[State of AI Startups] Memory/Learning, RL Envs & DBT-Fivetran — Sarah Catanzaro, Amplify - Latent Space Recap
Podcast: Latent Space
Published: 2025-12-30
Duration: 29 minutes
Guests: Sarah Catanzaro
Summary
Sarah Catanzaro from Amplify Partners discusses the current landscape of AI startups, highlighting the implications of the DBT-Fivetran merger, the consumerization of AI, and the trend of massive seed funding without clear short-term roadmaps.
What Happened
Sarah Catanzaro, a partner at Amplify Partners, provides insights on the current state of AI startups. She argues against the notion that the DBT-Fivetran merger signals the end of the modern data stack, suggesting instead that it positions the companies for IPO scale with a target of $600M+ combined revenue. Catanzaro points out that frontier labs are effectively using DBT and Fivetran for managing training data and agent analytics, indicating a strong demand for these tools despite the diminished need for large analytics teams.
Catanzaro also reflects on the failure of standalone data catalogs, which she attributes to a focus on human discoverability rather than machine-centric governance. She suggests that metadata services integrated within existing platforms like Snowflake, DBT, and Fivetran have absorbed most of the data catalog functions.
A significant trend discussed is the rise of $100M+ seed rounds, which Catanzaro finds concerning due to the lack of clear roadmaps and rushed investment decisions. She highlights that this trend can lead to a transactional view of funding, where signal and valuation are prioritized over strategic partnerships.
The episode also covers the consumerization of AI through personalization, where memory management and continual learning are seen as key drivers for retention and growth in 2026. Catanzaro emphasizes that AI applications need to adapt dynamically to user preferences and changing environments.
Catanzaro expresses skepticism about the hype surrounding RL environments, suggesting that real-world logs provide richer and more generalizable data than synthetic clones. She argues that the real world should serve as the primary RL environment, aligning with how companies like Cursor leverage actual user activity.
Finally, Catanzaro outlines her investment thesis which centers on startups that tackle hard research problems like Retrieval-Augmented Generation (RAG) and rule-following, pairing them with transformative applications that were previously unattainable.
Key Insights
- The merger of DBT and Fivetran is aimed at reaching IPO scale with a target of $600M+ combined revenue, positioning them as key tools for managing training data and agent analytics in AI startups.
- Standalone data catalogs have failed due to their focus on human discoverability, while metadata services integrated within platforms like Snowflake, DBT, and Fivetran have taken over their functions.
- The trend of $100M+ seed rounds in AI startups is concerning due to a lack of clear roadmaps and rushed investment decisions, leading to a transactional view of funding that prioritizes signal and valuation over strategic partnerships.
- Real-world logs are considered more valuable than synthetic clones for reinforcement learning environments, as they provide richer and more generalizable data, aligning with practices of companies like Cursor.