May 31, 2023

Investing in Predibase

Coffee drinkers used to fall into two camps: those who value convenience and those who value quality. The first camp drank instant coffee, the latter pulled their own espresso shots. It wasn’t until the rise of Starbucks in the early 2000s that the two camps collided. Starbucks combined the convenience and price of Folgers with the quality and customizability of a Marzocco. The world never looked back. Thanks to large language models (LLMs), a similar progression is playing out in the field of machine learning, and a Starbucks-scale ML company will soon be crowned. We think that company is Predibase.

Before LLMs, enterprises had two options to leverage ML: buy flexible, black-box AutoML tooling—instant coffee—or hire engineers to wield complex, low-level frameworks—an espresso machine. With LLMs, the burden of feature engineering, hyperparameter search, and data labeling has decreased. ML teams now have access to large pre-trained models that can execute zero-shot and few-shot predictions reasonably well. LLMs combine the flexibility of low-level ML primitives and the simplicity of AutoML. The world will never look back.

The best way to leverage ML in the enterprise is still in flux. Enterprises can fine-tune an open-source LLM, work directly with a foundation model API, train their own LLM from scratch, or use “traditional” ML models for things like recommendation engines or image classifiers. That’s a lot to digest. In general, more pathways to value equals more confusion for buyers equals an opportunity to simplify the process under one umbrella. Enter Predibase, a low-code, “declarative” approach to ML. We believe this is the quickest path to deriving value from ML, particularly in the age of LLMs. We’re thrilled to announce Felicis is leading Predibase’s latest financing round.

Predibase strikes a chord with the ML community. At Predibase’s inaugural community meetup in late 2022, dozens of engineers flooded into the company’s SF office to hear a fireside chat between CEO Piero Molino and Sr. Data Scientist Daliana Liu. We RSVP’d on a whim. At the event, one attendee pointed out the similarities between Predibase’s declarative approach to ML and Terraform, Hashicorp’s infrastructure-as-code tool that uses a declarative configuration language. It’s rare to see a vision, “Terraform for ML,” strike such a chord with an audience of highly-technical, default-skeptical engineers. Because we showed up to this event, we felt that potential energy. It was palpable. We set out to build a relationship with the Predibase team—Piero Molino, Travis Addair, Devvret Rishi, and Chris Ré. What followed convinced us that we needed to be part of the journey.

World-class founders at the helm. After several meetings, it became abundantly clear that the Predibase team has an optimal balance of ML expertise, business acumen, and user empathy to crack the code on “declarative ML.” Piero is the author of Ludwig (9K GitHub stars, 136 contributors) and a former Sr. Staff Research Scientist at Uber AI; Travis, CTO, led the ML infrastructure team at Uber and was the lead architect behind Horovod (13K stars, 169 contributors); and Dev (CPO) was the lead PM for Kaggle at Google. Chris, Co-Founder and Professor of CS at Stanford, is a serial entrepreneur and ML luminary. You couldn’t ask for a better team building at the cutting edge of ML.

Deploying ML is hard, but doesn’t have to be. Just like ordering a coffee to your exact specifications at Starbucks, data scientists and non-experts alike can use “declarative” ML systems like Predibase to specify (“declare”) what they want a model to do and then rely on the system to figure out how to do it. Piero built an open-source declarative ML framework at Uber, called Ludwig, and saw the massive impact it had on the business. He and his co-founders started Predibase to take this benefit to the masses. Made-to-order ML models aren’t a new concept, but declarative ML systems have traditionally been confined to large enterprises like Uber, Meta, and Apple (see Looper, Overton). With Ludwig built in the open-source, we think Predibase has a credible shot at truly democratizing ML. In fact, we think a commitment to fostering an open-source community is a necessary prerequisite to democratizing ML. Predibase offers the right abstraction to simplify model development for all audiences, decreasing the time and cost associated with deploying ML without sacrificing control. We think this is a game changer, and we’ve spoken with leading ML practitioners who agree.

Numerous engineering leaders have confirmed a need for Predibase’s solution. In each conversation we had, three aspects of Predibase’s offering stood out: (1) simplifying multi-modal model development, (2) enabling non-technical users to spin up models with PQL, Predibase’s “predictive query language,” and (3) quickly fine-tuning and privately hosting your own LLM. One Head of Engineering recalled, “Just yesterday, I was talking about an internal ML use case with multi-modal insurance data… That’s where Predibase differentiates itself. Multimodal models are the future.” Another Senior Director of Data Science noted, “earlier today I was told, ‘we need a regression model for the sales team, and we need predictions within two days!’ I’m very interested in PQL for this use case.” Lastly, when it comes to LLMs, almost every enterprise we speak with is looking to adopt the technology. The three use cases that resonate most are summarization, Q&A and semantic search, and asset generation (text, code, SQL). With 10 lines of configuration code, you can privately host and customize an LLM with Predibase. Whether composing and training state-of-the-art model pipelines on multiple data modalities, unlocking the value of ML for non-technical business users, or experimenting with LLMs, Predibase is the one-stop shop for solving real business problems with ML.

The GA version of Predibase, launching today, adds new capabilities, including: 

  • Data Science Copilot – gives developers expert recommendations on how to improve the performance of their models, as well as explanations and examples in real-time as they iterate.
  • Privately Hosted, Customized LLMs – instead of often renting costly LLMs from API vendors and giving up access to sensitive data, Predibase allows organizations to deploy LLMs securely within their own enterprise infrastructure, finely tuned for their specific ML task. Additionally, Predibase provides optimizations to accelerate LLM tuning while reducing costs by 100x.
  • Free Two-Week Trial – the launch of Predibase’s free trial enables engineers everywhere to start reaping the benefits of declarative ML. Predibase Cloud is a limited functionality version hosted by Predibase, while Predibase Virtual Private Cloud provides full functionality for organizations that choose to host in their own environment. Learn more about the free trial here.

Predibase makes deploying custom ML models as easy as ordering a coffee at Starbucks. Felicis has an established track record of backing category-defining companies like Runway, Supabase, Canva, Weights & Biases, and Notion. We think declarative ML platforms like Predibase will catalyze the next wave of ML adoption in the enterprise, creating billions of dollars of value and improving the workflows of millions in the process. We’re thrilled to be part of the journey.