Onehouse Unveils Onehouse Compute Runtime (OCR) for Data Lakehouses
Introduction
Organizations investing in data lakehouses in 2025 may want to check out a new offering unveiled by Onehouse this week. The company founded by the creator of the Apache Hudi table format launched Onehouse Compute Runtime (OCR), which it says enables customers to manage and optimize data lakehouse workloads across multiple cloud platforms, query engines, and open table formats.
The Emergence of Data Lakehouses
We’re in the midst of a building boom for data lakehouses at the moment, largely due to the industry coalescing around the Apache Iceberg table format in mid-2024, which reduced the odds that customer could choose the “wrong” format, thereby stranding their data. The rise of Iceberg would seem to put competing table formats, including Apache Hudi and Databricks Delta Lake, on the backburner. But the folks at Hudi-backer Onehouse see abundant opportunity, and aren’t taking the changes lying down.
Onehouse Compute Runtime (OCR)
While the Hudi-Iceberg comparison is not exactly apples-to-apples (read this story to learn how Hudi was originally designed to solve the fast data issue on Uber’s Hadoop cluster), Onehouse is nevertheless adapting to the reality that Iceberg is positioned to be the dominant table format moving forward. One way it’s doing that is by launching OCR.
Key Features of OCR
OCR gives customers the capability to manage their lakehouse environments across multiple cloud platforms (Databricks, Snowflake, AWS, Google Cloud) that use a variety of query engines (Spark, Redshift, BigQuery, Snowflake) on data stored in multiple table formats (Iceberg, Delta Lake, and Hudi). OCR doesn’t concern itself with the execution of the SQL (or other compute) workloads. Rather, it’s focused on automating some of the less glamorous but necessary maintenance work that lakehouses require.
How OCR Works
OCR automatically spins up the required compute resources on various cloud platforms using serverless computing techniques in customers own virtual private cloud (VPC) environments. OCR’s Spark-based serverless compute manager enables elastic scaling of the lakehouse maintenance workloads, such as data ingestion, table optimization, and ETL operations. This results in a 2x to 30x performance gain at a cost savings of 20% to 80%, the company says.
Benefits of OCR
The goal with OCR is to give customers all the tools they need to take advantage of the growth in lakehouses and openness of table formats, according to Vinoth Chandar, the creator of Hudi and founder and CEO at Onehouse. “While open table formats have emerged as means to open up data across multiple engines, there is great need for a high-performance compute platform that can transform and optimize data across such engines,” says Chandar.
Early Adopter Success
One early adopter of OCR is the digital marketing company Conductor. “Our Onehouse data lakehouse has enabled us to meet the demands of rapid growth while dramatically simplifying our data architecture,” said Emil Emilov, principal software engineer at Conductor. “With automated scaling and resources that adapt to our workloads, Onehouse helps us dedicate our teams to building out our core platform differentiators rather than keeping the data stack continuously optimized.”
Conclusion
Onehouse’s OCR aims to be that decentralized compute platform. The offering, which Onehouse launched Tuesday January 14, automatically spins up the required compute resources on various cloud platforms using serverless computing techniques in customers own virtual private cloud (VPC) environments.
FAQs
Q: What is Onehouse Compute Runtime (OCR)?
A: OCR is a new offering from Onehouse that enables customers to manage and optimize data lakehouse workloads across multiple cloud platforms, query engines, and open table formats.
Q: What are the key features of OCR?
A: OCR gives customers the capability to manage their lakehouse environments across multiple cloud platforms, query engines, and open table formats.
Q: How does OCR work?
A: OCR automatically spins up the required compute resources on various cloud platforms using serverless computing techniques in customers own virtual private cloud (VPC) environments.
Q: What are the benefits of OCR?
A: OCR enables customers to achieve a 2x to 30x performance gain at a cost savings of 20% to 80%, and provides a high-performance compute platform that can transform and optimize data across multiple engines.
Q: Who is an early adopter of OCR?
A: Conductor, a digital marketing company, is an early adopter of OCR.

