Velox -- Meta's Unified Execution Engine (P. Pedreira, et al., VLDB 2022)
Velox: Meta’s Unified Execution Engine
Overview:
- Meta had many specialized engines for different types of data processing workloads. This resulted in duplication of effort and inconsistent interfaces for users.
- Velox is a C++ library that provides reusable, extensible, and high-performance data processing components which teams can use to build execution engines in their specific data management systems. It aims to solve the issues of the siloed data ecosystem Meta observed.
- Velox takes a fully optimized query plan as input and performs the computation described by that plan in the local node. It does not contain a language front end, nor a global query optimizer – it is built specifically to be an execution engine.
Key takeaways:
- Provides the following high level components: Type, Vector, Expression Evaluation, Functions, Operators, I/O, Serializers, and Resource Management. Users can choose what they need and customize functionality using its extensibility APIs.
- Velox plays the role of an execution engine in a modular data management system.
System evaluated:
- Presto – Meta’s distributed query engine, Presto with Velox as its execution engine
Workloads/Benchmarks
- TPC-H and real workloads from production traffic in their company