Data Versioning Explained the Open Source Way
The demand for better versioning of data is growing. There are a plethora of open source projects providing tools for managing data using the best practices we learned from managing code.
In this talk we will go over the difference between these solutions by clustering them according to 4 main use cases: Collaboration over data, Managing ML pipelines, the need for mutability and ACID guarantees over an object storage data lake.
By the end of the talk, you should have a good understanding of how these solutions compare and which you should choose for different types of use cases.
You may also be interested in
This talk tells the story of the implementation of an application security program in an agile, polyglot, cloud-first organisation. With fast-moving...