Data versioning in machine learning projects

Loading

Upvotes: DownVotes:
Age: a year     Page Views: 3330
Votes / View: 14    Wilson Score: 0.86



In machine learning projects it is easy to get lost in many versions of your data files. Data Version Control or DVC is an open source tool for data science projects that was created to solve the issue of discrepancy between code and data files. It works on top of Git and helps you switch between Git branches and extracts not only source code but a right version of data files. Slides: https://www.slideshare.net/DmitryPetrov15/pydata-berlin-2018-dvcorg --- www.pydata.org
None