Distributed Hyperparameter search with sklearn and kubernetes


Upvotes: DownVotes:
Age: a year     Page Views: 94
Votes / View: 11    Wilson Score: 0.21

While sklearn provides a good interface to do hyperparameter search onlarge & complex model (pipelines), doing these can take up a lot oftime. The traditional way usually includes one beefy machine and a lotof waiting. In other cases, people tend to “manually” schedule parameterranges between nodes, but that can also be problematic since these won'ttalk to each other. Kubernetes itself is currently the most prominentscheduler and shines at distributing task, but is a pretty complexsystem in itself.In this talk, I will show how you can harness the scheduling ofkubernetes for distributing hyperparameter search with sklearn onto acluster of nodes. This can be achieved quite easily and with just a fewchanges to the original code, so the Data Scientist won't be bothered bycomplex kubernetes internals.