Scaling Machine Learning jobs with Kubernetes

Loading

Running Machine learning jobs at scale places painful demands on infrastructure from an operational perspective. As the number of jobs increase, having an easy-to-use infrastructure becomes a necessity. In this talk we will cover how we use Kubernetes at Textkernel as a job manager to scale our Tensorflow-based jobs. We will also explore other solutions such as distributed Tensorflow and Kubeflow.

Follow to receive video recommendations   a   A


Talk slides: https://docs.google.com/presentation/d/1-vP70IA_5-1xlUPhC09Pq9utxpNnTXfwCa-T_-zM8SU/edit#slide=id.g3b17409166_0_6

Editors Note:

I am looking for editors/curators to help with branches of the tree. Please send me an email  if you are interested.