Running Machine learning jobs at scale places painful demands on infrastructure from an operational perspective. As the number of jobs increase, having an easy-to-use infrastructure becomes a necessity. In this talk we will cover how we use Kubernetes at Textkernel as a job manager to scale our Tensorflow-based jobs. We will also explore other solutions such as distributed Tensorflow and Kubeflow.
I would like to work with open source projects to create a branch of the tree with all of the best videos for your open source project. Please send me an email if you are interested.