Apache Airflow in the Cloud: Programmatically orchestrating workloads w/ Py

Loading

Follow to receive video recommendations   a   A


Apache Airflow is a pipeline orchestration tool for Python initially built by Airbnb and then open-sourced. It allows data engineers to configure multi-system workflows that are executed in parallel across any number of workers. A single pipeline may contain single or multiple operations like python, bash or submitting a spark-job into the cloud. Airflow is written in python and users can write their custom operators in python. A data pipeline is a critical component of an effective data science product, and orchestrating pipeline tasks enables simpler development and more robust and scalable engineering. In this tutorial, we will give a practical introduction to Apache Airflow. Slides: https://speakerdeck.com/kaxil/apache-airflow-in-the-cloud-programmatically-orchestrating-workloads-with-python-pydata-london-2018

Editors Note:

I would like to work with open source projects to create a branch of the tree with all of the best videos for your open source project. Please send me an email if you are interested.