Apache Airflow is a pipeline orchestration tool for Python initially built by Airbnb and then open-sourced. It allows data engineers to configure multi-system workflows that are executed in parallel across any number of workers. A single pipeline may contain single or multiple operations like python, bash or submitting a spark-job into the cloud. Airflow is written in python and users can write their custom operators in python.
A data pipeline is a critical component of an effective data science product, and orchestrating pipeline tasks enables simpler development and more robust and scalable engineering.
In this tutorial, we will give a practical introduction to Apache Airflow.
I would like to work with open source projects to create a branch of the tree with all
of the best videos for your open source project. Please
send me an email if you are interested.