Created by Zdeněk Šimůnek
about 4 years ago
|
||
Question | Answer |
DAG | Directed Acyclic Graph - Collection of tasks, their dependencies and settings. - Defined in .py script as code. |
XCom | Feature for cross communication between tasks. |
dags_folder | - The folder where airflow pipelines live. - This path must be absolute. - Airflow looks in your DAGS_FOLDER for modules that contain DAG objects in their GLOBAL NAMESPACE and adds the objects it finds in the DagBag. |
DAG Run | - An instance of a DAG, containing task instances that run for a specific execution_date. - Created by the Airflow scheduler or an external trigger. |
Task | - A Task defines a unit of work within a DAG; it is represented as a node in the DAG graph, and it is written in Python. - Each task is an implementation of an Operator. |
Operator | An operator describes a single task in a workflow. |
Sensor | An Operator that waits (polls) for a certain time, file, database row, S3 key, etc. |
chain(op1, [op2, op3], [op4, op5], op6) | op1 >> [op2, op3] op2 >> op4 op3 >> op5 [op4, op5] >> op6 |
Task Instance | An instance of a task - that has been assigned to a DAG and has a state associated with a specific DAG run (i.e for a specific execution_date). |
execution_date | The logical date and time for a DAG Run and its Task Instances. |
Jinja | Jinja is a modern and designer-friendly templating language for Python, modelled after Django’s templates. |
Hooks | - Hooks are interfaces to external platforms and databases like Hive, S3, MySQL, Postgres, HDFS, and Pig. - Hooks implement a common interface when possible, and act as a building block for operators. |
Pools | Airflow pools can be used to limit the execution parallelism on arbitrary sets of tasks. |
Connections | The information needed to connect to external systems is stored in the Airflow metastore database. A conn_id is defined there, and hostname / login / password / schema information attached to it. Airflow pipelines retrieve centrally-managed connections information by specifying the relevant conn_id. |
Want to create your own Flashcards for free with GoConqr? Learn more.