The Airflow Scheduler constantly queries the metadata database. If the database is sluggish due to heavy XCom read/write operations, scheduling latencies skyrocket, causing tasks to stall in queued or scheduled states.
Tasks can manually interact with the XCom registry using context['task_instance'].xcom_push() and xcom_pull() . Key Constraints of Standard XComs
# Pushing XCom (implicitly via return) def push_task(**context): return "some_value" airflow xcom exclusive
While functional, this approach invites three cardinal sins:
XCom allows tasks to exchange small amounts of data by storing them in the Airflow metadata database. An XCom is essentially a key-value pair associated with a specific task instance, DAG, and execution date. The identifier for the data (e.g., filename ). Key Constraints of Standard XComs # Pushing XCom
Standard database columns limit payload size (e.g., standard BLOB/TEXT limits). Storing large DataFrames or massive JSON payloads directly in the database degrades orchestration performance.
trigger = TriggerDagRunOperator( task_id="trigger_child", trigger_dag_id="child_dag", conf="xcom_passthrough": " ti.xcom_pull(task_ids='parent_task', key='authorized_key') ", ) Standard database columns limit payload size (e
When we talk about Airflow XCom being "exclusive," we're referring to the fact that XCom is only accessible to tasks within the same DAG. This means that tasks in one DAG cannot access XCom values from another DAG.
Tasks can programmatically call ti.xcom_push() and ti.xcom_pull() via the execution context to precisely control keys, values, and target tasks. 2. Under the Hood: The Metadata Database Trap
Relational database columns have strict size limits (e.g., BLOB or TEXT limits). Attempting to push an asset larger than the database column capacity will hard-fail the task with serialization or database write errors. 3. The TaskFlow API: Modernizing XComs