Replies: 5 comments 4 replies
-
Thanks for opening your first issue here! Be sure to follow the issue template! |
Beta Was this translation helpful? Give feedback.
-
Please upgrade to latest 2.3 and repeat your investigtions. There were a LOT of optimisations done since 2.1 resulting in speeding up the home page immensely, not mentioning the fact that we have a new Grid View instead of Tree View now. |
Beta Was this translation helpful? Give feedback.
-
You are looking at quite an out-dated version simply. |
Beta Was this translation helpful? Give feedback.
-
I repeated the investigations using AirFlow An example again the relevant SQL code:
The issue is with the The irony is avoiding the |
Beta Was this translation helpful? Give feedback.
-
I know this post is old but I wasn't getting anywhere with the same symptoms for the past few weeks. We have local development VMs that were working fine with the homepage UI load but migrating to our more secure AWS solution is where we ran into issues. The setting that reaches out to airflow to pass data is what was causing the problem. Runs, Last Run, and Recent Tasks all spun for minutes before finally showing results. Specifically, I changed the variable AIRFLOW__USAGE_DATA_COLLECTION__ENABLED from enabled=True to enabled=False. Those fields populate almost instantly now |
Beta Was this translation helpful? Give feedback.
-
Apache Airflow version
2.1.2
What happened
When loading AirFlow v2 home page subsequent requests include
GET /last_dagruns
,GET /dag_stats
andGET /tasks_stats
which in turn query the database (PostgreSQL in this case). These DB queries (most likely generated by an ORM) have the form:Which have one issue: unless the postgresql query optimizer figures out that those strings are actually PKs in some table the
IN
operation is carried out by doing a linear search which when the list of DAGs and the number of rows in the tables is high (ie. 300 actives dags and 10M rows) results in response times in the order of minutes not seconds.This situation has unintended consequences:
GET /health
request you are like to get timeout hence triggering alarms.What you think should happen instead
I've made the queries run faster by creating a tmp table storing the values in the
IN
operator and thenJOIN
ing. That is:Then the implicit indexes take care of it all. But that of course is a lot to ask to an ORM.
How to reproduce
You'll need a setup with a few hundred DAGs. If you run them often the db will pretty soon contain tables with millions of entries thus making the issue arise.
Operating System
Linux
Versions of Apache Airflow Providers
No response
Deployment
Other Docker-based deployment
Deployment details
Docker images which operations managed under AWS ECS and Fargate.
Anything else
N/A
Are you willing to submit PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions