Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

22.0.0 regression: We need a better default treatment of SCRIPT_NAME #3192

Merged
merged 7 commits into from
Aug 9, 2024

Conversation

pajod
Copy link
Contributor

@pajod pajod commented Apr 24, 2024

As pointed out in #2650, I have introduced a regression in the now released 22.0 version. What commonly worked in the default installation without additional configuration broke: Passing dynamic path information from a front-end when the application lives in a subfolder.

  • ✅ Deployments where the application does not require additional path information from the front-end proxy keep working
  • ✅ Deployments where the SCRIPT_NAME is statically configured keep working
    • e.g. in the application, in its environment, or in the environment of the gunicorn runtime (including in a systemd unit file)
  • ❌ Deployments that depended on the ability of arbitrary requests containing a SCRIPT_NAME (underscore, not dash) header to transmit the path where the application lives are broken after updating to gunicorn 22.0
    • and can at this time only be reverted to the previous behaviour by resorting to the --header-map dangerous escape hatch. Which is bad, because when intending to only allow one particular header from one particular allowed forwarder, this options re-introduces the security concern by permitting all headers from all sources.

Affected Nginx config might look like this:

location /foo {
    proxy_redirect off; proxy_http_version 1.1;
    proxy_set_header SCRIPT_NAME /foo;
    proxy_pass http://unix:/run/www-data/gunicorn_foo.sock;
}
location /bar {
    proxy_redirect off; proxy_http_version 1.1;
    proxy_set_header SCRIPT_NAME /bar;
    proxy_pass http://unix:/run/www-data/gunicorn_bar.sock;
}

This is not necessarily unsafe. When leaving the defaults of underscores_in_headers off; ignore_invalid_headers on; untouched, this header name would not be automatically forwarded, so it has come from Nginx.

Caution

SCRIPT_NAME is not the only header that can lead to dangerous results when an application gets confused about which headers were inserted by an authorized proxy. But is the only one gunicorn absolutely must take care of, being the one gunicorn picks up from headers and places into environ.

I propose we do:

  • Push out a patch release before the affected version even propagates everywhere and people are left with no option other than applying a less specific workaround.
  • By default extend "localhost & unix peers are authorized" treatment of secure scheme headers also to now-banned field names
    • New list-type option, similar to secure scheme headers. There may be other cases where (especially nginx users) want to in-band transmit map{}-ed data to the application.
    • The fact that non-IP peers simple ignore the allowed IP list probably deserves documentation. For code readability, it should also be flipped in the code form "ignore anything that is not a tuple" to "ignore for UNIX sockets, specifically, and only then"
  • check the syntax of the allowed-ip-addrs setting to make that option easier to figure out
  • modify the default of allowed-ip-addrs setting to permit the (singular) IPv6 loopback address as well
  • Documentation adding examples, explanation, and cross-links to popular frameworks on how SCRIPT_NAME works

Related issues

@pajod pajod changed the title A better default for treating SCRIPT_NAME 22.0.0 regression: We need a better default treatment of SCRIPT_NAME Apr 24, 2024
@benoitc benoitc self-assigned this Apr 24, 2024
@pajod pajod marked this pull request as ready for review April 25, 2024 11:04
@jhominal
Copy link

jhominal commented May 5, 2024

How will forwarded-allow-ips work when listening on a UNIX socket?

@pajod
Copy link
Contributor Author

pajod commented May 5, 2024

How will forwarded-allow-ips work when listening on a UNIX socket?

Same as for the secure scheme headers - all connections that are not tuple (ip, port) are trusted unconditionally.

https://github.com/benoitc/gunicorn/pull/3192/files#diff-7e884e24484c7ac02a634681dfd68ba4d1830cdf0204768b2fabbf84dd0adcd3R86

(Yes, that far from obvious line should be flipped from not isinstance(addr, tuple) to sock.family == AF_UNIX and properly documented.)

@pajod pajod force-pushed the patch-allowed-script-name branch from 3c7bb99 to 5bbf373 Compare August 7, 2024 18:17
@pajod
Copy link
Contributor Author

pajod commented Aug 7, 2024

Rebased version contains a doc syntax fix missed in #3217 (rst lists use - dash not * star)

Question to reviewers: What about PATH_INFO, should it also be in the defaults? I think so.

@benoitc
Copy link
Owner

benoitc commented Aug 8, 2024

@pajod it should be made available by default yes.

* PATH_NAME is used like SCRIPT_NAME: include both
* replicate changed forwarded-allow-ips default to proxy_allow_ips
@benoitc
Copy link
Owner

benoitc commented Aug 8, 2024

sounds good for me. Do you plane any other changes?

@pajod
Copy link
Contributor Author

pajod commented Aug 8, 2024

Do you plane any other changes?

In this PR? No.

Before any release is tagged that includes this change? Very likely.
The only makeshift-release-candidate I tested beyond running unit tests deviates quite a bit from what master is at now (still missing gthread fix or rework, and I imagine #3127 won't make it into a release, you mentioned aiming for a week somewhere else).

--
Afterthought: Documenting that PATH_INFO can be overridden from a header.. only makes sense once we actually permit passing that to the application. And doing that in turn only makes sense one we explain how to do it correctly.. despite not knowing a single example where a difference between "the rest of the path, as split by us" and "the rest of the path, as split by nginx" is significant yet still correct.

@benoitc
Copy link
Owner

benoitc commented Aug 9, 2024

Thanks for your answer :) We will launch a 23.1 asap this month with more feature. Let's focus on making a release bringing gunicorn on a par with latest HTTP 1.1 updates for 3.0

PATH_INFO and SCRIPT_NAME are required the wsgi spec : https://peps.python.org/pep-3333/#environ-variables .

I think it's unfortunate that people don't understand this pooint when they talk about security. As for gunicorn I think that we should provide 2 modes:

  1. is running behind a proxy/HTTP server . In this case some secuirtiy could be relaxed and CGI variables shuoudl be passed and secured from this server . This is the advised way until now
  2. gunicorn acting itself as main server. In such case we should ensure only Gunicorn and system env can set this variables .

This let me think that a simple "--mode" variable set as "server" or "gateway" could enforce some rules and options. Ie no CGI-VARIABLES should be accepted when it's come from HTTP in server mode. These variables can only be set using OS env or configuration in such case. In gateway mode we should probably accept them transparently and let the security benig handled by the server on top.

I will merge this change fr now but let's revisit this asap following the definition above in a searate ticket.

@benoitc benoitc merged commit 3f56d76 into benoitc:master Aug 9, 2024
24 checks passed
@benoitc
Copy link
Owner

benoitc commented Aug 9, 2024

merged the patch. Thanks :) Discussion can continue there #3262

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SCRIPT_NAME header is dropped/rejected by header map validation
3 participants