-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metarefresh fails when source URL responds without HTTP version and response code #42
Comments
Thanks for reporting. Indeed this is uncessesarily unwieldy and therefore not robust. I'm thinking in a different route to resolve this. I think the module should not be concerning itself with details like parsing out HTTP version codes. It already outsources a part of this to SSP's HTTP utility class. But still that class does similair things. There are many excellent libs that can perform a HTTP request, handle all eventualities and return something sane. Something like curl is highly competent in this. I think everyone is much better off if we replace the custom fiddling with a dependency on PHP's curl module. This is not an unworldly dependency for something like SSP to have, in my opinion, and buys us the shared code used by millions of projects. A project like SSP should focus on being good at the SAML part, fetchting something over HTTP has been solved by others. Nonetheless, as for your options:
I do note that I just get a HTTP version reponse back from the URL you give:
|
I also see a HTTP version, so that's not the problem. I suspect @aaronhark is for some reason receiving a HTTP/2 response which doesn't pass the regexp. I think given our efforts to move towards PSR-7 in the next major release, it also makes sense to offload this to a proper PSR-18 compatible HTTP-client lib like guzzle instead of using curl directly. Or maybe even better; the http-client from Symfony. |
In any case, remove code where SimpleSAMLphp itself is parsing HTTP responses with regexes... |
@thijskh Just for my understanding, is it not doing the same load of cached metadata in each part of this if-statement, regardless of what HTTP code is returned?
I agree completely with the idea of not fiddling at the HTTP response level directly in this module. In fact, that's actually how we solved this issue: moved all of our fetching of metadata to a separate cURL-based script. I was just offering these code modifications if the desire was to keep with the approach as it currently exists. |
Describe the bug
The metarefresh module can fetch metadata from a list of provided URLs and compile it in to the SimpleSAMLphp flatfile format. If one of those URLs responds but does not include a header with its HTTP version and response code (e.g.
HTTP/1.1 200 OK
), the entire metarefresh process fails due to an uncaught exception. The exception occurs because the logic inMetaLoader.php
that handles the response does not account for the possibility of a null value for this header.Specifics of Environment
Using SimpleSAMLphp 2.3.5 as an SP on Linux with PHP 8.3.10
To Reproduce
module_metarefresh.php
to use these three example URL-based sources and output as flatfile:Expected Behavior
Fetch the XML metadata from each specified URL, and compile it into the SimpleSAMLphp flatfile format.
Actual Behavior
Because the web server of the second link (https://amidp.drew.edu/nidp/saml2/metadata) returns the XML without an HTTP version and response code, nothing gets loaded into the
[0]
key of the response array and an exception is subsequently thrown. We can debate whether the web server owner ought to fix its response behavior, but the fact of the matter is there's nothing wrong with the actual XML metadata it serves up. The only reason I know it's not returning an HTTP version and response code is from debugging this issue.Solution
There are several ways to resolve this; I'm just not sure what the preferred approach would be for the maintainers of this project. Therefore, I'm presenting two potential solutions inline rather than doing a PR. If one of these is preferred, I'm happy to fork and submit a PR. The most straightforward place to effect this change is at lines 118-134 of
src/MetaLoader.php
.Option 1: If for some reason it is critical to this module that the HTTP version and response code header be present, more gracefully handle the lack of one by modifying the code to read as follows. With this option, the reproduction steps I provided above would result in the metadata being loaded successfully from URLs 1 and 3, and URL 2 throwing a warning to the logs.
Option 2: Rather than testing for the HTTP version and response code -- which is irrelevant to the validity of the content served up -- it seems it might make more sense to look at the
content-type
header to ensure we actually receivedtext/xml
. With this option, the reproduction steps I provided above would result in the metadata being loaded successfully from all three URLs.The text was updated successfully, but these errors were encountered: