Use streaming for raw data requests. #19
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR makes it so that, for requests for raw data, if the portal is configured to stream JSON text sequence data back (as it will be in XDMoD 11.0 after ubccr/xdmod#1792 and ubccr/xdmod#1858), the data will be properly iterated over and stored in the data frame.
Determining whether the portal supports streaming is accomplished by first making a request to the
rest/warehouse/raw-data/limit
endpoint. If the response status code is404
(as it will be for 11.0 based on changes in ubccr/xdmod#1792), it runs the streaming algorithm. Otherwise, if the portal has therest/warehouse/raw-data/limit
endpoint (i.e., if it is running XDMoD 10.5), it runs the old algorithm of iteratively requesting 10,000 rows (or whatever the portal has as its configured limit).Once XDMoD 10.5 is no longer supported, we can remove the old algorithm.
Motivation and Context
ubccr/xdmod#1792 improves the performance of requests for raw data in the Jobs realm.
Tests performed
In addition to running the automated tests on the existing XDMoD portal, which is running 10.5, I also edited the automated tests to point at my port on
xdmod-dev
with the changes from ubccr/xdmod#1792, and ran those to success (thetest_get_raw_data
regression test failed, but on closer inspection this was due to the rows of the data frame being in a different order, which is acceptable).Types of changes
Checklist:
CHANGELOG.md
has been updateddocs/developing.md
) produces no errorsxdmod-notebooks
repository as necessary, and the notebooks all run successfully