Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull Data from CKAN API #6

Open
curran opened this issue Aug 23, 2017 · 3 comments
Open

Pull Data from CKAN API #6

curran opened this issue Aug 23, 2017 · 3 comments

Comments

@curran
Copy link
Contributor

curran commented Aug 23, 2017

Ideally we'd pull the data we need when we need it, so we don't need to download a 2MB file on page load.

@curran
Copy link
Contributor Author

curran commented Aug 24, 2017

This requires some thought as to the granularity of the data requests.

The source streamgraph and destination streamgraph need different aggregations of the data, after filtering by selected population types. Does this mean that we'll need to make two separate API calls each time we change the population type filter selection?

@matthewsmawfield
Copy link
Contributor

matthewsmawfield commented Aug 24, 2017

Yes ideally we should pull data from CKAN. I imagine for source/dest data, the table schema would look like:

year, country_dest_iso, country_origin_iso, population_type_id, total

The data is then more flat, meaning it can be queried easily and grouped by and summed in the SQL parameters of the CKAN API call. It would be interesting to compare the size of the flat file, using ids and ISO codes with 2MB proccessed data.json file. Most importantly though, its important from this stage to have the data.json as close to the structure of the future API JSON response. Currently the data.json stucutre is nested, and it would not be possible to return JSON the same as data.json from a CKAN endpoint.

@curran
Copy link
Contributor Author

curran commented Aug 24, 2017

Yes, that schema looks accurate.

I think what's important is to have a place in the code that expects the JSON structure that will be returned from the API. Meaning, there can be an intermediate "unpacking" step that only applies to the packed JSON data, and returns a data structure that will match with the API response. But I think I get what you mean, that there needs to be a place where we can cleanly swap out the data source later. I'll work with that in mind.

@curran curran mentioned this issue Oct 17, 2017
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants