Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oct hqta #1274

Merged
merged 4 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion open_data/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ compile_open_data_portal:
python update_fields_fgdc.py # populate fields with data dictionary yml values, run if update_data_dict had changes to incorporate
# Download the zipped shapefiles and metadata.yml and move to local ESRI directory
#python arcgis_script_pro.py #(in ESRI!)
python metadata_update_pro.py # go back into ESRI and update xml
# Bring the ESRI rewritten XML files into Hub and drop into xml/ and allow overwrite(s)
python metadata_update_pro.py # (in Hub)
# Download the overwritten XML files in xml/run_in_esri/ and move to local ESRI directory.
#python arcgis_script_pro.py #(in ESRI!)
python cleanup.py # run after ESRI work done
15 changes: 11 additions & 4 deletions open_data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,25 +27,32 @@ Traffic Ops had a request for all transit routes and transit stops to be publish
* Download the zipped shapefiles from the Hub to your local filesystem.
1. If there are new datasets to add or changes to make, make them in `metadata.yml` and/or `data_dictionary.yml`.
* If there are changes to make in `metadata.yml`, make them. Afterwards, in terminal, run: `python supplement_meta.py`
1. If there are changes to be made to metadata.yml (adding new datasets, changing descriptions, change contact information, etc), make them. This is infrequent. An updated analysis date is already automated and does not have to be updated here.
1. In terminal: `python supplement_meta.py`
1. In terminal: `python update_data_dict.py`.
* Check the log results, which tells you if there are columns missing from `data_dictionary.yml`. These columns and their descriptions need to be added. Every column in the ESRI layer must have a definition, and where there's an external data dictionary website to cite, provide a definition source.
1. In terminal: `python update_fields_fgdc.py`. This populates fields with `data_dictionary.yml` values.
* Only run if `update_data_dict` had changes to incorporate
1. Run [arcgis_pro_script](./arcgis_pro_script.py) to create XML files.
* Open a notebook in Hub and find the `ARCGIS_PATH`
* Hardcode that path for `arcpy.env.workspace = ARCGIS_PATH`
* The exported XML metadata will be in file gdb directory.
* Upload the XML metadata into Hub in `open_data/xml/`.
1. If there are new datasets added, open `open_data.py` and modify the script.
1. In terminal: `python open_data.py`.
1. If there are new datasets added, open `update_vars.py` and modify the script.
1. In terminal: `python metadata_update_pro.py`.
* Change into the `open_data` directory: `cd open_data/`.
* The overwritten XML is stored in `open_data/metadata_xml/run_in_esri/`.
* The overwritten XML is stored in `open_data/xml/run_in_esri/`.
* Download the overwritten XML files locally to run in ArcGIS.
1. Run [arcgis_pro_script](./arcgis_pro_script.py) after import the updated XML metadata for each feature class.
* There are steps to create FGDC templates for each datasets to store field information.
* This only needs to be done once when a new dataset is created.
1. In terminal: `python cleanup.py` to clean up old XML files and remove zipped shapefiles.
* The YAML and XML files created/have changes get checked into GitHub.

### Metadata
* [Metadata](./metadata.yml)
* [Data dictionary](./data_dictionary.yml)
* [update_vars](./update_vars.py) and [publish_utils](./publish_utils.py) contain a lot of the variables that would frequently get updated in the publishing process.
* [update_vars](./update_vars.py) contains a lot of the variables that would frequently get updated in the publishing process.
* Apply standardized column names across published datasets, even they differ from internal keys (`org_id` in favor of `gtfs_dataset_key`, `agency` in favor of `organization_name`).
* Since we do not save multiple versions of published datasets, the columns are renamed prior to exporting the geoparquet as a zipped shapefile.

Expand Down
8 changes: 4 additions & 4 deletions open_data/update_vars.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
RUN_ME = [
"ca_hq_transit_areas",
"ca_hq_transit_stops",
# "ca_transit_routes",
# "ca_transit_stops",
# "speeds_by_stop_segments",
# "speeds_by_route_time_of_day",
"ca_transit_routes",
"ca_transit_stops",
"speeds_by_stop_segments",
"speeds_by_route_time_of_day",
]
4 changes: 2 additions & 2 deletions open_data/xml/ca_hq_transit_areas.xml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
</ns0:hierarchyLevelName>
<ns0:contact ns1:nilReason="missing"></ns0:contact>
<ns0:dateStamp>
<ns1:Date>2024-10-08</ns1:Date>
<ns1:Date>2024-10-30</ns1:Date>
</ns0:dateStamp>
<ns0:metadataStandardName>
<ns1:CharacterString>ISO 19139 Geographic Information - Metadata - Implementation Specification</ns1:CharacterString>
Expand Down Expand Up @@ -85,7 +85,7 @@
<ns0:date>
<ns0:CI_Date>
<ns0:date>
<ns1:Date>2024-09-18</ns1:Date>
<ns1:Date>2024-10-16</ns1:Date>
</ns0:date>
<ns0:dateType>
<ns0:CI_DateTypeCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="revision" codeSpace="ISOTC211/19115">
Expand Down
4 changes: 2 additions & 2 deletions open_data/xml/ca_hq_transit_stops.xml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
</ns0:hierarchyLevelName>
<ns0:contact ns1:nilReason="missing"></ns0:contact>
<ns0:dateStamp>
<ns1:Date>2024-10-08</ns1:Date>
<ns1:Date>2024-10-30</ns1:Date>
</ns0:dateStamp>
<ns0:metadataStandardName>
<ns1:CharacterString>ISO 19139 Geographic Information - Metadata - Implementation Specification</ns1:CharacterString>
Expand Down Expand Up @@ -85,7 +85,7 @@
<ns0:date>
<ns0:CI_Date>
<ns0:date>
<ns1:Date>2024-09-18</ns1:Date>
<ns1:Date>2024-10-16</ns1:Date>
</ns0:date>
<ns0:dateType>
<ns0:CI_DateTypeCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="revision" codeSpace="ISOTC211/19115">
Expand Down
Loading