-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
08 add map #9
base: dev
Are you sure you want to change the base?
08 add map #9
Conversation
improve and correct set up instructions and update output file extensions to html.
update local_data_dir to use relative file path from config base.yaml so config file can be updated by users rather than editing scripts.
change output graphs from produce_plots.py from .png to .html files to avoid altair CalledProcessError
- add argparser to allow user to specify local_data_dir, epc-, and mcs-batch - add function to load MCS data directly from S3 bucket - add function to check for EPC data locally and download from S3 if not located - update existing functions to work with above - add global parameters for EPC processing version to base.yaml config file
update instructions in README to explain new way of running script and to record batches for historical analyses
add script to generate stats and charts from asf_senedd_response into produce_plots.py
merge getters from loading.py in asf_senedd_response into get_data.py script
update EPC getter function name to match updates in get_data.py
add files: - translation_config for welsh translations - plotting.py for plotting functions - augmenting.py for data processing and augmentation
update to prevent module not subscriptable error
improve selection of min and max dates
allows user to specify which directory with supplementary data to use, meaning new supplementary data can be added to analysis as it's updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @crispy-wonton ,
thanks for this PR - the code for the map looks good. I didn't look to closely at the specific Kepler function calls, but overall looks like it's doing what is supposed to. Also haven't tested if the numbers make sense as in, if areas of higher percentage are correct.
When I open the HTML it shows San Francisco are. I had to zoom out and then look for wales. There my be a way of centering the map in wales. Something like:
wales_map = KeplerGl(...)
wales_map.config = {
'version': 'v1',
'config': {
'mapState': {
'latitude': xxxx, # Specific latitude value
'longitude':xxx, # Specific longitude value
'zoom': 10 # Specify the desired zoom level
}
}
}
I also gave a suggestion of how to re-define the ranges in the color scale, but haven't tested it so i'm not sure it works.
The code doesn't run as is, given the changes done in the other branch, so I had to change a few things around and comment parts of the code so that the map would be generated, but i managed to generate it after those changes. I'm happy to re-run the code to make sure if runs smoothly after you incorporate changes from main
(after merging the other branch).
- You can specify which batch of EPC data to download and MCS data to load from S3 by passing the `--epc_batch` and `--mcs_batch` arguments, both | ||
default to downloading/loading the newest data from S3, respectively. You can also specify which set of supplementary data should be used by passing | ||
the `--supp_data` argument followed by the name of the directory, e.g. data_202310. See the `Historical analyses` section below to see which version was used for each analysis. | ||
Run `python asf_welsh_energy_consultation/analysis/produce_plots.py -h` for more info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be changed after you merge the PR number 3, as it's now produce_plots_and_stats.py
October 2023 analysis: | ||
|
||
- Supplementary data: data_202310 | ||
- EPC: 2023_Q2_complete (preprocessed) | ||
- mcs_installations_231009.csv | ||
- mcs_installations_epc_full_231009.csv | ||
- off-gas-live-postcodes-2022.xlsx - check [here](https://www.xoserve.com/a-to-z/) for updates | ||
- rurality.ods - 2011 Rural Urban Classification for small area geographies, see [here](https://www.ons.gov.uk/methodology/geography/geographicalproducts/ruralurbanclassifications) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! :)
@@ -445,3 +445,19 @@ def load_wales_hp(wales_epc): | |||
wales_hp = wales_epc.loc[wales_epc.HP_INSTALLED].reset_index(drop=True) | |||
|
|||
return wales_hp | |||
|
|||
|
|||
def pc_to_coords_df(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing the dostrings :)
I've been finding it helpful to have the autodosctring extension in VSCode. It might also exist for your code editor.
@@ -0,0 +1,9 @@ | |||
def arial(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing docstring for this function. Why not use Averta?
@@ -7,6 +7,7 @@ matplotlib | |||
odfpy | |||
selenium==4.2.0 | |||
argparse==1.4.0 | |||
keplergl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be better to set the version?
@@ -277,6 +278,36 @@ def get_mcs_retrofits(): | |||
return mcs_retrofits | |||
|
|||
|
|||
def generate_hex_counts(wales_df, pc_df): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or something similar.. just changing names to make it more obvious what they represent. not required, just a suggestion :)
def generate_hex_counts(wales_df, pc_df): | |
def generate_hex_counts(wales_epc, location_info): |
wales_df_coords = pd.merge( | ||
wales_df, pc_df, on=["POSTCODE"] | ||
) # merge EPC with postcode df | ||
wales_df_hex = add_hex_id(wales_df_coords, 6) # add H3 hex id to each row |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does the 6 mean?
|
||
def plot_kepler_graph(base_data, filename): | ||
hex_map = KeplerGl(height=500) | ||
hex_map.add_data( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To define the ranges appearing in the map i think you can do something like this:
# Define the custom color scale
custom_color_scale = {
"name": "Custom", # You can use any name you prefer
"type": "sequential",
"domain": [0, 0.1, 0.5, 1], # Change this to whatever you want :)
"colors": ["#FF0000", "#FFFF00", "#00FF00", "#0000FF"] # Define colors for each range
}
# Add data with the custom color scale
hex_map.add_data(
data=base_data[["perc_true", "hex_id"]],
name="Heat pump proportions",
color_scale=custom_color_scale
)
def plot_kepler_graph(base_data, filename): | ||
hex_map = KeplerGl(height=500) | ||
hex_map.add_data( | ||
data=base_data[["perc_true", "hex_id"]], name="Heat pump proportions" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one way to go around the name that appears on the plot is to rename "perc_true" before plotting.
file_name=os.path.join(fig_output_path["english"], f"{filename}.html") | ||
) | ||
|
||
print("Saved: " + filename + ".html") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is saving to the english folder. is that what you want?
Something else I remembered is that we might want to be strict and only show percentages when there's enough data for a specific hexagon, e.g. only show data for hexagon if denominator (of %) is above X (where X needs to be pre-set by us). I think we have enough data for each hexagon, but maybe worth checking. Hopefully it's not too much trouble - just checking the values of "total" for each hexagon. |
Fixes #8
Changes:
Instructions for Reviewer
Checklist:
notebooks/
pre-commit
and addressed any issues not automatically fixeddev
README
s