Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plymouth analysis #118

Draft
wants to merge 2 commits into
base: dev
Choose a base branch
from
Draft

Plymouth analysis #118

wants to merge 2 commits into from

Conversation

lizgzil
Copy link
Collaborator

@lizgzil lizgzil commented Jan 28, 2025


  • no weighting applied for features

Description

Please include a summary of the changes.

Fixes # (issue)

Instructions for Reviewer

In order to test the code in this PR you need to ...

Please pay special attention to ...

Checklist:

  • I have refactored my code out from notebooks/
  • I have checked the code runs
  • I have tested the code
  • I have run pre-commit and addressed any issues not automatically fixed
  • I have merged any new changes from dev
  • I have documented the code
    • Major functions have docstrings
    • Appropriate information has been added to READMEs
  • I have explained this PR above
  • I have requested a code review

Copy link
Collaborator

@crispy-wonton crispy-wonton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lizgzil for working on this analysis!

I ran the code and it worked. The outputs seem ok individually, but the different suitability types seem to be contradictory to me, although maybe this is intended - let me know if so.

I've joined the results for the following columns in the following order, below and printed their value counts:
Overall suitability type - shared/not -> Overall suitability type -> Manual suitability type

Individual suitable only -> ASHP suitable only -> ASHP suitable                                 50
Shared suitable only -> HN suitable only -> Multiple technologies may be feasible               26
Neither suitable -> Neither ASHP or HN suitable -> ASHP suitable                                24
Both individual and shared suitable -> Both ASHP and HN suitable -> ASHP suitable               16
Shared suitable only -> Neither ASHP or HN suitable -> Multiple technologies may be feasible    16
Shared suitable only -> HN suitable only -> ASHP suitable                                       13
Shared suitable only -> HN suitable only -> HN suitable                                         10
Shared suitable only -> Neither ASHP or HN suitable -> ASHP suitable                             7
Individual suitable only -> Neither ASHP or HN suitable -> ASHP suitable                         1
Both individual and shared suitable -> HN suitable only -> ASHP suitable                         1

Beyond that, the other functions look correct and appropriate and the code is easy to read. Just needs some refactoring and docstrings.

Comment on lines +102 to +105
if ashp < 0.82:
return "Multiple technologies may be feasible"
else:
return "ASHP suitable"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be misinterpreting the use case of this function but is there an outcome missing where none is suitable?

E.g. it looks like if ashp == 0 and hn == 0 for example, the function would return "Multiple technologies may be feasible" even though none are.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know what you mean. I think the language is because we don't want to say that HPs are never suitable. And either way, for Plymouth (and I think most places) these values are never that low anyway.

"lsoa",
"LATITUDE",
"LONGITUDE",
"UPRN_duplicated",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be removed from the suitability per property file. Not sure why it's still in there, I thought I had fixed the issue.


plymouth_lsoas_gdf = plymouth_lsoas_gdf.round(3).rename(columns=column_rename_dict)

quantile_thresh = 0.6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for choosing 0.6?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh it feels a bit random, it was because a more normal quantile like 0.75 gave no values for one of the categories, so I felt like I needed to lower it in order to get at least a few LSOAs in each category. Either way, we didn't actually use this categorisation in the mapping, so this could be deleted.

Comment on lines +223 to +240
census_prop_type_plymouth = census_prop_type_plymouth.with_columns(
sum=pl.sum_horizontal(
"Detached",
"Semi-detached",
"Terraced (including end-terrace)",
"Flat, maisonette or apartment",
"Caravan or other mobile or temporary structure",
)
).with_columns(
(pl.col("Flat, maisonette or apartment") * 100 / pl.col("sum")).alias(
"census_%flats"
)
)
plymouth_lsoas_gdf = plymouth_lsoas_gdf.merge(
census_prop_type_plymouth.to_pandas()[["lsoa", "census_%flats"]],
how="left",
on="lsoa",
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use these census proportions when it comes to using them, rather than proportions calculated from the EPC data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know! I think we did in the maps - but will make sure going forward

).round(3)
)

plymouth_per_prop_data_extra.write_csv("plymouth_per_prop_data_extra.csv")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path could be updated to save to the /outputs dir instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very true!

@lizgzil
Copy link
Collaborator Author

lizgzil commented Feb 6, 2025

Thanks a lot for looking at this @crispy-wonton !!

I think the 3 types of categorisations are confusing and unnecessary anyway since we only use one in the map (the other 2 are legacy).

It didn't feel like there were other bugs you noticed (apart from the UPRN_deduplicated which will probably cause problems in the future), so thinking my next steps are

  • clean this notebook up a bit (remove the 3 types of categorisation, save to the outputs folder)
  • run with latest data

Then maybe it can be merged and refactoring steps could happen later if/when we apply this to other regions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants