Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoDataFrame instead of DataFrame for ThermalNetwork.components #32

Open
joroeder opened this issue Mar 16, 2020 · 6 comments
Open

GeoDataFrame instead of DataFrame for ThermalNetwork.components #32

joroeder opened this issue Mar 16, 2020 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@joroeder
Copy link
Member

I am wondering whether we can directly use GeoDataFrames for storing all geo-referenced content of the ThermalNetwork.components.

Then, we do not need to care about coordinate reference system of lat, lon columns of the components, and we have already many import, export and plot functionalities of geopandas ... especialley for the import, I think it is very pratical. In most cases, you probably have some GIS files in geojson, shp or any other format ...

We could then define the geometric format of the geometry for each componet, e.g.:
consumers: points
forks: points
producers: points
edges: lines (no multilines)
...

What do you think? Is there anything against it?

@jnnr
Copy link
Member

jnnr commented Mar 18, 2020

Thanks for this issue! Being able to use information about the geographic location of consumers, producers, forks etc. and geometries of edges was part of the idea of dhnx, so this is a natural thing to ask. In some use cases you might not have these informations, but if you have them you would want to use them.

At the moment, ThermalNetwork only has a CSVImporter which allows you to define a network consisting of different nodes and its connectivity. It is assumed that the lengths of the edges are know, and (even this is optional), the geocoordinates. In this case, pandas DataFrames are a good choice.

If you have the information (e.g. by downloading it from osm, as shown in the import_osm example), it is more convenient to store it as geopandas.GeoDataFrame. GeoDataFrame inherits from pandas.DataFrame, adding a 'geometry' column which holds data about POINTs and LINESTRINGs. Also, GeoDataFrames have more methods that handle the geographic data.

It makes sense to use geopandas when detaiiled geo information is present and use pandas when this is not the case. The class ThermalNetwork should be able to handle both DataFrames and GeoDataFrames polymorphically.

How to do it? When writing a new ShapefileImporter or GeojsonImporter, use geopandas. Also, we might have to rewrite this here https://github.com/oemof/DHNx/blob/dev/dhnx/network.py#L70, where the component dataframes are initialized as empty pandas.DataFrames. If we can find a way to leave these specifics in the importers/exporters, this should work.

@joroeder
Copy link
Member Author

Thanks for your comment!
From my perspective, not having GIS data when doing any dhs optimization/calculation but having csv data, is a rare case. And if you have csv data with coordinates, it is really easy to generate a geoDataFrame. So we could focus on that as well.

The class ThermalNetwork should be able to handle both DataFrames and GeoDataFrames polymorphically.

Sounds nice! But I don't know how to do that ..

If we can find a way to leave these specifics in the importers/exporters, this should work.

Maybe, I don't know. I did not fully get the concept of the import,export structure so far, and why we need so many classes for that. But for sure, I am not a python native, maybe, all this makes sense 😉

Generally, I am afraid, that we are trying to consider too much right from the start. And thereby, we make it too complex, so that the developments are more difficult, and in the end very slowly. For me, the structure is already quite complex, e.g. when I now want to write a geojson reader, it would be faster for me, to write a function, which exports the geojson files, I want to import, into the given .csv structure and then import it 😉

@jnnr
Copy link
Member

jnnr commented Apr 8, 2020

I can open a new branch to try the solution that I described above. Coming soon.

@jnnr jnnr self-assigned this Apr 8, 2020
@jnnr jnnr added the enhancement New feature or request label Apr 8, 2020
@jnnr jnnr modified the milestone: Release v0.0.1 Aug 4, 2020
@joroeder
Copy link
Member Author

I can open a new branch to try the solution that I described above. Coming soon.

For me, this issue is solved. I am fine having DataFrames in the thermal network. This might makes things easier as we discussed once. Did you try this solution you were talking of? In my opinion, we can close this issue.

@jnnr
Copy link
Member

jnnr commented Oct 22, 2020

I would like to leave this open. Let's check the integration with the OSMImporter and see how things work when using GeoDataFrames.

@joroeder
Copy link
Member Author

Hey, it seems that there is no problem using geopandas.GeoDataFrame in the components - which is very nice I think!

network = dhnx.network.ThermalNetwork()
network.components['pipes'] = geopandas_dataframe_pipes
network.components['forks'] = ...
network.components['consumers'] = ...
network.components['producers'] = ...

network.is_consistent()

Is it indented to work like this without any "setter" method? Or can you also add Dataframes to the components with the ThermalNetwork.add() method?

Here is an example: https://github.com/oemof/DHNx/blob/features/Move_gistools_to_dhnx/examples/investment_optimisation/import_osm_invest/import_osm_invest.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants