Friendly Hello from another parking scraper #229

defgsus · 2021-11-14T18:44:22Z

Hi there. Are you still working on this?

I wrote a couple of parking space scrapers myself at https://github.com/defgsus/parking-scraper

Data is persisted at https://github.com/defgsus/parking-data

Maybe we can join forces.

I'm more interested in recording than in actual use by driving people, but who knows?

kiliankoe · 2021-11-14T19:55:31Z

Hey! It would be great to join forces, adapt the scrapers you've written to our format and dump our historical data into your data repo as well. According to our dumps they go back to July 2015 for many lots, others as soon as their scrapers were introduced, but @jklmnn knows how current the data there is. Also please beware that the archive expands quite a bit when unarchived.

I'm more interested in recording than in actual use by driving people, but who knows?

Haha, same, who needs a car anyways 😅

defgsus · 2021-11-15T09:33:32Z

Cool, i will just start the port.

Ach nee, is ja Montag.. Need to earn some money first.

Only kidding. Thanks for the fast reply. Thought that not much is going on here because of the pending pull requests. I will certainly add some more.

Your archive is quite something! Here's the one-year-celebration of mine: https://defgsus.github.io/blog/2021/04/08/one-year-parking.html

jklmnn · 2021-11-15T11:06:08Z

Great work! Our archive is up to date to then end of 2020. I will add this year at the start of 2022. For more current data we also have an API (@kiliankoe do you remember if and where this is documented). However we're not advertising this API as it puts a strain on the server (which is also the reason that the largest request that can be made is limited to a week per lot). This API contains all data since the server has been upgraded (09/2020). I try to make sure that all historical data is always available either in the archives or via API.

About the pending PRs, They're the ones that got stuck at some point of the process, we merged multiple others in the mean time. But you're right, we're currently not super active because as you said it's Monday and someone has to earn the money to keep the servers running ;).

defgsus · 2021-11-16T10:27:21Z

Good morning. Earned some money in the meantime? I did, but it's not enough..

However, i had some trouble setting up the local development. I'm not the raw SQL type of guy. So i added a few lines to the README.md to help others. #230 (EDIT: actually, now i found the park_api/setupdb.py)

Actually, i wanted to see if Frankfurt is still failing (#153). I have the same problem in other scrapers. Some websites issue certificates which are fine for a browser but not for the requests api.

Anyways, i have two immediate proposals:

Add some command-line args to the scraper, so that one can work on a particular website without needing to scrape all the others as well. If the list of cities is growing that would certainly be helpful.
Switch the whole project to Django. Actually, i'm not sure if Django is the best of choices for a (potentially) high-demand API server, but it's not a bad choice either and they have a usable database ORM, migrations, an integrated shell and command-line tasks (that can be cron-jobbed) out of the box. I'm a Django fan, admittedly. That would require a move of the data to a new database but that would fit with Proper database schema #224 i guess. Also agree with Exclude static data / modules. #144 in the long term.

Waddjasay?

EDIT

If you ask why i would like to help refactoring a whole project that i havn't known two days ago and whose public website i havn't even loaded without all the scriptblockers, here's a few motivational points (because i was wondering myself):

I like collecting this kind of data. And i did some mistakes along in my own project which are still in there because nobody else seems to look at it.
So it would certainly be supportive to work on a canonical standard as a team
The legacy code is much like my own python projects from 6 years ago. I'd do it quite differently, today. I know Django quite well and it would help to make the code, data, CI and community involvement more durable... i think, Using the django rest-framework the API can be automatically documented, throttled or parts of it restricted via access-keys, etc...
In other words: I like to refresh the code at a couple of points and actually would rather invest the time to port it to a new framework

defgsus · 2021-11-18T16:43:16Z

Hello again!

Guess one should not come along and right away call other people's projects legacy code. It's just enthusiasm. I don't want to hijack anything. Still, interested in discussion.

jklmnn · 2021-11-18T17:07:21Z

Sorry for the late reply. I wanted to point you to park_api/setupdb.py but I see you already found it. Since it's not documented yet, could you change #230 to include that into the setup instructions?

You're certainly right about the state of the code, most of it is legacy and only maintained as needed for it not to break. The main reason for that is the lack of time, energy and people (as far as I can tell only @kiliankoe and I actively work on the current code base outside of module contributions. For my side, the biggest restriction is time and motivation (both is hard to find for programming after working full time in software engineering). So we'd be really thankful for any help that improves the current implementation.

Tbh I don't have any experience with Django but if that provides us with a (more or less) easy implementation of the full stack including the database I'd be in favor of that. @kiliankoe do you have any opinion on that? About #224, that's the best I could come up with, but I'm certainly no database export.

About refactoring the project, I'd say rewriting is the proper term. Most of the code has been written by different people (many of which aren't working actively on that code base anymore) and then updated by me. I think a new implementation would be faster and yield better results than trying to understand the current code base and refactor it. Especially since we don't have a specific goal for refactoring (other than "improving" which is not exactly specific).

jklmnn · 2021-11-18T17:21:29Z

Also for better communication you can join #parkendd:matrix.org or I can invite you into the OKF Germany Slack.

defgsus · 2021-11-18T17:51:57Z

Thanks for the reply! So i guessed about right about the state of code. I just don't want to put someone off but it looks like it should be rewritten to make further contributions easier. I've been using Django for a few years now, also use it to earn some of that money we were talking about. After some introduction it's a quite friendly and intuitive framework (until you start extending the admin interface ;-)

My current workload is acceptable and i can certainly spend a few hours each week. I will check the matrix link above.

I will update the #230 pull-req. Can you please just clarify the environments a bit? As i understand it:

The unittests will always use testing
Otherwise it defaults to development
And on the live system you probably set the env to staging or production

So calling setupdb.py will probably create the development tables and to create testing-db i need to set the environment variable env=testing.

Just nod if that's right

jklmnn · 2021-11-18T18:02:31Z

Yes the unittests use testing and on our server we use production. setupdb.py as all other scripts will use the env environment variable to select the environment. I don't really know why we have staging though. It would probably useful if we had a specific test system but that is probably not going to happen.

defgsus · 2021-11-21T02:22:28Z

Hi there,

here's a new prototype https://github.com/defgsus/ParkAPI2

(don't mind the 2, it's still supposed to become 1 ;)

Right now it's just a basic framework for tying parking lots to cities, states and countries with geo-coordinates on all entities. Most interesting part are the models and the store methods in https://github.com/defgsus/ParkAPI2/tree/master/web/park_data/models

Certainly stuff to discuss. The pool entity from #224 can be added besides the cities.

jklmnn · 2022-02-22T16:52:57Z

Hi @defgsus, sorry for the long reply times. I invited you into a matrix room for a discussion about the future of ParkAPI. Please have a look :).

defgsus added a commit to defgsus/ParkAPI that referenced this issue Nov 16, 2021

add migration step to README.md docs (offenesdresden#229)

ab06473

jklmnn pushed a commit that referenced this issue Nov 19, 2021

add migration step to README.md docs (#229)

83b2b85

defgsus mentioned this issue Nov 20, 2021

Missing steps in setup documentation #186

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Friendly Hello from another parking scraper #229

Friendly Hello from another parking scraper #229

defgsus commented Nov 14, 2021

kiliankoe commented Nov 14, 2021

defgsus commented Nov 15, 2021

jklmnn commented Nov 15, 2021

defgsus commented Nov 16, 2021 •

edited

Loading

defgsus commented Nov 18, 2021

jklmnn commented Nov 18, 2021

jklmnn commented Nov 18, 2021

defgsus commented Nov 18, 2021

jklmnn commented Nov 18, 2021

defgsus commented Nov 21, 2021

jklmnn commented Feb 22, 2022

Friendly Hello from another parking scraper #229

Friendly Hello from another parking scraper #229

Comments

defgsus commented Nov 14, 2021

kiliankoe commented Nov 14, 2021

defgsus commented Nov 15, 2021

jklmnn commented Nov 15, 2021

defgsus commented Nov 16, 2021 • edited Loading

EDIT

defgsus commented Nov 18, 2021

jklmnn commented Nov 18, 2021

jklmnn commented Nov 18, 2021

defgsus commented Nov 18, 2021

jklmnn commented Nov 18, 2021

defgsus commented Nov 21, 2021

jklmnn commented Feb 22, 2022

defgsus commented Nov 16, 2021 •

edited

Loading