Restore committee data with a new process #7

jamesturk · 2019-03-19T02:02:45Z

jamesturk
Mar 19, 2019
Maintainer

There are a lot of issues on committee information, but I don't think a lot of people interested in making PRs for it. Committee scrapers are very prone to breakage, maybe the highest of all the ones we maintain.

This leaves us in a tough spot, should we continue to provide it? Are people using this data? Is anyone interested in helping maintain it?

rconnorlawson · 2019-03-19T05:05:04Z

rconnorlawson
Mar 19, 2019

I opened openstates/people#135 as I was perusing openstates.org. I'm new to GA and was merely browsing for my own interest when I noticed the bug. While I don't currently have a particular use for such data, I think having easy-to-access, structured data on committee composition - and particularly the reverse map, legislator committee membership - is generally useful. Committee membership is an important aspect of understanding a legislator's politics and is not always readily available from other sources. The value of openstates is that it collects this information for the times when it is useful.

Regarding maintainence: In the past I have contributed an unsolicited PR for the MS scrapers. The functionality in that PR was eventually duplicated by someone on staff, and my PR was rejected. I observed several PRs from various people/states get subsumed in a similar manner during that time. I have no issue with this - the bugs were fixed, which is the ultimate goal - but this time I didn't spend the time to fix the issue when I reported it, expecting a similar response. If it's helpful, I can look at making patches for the GA committee data and scrapers.

0 replies

MH-42 · 2019-03-19T12:52:14Z

MH-42
Mar 19, 2019

I think the committee data is essential. Reps have more responsibility and leverage for work that is in their committees. The bill has to get through committee before non-committee members get a vote. Currently PA has an excellent state site for this information (e.g. https://www.legis.state.pa.us/cfdocs/cteeInfo/StandingCommittees.cfm?CteeBody=H) and RSS feeds. But I suspect as rconnorlawson said, not all states have good sites. So openstates provides a service in providing that information consistently, if it can do it with accuracy.

Maintenance: I'm not currently in a position to promise much but am interested and may be able to help in the future; I'll take a look at the code when I can.

0 replies

jamesturk · 2020-05-08T17:44:52Z

jamesturk
May 8, 2020
Maintainer Author

Given that we don't have a maintainer and the information is very stale, I will begin the process of removing the stale info for now. This isn't meant to preclude the future inclusion, in fact I think the existing data is more of an obstacle than a help- so perhaps the removal of stale info will spur those that need new info to figure out a plan to maintain the data.

0 replies

showerst · 2020-09-29T20:18:02Z

showerst
Sep 29, 2020
Collaborator

I had some more time to look at this, and have a few thoughts:

I agree maintaining the full scrapers is too much for our current resources
That said, it would be nice to have a list of committees and aliases with some kind of canonical IDs for things like resolving sponsorships and meetings.
These don't change very often, and going into 2021 Openstates (or Govhawk for that matter) should have a 95% accurate list of state committees that currently exist.
Maybe instead of maintaining scrapers for it, we move to a YAML model like openstates-people and take PRs?
I don't think it makes sense to track contact and membership information, unless we can get a sponsor to do the work, but keeping a list of names and aliases seems doable.

Just as a proof of concept i whipped up a quick list of some CA committees in a sample format --

https://gist.github.com/showerst/56509d68dbd7527c531f7e36cebff194

I'm thinking we could do one file per jurisdiction. Haven't thought out 'retirements' (other than an end_date key).

The big obvious downside is that this doesn't come with sourced scrapers to run the updates, but we could provide small utility scripts for adding/retiring a com.

What are people's thoughts on this?

Is it worth pursuing this?
I could do a one time dump to get us started, and try to add them as i see new ones, but i'm concerned about it going stale.
Contact info was impossible to keep up to date, but a website URL might be manageable? Are there other keys that make sense to include/exclude?

@jamesturk @azban

0 replies

jamesturk · 2020-10-05T16:03:02Z

jamesturk
Oct 5, 2020
Maintainer Author

I'm on board with this, lots of people have reached out about supporting committees again but balk at the amount of work to actually track members. I don't see it happening any time soon. From my point of view, the most useful thing will wind up being the aliases, I don't see us using many other fields besides those- so this works well from my perspective. One thought: In other places where we have aliases they can have an optional start & end date, so we might want to consider supporting that by making the aliases a list of objects instead of just strings? (e.g. aliases: - name: Assembly Select Committee on Sustainable and Organic Agriculture ) which would give the future flexibility to add end_date as needed without changing too much

…

On Tue, Sep 29, 2020 at 4:18 PM showerst ***@***.***> wrote: I had some more time to look at this, and have a few thoughts: 1. I agree maintaining the full scrapers is too much for our current resources 2. That said, it would be nice to have a list of committees and aliases with some kind of canonical IDs for things like resolving sponsorships and meetings. 3. These don't change very often, and going into 2021 Openstates (or Govhawk for that matter) should have a 95% accurate list of state committees that currently exist. 4. Maybe instead of maintaining scrapers for it, we move to a YAML model like openstates-people and take PRs? 5. I don't think it makes sense to track contact and membership information, unless we can get a sponsor to do the work, but keeping a list of names and aliases seems doable. Just as a proof of concept i whipped up a quick list of some CA committees in a sample format -- https://gist.github.com/showerst/56509d68dbd7527c531f7e36cebff194 I'm thinking we could do one file per jurisdiction. Haven't thought out 'retirements' (other than an end_date key). The big obvious downside is that this doesn't come with sourced scrapers to run the updates, but we could provide small utility scripts for adding/retiring a com. What are people's thoughts on this? 1. Is it worth pursuing this? 2. I could do a one time dump to get us started, and try to add them as i see new ones, but i'm concerned about it going stale. 3. Contact info was impossible to keep up to date, but a website URL might be manageable? Are there other keys that make sense to include/exclude? @jamesturk <https://github.com/jamesturk> @azban <https://github.com/azban> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/openstates/issues/issues/45#issuecomment-700962853>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAB6YXL4E3DUVJQ5W4JQJLSII6ITANCNFSM4ODE2ZEQ> .

0 replies

jamesturk · 2021-06-30T18:09:58Z

jamesturk
Jun 30, 2021
Maintainer Author

this is now being tackled in OCEP 4 (#18 )

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore committee data with a new process #7

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Restore committee data with a new process #7

jamesturk Mar 19, 2019 Maintainer

Replies: 6 comments

rconnorlawson Mar 19, 2019

MH-42 Mar 19, 2019

jamesturk May 8, 2020 Maintainer Author

showerst Sep 29, 2020 Collaborator

jamesturk Oct 5, 2020 Maintainer Author

jamesturk Jun 30, 2021 Maintainer Author

jamesturk
Mar 19, 2019
Maintainer

rconnorlawson
Mar 19, 2019

MH-42
Mar 19, 2019

jamesturk
May 8, 2020
Maintainer Author

showerst
Sep 29, 2020
Collaborator

jamesturk
Oct 5, 2020
Maintainer Author

jamesturk
Jun 30, 2021
Maintainer Author