This document records how we're translating data from the @unitedstates
project to the OCD spec. For now, we'll have a section per Congress API endpoint.
Article I of the Constitution states, "All legislative Powers herein granted shall be vested in a Congress of the United States, which shall consist of a Senate and House of Representatives." -- Wikipedia
The congress-legislator
repo in the github @unitedstates
organization contains a YAML file that maps to the OpenCivicData Person
object. Headshot images come from the images
repo from the github @unitedstates
organization.
Congress-Legislator | OCD Organization |
---|---|
name__official_full | name |
construct github url from biogude | image |
from last entry in terms | contact_details |
bio__gender | gender |
bio__birthday | birth_date |
id | identifiers |
Mappings for Membership can be found below in committee.
"A congressional committee is a legislative sub-organization in the United States Congress that handles a specific duty (rather than the general duties of Congress)." -- Wikipedia
@unitedstates
Committees map to the Open Civic Data Organization
object. Committee memberships are represented by Membership
objects that connect to the Posts
array on the Organization
object.
*** At the federal level congress, house, and senate are all organizations. House, Senate, and joint committees are all going to have parent_ids linking to the congress organization. ***
UnitedStates | OCD Organization |
---|---|
type | handled by parent_id and application logic |
name | name |
url | .links |
thomas_id | .identifiers (scheme: THOMAS) |
phone | .contact_details |
address | .contact_details |
rss_url: | .links |
parent_committee | parent_id |
The @unitedstates
project stores current committee memberships in the committee-membership-current.yml
file. We'll use this to build the OCD Memberships
that connect Organizations
' Posts
to OCD Person
objects.
UnitedStates | OCD Post |
---|---|
party (majority or minority) | handled by application logic |
title | label |
| role
| organization_id
| division_id
| start_date
| end_date
| contact_details
| links
| extras
UnitedStates | OCD Membership |
---|---|
rank | .extras |
| label
| role
| person_id
| organization_id
| post_id
| on_behalf_of_id
| start_date
| end_date
| contact_details
| links
| extras
| jurisdiction_id
| division_id
There is no good idea of time in our current store of committees and committee membership, partially because there is not great historical committee data. More details on this issue: unitedstates/congress-legislators#46.
UnitedStates | OCD Bill |
---|---|
bill_id | identifier |
bill_type | classification |
committees hash | * Not carried over * |
introduced_at | actions |
congress | * application logic and/or extras * |
enacted_as | extras |
number | other_identifiers |
official_title | title |
short_title | * disregard * |
popular_title | * disregard * |
titles | other_titles |
status | ? deduce from actions ? |
status_at | ? deduce from actions ? |
subjects | subject |
subjects_top_term | extras |
sponsor | sponsorship with 'sponsor' classificaiton |
cosponsors | sponships with 'cosponsor' classification |
related_bills | related_bills |
summary | abstracts |
histories | * disregard * |
ammendments | * waiting on OCD field * |
Recorded vote in the United States Congress, often called a roll-call vote.
The specific rules differs between the House and the Senate and both are afforded considerable freedom in establishing their own rules under Article One of the United States Constitution. -- See Wikipedia for more information.
@unitedstates
Votes map to the Open Civic Data Vote
object. It's nearly a one-to-one attribute mapping. See @unitedstates
votes data format here and OCD votes data format here for documentation of each.
UnitedStates | OCD Vote |
---|---|
vote_id | identifier |
congress | session |
chamber | chamber (h -> lower, s -> upper, etc) |
date | date |
type | motion |
category | type |
result | passed |
source_url | sources |
bill | bill |
? deduce from votes ? | vote_counts |
votes | roll_call |
"Congressional hearings are the principal formal method by which committees collect and analyze information in the early stages of legislative policymaking" -- Wikipedia
@unitedstates
Committee Meetings map to the Open Civic Data Event
object.
UnitedStates | OCD Event |
---|---|
guid/house_event_id/congress/house_meeting_type ??? | identifier |
generate | name |
topic | description |
occurs_at | when |
| end
ask paul | status room | location (name) committee | participants (see mapping below) witnesses | agenda (see mapping below) bill_ids | agenda (see mapping below) meeting_documents | media (see mapping below) url | links (see mapping below)
Should we derive the Person models from committee membership? Should the participating committee be only the subcommittee (if applicable) or both the subcommittee and the parent committee?
committe | OCD Event-participants |
---|---|
chamber | chamber |
derived | note |
default value | type = organization ? |
committee | name |
derived | id |
Should the organization (if applicable) only be associated with the person or both the person and the agenda?
witnesses | OCD Event-agenda |
---|---|
documents | media |
first_name, last_name, witness_type, position | related_entitites |
organization | related_entities |
Leave note field on the agenda empty?
bill_ids | OCD Event-agenda |
---|---|
array entries are bill identifiers | related_entities |
This maps fairly nicely.
meeting_documents | OCD Event-media |
---|---|
type | type |
name | description |
date | published_on |
urls | links (mimetype = infer mimetype from url extension, url = url) |
| offset
What should the note be? The link in the US scrapers are to the committee repository whichh is basically an html representation of the hearing data.
url | OCD Event-links |
---|---|
url | note = Calendar Event??? url=url |
Recent real time, to-the-minute updates from the House and Senate floor. House floor updates are sourced from XML at the House Clerk, and Senate updates from the Senate Periodical Press Gallery.
House Clerk XML
and Senate Periodic Press Gallery
map to the Open Civic Data Event
object.
The House releases XML for floor updates that we will ingest and translate directly.
XML | OCD Event |
---|---|
derived from chamber, action_time | name |
"floor_update" | classification |
action_time[for-search] | start_time |
?? eastern ?? | timezone |
action_description | description |
capitol (get url/coords of chamber) | location |
False | all_day |
| end_time
confirmed | status Parse names?? ER? | participants a[rel='bill'] | agenda (type=bill) a[rel='vote'] | agenda (type=vote)
The Senate does not release floor updates in a machine-readable format. We must parse the HTML. You can see a current scraper on the Congress API. The mapping table below will translate from the Congress API fields as the scraper will be adapted for OCD.
Senate Periodic Press Gallery | OCD Event |
---|---|
derived from chamber, action time | name |
"floor_update" | classification |
timestamp | start_time |
?? eastern ?? | timezone |
update | description |
capitol (get url/coords of chamber) | location |
False | all_day |
| end_time
confirmed | status bill_ids | agenda (type=bill) vote_ids | agenda (type=vote) Parse names?? ER? | participants