-
Notifications
You must be signed in to change notification settings - Fork 10
barty
This plugin is known as barty
(for BART-year, as opposed to our first BART plugin, which has data for only one day).
BARTy gives users access to (at this writing) one year of BART data: for each hour, and each pair of stations, you can learn the number of people who entered the system at the first station and exited at the second. This works out to over 10 million cases.
The designer's task, therefore, is to give users easy-to-understand access to this large data set, and to give them useful choices about what cases to download, when they can only practically work with a few thousand cases at a time.
The data are more multidimensional than they sound, so users have to think hard about what data to ask for; and then CODAP gives them the chance to fix their mistakes easily and improve what they are doing. Also, because of that multidimensionality, there are many ways to organize data and compute aggregate measures. So there is a lot of flexibility in what data moves are possible or useful.
For example, suppose you have the task of figuring out how many people took BART to attend Pride, which is on a Sunday in June. You need to conider questions such as
- What destination stations should you look at?
- Do "source" stations matter?
- What times should you look at?
- So you look at arrivals or departures or both?
- Should you compare the data with a non-pride day? Which day? Only one?
Of course, Tim prefers that we not give students these questions but let them come up with them themselves in a process of looking critically at partial solutions.
-
barty.html
: The overall UI, etc. -
barty.js
: this contains the central initialization method, and defines the globalbarty
. -
barty.constants.js
: containsbarty.constants
. Here is wherewhence
and the php paths are defined. -
barty.css
: styling the html -
barty.ui.js
: vast file that adjusts visibility, contents of controls, formats of strings, etc., and responds to user changes to their selection criteria. Inlater plugins, some of this would be in auserActions
file. -
bartyManager.js
: the main controller, definesbarty.manager.
Importantly, assembles thePOST
parameters its methoddoBucketOfData
sends to PHP. NOTE: this whole mechanism ought to be updated to use the Fetch interface; we would use that and make abarty.phpConnect
file. -
bartyCODAPConnector.js
: establishes communication with CODAP, and outputs any data items received from the DB. -
bartyMeetings.js
: regulates the meetings parameters and count adjustments if there is a secret meeting. -
bartStations.js
: defines the JSON objectbarty.stations
that makes thestations
table in the mySQL obsolete. -
php/getBARTYdata.php
: receives a POST from `barty.manager', then uses PDO (thankfully) to extract the specified data from the DB. -
php/establishCredentials.php
: refers to the publicly-inaccessible credentials file that contains mySQL passwords. requires thatwhence
be set properly. -
sql/barty entire 2015.sql
: A huge (533 MB) SQL file containing all of the data for 2015.
-
hours
: The large, 40M record file, one for each hour between each pair of stations
The database is huge, so needs special care. How huge? For every hour, we have the number of riders between every pair of two stations. There are about 50 stations, so there are 2500 pairs. Multiply by 20 hours in a day and you get 50,000 records per day. Times 400 days and you get 20,000,000 records per year. Because many of these cells are blank (and we overestimated) the actual number is more like 10,000,000 records. As a CSV, this amounts to about 260 MB per year; 40 MB zipped. So four years is 160 MB zipped, 1 GB unzipped.
First, there is a folder on Google Drive here: https://drive.google.com/drive/u/0/folders/1oHRtK0AwT1V-QplmBc9EF0tpn9TNNN3-
In February 2019, it contains:
- A zip with four csv files, one for each year from 2015 to 2018.
- a .sql file that, when run, deletes the existing
hours
table and recreates it, empty (but with the correct structure).
Here is that .sql file:
DROP TABLE IF EXISTS `hours`;
CREATE TABLE `hours` (
`date` date DEFAULT NULL,
`hour` int(11) unsigned DEFAULT NULL,
`origin` varchar(4) DEFAULT NULL,
`destination` varchar(4) DEFAULT NULL,
`riders` int(11) unsigned DEFAULT NULL,
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
KEY `date` (`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
In future years, you should be able to download a new annual file from BART, here: http://64.111.127.166/origin-destination/
To orient you (and to make it so you don't have to open one of these files), the .csv for 2018 starts like this:
2018-01-01,0,12TH,12TH,3
2018-01-01,0,12TH,16TH,1
2018-01-01,0,12TH,BAYF,1
2018-01-01,0,12TH,CAST,3
2018-01-01,0,12TH,CIVC,2
Notice that there are no column headers in the first row. The table set-up sql is such that the variables ar ein the right order, although the auto-increment index id
is not to be imported.
The order, then, for each line, is: date
, hour
, origin
, destination
, riders
. Looking at the last mine, that means that between midnight and 12:59 AM on new year's morning, two people got off at Civic Center, having gotten on at 12th Street Oakland.
- Download and run
barty-table-defs.sql
. - Download and unzip
barty-hourly-2015to2018.zip
. You will get four .csv files. -
IMPORT
each .csv into the now-extanthours
table; if you use Sequel Pro, you will have a chance to make sure the fields in the csv correspond correctly to the ones in the SQL.