This is a collection of scripts to download, convert, sort and clean submissions to the Zero Carbon Bill so that they can be analysed.
There's a collection of Ruby and Python scripts here:
python download.py
to download the PDFs listed in urls.jsontext.rb
to convert the PDFs inpdfs/
to textsorter.rb
to sort the text submissions into typesclean_text.rb
to turn submissions that follow the online submission format into json files
pdfs
and txt
folders are ignored from git, but the sorted
folder is included.