Skip to content

Latest commit

 

History

History
executable file
·
12 lines (5 loc) · 613 Bytes

README.md

File metadata and controls

executable file
·
12 lines (5 loc) · 613 Bytes

Parse ProQuest Metadata

This notebook includes a python function to parse newspaper articles downloaded from ProQuest Newsstream into a pandas dataframe (and save to CSV) with metadata and full text (when full text is available).

Created by Cody Hennesy and David Naughton (University of Minnesota, Twin Cities, Libraries). Email Cody ([email protected]) with any questions.

For an alternative approach using R and saving documents as HTML files, Jae Yeon Kim's Tidy Ethnic News parser.

See also: Factiva parser