Skip to content
This repository has been archived by the owner on Apr 22, 2024. It is now read-only.

Commit

Permalink
Handle possibility of duplicate courses being inserted to course ta…
Browse files Browse the repository at this point in the history
…ble (#174) (#175)

* Drop duplicate courses; log number of dropped courses

* Improve logging messages
  • Loading branch information
ssciolla authored May 22, 2020
1 parent 54e9aed commit 700b986
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion course_inventory/inventory.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,10 +173,14 @@ def gather_course_data_from_api(account_id: int, term_ids: Sequence[int]) -> pd.
num_course_dicts_with_students = len(course_dicts_with_students)

logger.info(f'Course records with students: {num_course_dicts_with_students}')
logger.info(f'Dropped {num_course_dicts - num_course_dicts_with_students} records')
logger.info(f'Dropped {num_course_dicts - num_course_dicts_with_students} course record(s) with no students')

course_df = pd.DataFrame(course_dicts_with_students)
course_df = course_df.drop(['total_students'], axis='columns')
orig_course_count = len(course_df)
course_df = course_df.drop_duplicates(subset=['canvas_id'], keep='last')
logger.info(f'Dropped {orig_course_count - len(course_df)} duplicate course record(s)')

logger.debug(course_df.head())
return course_df

Expand Down

0 comments on commit 700b986

Please sign in to comment.