Spotify Stat Wrapper

TThis script processes and analyzes your Spotify listening history data stored in JSON files. The analysis includes:

First song ever played in each year.
Total music listened to over the years.
Most and least skipped songs.
Most and least played songs.
Saving the analysis results to an Excel file.

Requirements

Python 3.x
Pandas library
openpyxl library
JSON files from your Spotify listening history

You can install the required libraries using pip:

pip install pandas openpyxl

Structure of the Data

The script expects the Spotify data in JSON files with the following structure:

Each JSON file contains an array of listening events.
Each event contains details such as DateTime, Track Name, Artist, Time Played(ms), and whether the track was skipped.

Example JSON structure

[
    {
        "DateTime": "2024-12-21T12:00:00Z",
        "Platform": "Spotify",
        "Time Played(ms)": 200000,
        "Country": "IN",
        "IP_Addr": "192.168.1.1",
        "Track Name": "Track A",
        "Artist": "Artist A",
        "Album": "Album A",
        "URL": "https://spotify.com/track/abc123",
        "episode_name": null,
        "show_name": null,
        "episode_uri": null,
        "reason_start": "user",
        "reason_end": "user",
        "shuffle": false,
        "skipped": false,
        "offline": false,
        "Offline_timestamp": null,
        "Incognito Mode": false
    },
    ...
]

Script Overview

`main()` function:

Reads all JSON files containing Spotify listening data.
Aggregates the data across all files and processes it using pandas.
For each year, it performs the following:
- Extracts the first song played.
- Sums up the total time played.
- Analyzes the most and least skipped songs.
- Analyzes the most and least played songs.
- Saves the results to an Excel file.

Analysis Functions:

get_years(df): Extracts unique years from the data.
first_song(i, df): Prints the first song listened to for the year (or overall).
sum_heard_music(i, df): Prints the total time spent listening to music in different units (milliseconds, seconds, minutes, hours, days).
filtering(df): Filters the data and computes the total time played, skip count, and play count for each song.
most_skipped_least_skipped(df): Finds the most and least skipped songs.
most_played_least_played(df): Finds the most and least played songs.
save_to_csv(most_skipped, least_skipped, most_played, least_played, year, run_loop, all_lines_df=None): Saves the analysis results to an Excel file

Running the Script

To run the script, simply execute the Python file

python main.py

Naming Convention and location of JSON files

The script expects the JSON files to follow the naming convention: Streaming_History_Audio_<Year>_<Unique Digit>.json
Example: Streaming_History_Audio_2022-2023_1.json, Streaming_History_Audio_2024_2.json, etc.
Ensure that your JSON files are placed in a folder named: my_spotify_data\Spotify Extended Streaming History. If your data is stored in a different folder, update the path_to_json variable in the script accordingly.

Output

The script generates the following output:

Excel files for each year analyzed, saved as .xlsx. These include:
- Most skipped tracks.
- Least skipped tracks.
- Most played tracks.
- Least played tracks.
- A comprehensive file named Entire History.xlsx, containing the combined analysis of all data.
- A raw data CSV file named Entire raw Audio Data from start to end.csv, containing the entire dataset used for analysis.

Notes

The script converts timestamps to Indian Standard Time (IST).
Make sure your Spotify JSON files contain the necessary fields for accurate analysis.
If no suitable JSON files are found, the script will display an error message in the console.

TODO:

A GUI!
- Select multiple timezone for output
- Output to CSV should be optional
- Year Filtering according to user's wishes
Given that Skipped/Skipped_Data variable was brought to the extended data in late 2022, a workaround to find skipped tracks before 2023 should be found.
Integrate with the liked songs playlist and then find out what songs you've played the least, most, etc.
Simplify the initial data frame input workflow

Disclaimer

This script is provided "as is" without any guarantees or warranties. The author is not liable for any errors, data loss, or other issues that may arise from its use. Use it at your own risk.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
my_spotify_data/Spotify Extended Streaming History		my_spotify_data/Spotify Extended Streaming History
.gitignore		.gitignore
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify Stat Wrapper

Requirements

Structure of the Data

Example JSON structure

Script Overview

`main()` function:

Analysis Functions:

Running the Script

Naming Convention and location of JSON files

Output

Notes

TODO:

Disclaimer

About

Languages

mohdaadilf/Spotify-Stat-Wrapper

Folders and files

Latest commit

History

Repository files navigation

Spotify Stat Wrapper

Requirements

Structure of the Data

Example JSON structure

Script Overview

main() function:

Analysis Functions:

Running the Script

Naming Convention and location of JSON files

Output

Notes

TODO:

Disclaimer

About

Topics

Resources

Stars

Watchers

Forks

Languages

`main()` function: