Skip to content

jorgeluis8ar/Pearl-Jam-Lyrical-Musical-Analysis

Repository files navigation

GitHub contributions welcome

Colaboradores

Pearl Jam Analysis in R

Musical and lyrical analysis of Pearl Jam's songs

This repository comprises a musical and lyrical analysis of Pearl Jam's songs. I use data from Spotify's and Genius's APIs. I use scripts to query information on the musical propety of all of the 11 Pearl Jam albums. I later use a script to scrape the lyrycs to several songs and finally use the Genius API to download the remaining. Related works include Kendrick Lamar sentiment analysis, Gloom Index to find Radiohead's most depressing song by Charlie Thompson, Bob Dylan lyrical analysis by Paul Reiners, tidy sentiment analysis on Prince's music by Debbie Liske and Musical Lyrics Analysis on several artists by Bree McLennan.

The repo is organized as follows:

  • Data Assessment

    • Intellectual Property (Copyrigths)
    • Songs selection: Spotify API
    • Lyrics: Scraping and using the Genius API
    • Processing the lyrics
    • Final Data set
  • Data exploration

    • Word counts by album and song
    • Wordclouds
    • Vocabulary diversity
    • Term Frequency Inverse Document Frequency (TF-IDF)
  • Sentiment Analysis and Natural Language Processing

    • NCR Sentiment
    • Bi-grams

Data Assessment

Intellectual Property (Copyrigths)

I would like to acknowledge that every single content exposed in this repository is based on protected songs. All the rigths of the data used come from the Spotify API, Genius API, but mainly the sonwriters and composers of the songs. In no way it is intented to make it as my own. All the mistakes made are my own and no one else.

Songs selection: Spotify API

I first asing all the packages needed into the vector packages. In this section the important package to query the Spotify API is spotifyr.

packages <- c('spotifyr','lubridate','ggplot2','dplyr','tidytext',
              'stringr','tidyr','viridis','wordcloud', "tm",'forcats','fmsb',
              'scales','radarchart','qdap','knitr','geniusr','tidyverse')
lapply(packages,require,character.only=T)
rm(packages)
packages <- c('spotifyr','lubridate','ggplot2','dplyr','tidytext',
              'stringr','tidyr','viridis','wordcloud', "tm",'forcats','fmsb',
              'scales','radarchart','qdap','knitr','geniusr','tidyverse')
lapply(packages,require,character.only=T)
rm(packages)

The package requieres a Client ID and a Client Secret to query the API. To do so, the user must have a premium account in order to create a developers account. The proccess could be done here. Once this process is done, you can pull spotify access token into R with get_spotify_access_token(). Note that you could pass your ID and secret in order to set your credentials into System Environment. For more information on the packages reference to Charlie Thompson's spotifyr Github repository.

Sys.setenv(SPOTIFY_CLIENT_ID = 'xxxxxxxxxxxxxxxxxxxxx')
Sys.setenv(SPOTIFY_CLIENT_SECRET = 'xxxxxxxxxxxxxxxxxxxxx')

access_token <- get_spotify_access_token()

To get Pearl Jam's songs audio features we use the function get_artist_audio_features(). This functions queries the Spotify API and returns information on several characteristics such as: danceability, energy, instrumentalness, valene, explicit content, track name, track album, etc. Now, I'll only keep songs featuring in the 11 albums. The reason is to eliminate live performances and covers and focus on self-written songs. Lastly we verify there are no repeated sing in the data set.

albums <- c("Riot Act", "Ten", "Yield", "Gigaton", "Backspacer", "Vs.", "Pearl Jam", "No Code", "Vitalogy", "Binaural", "Lightning Bolt")
pj <- pj[pj$album_name %in% albums,]
pj <- mutate(pj, dupli=ifelse(duplicated(pj$track_name)==T,1,0))
pj <- subset(pj,dupli == 0)
head(pj,5) %>% select(track_name, album_name,artist_name,album_release_date,danceability,instrumentalness,energy,valence, key_name,mode_name) %>%  kable(format = "simple", col.names = str_to_title(gsub("[_]", " ", colnames(.))),align = 'lccccccccc',caption = "An example table caption.",digits = 3)

Lyrics: Genius API

As for Spotify's API, the Genius API requieres a developers account in order to query information. To authenticate the information the user must: 1. Create a Genius API client here, 2. generate a client access token form the API Clients Page and 3. set your credential in the System environment variable GENIUS_API_TOKE calling the function genius_token(). Now, in order to fecth the lyrics for each song I created a loop that goes trhough every song in the data set and retrieves the lyrics in a single vector and then adds it to the create variable lyrics2.

pj <- mutate(pj, lyrics2 = "")
for (element in (1:nrow(pj))) {
  
  # I created the loop with two cicles in it in order to double check all the songs 
  # get their corresponding lyrics. Note that the loop isolates each song and
  # retrieves the information with the function et_lyrics_search()
  
  title <- str_to_title(pj$track_name[element]) 
  print(title)
  lyrics <- get_lyrics_search(artist_name = "Pearl Jam",
                              song_title = title)
  if (nrow(lyrics)!=0) {
    # I optedto reduce the dimensions of the retrieved data set to  one observation.
    # Now I save the first line and add the rest of the lines of the song
  lyrics2 <- ""
    for (piece in (1:nrow(lyrics))) {
    if (piece ==1) {
      lyrics2 <- lyrics$line[piece]
    }
    else{
      lyrics2 <- paste(lyrics2, lyrics$line[piece], collapse = " ")
    }
    print(title)
    print(lyrics2)
    }
  }
  
  # If any song had any problem with querying the lyrics the second part of the loop
  # repeats the procces to guarantee that all songs are assinged to their lyrics.
  
  if (nrow(lyrics)==0) {
    lyrics <- get_lyrics_search(artist_name = "Pearl Jam",
                                song_title = title)
    lyrics2 <- ""
    for (piece in (1:nrow(lyrics))) {
      if (piece ==1) {
        lyrics2 <- lyrics$line[piece]
      }
      else{
        lyrics2 <- paste(lyrics2, lyrics$line[piece], collapse = " ")
      }
      print(title)
      print(lyrics2)
    }
  }
  
  # Finally the song is added to the original data
  
  pj$lyrics2[element] <- lyrics2
}

Our data set now contains lyrics to all the 146 songs in the 11 albums. A glimpse to my top two Pearl Jam songs (according to spotify rankings) Black and Even flow:

pj <- inner_join(pj,lyrics)
colnames(pj)[ncol(pj)] <- 'lyrics'
songs <- c('Black', 'Even Flow')
pj[pj$track_name %in% songs,] %>% select(artist_name, album_name, track_name, lyrics) %>% 
  kable(format = "simple",col.names = str_to_title(gsub("[_]", " ", colnames(.))),align = 'lccccccccc')

Finally, we have to take into account instumental songs. Even though Pearl Jam doesn't have many instrumental songs 3 cases need to be adressed. these are Arc, Aya Davanita and Cready Stomp. I proced then to assing a missing value to these songs in order to implement the followong analysis.

songs <- c("Arc","Aya Davanita - Remastered","Cready Stomp - bonus track")
pj[pj$track_name %in% songs,grep('lyrics', colnames(pj))]
pj[pj$track_name %in% songs,grep('lyrics', colnames(pj))] <- NA

About

Musical and lyrical analysis of Pearl Jam's songs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published