This is a follow up to my previous article "Appetite for Stats: Insights on the world's most dangerous band" and is based on the second point of this comment by Renaissir on Reddit's Data is Beautiful.
The 44 songs from Appetite for Destruction, GN'R Lies and Use Your Illusion I and II had a lexical diversity of 12%, that is, about 39 unique words per song.
So what I'm doing now is analysing the lexical diversity for each one of those songs as
Unique words per song / Total words per song and comparing that value to that global 12%. And just for the sake of comparing old to new GnR, I also analyse the lexical diversity of the songs from Chinese Democracy!
So here are the results:
|Album||Song||Lexical Diversity %|
|Appetite for Destruction||Welcome to the Jungle||36.9|
|Appetite for Destruction||It's So Easy||38.81|
|Appetite for Destruction||Nightrain||38.84|
|Appetite for Destruction||Out ta Get Me||41.67|
|Appetite for Destruction||Mr. Brownstone||45.25|
|Appetite for Destruction||Paradise City||50.38|
|Appetite for Destruction||My Michelle||53.23|
|Appetite for Destruction||Think About You||35.45|
|Appetite for Destruction||Sweet Child o' Mine||56.25|
|Appetite for Destruction||You're Crazy||35.29|
|Appetite for Destruction||Anything Goes||67.05|
|Appetite for Destruction||Rocket Queen||45.45|
|GN'R Lies||Used to Love Her||27.01|
|GN'R Lies||One in a Million||39.07|
|Use Your Illusion I||Right Next Door To Hell||53.94|
|Use Your Illusion I||Dust N' Bones||47.87|
|Use Your Illusion I||Don't Cry||33.33|
|Use Your Illusion I||Perfect Crime||51.15|
|Use Your Illusion I||You Ain't The First||67.12|
|Use Your Illusion I||Bad Obsession||28.13|
|Use Your Illusion I||Back OFF Bitch||39.73|
|Use Your Illusion I||Double Talking Jive||61.54|
|Use Your Illusion I||November Rain||45.45|
|Use Your Illusion I||The Garden||50.36|
|Use Your Illusion I||Garden Of Eden||42.17|
|Use Your Illusion I||Don't Damn Me||39.44|
|Use Your Illusion I||Bad Apples||38.31|
|Use Your Illusion I||Dead Horse||42.29|
|Use Your Illusion I||Coma||45.89|
|Use Your Illusion II||Civil War||43.96|
|Use Your Illusion II||14 Years||44.48|
|Use Your Illusion II||Yesterdays||42.46|
|Use Your Illusion II||Get In The Ring||58.11|
|Use Your Illusion II||Shotgun Blues||38.63|
|Use Your Illusion II||Breakdown||53.33|
|Use Your Illusion II||Pretty Tied Up||44.6|
|Use Your Illusion II||Locomotive||38.91|
|Use Your Illusion II||So Fine||35.67|
|Use Your Illusion II||Estranged||44.47|
|Use Your Illusion II||You Could Be Mine||49.21|
|Use Your Illusion II||Don't Cry (Alt)||40.48|
|Use Your Illusion II||My World||70.48|
So, what does all of this mean? You can see that the lexical diversity for each song it's actually pretty high in most case, that means that those songs don't repeat words that much. But, what about that global 12% (total unique words in ALL Gnr songs / total words in ALL Gnr songs). Well, we can conclude that, even though each song lexical diversity is bigger than 12%, in the overall, words tend to repeat from song to song.
This is were controversy begins. For many die hard Guns N' Roses fans the band ended en 1993, this band that we have now is not GnR. But, Axl has been the owner of the brand for many years (since early nineties) and, in my opinion, this is not a bad album at all.
So, lets see what the lexical diversity of this songs is. I'll also analyse the album lexical diversity in order to compare it with the Live Era albums.
Chinese Democracy Lexical Diversity: 15% (or 1 out 6 words is unique)
|Album||Song||Lexical Diversity %|
|Chinese Democracy||Chinese Democracy||43.87|
|Chinese Democracy||Shackler's Revenge||28.51|
|Chinese Democracy||Street Of Dreams||44.78|
|Chinese Democracy||If The World||31.86|
|Chinese Democracy||There Was A Time||31.43|
|Chinese Democracy||Catcher In The Rye||46.8|
|Chinese Democracy||This I Love||34.35|
As you can see, each song lexical diversity is also high but with a global lexical diversity of 15%, we can say again that there are a lot of words that are used in more than one song.
Now, lets see which is the lexical diversity of the previous albums and compare that to Chinese Democracy.
The following treemap shows each album lexical diversity as the composition of its songs lexical diversity.
Chinese Democracy is the GnR album with the lowest lexical diversity. Does it mean that is a bad album? Well, in my opinion is a really good album and music is much more than math. Its about passion and feelings and things that thankfully, we can measure. But you know, we judge a book by its cover and read what we want between selected lines (Don't Damn Me).
The songs analysis was made using this Python script that uses the fantastic NLTK natural language processing library. You can get the songs list from here and the Chinese Democracy songs list from here.
The treemap was made using R's portfolio library.