Appetite for Stats: It's so Lexical

This is a follow up to my previous article "Appetite for Stats: Insights on the world's most dangerous band" and is based on the second point of this comment by Renaissir on Reddit's Data is Beautiful.

The 44 songs from Appetite for Destruction, GN'R Lies and Use Your Illusion I and II had a lexical diversity of 12%, that is, about 39 unique words per song.

So what I'm doing now is analysing the lexical diversity for each one of those songs as Unique words per song / Total words per song and comparing that value to that global 12%. And just for the sake of comparing old to new GnR, I also analyse the lexical diversity of the songs from Chinese Democracy!

So here are the results:

Album Song Lexical Diversity %
Appetite for DestructionWelcome to the Jungle36.9
Appetite for DestructionIt's So Easy38.81
Appetite for DestructionNightrain38.84
Appetite for DestructionOut ta Get Me41.67
Appetite for DestructionMr. Brownstone45.25
Appetite for DestructionParadise City50.38
Appetite for DestructionMy Michelle53.23
Appetite for DestructionThink About You35.45
Appetite for DestructionSweet Child o' Mine56.25
Appetite for DestructionYou're Crazy35.29
Appetite for DestructionAnything Goes67.05
Appetite for DestructionRocket Queen45.45
GN'R LiesPatience46.11
GN'R LiesUsed to Love Her27.01
GN'R LiesOne in a Million39.07
Use Your Illusion IRight Next Door To Hell53.94
Use Your Illusion IDust N' Bones47.87
Use Your Illusion IDon't Cry33.33
Use Your Illusion IPerfect Crime51.15
Use Your Illusion IYou Ain't The First67.12
Use Your Illusion IBad Obsession28.13
Use Your Illusion IBack OFF Bitch39.73
Use Your Illusion IDouble Talking Jive61.54
Use Your Illusion INovember Rain45.45
Use Your Illusion IThe Garden50.36
Use Your Illusion IGarden Of Eden42.17
Use Your Illusion IDon't Damn Me39.44
Use Your Illusion IBad Apples38.31
Use Your Illusion IDead Horse42.29
Use Your Illusion IComa45.89
Use Your Illusion IICivil War43.96
Use Your Illusion II14 Years44.48
Use Your Illusion IIYesterdays42.46
Use Your Illusion IIGet In The Ring58.11
Use Your Illusion IIShotgun Blues38.63
Use Your Illusion IIBreakdown53.33
Use Your Illusion IIPretty Tied Up44.6
Use Your Illusion IILocomotive38.91
Use Your Illusion IISo Fine35.67
Use Your Illusion IIEstranged44.47
Use Your Illusion IIYou Could Be Mine49.21
Use Your Illusion IIDon't Cry (Alt)40.48
Use Your Illusion IIMy World70.48

So, what does all of this mean? You can see that the lexical diversity for each song it's actually pretty high in most case, that means that those songs don't repeat words that much. But, what about that global 12% (total unique words in ALL Gnr songs / total words in ALL Gnr songs). Well, we can conclude that, even though each song lexical diversity is bigger than 12%, in the overall, words tend to repeat from song to song.

Chinese Democracy

This is were controversy begins. For many die hard Guns N' Roses fans the band ended en 1993, this band that we have now is not GnR. But, Axl has been the owner of the brand for many years (since early nineties) and, in my opinion, this is not a bad album at all.

So, lets see what the lexical diversity of this songs is. I'll also analyse the album lexical diversity in order to compare it with the Live Era albums.

Chinese Democracy Lexical Diversity: 15% (or 1 out 6 words is unique)

Album Song Lexical Diversity %
Chinese DemocracyChinese Democracy43.87
Chinese DemocracyShackler's Revenge28.51
Chinese DemocracyBetter36.87
Chinese DemocracyStreet Of Dreams44.78
Chinese DemocracyIf The World31.86
Chinese DemocracyThere Was A Time31.43
Chinese DemocracyCatcher In The Rye46.8
Chinese DemocracyScraped33.12
Chinese DemocracySorry38.93
Chinese DemocracyI.R.S.43.25
Chinese DemocracyMadagascar36.5
Chinese DemocracyThis I Love34.35
Chinese DemocracyProstitute52.07

As you can see, each song lexical diversity is also high but with a global lexical diversity of 15%, we can say again that there are a lot of words that are used in more than one song.

Now, lets see which is the lexical diversity of the previous albums and compare that to Chinese Democracy.

Appetite for Destruction
GN'R Lies
Use Your Illusion I
Use Your Illusion II

The following treemap shows each album lexical diversity as the composition of its songs lexical diversity.

GNR Lexical Diversity

Chinese Democracy is the GnR album with the lowest lexical diversity. Does it mean that is a bad album? Well, in my opinion is a really good album and music is much more than math. Its about passion and feelings and things that thankfully, we can measure. But you know, we judge a book by its cover and read what we want between selected lines (Don't Damn Me).

Guns N' Roses

Data and tools

The songs analysis was made using this Python script that uses the fantastic NLTK natural language processing library. You can get the songs list from here and the Chinese Democracy songs list from here.

The treemap was made using R's portfolio library.

comments powered by Disqus
Fork me on GitHub