Visualizing Pitchfork Music Reviews

Visualization Project using Tableau and Pitchfork magazine review data

Project Description

For this project I created a set of different visualizations using data from a collection of over 18,000 Pitchfork music album reviews. The creator of this data-set had already explored some interesting questions with this data, such as do reviewers get tougher with more experience? (they supposedly don’t), but I wanted to take it further. My set of visualizations start with some basic information including ratings by different genre, then looking at artists and their ratings, and finally using text-mining and natural language processing to visualize how writers describe different kinds of music.

Why pitchfork?

I’ve generally enjoyed reading some Pitchfork reviews every now and then, particularly for artists I appreciate and believe they review honestly. That being said, despite being one of the most well-known music review websites, they do have a reputation for being very pretentious and supposedly not understanding certain beloved genres. The pretension of Pitchfork is often attributed to the ornate writing style of many of their reviewers, which leaves some readers bewildered by choice metaphors or adjectives that seem to be pulled out of a dated thesaurus. Bias towards certain genres is not something I’ve personally observed, but it seemed like an interesting thing to investigate with this data-set. This lead me to the following problem statements.

Problem Statements:

  • Are pitchfork reviews more favored towards certain genres? How do albums for certain genres score compare to others?
  • Is there a way to visualize the supposed pretension of these reviews? Perhaps by breaking down and visualizing certain parts-of-speech such as adjectives?

Scope and Impact

The scope of this project has been limited to using this one data-set and will not involve cross-referencing any information about artists, albums or genres. The project will involve taking data as-is for the first couple of visualizations followed by manipulated data through text-mining the reviews. As there are over 18,000 review content records in the data-set, for the text-mining portion this had to be cut-down to only reviews that were rated “Best New Music” to reduce processing time.

As for impact, while this is mostly a lighthearted project, it can help illuminate whether some perceptions of Pitchfork are accurate or not. For the question on favoring certain genres in particular, the results of this could provide evidence for discussions on representation and biases among critics in the industry.

The Dataset – Variables and Processing

The original data-set was stored in an SQLite database with several variables for reviews including the information about the album and artist as well as data on the review including written content, publish date, author and more. For this project I only used the following variables:

  • Genre: The genre of music, such as electronic or rock (nominal)
  • Artist: The name of the artist who created the album (nominal)
  • Album: The title of the album being reviewed (nominal)
  • Score: The score given by the reviewer from 1 to 10 with 10 being best (continuous)
  • Best New Music: Whether the album was rated as “best new music”, a seemingly arbitrary status given to music reviewers felt deserved extra recognition (binary)
  • Label: The record label the artist was signed to for the album’s release (nominal)

In addition to the above, for the purpose of visualizing the adjectives used in reviews I did some additional data-processing on the review contents to get the following variables:

  • Adjective: An adjective found in a review (nominal)
  • Adjective Count: The number of times the adjectives appears in a review (continuous)
  • Sentiment Polarity: The sentiment “polarity” score, where scores of 0 mean the word is neutral, below 0 are negative and above are positive (continuous)

Data processing was done in Python with the help of the TextBlob package. After querying the database for the content, I used TextBlob to give every word in each review its parts-of-speech tag. This proved to be very time-consuming when performed on the entire 18,000+ data-set, so I decided to narrow it down to albums rated “best new music”, which only had 1,476 records. After getting parts of speech, I extracted only those that were labeled as adjectives. I then used another Python collection called Counter to create a dictionary with the adjectives and their occurrences in each review. Finally, I used TextBlob’s sentiment analysis function to get the polarity score for each word. This was definitely a overly-simple approach to sentiment analysis, as analyzing the text word-by-word ignores much needed context that could completely change the intended meaning. I only took this approach to save some time processing and to avoid creating more convoluted data structures with complete sentences, but ideally the sentiment of each word would be analyzed within its sentence. After processing, I saved the data into a new CSV file and joined it with the non-processed data using the unique reviewID identifiers. All of the data processing code and files used to create the complete data-set are available on my GitHub.


Part 1: Ratings By Genre

For the first step I wanted to address the question of whether Pitchfork clearly favors some genres over others. I approached this in two ways: by looking at average scores by genre, as well as the distribution of albums rated “Best New Music” by genre. Before developing the visuals, I determined the context, user and task as follows:

  • Context: These visuals are used to understand information about pitchfork specifically. While they’re developed for online viewing which enables filtering and interaction, I want the key data to be obvious.
  • User: Anybody interested in music and critical music reviews.
  • Task: Users want to understand which genres are scored best to worst, whether there are seemingly significant differences in how genres are scored, and get an idea of how the distribution of scoring or awards of “best new music” relate to distribution of reviews by genre.

For these visuals I used Tableau to develop a simple bar chart for average scores by genres and a stacked bar chart showing total reviews by genre and those awarded “best new music”. You can look at the distribution by genre dashboard on Tableau or the still image of the visuals below:

The visual on the right was fairly straightforward to develop as it was just aggregating all scores by genre into a bar chart. The chart on the left went through a few iterations. First, I was simply displaying the distribution of “best new music”, or the blue parts of the chart. The problem with this, as clearly shown above, is that some genres like Rock have a lot more total reviews than others, so showing that they have the most “best new music” doesn’t mean it is the favorite genre based on rating but that they simply review more rock albums. Another alternative was to display the percent of total albums per genre that are “best new music”. On one hand, it would have made the proportion immediately clear, but on the other hand, users wouldn’t be able to see the distribution of genres reviewed which is additional interesting information. As Tableau allows filter options, I chose to display all the data and allow users to filter whether they want to look at all reviews, or reviews that are or aren’t “best new music”. I’ve enabled the filter for both graphs, so users can view average scores as well by titles rewarded “best new music” or not.

Part 2: Scores and Rewards by Artist

The second visual I wanted to create was to look at artists and be able to see which ones had higher scores or were awarded “best new music”. This was trickier because while genres only had 10 categories, there are 1000’s of artists in the data-set. I first determined the context, user and task as follows:

  • Context: These visuals are used to understand pitchfork reviews specifically. Given that artists have 1000’s of unique categories, there is no sense in making specific information obvious at first glance, or necessarily a desire for users to be able to do so immediately. As this is available online, filtering options will help users look at select artists and ranges of scores as desired.
  • User: Anybody interested in music and critical music reviews.
  • Task: Users may want to get a general look at all artists and their scores, possibly filtering by specific artists and score ranges or number of albums awarded “best new music”.

I also used Tableau for these visuals, this time creating a bubble chart with multiple filters: by Artist (look-up), total albums awarded best new music (scale), genre (multi-value list), and score (scale). I also color-coded the diagram by genre, which aided it visually as well as provided additional information. The reason I used a bubble chart was because they are visually appealing and good for exploratory and interactive visuals. With so many data points, the typical option for precise data would be a scatter-plot, but since I only had one meaningful continuous data point (score), a plot just looked poor and had too much overlap. The only issue with this approach was that adding genres to it separated artists who were listed under multiple genres. My approach was to format the data beforehand so that each artist belongs to a single genre, which was the one they have the most albums tagged as that genre. You can look at the Scores by Artist Dashboard on Tableau or the still image of the visuals below:

Artists with Albums reviewed by pitchfork

Part 3: How does Pitchfork Describe Music?

The final visual I made was of adjectives used in the album reviews, presumably to describe the music. This first involved data processing as previously described, followed by determining what visual would best present the data. I determined context, user and task for this visual as follows:

  • Context: These visuals are used to understand how Pitchfork writers describe the albums they review. As the number of possibly adjectives used would be in the thousands or even tens of thousands, this should be a more exploratory visual that allows for some dI etail filtering.
  • User: Anybody interested in music and critical music reviews.
  • Task: Users may want to first see some words pop up as the most common, and then filter by certain attributes: possibly specific artists or genres, or looking at ranges of scores.

I once again used Tableau and created another bubble chart. Each bubble represents an adjective and its size is determined by the total number of adjectives appearing across reviews. I also used the sentiment polarity calculated for each adjective and applied it to the visual on a color scale from red (negative sentiment) to blue (positive sentiment). The resulting bubble chart with all adjectives is fairly massive, so I added several filters: by artist (lookup), review score (range), sentiment polarity (range), adjective count (range), and genre (multiple selection). You can look at the Adjectives Dashboard on Tableau or the still image of the visuals below:

adjectives in music reviews

A bubble chart seemed to be the most sensible chart I could make with Tableau to make this an exploratory visual. I prefer bubble charts to word clouds because it’s easier to gauge size of a bubble rather than words that can differ in character length and dimensions. I had hoped it would be possible to use Tableau to create bubble charts inside certain shapes, such as a guitar, which is something some word cloud generators can do. Tableau doesn’t enable you to do so, which is understandable since it’s more of a visual flourish at the expense of accurately representing data.


Many things can be done using Tableau ranging from fairly simple and commonplace graphs such as bar charts to more complex ones. One of the greatest challenges in making visuals with Tableau was taking care to fully understand how every action you take changes the data being presented, often in ways you may not have predicted. For example, adding genres as a color to my chart in my artist score visuals initially split up all the artists into several bubbles, so I had to first change the data set to make sure each artist only had a single genre. Following the CUT-DDV framework helped a lot to narrow down my approach and also be aware of how my final visualizations will be limited by the choices I plan to make before I even created the visualizations. The decision I made over whether a visualization should make specific facts pop out versus a creating a broad exploratory interactive experienced made it a lot easier to identify which charts to use. Overall, this was a great experience in exploring the different capabilities of Tableau and data processing.