Helpful People: Ryan Petroff

It’s rare you get help from someone who asks for little in return. Ryan Petroff is one of those people.

I express my gratitude for the many times he has assisted me with my projects.

More than just a person to bounce ideas off of (though great at that), he was instrumental in my earliest days of programming.

Before I had even known that I wanted to pursue data science & automation, the gears had already been set in motion thanks to his help.

I was in university when I found myself in a pickle. I was doing an elective for a 3rd year Geography class, and had asked the professor the easiest way to make a Cartogram.

I am a fan of geographic visualizations, and this was one I wanted to create. The professor said that as a non-geography major, I shouldn’t worry about such difficult tasks. That this was above what I was capable of.

So naturally, I had to create one.

I knew I had the potential to pull off this project, but was without the basic technological literacy to get started.

That professor did not expect someone like Ryan Petroff. An educator who will not only help someone out of generosity, but also inspire them while doing so.

Ryan helped me from the ground up with this project. He helped point me in the right direction at every turn. He found the project amusing. A challenge, but one that could be tackled in a single night. I won’t lie, it was a long night. But one that left a lasting impression. For the first time, I had manipulated data with my own hands.

Fast forward a couple of years. I spoke with Ryan about my interest in ‘data science’. He answered my questions and helped provide me the level of knowledge required to secure a job teaching JavaScript and Python.

As I improved, Ryan would still help me with my projects. When facing an issue, he would often have a fix. But more importantly, he taught self-sufficiency. Each time Ryan helped resolve any issue, he’d walk me through the process used to find the answer from the beginning. During this process of dense skill acquisition, he taught me to think like a programmer, and how to think about code development.

His help in these matters inspired part of my philosophy of automation. Well-designed code frees people from unnecessary work. I have since developed several tools to make tedious tasks in my life disappear. I always consult Ryan when starting a new project.

Even now, as I teach students at the University of Toronto the ins-and-outs of data science, I still remember Ryan’s lessons. I do my best to teach my students the way that Ryan taught me.

So thank you, Ryan Petroff. Your contributions have not gone unnoticed.

Metacritic Reviews: Metascore vs. Userscore

For an upcoming project, I had to determine whether or not Metacritic’s ‘Meta-score’ (aka the ‘critic score’) and ‘User score’ were different enough that it was worth analyzing both datasets. The data is for video game reviews, and each point presents the final score given for a game.

It turns out, they are similar, but less than you might think.

Metacritic Metascore vs. Userscore
Statisticians and Data-nerds can note that the correlation had a strength of 0.536, and a p-value of <0.05


Meta-critic Metascore & User Scores have a moderate positive correlation to one another. These results are statistically significant.

To simplify that statement: The relationship between Metascores and Userscores generally line up with one another. Most of the time, critic and user scores are pretty close to one other. When one group gives a high score to a game, generally so does the other. Likewise with low scores, or scores that fall somewhere in the middle.

Because this is only ‘moderately’ correlated, there are still a lot of times when the two scores differ. The little dots that are further away from the main “swarm” are ones where one scores is vastly different than the others.

With the number of datapoints and their placement, the chances this result happened due to “dumb luck” is exceptionally low.

You may notice that Critic scores are out of 100, while User scores are only out of 10. Aside from giving a little less ‘wiggle room’ for users, this doesn’t effect the analysis too heavily. Users can only rate something out of 10, but the overall rating on a videogame can still have a decimal point in it (i.e. 8.7/10)


Even beyond looking at a specific genre, most users develop a ‘taste’ for certain games, based on franchises and game studios. Gamers learn what games they enjoy, and what games to avoid. Critics often lack that freedom, and may be circumstantially forced to play games they may have a negative predisposition towards, or would otherwise not consider better than “middle of the road”.

Critics may have to “take one for the team” and review the occasional game they know is going to be awful. Users just know to avoid them.

Another possible explanation is simply that critics score lower because they are…critics. Their goal is to accurately assess the quality of a game as best as possible.

Theories aside, the results of this correlation told me what I needed to know. ‘Metascores’ and ‘Userscores’ are different enough that I can include both in my next analysis.

Game Sales: a Visual Analysis

This’ll be the first in a series of articles/visualizations on ‘Data Trends’

Let’s start with something simple – stats about Video-game sales. This data was compiled by, a website that focuses on sales trends within the gaming industry.

Much like how Hollywood movies have an ‘upward trend’ in the #1 box office sales each year, I expected the same to be true for videogames. I anticipated an upward trend in the # of sales of newer games, with the top-selling games constantly breaking sales records each year.

“Best-seller” of the year:

The following shows the sales of the ‘best-selling’ videogame of each year.

Game Sales by Year

This defied my initial expectations. We see peaks and valleys between the yearly best-sellers. The highest record was around 40 million being set in 1985, and then the next highest was in 2006.

Let’s take a second to talk about the top 7 selling videogames of all time:

# 1: Wii-Sports (2006)

#2: Super Mario Bros. (1985)

#3: Mario Karts Wii (2008)

#4: Wii Sports Resort (2009)

#5: Pokemon (1996)

#6: Tetris (1989)

#7: New Super Mario Bros. (2006)

So what’s similar between all of these games? Aside from being produced by Nintendo (who also hold the record for 22 of the top 25 best-sellers of all time).

I did some googling on these titles, and all 7 games all had ‘bundle’ deals, where you could purchase them as part of a bundle with your game console- essentially giving you a free videogame with a purchase of the device to play videogames on.

Successful games would already be strong consideration to be bundled with the consoles, but these package deals may have contributed to making these games all-time best-sellers.

Even with the data in front of us, it’s hard to imagine Wii Sports selling 82.83 million copies…That’s more than twice the population of Canada (36.29 million in 2016).


Average sales per year:

Shocked that best-selling game records aren’t constantly being beaten, I looked to other statistics. If the best-selling video-game isn’t breaking records each year, I considered that perhaps the average game was selling more copies.

Once again, I am completely wrong.

Mean Videogame Sales by year

We clearly see a ‘golden age’ of game sales, starting in the mid-80’s and ending in the early 90’s. Things seem to flatten out after that, but sales are nowhere near the monster they are before.


(Bear in mind this is only the top ~16600 games sold- with the bottom of the list only getting around 10k sales globally. Even if the effects on overall averages are only slight, I’m sure there’s a chance that the “bottom of the barrel” sellers who never even made the list have a chance of skewing averages lower. )

This prompts even more questions. I anecdotally hear about how the game industry is growing, and yet these numbers don’t reflect that.


Number of games made:

Games Made Per Year

Finally, something that rises, rather than falls!

While numbers seem to be on the decline from the mid-2000’s forward, the number of games being produced are still far higher than they were in the ‘early days’. It’s quite possible that the number of games on the market now is part of the reason why the average game sold fewer copies.

So, more games are being made, but are people buying them?

From the looks of it, yes. The total amount of games SOLD per year is also increasing:


Total Game Sales:

Ancedotes about the gaming industry growing are not unfounded. It’s just that rather than a couple of games dominating in sales, there’s a more diverse marketplace from which to choose from.

The ‘average’ game may be making less revenue past the mid-90’s, but with more options for games available, it’s not due to dwindling interest. There’s simply more competition, and more diversity in what gamers can buy.

As they say, variety is the spice of life. When more games are available, more games are being bought.

Film Box Office Sales vs. Game Sales – Upward Trend

I made a comment earlier in this document about how film sales are on an ‘upward trend’. To validate my claim, the following is a trendline of the best-selling film of the year. Just to give some context, you can also see the best selling Videogame of the year below it.

Box Office Sales vs. Game Sales

Sources: Box office grosses:

It’s interesting the consider how best-selling movies are on an incline, while the best seller for videogames are, by comparison, fairly steady. I would hypothesize that this would have to do with the consolidation of film studios, leading to fewer studios competing, and more resources being pooled into marketing for the biggest films of the year, though that would be worth another analysis altogether.


Outliers , oddities, and points of note:

The data used in this analysis was scraped during 2017, likely during the beginning of the year. I choose to remove 2017 data because 2017 isn’t over yet, and many of the games that have come out this year are missing.

Games that were sold cross-platform were evaluated by their sales for each individual platform. So, say, Grand Theft Auto V would be evaluated by its PS4 sales, it’s XBOX sales and it’s PC sales separately.

It’s also important to note that these dates mark the year the game debuted. It may not necessarily be the year the game made the most sales. As a result, it’s important to consider that the production run of a single videogame is going to play a major factor in its global scales.

As a result, I would argue the data is ‘incomplete’ on any game that still hasn’t finished its production run. They still have potential to climb significantly in sales. Especially for games from 2016, which I suspect are still on the shelves.


I hope you enjoyed this analysis. If you’re interested in similar articles, be sure to check the Research & Data section of my site!