Manuscript Wordcounter/Grapher

I’m always looking at ways to make my life easier with code.

Sometimes, it’s through automating a process for myself. Other times, it’s about presenting information that informs my decisions.

This project- a word-counter/visualizer, is a bit of both.

The manuscript writing software I use (Scrivener) is an excellent tool, but the windows edition lacked certain features regarding tracking wordcounts over time.

NanoWriMo, a writing event I had previously participated in, has a wonderful bar graph to track your word-count from day to day.
(This is actually from Camp NanoWriMo, but I don't have a screenshot from the November one)

I wanted a similar tool for my own writing- outside of just November, and outside of just NanoWriMo.

Thus, I wanted my own wordcounter.

Scrivener works by having all of your projects contained in a file-tree environment- where individual files can be grouped into folders, and those folders ultimately into the manuscript at large.

They give you the raw numbers for an individual document, but I wanted to see my progress on a day-by-day basis, so I can track how much I am writing over a course of days, weeks, and months.

There are already existent tools which graph wordcounts- but none that seemed to word between multiple files. Rather than tallying how much word I had done across several documents, I wanted a tool which automatically interprets my wordcount for me, rather than adding it up every time I wanted to record my daily wordcount.

Fortunately, scrivener has a feature that allows you to ‘sync’ your project and back it up as text files. As so:

For windows version

The python tools:

I made three python files for this project, each involved with a separate step.

Wordcounter.py reads the contents of your sync folder, counts the number of words, then stores it to a csv file- along with the date.

This gives me a solid set of records of my wordcount per day (if I run it each day).

As for visualizing the records, Wordgrapher.py takes care of this.

Wordgrapher.py simply reads the contents of the CSV, then creates a bar graph visualizing that wordcount.

I can now tell what my wordcounts were for each day.

There are tons of features & design choices that I will likely add to this tool later. Like my DJ tool, this was mostly designed with my workflow in mind, though may be of use to others.

Now that I have a tool to chart my progress, I just need to keep on writing!

DJ Beatmatch Encoder

It’s really rewarding when you develop a script that automates a small part of your life.

Recently, I managed to pull one off which automates part of designing DJ sets.

Before jumping into the program itself, I need to explain a couple things about how I make my mixes, and how I DJ. These are important in understanding how my tool works.

-The music I use to DJ are all files saved on my hard-drive.
-I scan the files with a key detection software (Mixed in Key) which helps detect what musical key my music is written in.
-Mixed in Key also detects the ‘Beats Per Minute’ (BPM) of a song, which is a value representing the tempo.
-My DJ’ing often involves matching the BPM of two songs, allowing them to blend seamlessly.
-Most digital DJ software matches the bpm of two songs for you (called ‘sync’), but my hardware setup (Pioneer CDJ-900’s) does not.

I used beatmatch ‘by ear’, but I discovered an easier way. I can use an algorithm to to beatmatch for me. It’s quicker and more precise, but requires me to write down how much faster or slower each track is than the ‘Set’s BPM’.

I usually write this info on the filename for each song. This whole process used to require crunching numbers of a calculator, then renaming every file in the set.

Now, all I have to do is run a single python script.

Example

Let’s say I’m trying to develop a DJ set.

(Usually a set would be anywhere between 20-30 tracks, but to keep things simple, this one’s 2 songs.)

I already ordered the songs, so I know I’m going to play ‘Don’t Wanna Cry’ before ‘Great Fairy Fountain’. That means it’s time to use my tool.

The Beatmatch Encoder first reads the BPM of the set (or part of a set) and a folder which I’ve planned a DJ set in. I do this my altering the variables for the DJ directory, and for the ‘base BPM’. In this case, ‘example set’ is the directory, and ‘160’ is the set’s BPM.

When run, the script loops over the filenames of all the songs in that folder, reads the BPM of them, does some quick math to find the percentage difference between the song and the set’s desired speed. The script then prints this percentage right after the song’s number.

As you can see, I now have these BPM differences in an easy-to-read format.

Since my DJ equipment reads filenames, all I have to do is read the file on my equipment, and I instantly know what speed to put it at to be a perfect match.

Doing this file-editing manually would take ~20 minutes a set. Now it takes 20 seconds.

While I expect this particular tool to be of niche interest to others (you’d have to mirror my workflow for preparing DJ sets to find this tool practical) but hopefully this gave a little insight into how to simplify your life with a little code.

Average Is Not the Middle: Ratings & Their Distributions

Have you ever read a review and seen something akin to:

“This movie/film/album was average. It took no risks and was completely middle of the road, 70/100.”

I have seen this general review & it’s score dozens of times (though usually with more colourful language). It always struck me as odd. A 5/10 or 50% would be the middle-ground between an abysmal 0/10 and a perfect 10/10, yet I’ve noticed people seem to stand around a 7/10 or 70% when rating a piece of media that is ‘so-so’ but also ‘not bad’.

Is 70% really ‘average’, or have I just been imagining things?

I decided to download several different datasets of internet reviews to find out.

Rather than just sticking to a single website or medium, I decided to look at several different sources. That way, I could be sure that the results were just a by-product of a single website’s scoring system, and instead represented a larger trend in review-scores.

Before looking at all of the different reviews normalized and smooshed into one large dataset (which you can see near the bottom) I decided to take a quick look at the datasets individually- just to see if they stood out differently from one another, or if one of them was completely unlike the rest.

 

Metacritic: Metascore (Videogames)

Metacritic has two different types of scores it uses: Metascores and User scores.

According to the Metacritic Website, their ‘metascore’ is a weighted average of individual critic scores. They take a bunch of reviews from official critics, assign a ‘weight’ to each of the critics, ‘normalize’ all the scores (so a 9/10 becomes a 90/100, a B+ becomes an 83, etc… Then they base their metascore off of this.

Reverse engineering how metacritic gets their metascores scores would be worth an entire article on its own (one that has already been done by someone else, no less). Suffice to say, the ‘meta’ of metacritic is because their review is supposed to be the result of many others.

Metacritic Scores and # of ratings which fall into them

We see an obvious peak around the mid-70’s, but other than that, a fairly normal & tapering off in both directions. Consulting with the Metacritic website again, ‘Mixed or Average Reviews’ for videogames fall between 50-74%.  Based on that, I would have suspected that the peak of the bell-curve would’ve fallen in the middle of the ‘Average Reviews’ score range (62%). Instead, it peaks around the 70-73% range- illustrating that a 70/100 score is, indeed, average. It’s also nice to know that the average metascore is actually within the boundaries of ‘Mixed or Average Reviews’, even if it’s touching the upper-ceiling of that classification.

Metacritic: Userscore (Videogames)

Userscores on Metacritic follow a far simpler process: Metacritic users submit a personal score between 1 and 10, Metacritic averages the submitted scores from all users, and voila, the user score is created.

Here, we’re seeing a peak around the 8/10 range. This would be an ~80% when compared to the ~70% of Metascores, showing a clear disparity between user scores and critic scores.

I suspect this is because of a self-selection bias: a user can pick and choose what he wants to play, while a critic often cannot.

I looked at the relationship between these metacritic Metascores and Userscores in a separate article here.

 

IMDB Film Ratings:

IMDB’s scoring system is also determined by popular vote among users. IMDB mentions that its rating system is a weighted average designed to “eliminate and reduce attempts at vote stuffing by people more interested in changing the current rating of a movie than giving their true opinion of it.”

Suffice to say, these scores are largely determined by users, though some users (who presumably have proven themselves in some way to be reputable at voting) have more of an impact than others.

Histogram IMDB Review Score

 

The average IMDB score is definitely lower than the average metacritic score. The middle of the bell-curve here is clocking in around 6.8, and tapers off aggressively below 6.0 and above 7.6.

This is looking ever-so-slightly lower than a ‘70%’ average, but it’s certainly higher than a ‘50%’ one, too.

 

Pitchfork Music Ratings:

Pitchfork’s rating system is far less ‘by the numbers’ than Metacritic or IMDB. A single writer for Pitchfork will review a specific album, giving it a write-up explaining their thoughts about it, and also giving it a numerical score out of 10. The write-up is ultimately meant to help inform & give rationale for the 0-10 rating- providing accountability for specific scores.

If Pitchfork has a “formula” for their reviews, they don’t have it explicitly posted. Though given their entire team of reviewers can be viewed on a single page, I would imagine their reviews are more of a practiced art than a science.

Pitchfork Histogram

Of all the individual datasets, this one is easily the most interesting. Pitchfork has reviews out of 10, but they also give decimal values (rather than just giving an 8/10, they may give a 8.2/10.0). In essence, this means their reviews function similarly to a review out of 100. However, if you look at the above graph, you may notice that, relative to nearby bars, the tallest peaks are on whole numbers. It seems critics are still more apt to assign a ‘6.0’ than they are a 6.1 or 5.9. If you ignore the numbers at the bottom for a moment, you can probably make an educated guess as to where 7.0 would be in relation to the peaks for 6.0 and 8.0.

Another trend worth noting is the sheer number of 8.0 reviews. The Median (which, in many ways, signifies the “middle” of our hill of ratings) of this dataset is 7.2, yet 8.0 is where a majority of reviews sit.

The likely explanation for this phenomenon has to do with Pitchfork’s review categories. The site has a special category for ‘8.0+ reviews’, which only gives you search results which scored 8 or above. I suspect there is a similar thought process for many reviewers: “I’d give this a 7.8 or 7.9, but I think it deserves to be seen on the high-score list”. Perhaps you also end up with reviewers who feel something is “the worst of the best”, and that it deserves to be on the high score list, but only at the very lower threshold of it.

Whatever the reason may be, we see many ratings of around 80%, and a large pool of reviews around the 70% range, too. Once again, it looks as if 5.0/10 or 50% is far from the average score.

All together:

So far we’ve seen some pretty compelling evidence that the average review for something lies higher than 50/100. But from the looks of it, there’s still a bit of variation from review-system to review-system as to what exactly IS a “middle-score”.

Setting out to find the answer to this, the following is a combination of all four datasets together. Since Metacritic User Scores & Pitchfork reviews were only out of 10, they were multiplied by 10, normalizing them to a 100 point scale.

Also, since we’re looking for the average review (across different websites and creative mediums), I chose not to ‘weigh’ any of the datasets differently. This means the the smallest dataset (IMDB film ratings) will play a far smaller role in the end result than the largest dataset (Pitchfork Music Ratings). However, since we’re trying to find ‘the average review’ independently from whatever rating system or site it is from, we are going by the sheer number of reviews, it makes sense to treat every review with equal importance.

Histogram of Reviews from Metacritic, IMDB, and Pitchfork

You may notice that, unlike the previous bar-graphs, this one doesn’t measure the # of entries for each bar, but instead looks at what proportion of reviews that fall into each bar. Rather than simply showing ‘how many’ reviews gave a specific score, this helps to inform us about what percentage of reviews fell into each bar. We see how the tallest peak (80/100) touches the 0.040 tick, therefore account for about 4% of the overall reviews.

The results here appear to cluster strongly around the mid-70’s range. Both the mean (69.98) and median(72.0) are fairly close to a score of 70/100.

While looking at things from a purely scientific point of view, there are a number of factors to consider which prevent me from generalizing these results to talk about EVERYONE. Mainly, all of these reviews are collected from the internet and are operated/designed with an English-Speaking audience in mind.

That being said, the implications of these results are fairly clear: A 70/100 review can often be interpreted as ‘average’.

As to why this is the case: I have my theories, but I will leave them for another day.