u/ArcticBeavers

r/letterboxdtopfour's Top Four - A Data Analysis

Hi all,

I did a data analysis of all the top fours posted on r/letterboxdtopfour. Here is a top four based on 343 posts in that subreddit:

  1. 2001: A Space Odyssey
  2. Mulholland Drive
  3. Twin Peaks: Fire Walk With Me
  4. Interstellar

Purpose:

I hate my current job. When I gave my two weeks notice and started sunsetting my productivity, I decided to put some effort into something I love, movies and data. I chose this project because I got tired of seeing the same goddamn movies on everyone's posts and wanted to make a point about it.

Methodology:

I sorted the subreddit's all-time posts from highest to lowest. I then went through each individual post and noted each film that was mentioned on a spreadsheet. I started with the highest post that gave a true top 4: "I'm 18 (not gay), any recs?". I then, everyday would just scroll and take note of every post's top 4 on a spreadsheet. I would write on a notepad where I left off and picked up from there whenever I could. Overall time to complete this task: 2.5 weeks (about 2 hours per day).

I went all the way down until my browser would not allow me to load any more pages. It ended on "What can you infer about me?", which has 36 upvotes. My intention was to keep going until my last few days at work, but this limit ended my data collection early. I did not care enough to get beyond this point.

Exclusions:

There were some exclusions I decided to make. Obvious joke posts like "Any Lasagna Lovers", "Guess my top 4" posts, or posts that only ranked one particular year/decade were excluded. Specialty lists like "10 Most Beautifully Shot Films" were also excluded. Also, any list that did not clearly identify a top 4 and were just a top 20, 50, 100 were excluded.

Findings:

In total, 581 movies were mentioned!

A total of 1372 votes were tallied!

That means I went through 343 posts!

Here are the top 18 movies and their vote totals:

Films - Ranked # of votes
2001: A Space Odyssey 22
Mulholland Drive 22
Twin Peaks: Fire Walk With Me 19
Interstellar 18
Whiplash 17
Evangelion 16
Parasite 16
The Godfather 14
Paris, Texas 14
La La Land 13
No Country for Old Men 12
Perfect Blue 12
Fight Club 11
Shawshank Redemption 11
There Will Be Blood 11
The Thing 11
Alien 10
Apocalypse Now 10
Eternal Sunshine of the Spotless Mind 10

Here is the breakdown by decade:

Films by Decade #
1920s 1
1930s 8
1940s 10
1950s 12
1960s 34
1970s 50
1980s 82
1990s 100
2000s 111
2010s 113
2020s 60

Here is the breakdown by # of votes

Vote Total # of Films
1 349
2 84
3 44
4 35
5 19
6 13
7 7
8 7
9 4
10 3
11 4
12 2
13 1
14 2
15 0
16 2
17 1
18 1
19 1
20 0
21 0
22 2

Interpretation:

A lot of this data was expected, given the demographics of Letterboxd and Reddit. This clearly introduces an age bias. I was happy when people would include family members like mom/dad in their posts to broaden the data a bit.

Here are some surprises for me:

If you only looked at this data set, you would think Evangelion and Twin Peaks: Fire Walk With Me were some of the most watched and discussed films among cinephiles. Obviously, this is not the case. Additionally, some films that are beloved by Millenials and Gen X feel underrepresented. Pulp Fiction, Jurassic Park, Back to the Future, Lord of the Rings, Goodfellas, and Star Wars come to mind. Granted, Star Wars and Lord of the Rings had their votes split among other movies in the franchise.

Lastly, there were 348 films that received 1 vote, which I found very interesting. There is a lot mixed in here. I think unique taste is good! However, there are some questionable films in this list:

Monsters Universe

Final Destination 3

Hotwheels: The Speed of Silence

Terrifier 2

Jumanji

The intention of Letterbox is to allow people to put whatever they want at the top... so who am I to judge?

Conclusion:

I hate my boss.

Raw webpage https://docs.google.com/spreadsheets/d/e/2PACX-1vSM8jx7dvfiupJVjw_ZNvJOfU09leZZGT5Wk4L6ccSJ3f4eeU6ELw6SNiBx4Eqpbw/pubhtml

Raw xls https://docs.google.com/spreadsheets/d/e/2PACX-1vSM8jx7dvfiupJVjw_ZNvJOfU09leZZGT5Wk4L6ccSJ3f4eeU6ELw6SNiBx4Eqpbw/pub?output=xlsx

Let me know if you think there is any more interesting data to be harvested here!

reddit.com
u/ArcticBeavers — 1 day ago

r/LetterboxdTopFour's Top 4, or, How I Killed Time at Work Before Quitting

Hi all,

I did a data analysis of this subreddit's favorite films. They are:

  1. 2001: A Space Odyssey
  2. Mulholland Drive
  3. Twin Peaks: Fire Walk With Me
  4. Interstellar

Purpose:

I hate my current job. When I gave my two weeks notice and started sunsetting my productivity, I decided to put some effort into something I love, movies and data. I chose this project because I got tired of seeing the same goddamn movies on everyone's posts and wanted to make a point about it.

Methodology:

I sorted the subreddit's all-time posts from highest to lowest. I then went through each individual post and noted each film that was mentioned on a spreadsheet. I started with the highest post that gave a true top 4: "I'm 18 (not gay), any recs?". I then, everyday would just scroll and take note of every individuals top 4. I would write on a notepad where I left off and picked up from there whenever I could. Overall time to complete this task: 2.5 weeks (about 2 hours per day).

I went all the way down until my browser would not allow me to load any more pages. It ended on "What can you infer about me?", which has 36 upvotes. My intention was to keep going until my last few days at work, but this limit ended my data early. I did not care enough to get beyond this point.

Exclusions:

There were some exclusions I decided to make. Obvious joke posts like "Any Lasagna Lovers", "Guess my top 4" posts, or posts that only ranked one particular year/decade were excluded. Specialty lists like "10 Most Beautifully Shot Films" were also excluded. Also, any list that did not clearly identify a top 4 and were just a top 20, 50, 100 were excluded.

Findings:

*raw data is at bottom of post*

In total, 581 movies were mentioned!

A total of 1372 votes were tallied!

That means I went through 343 posts!

Here are the top 18 movies and their vote totals:

Films - Ranked # of votes
2001: A Space Odyssey 22
Mulholland Drive 22
Twin Peaks: Fire Walk With Me 19
Interstellar 18
Whiplash 17
Evangelion 16
Parasite 16
The Godfather 14
Paris, Texas 14
La La Land 13
No Country for Old Men 12
Perfect Blue 12
Fight Club 11
Shawshank Redemption 11
There Will Be Blood 11
The Thing 11
Alien 10
Apocalypse Now 10
Eternal Sunshine of the Spotless Mind 10

Here is the breakdown by decade:

Films by Decade #
1920s 1
1930s 8
1940s 10
1950s 12
1960s 34
1970s 50
1980s 82
1990s 100
2000s 111
2010s 113
2020s 60

Here is the breakdown by # of votes

Vote Total # of Films
1 349
2 84
3 44
4 35
5 19
6 13
7 7
8 7
9 4
10 3
11 4
12 2
13 1
14 2
15 0
16 2
17 1
18 1
19 1
20 0
21 0
22 2

Interpretation:

A lot of this data was expected, given the demographic of Letterboxd and Reddit. This clearly introduces certain biases. I was happy when people would include family members like mom/dad in their posts. Here are some surprises for me:

If you only looked at this subreddit, you would think Evangelion and Twin Peaks: Fire Walk With Me were some of the most watched and discussed films among cinephiles. Obviously, this is not the case. Additionally, some films that are beloved by Millenials and Gen X feel underrepresented. Pulp Fiction, Jurassic Park, Back to the Future, Lord of the Rings, Goodfellas, and Star Wars come to mind. Granted, Star Wars and Lord of the Rings had their votes split among other movies in the franchise.

Lastly, there were 348 films that received 1 vote, which I found very interesting. There is a lot mixed in here. I think unique taste is good! However, there are some questionable films in this list:

Monsters Universe

Final Destination 3

Hotwheels: The Speed of Silence

Terrifier 2

Jumanji

The intention of Letterbox is to allow people to put whatever they want at the top... so who am I to judge?

Conclusion:

I hate my boss.

raw data: https://docs.google.com/spreadsheets/d/e/2PACX-1vSM8jx7dvfiupJVjw_ZNvJOfU09leZZGT5Wk4L6ccSJ3f4eeU6ELw6SNiBx4Eqpbw/pubhtml

Let me know if you think there is any more interesting data to be harvested here!

reddit.com
u/ArcticBeavers — 1 day ago