The problem, though, isn’t the algorithm. It’s the dataset. They don’t have all the data they need.
A simple one-through-five stars won’t suffice for generating accurate predictions. The definition of “genre” needs to be re-written. I believe in this enough that I think it’d be my Master’s thesis if I ever had the desire to go back to school. That said, let us argue.
The term “genre” encompasses too much; it includes both the intended emotional response to a film — let’s just call that the “intended emotional response”, since I can’t think of a real word for it — as well as story-type factors. Instead, I say “genre” should ONLY refer to the story-likenesses.
Story likenesses are like the following (remember, these are generalities, or genre-alities):
- A sci-fi includes non-existent technology (and frequently extrapolates the consequences of that technology); they can be set in the present day or the future, on our planet or a distant one.
- A western is set in the 1800s American frontier, with man vs man conflicts.
- Film noir includes a femme fatale, a crime, etc., and is generally set in the WWII and just-post-WWII era.
- Cop dramas follow the dark side of a detective’s duty.
I think you get the idea. With the new definition of “genre”, though, you’ll start getting more and more specific — “Zombie movies”, “19th Century Romance”, “Birth of a Superhero”, etc.
But these genres can fall anywhere in the spectrum of intended emotional response.
That’s my first idea of how movies should be mapped — it may have some stuff I haven’t considered.
But note how sci-fis can fit anywhere on the spectrum. To say “Jack and Jill like science fiction” isn’t adequate. Perhaps Jack likes only the most incredible science fiction, whatever the intended emotional response, whereas Jill really only goes for the more believable adventures in the sci-fi genre. Jack’s favor towards sci-fi will make a ring around the outside of the spectrum; Jill’s will make a small blob — maybe an arc — in just one area.
I believe it’s safe to assume that NO MATTER THE GENRE, a person’s like or dislike of a given area of the spectrum will remain constant.
To determine where a movie falls in the spectrum is tricky. You can’t take the director’s word for it, because what he finds merely thrilling might be horrific to the majority; every individual has a personal spectrum. (That said, a movie that is intended to be funny, but fails, is still a comedy. Failure to elicit the intended emotional response is a primary factor in a movie’s quality to viewers, but it doesn’t change its position in the spectrum.) For that reason, you probably need to ask more questions in each user’s review to determine how close they are to the average spectrum, and adjust as necessary.
Not that any of that is viable for Netflix. Getting more out of users is unlikely. But at the very least, they could establish the baseline position for a movie in the spectrum of intended emotional response, and start qualifying folks’ ratings from there.
Gimme my million. Or my Master’s.