Youth-anizing the Oscars, a Fact-based Analysis

I hesitate to claim that I am the first person ever interested in reviewing the age of Oscar winners. The idea has been tried before, but it is a worthy subject. The topic cropped up again this year as I was looking at a local film critic’s predictions prior to this past weekend’s Academy Awards. The argument he made was that the Best Actress winner would be under 35 and, furthermore, that Emma Stone would be chosen over Ruth Negga because Negga had just passed her 35th birthday. A magic wall of 35?  That seemed like a prediction worth investigating, so I set out to explore the data and see what fascinating analysis™ might turn up.

First, it’s worth noting that the newspaper critic decided that he could handicap the winner to be under 35 based on the percent from “the last 13 out of 19 years”. It always gives me pause when someone uses a statistic involving an oddball number (like 19); you know that they’re cherry-picking the data. For non-statistical people that means they went out of their way to self-select the best set of facts to fit their hypothesis. Without even looking, I could tell you that if they select 20 years, or 25, 10, 12, or 15 years, the percentage wouldn’t be as high.

For the record, the percent of Best Actress Winners under 35 in the last 19 years is 68%, while the 20 year and 25 year were 65% and 64%, respectively.  Why cherry-pick when if you look at actual data and use round numbers, 64% is just as compelling. However, since any given actress has a 20% chance of winning randomly, and even a 20 or 25 year sample is pretty small and kind of lazy, we should really look at a much bigger set of data to understand the interplay of youth and winning.

Two existing analyses have done that to some extent. Rachel Hitchcock, a blogger at a fabulous site called Bitch Flicks, did an interesting comparison of Best Actor and Best Actress (and Supporting categories). You can read the analysis here, but the money chart is this one:

20170301-bitchflicks
Source: Rachel Hitchcock, Bitch Flicks

Her data showed the vast majority of Oscar winners for Actress are 39 and younger, while a similar majority of Oscar winners for Actors are 49 and younger. A fairly noticeable percent of Supporting Actors are in the much older categories, and there are a higher number of older women in the Supporting Actress category than Leading Actress category. Older men are given the opportunity to play parts more likely to garner Academy notice.

The Washington Post took a similar tack on a story by Stephanie Merry last year. In this case, Merry also looked at the winners over time to put together this time based view. (The data points are averages per decade). Actress winners are clearly younger and even the Supporting Actresses average less in age than the Actors, who themselves average less in age than Supporting Actors:

20170301 Wapost.png
Washington Post data analysis Feb 2016, Stephanie Merry & Zachary Pincus-Roth

Two other noteworthy observations involve the average over time and the shape of the lines. The average age for the Actors has clearly increased over the decades. For the Actresses, it did until the mid 1960s and then appeared to stop. There’s also a really odd blip (the red circle) in that decade of 1997-2006. It turns out that 8 out of the 10 winners in that decade – from Gwyneth Paltrow to Reese Witherspoon – were quite young. (And is there an entirely different blog about why the crop of previous nominees getting older were passed over… perhaps, but that’s another topic to investigate.)

Still, I was still interested in predictive ability. The Actresses are younger than the Actors but is that because the nominee pool is younger or something else. What happens between the nominees and the winners? Are the nominees all young to begin with or are only the young’uns picked to win? Is there a magical 35 year old wall? This was as yet unplundered data.

As an important aside, let it not be said that this data materialized out of thin air. As with all analyses worth their salt, the data had to be extracted and beaten into submission. It’s all fine and good for Wikipedia to list all the Best Actresses (and Actor and Supporting…) by age, but pulling them into Excel is not the final story. Dates like “January 26, 1939” are actually difficult to work with and the analysis required a great deal of google searching about date and text formatting, all of which led me to descriptions of the existing Excel @DATEVALUE function which does not, in fact, work on a date with that format. Not to mention ads for help with my Excel data by all sorts of international (shady) outfits, UCanDoExcel!.com.

Furthermore, Excel pretends that February 29th doesn’t exist. And, for people born prior to the 20th century – many of which won early Oscars – date math doesn’t work because it’s based on an Excel formula that assumes the world started at 1/1/1900. Not to mention that there is no Wikipedia list of Best Actress nominees by age; I had to enter all 443 of those by hand. Suffice it to say, I grabbed my spiked heels and whip and went at it until Excel was sufficiently tamed. You’re welcome.

I actually started with the most recent fifty years of winners (1967-2016), but, of course, once I had that completed, I had to go pull all the way back to 1928. That in itself proved to show some Interesting Things. Did you know that in the first decade of Oscars, there were sometimes 3, 4, or 6 nominees? Sometimes the winner was awarded for more than one film. Sometimes there were ties. This plays havoc with my data formatting, let me tell you!

20170301-meanmedian
Source: Kaj analysis, Wikipedia dates

The simplest result from this data is a comparison of the averages (the mean or the median). Two things jump out here. First, the Winner – regardless of whether we include all years or more recent years – is younger than the Nominees in all cases except looking at the mean for the last 50 years. (One reason not to look at the mean vs. the median.) Secondly, both the Nominees and the Winners are getting OLDER. I actually ran a simple confidence interval and it turns out I can be 95% confident that the average winner’s age in the most recent 50 years of 38.87 is statistically different than the average age of 36.02 across all years. You read that right. Older. Maybe I need to retitle this column because we’re not Youth-anizing, but we’re Age-anizing.

The data gets even more interesting – oh, doesn’t it always? Plotting the data leads to a much richer understanding of what’s going on here. The distribution of both nominees and winners is definitely skewed younger – women’s population in the U.S as a whole has a long aging tail — though the nominee drop-off is sharper than the population as a whole. In non-data terms, that means there are few roles for aging women and even fewer nominated for awards.  (An interesting side analysis would be to look at the age of the roles available, but how would I ever get that data?)

20170301 winnersand nominees.png

But there is also this odd bump in the Winner data – even easier to see in a simple bar chart here:

20170301-winners-bar

Women between age 35-40 experience a sharp drop of winning roles. There’s a slight uptick for those aged 40-45 and then it disappears again. Something does appear to happen at age 35. It seems hardly likely that the 35th birthday is a magic wall, but the data suggest it’s a barrier. It’s particularly stark in the last fifty years, although it extends backward to all years.

One more set of data is worth a look. The first rule of all good analysis is Plot the Data. The second and third rules, by the way, are Plot the Data and Plot the Data. So, while I have been looking at averages and decades and quartiles, I did also plot all winners and all nominees from the beginning of the Academy Award’s tenure. The “heartbeat” diagrams that you see below were predictable in their variation, but also illustrate the trend behind the increasing median age. In the more recent timeframe, the opportunity for significantly older women to win has increased (i.e. Katharine Hepburn and Jessica Tandy), but even in general the winner roles for women nearing 60 are more plentiful. Except for that 1996-2006 period where most of the winners were under 34, there has been increasing variation in age for the winners. That seems to me a good thing.

20170301 winneheartbeat.png

This trend is a strong reflection of the nominees – of the population available – because here is the graph of all nominee ages over time.  Since the mid-1960s, variation has increased on the high and low end of ages. A nominee in the 1930-1940s was never over 55 and hardly ever over 45; now those ages are fairly common. Part of that is Meryl Streep, but even she hasn’t always been over 45 across her twenty nominations.

20170301-nomineeheartbeat

To summarize:

  • The best actress winner tends to be under 35, in stark contrast to the nominees who most frequently average between 35 and 45.
  • This average has increased in a more recent block of years compared with the entire 89 year Academy history.
  • The mean average is higher than the median because, in all cases, there is a long tail in the distribution and the data is skewed. Using the median rather than the average would tell more about how the data really looks.
  • Plotting all the data over time shows that there is increasing variation – i.e. opportunity – for winners to come from a larger span of years even if there is a lean towards a <35 year old to win.

There’s one more thing. As you flick back through the performances nominated – and you can find all the nominees on Wikipedia under Academy Award Best Actress – you can’t help but recall many of these phenomenal women and their amazing performances. Whether you agree with the choice of winner or like/dislike the actress, the sheer force of all that talent is astounding to contemplate. From a Jennifer Lawrence in Silver Linings Playbook to Helen Mirren in the Queen, from Viola Davis in The Help back to a Bette Davis in Jezebel– these performances, these actresses, are a natural treasure.

As I look across the film list, I’m a little red-faced at how many I haven’t seen. Now that I’m finished plotting the data, I willl have to make time to experience the data. Time for a visit to my local library!

Leave a Reply