#177 | The Real Problem with AG49 – Part 2
When I say “numerical average,” what picture pops into your head? If you’re into statistics, you imagine a standard normal distribution – a bell curve. We were all taught about the bell curve in school and it applies nearly universally to natural phenomenon. If you let sand drain out of your hand into a pile, the shape of the cross-section of the pile is a bell curve. If you take 500 people in room and graph their heights, you’d see a bell curve. Our minds naturally process statistics through the lens of the normal distribution. So when folks in our industry make the statement that AG49 is the average, then they’re implying (or even explicitly state) that roughly 50% of the returns are above average and 50% are below. More importantly, they’re also implying that the magnitude of outperformance is identical to the magnitude of underperformance. Why? Because that’s how a standard normal distribution works. And I have to confess – that’s exactly how I thought the AG49 rates looked, too. But they don’t.
While I was playing around with the historical data in the AG49 maximum rate calculation, I started to notice something odd. Below is a histogram showing the geometric returns for all 16,802 25 year geometric returns dating back to 1950 that comprise the AG49 maximum illustrated rate. Each bar represents the number of observations that fall between the top and bottom return range, segmented in 0.1% intervals. I’ve highlighted the two cells that border the AG49 average for the returns, which is 6.09%.
What this graph is telling you is that the distribution of returns that make up the AG49 average is decidedly not symmetrical. For example, the chance of having a return that is more than 1% higher than the AG49 average is just 3.4%. The chance of having a return less than 1% lower than the AG49 average? Roughly 10%. The best-case scenario, in terms of overperformance, is exceeding the AG49 maximum illustrated rate by 1.41%. The worst case is having a return 2.3% lower than the AG49 max rate. Yikes. Saying that using the AG49 maximum rate is a reasonable “average expectation” conceals the fact that the worst case is significantly worse than the best case. On the flip side, it also conceals the fact that a more reasonable expectation of performance is actually 6.22%, which is the median return for the dataset. The gap between the average and the median immediately tells you that underperformance is more severe but less common than overperformance.
And there’s more to the story. Life insurance products don’t just respond to returns – they have policy charges, as well, and those charges interact with the return assumption. So what happens when you overlay the historical AG49 return distribution? That’s where things get really interesting.
This is a bizarre and unexpected result. I was so shocked by it that I called a couple of folks on both sides of the AG49 debate as it was being crafted back in 2014 to see if they’d looked at this analysis. The answer was that no, they hadn’t, at least not like this. They both had a relatively standard normal distribution in their heads as well and were as shocked as I was by the findings. So what is going on with this distribution? Why in the world does it look like a dinosaur, with a humped back and long tail?
On a whim, I decided to split the dataset dating back to 1950 into two equal parts – 25 year periods starting from 1950-1971 and 25 year periods from 1972-1993. Take a look at the results.
This blew my mind. What this graph is showing, basically, is that the reason why the return distribution doesn’t look like it should is because it’s actually made up of two individual return distributions depending on the start date for the 25 year periods. Again, these are the 16,802 data points in the AG49 calculation simply segmented by start year. The average of the 8,035 data points from 25 year periods starting in 1950-1971 is just 5.49%. The average of the 7,748 data points for 25 year periods starting in 1972 is 6.63%. The AG49 maximum illustrated rate is the average of all of the data points – but really, it cuts right down the middle of two distinct return distributions.
The more I’ve thought about this phenomenon, the more it makes sense both mathematically and theoretically. Mathematically, the problem is that although it feels like we’re using an enormous amount of historical data to draw conclusions, that’s actually not the case. The span of time from 1950 to 2018 is 68 years, which is just 2.7 times the observation period of 25 years. If you think about independent average returns over 25 years, we can only fit 2.7 independent 25 year periods into 68 years of data. The other 16,799 25 year strings share some number of observations greater than zero. So wouldn’t it make sense, then, that the data would create two different distribution curves since that’s basically what can fit independently inside the data?
Theoretically, the fact that these two distributions are markedly different from one another also makes sense. The time period from, say, 1990 to 2015 is markedly different than 1955 to 1980. I’ve heard many explanations for why there are two regimes. Companies were allowed to buy back stock in the 1980s, which opened the door for inflating share prices by reducing float. The US left the gold standard in the 1970s, which allowed us to move from being a creditor to a debtor (as a country) and lever up financial assets. Financial and technological advances disproportionately benefited US equity prices after 1985. The list goes on. But everyone generally agrees that the world now is not what it once was. For Indexed UL, the data seems to bear that out.
If you’re in the business of promoting Indexed UL at the maximum AG49 rate, then this should be an immensely disturbing conclusion. By using the maximum AG49 rate, you’re implying that it is a reliable projection of future policy performance or, at least, a “fair” valuation for the current cap. But that’s not the case. History will not repeat itself. What’s your take on the future equity regime over the next 25 years? The simple fact is that no one knows the answer to that question. Not the regulators. Not the insurance companies. Certainly not me. And doubly certainly not clients. But the answer to that question seems to matter immensely to what illustrated rate to show on an Indexed UL product.
Except, in reality, it doesn’t. I started this series by pointing out the fact that the AG49 maximum illustrated rate is purely hypothetical in that it relies on the assumption of a constant cap throughout history. That’s why I’ve never, ever, under any circumstance advocated that the AG49 maximum illustrated rate serve as the baseline for a “reasonable” rate for Indexed UL. It’s a purely hypothetical figure. But that view doesn’t sit well with a lot of people selling Indexed UL because they want something concrete to show to their clients, something historically grounded. That’s why the hypothetical historical lookback methodology embedded in AG49 is so appealing. But, as I’ve shown, even the “historical” part of that equation is dicey and can’t be trusted. It’s not what it appears to be and isn’t fed with sufficient data to be reliable, even if you’re willing to suspend reality and assume that the cap never changes. The AG49 maximum illustrated rate formula betrays on both counts. That’s why, at the end of the day, the AG49 rate has literally nothing to do with expected performance for an Indexed UL product. It has no bearing. It is equivalent to using a dart board to select an illustrated rate – and a lot less fun.
The fact that AG49 is neither reliable for its hypothetical assumption of a constant cap nor for its skewed and limited historical data raises a tough question. If not the AG49 maximum rate, then what is the “right” illustrated rate for Indexed UL? The simple answer is that there is no “right” illustrated rate, but there is a fair one – the option budget. The maximum AG49 rate represents a single scenario out of an infinite number. It is not to be trusted and was never intended to be trusted. The responsibility for showing reasonable illustrations is, has been and will always rest with the producer. Choose wisely – and conservatively.