#314 | Engineered Index 5 Year Performance

creamy desert

For most folks in the life insurance industry, it probably feels as though the phenomenon of non-standard indices in Indexed UL products has appeared very quickly, almost out-of-nowhere in the wake of AG 49-A. The reality, however, is that these types of indices have been around for over a decade, first in structured CDs and then migrating to FIAs and now making their way into Indexed UL. I’ve written about this new breed of indices multiple times and primarily from the vantage point of education about how they’re built, their structural attributes and how they interact with Indexed UL illustrations. But now it’s time to take a different tack – how do these indices actually perform?

But first, let’s clear up some nomenclature, at least for the purposes of this article. The industry can’t seem to coalesce on what to call these indices, myself included. Some folks call them “volatility controlled” and that’s broadly accurate given that the vast majority of the indices have an explicit volatility target, but not all of them. Some folks call them “proprietary indices” because most of these indices are built and used at specific life insurers, but not all of them.

For this article, I’m going to use another term – engineered indices – that, I think, captures the spirit of what the asset managers and banks are doing when they create these indices. They’re engineering them to do particular things such as high participation rates, stable option prices, backtested performance, real-world performance or a particular investment story. At their core, these indices are engineered for something. That’s what separates them from normal indices.

Every month, I get a report of the performance of 135 engineered indices dating back to 2017. The report also includes the salient attributes of the index – the strategy, volatility target, return type and embedded fees. The returns are on a calendar year basis, so not a complete picture of the index but a reasonable directional look at the performance of the indices over time. To keep things simple for this article, I eliminated any index with a volatility target higher than 6%. There are indices with higher volatility targets but they’re relatively few and far between.

Using the calendar year (and 2022 YTD) return data provided on the report, I’m going to work through a few simple questions that relate to core value propositions of engineered indices.

How do they actually perform over time?

When it comes to performance, engineered indices have had a very interesting last 5 years. Take a look at the chart below showing average compounding performance for the engineered indices from the beginning of 2017 through to YTD 2022. Engineered indices are in blue. I also color coded the S&P 500 (yellow), the AGG and 10 year futures index (dark green) and a 60/40 rebalanced portfolio (light green).

On average, engineered indices have delivered returns of 3.55% since the beginning of 2017, quite a bit less than the average S&P 500 return of 12.45% and the 60/40 rebalanced portfolio return of 8.29%. But that’s not quite the whole story. For a vanilla IUL with a 4% option budget, an S&P 500 par rate over this period of time would averaged around 5% with a 4% option budget. That same 4% option budget, however, could have produced something like a 170% participation rate for an engineered index. Applying those figures to the averages (with a 0% Floor) produces a much closer story – the S&P 500 par rate strategy would have generated a return of around 8%, spitting distance from the average return of the engineered indices after applying a 170% participation rate.

And this is exactly what we’d expect. In theory, there should be no very little difference over a multi-year time horizon between the S&P 500 with a dynamic participation rate and an engineered index that allocates between the S&P 500 and cash but has a stable participation rate. The difference is in the packaging, not in the results, and that’s what we see here on average.

This speaks, I think, to the use case for these indices in general. Some folks like to demonize engineered indices because they’re complicated, opaque and often somewhat unruly. All of that is true, but that doesn’t detract from the fact that they can and do actually deliver performance that is generally in-line with the S&P 500 after adjusting for participation rates, with some variation around the mean due to the peculiarities of each index. That’s what this analysis shows.

How does engineered index performance vary by year?

Take a look at the average performance of the crop of engineered indices, on average, from 2017 through YTD 2022:

If you were to look at this chart and make a few generalizations, you might say that engineered indices are up when the S&P 500 is up, but the degree to which S&P 500 performance flows through to the index depends on S&P 500 volatility. In 2017, for example, engineered indices posted a massive average return of 11.45%, capturing more than half of the S&P 500’s return of 21.82%. But in 2019, the S&P 500 was up 31.4% and engineered indices captured just 32% of those gains for an average of 10.12%.

Why the difference? In 2017, the VIX averaged just 11.1% but it was 15.4% in 2019. Higher volatility means that engineered indices have to allocate out of risky assets (or deleverage) to maintain their volatility target. As volatility increases – like it did in 2020 and 2021 – we would expect engineered indices to capture less equity-based upside, and that’s exactly what we see. In 2021, for example, VIX averaged 19.7% and the engineered indices managed to capture just 14.5% of the S&P 500 return, posting a paltry 4.15% average return against S&P 500 performance of 28.68%.

But it’s not just equity performance in the mix. If the engineered index is allocating out of equities, then it’s allocating into something else. As I wrote recently, that “something else” is often long-duration fixed income assets. That’s part of the reason why, for example, 2020 was a strong year for engineered indices despite the fact that most of them allocated almost completely out of equities and missed the S&P 500 rally after March. How is that possible? Because the interest rates fell through the floor and the Agg posted a gain for the year of 7.5% and even 10 year Treasury futures increased a whopping 8.23%. That’s also why engineered indices have taken a beating in 2022 – yes, the S&P 500 is down (-13.3%), but the carnage in the Agg (-9.5%) and 10 year Treasury futures (-8.63%) is worse relative to the inherent volatility of the asset.

I’m sure that folks who make their living creating and selling these indices would take some umbrage with my generalized intuition above, but consider this – when the average engineered index performance was negative in 2022 and 2018, 88% and 84% of engineered indices are negative (respectively). When the engineered index performance was positive in 2021, 2020, 2019 and 2017, then 91%, 82%, 98% and an incredible 99% of proprietary indices are also up. On average, 90% of indices have the same sign as the average. These indices appear to be far more alike than they are different, at least in aggregate.

Why do some indices do better than others?

Despite the common ingredients used in these indices, average performance ranges widely. The best-performing engineered index clocked in at 6.25% and the worst-performing index managed negative performance over the time. As you’ll see in a minute, many of these indices are recent creations, but ironically both the best and worst performing indices came into being prior to 2017. The index performance being shown on the chart is real for these two indices.

That fact, I think, highlights one of the truisms about engineered indices – each index will respond differently to the same economic fact pattern. Or, to put it another way, each engineered index is “tuned” to respond well to a certain series of economic outcomes and potentially poorly when those outcomes don’t happen. That’s why two indices can have such vastly different reactions to the exact same historical period.

There’s no better example of this than the much-maligned Trader Vic index that was released in early 2012, making it just the 3rd engineered index to hit the market in this dataset. That index has been the subject of a class action lawsuit (Ogles v Security Benefit) and in the summary judgement for the lawsuit in favor of the defendants, it states that “the selling point [of the index] was that it might perform favorably when annuities linked to indices based on stocks might not, and its potential was uncapped.” The suit also says that “historical performance simulations showed an upward trend in past years – a trend that did not hold up after Ogles purchased his annuity.”

That’s to be expected because equity performance was absolutely stellar after 2012. Trader Vic was literally the worst performing index in the industry for the next 9 years – until its fortunes turned in 2020 and it ripped off a 6.82% return followed by a 7.47% return in 2021 and a whopping 15.22% YTD in 2022, making it one of the best performers over each one of those periods and the best performer by more than 7% in 2022.

Contrast Trader Vic with the Bloomberg US Dynamic Balance index (BUDBI), created just after Trader Vic in mid-2013 and the 5th engineered index into the market. BUDBI posted massive gains in 2017, 2019 and 2020 that were well in excess of the average – the only index to have returns in the top 15 for those 3 years (and the only index to have 3 top-15 showings for any year). What was unique about 2017, 2019 and 2020? Those 3 years were the only years when both the Agg and the S&P 500 had gains. Why does that matter for BUDBI? Because those are the two constituents to the index.Hence, in 2022 when the S&P is down 13.3% (as of the end of April) and the Agg is down an incredible 9.5%, the BUDBI is down a jaw-dropping 9.87% – a huge swing for an index with a 5% volatility target.

Which index is better, Trader Vic or BUDBI? Well, that depends on how the future plays out. Of the 103 indices in the dataset, only one – BUDBI – ranked in the top 15 for 3 years. Just 19 indices ranked in the top 15 for 2 years. Another 36 had a top 15 showing for at least one year. But the largest slug, a full 47 indices, didn’t have a single top 15 showing since 2017. The future will surely be the same as the past – not in terms of pure economic returns, but in terms of how these indices perform. Some will do well. Some will do not so well. Some may even do quite poorly, especially (as I wrote about recently) in a rising rate environment. And it will be extremely difficult to predict which ones will do which.

To put a final pin on this point, consider the chart below, which shows the average performance for each proprietary index from 2018 through YTD 2022 based on its 2017 performance ranking. If performance in any given year – in this case, 2017 – is an indicator of future performance, then we should expect to see some correlation between the rank of the index and its performance. In other words, the lines should generally slope down from left to right.

Instead, what do we see? Virtually no correlation. For example, thebest performer after 2017 was 9th worst performer in 2017.The worst performer in 2017 was the 2nd best performer after 2017. The 2nd best performer in 2017 was the 12th worst performer after 2017. Only 3 indices maintained their ranking in both scenarios – S&P 500 at #1, S&P 500 Daily Risk Control 5% at #19 and Goldman Sachs Dynamo at #76. As they say, past performance is not a predictor of future performance.

Does more engineering make an engineered index better?

Early in the life of engineered indices, the basic pitch was that the core value of engineering was actually in volatility control. Theoretically, using a volatility trigger to allocate out of equities and into cash (or, later, fixed income) would mean that the index would scale down risk exposure as drawdowns in equities occurred. From this concept came simple indices built by S&P 500 that combined equity exposure with a daily risk control at a specific volatility target (5% for the indices used in this analysis).

There are 4 S&P volatility-controlled indices in this dataset. They are:

  1. S&P 500 Daily Risk Control 5%
  2. S&P 500 Daily Risk Control 5% Excess Return
  3. S&P Low Volatility Daily Risk Control 5%.
  4. S&P Low Volatility Price Return Daily Risk Control 5%.

All 4 use a volatility control mechanism that allocates to cash, not long-duration fixed income, and none have leverage. These are as vanilla as they get. There are, however, some substantive differences. The 1st and 3rd indices are total return, whereas the 2nd is excess return (total return minus the risk-free rate) and the 4th is a price return. Also, the Low Volatility use the S&P Low Volatility Index, which screens the S&P 500 for the 100 lowest volatility stocks, rather than the S&P 500. Theoretically, this allows for more participation in equities because the inherent low volatility of the equities being selected means that there’s more equity exposure. All else being equal, in a rising equity environment, we’d expect the Low Volatility variants to outperform the S&P 500 variants simply because of a larger exposure to equities.

So how do these indices perform relative to their peers? Take a look at the performance of the indices since 2017:

In short, the simple, straightforward, “benchmark” S&P 500 indices perform extremely well. All of them rank in the top 35 for average performance over the sample period. What these S&P indices show is that although carriers and their index partners usually lead with some sexy story about asset allocation, the reality is that volatility control is the primary driver of performance, the great equalizer amongst all of the indices. That’s why the indices all tend to move in the same direction and why the S&P indices are reliable indicators of what other indices will do, at least directionally.

Based on this analysis, I would argue that 90%+ of the generic benefits of an engineered index can be gleaned with the simple volatility control mechanism in these benchmark S&P indices. As one index provider commented to me recently, all that really matters in the long run is the volatility control target. Everything else is window dressing – and that’s exactly what this shows, despite all of the time, money, engineering and marketing that firms pour into their creations.

What about participation rates?

However, it’s not the end of the story. We can’t just look at index performance. We also have to look at how the index itself interacts with the product. The fact that two of the S&P indices are Total Return, for example, means that the options on these products will be relatively more expensive than Price Return or Excess Return indices, both of which are designed to make options cheaper to increase the participation rate available in an indexed insurance product.

Carriers have figured out that participation rates sell. Producers (and their clients) seem to be easily convinced that higher participation rates are better than lower participation rates. If all else were equal, then that would be true. But all else is not equal. Let me give you a quick example. Imagine you’re trying to choose between an index and you’re presented with the following options, all of which have the same 4% option budget:

IndexVolatility TargetReturn TypeEmbedded FeeParticipation Rate
S&P 500NoneTotalNone44%
S&P 500NonePriceNone51%
S&P 500NoneExcessNone58%
S&P 50010.00%ExcessNone92%
S&P 5005.00%ExcessNone167%
S&P 5005.00%Excess0.50%185%
S&P 5005.00%Excess1.00%206%

Which one is the best? Trick question – they’re all the same, at least on an economic basis. But the participation rates are definitely not. Why? Because these strategies are designed to play games with the option pricing formula. Take the final option, which has a 5% volatility control target, subtracts out the risk-free rate and has an embedded fee of 1%. For option pricing purposes, this option has a negative expected return of 1%. Hence, the options are cheap. Very cheap. You’re basically only paying for the nose of the 5% volatility target that will poke out above the 0% return line.

We can see the effect of these sorts of fees and Excess Return arrangements in the performance data. Only 8 of the 99 engineered indices in this dataset are not Excess Return. 71 of the 99 have some embedded fee, most often 0.50% (50/71) but as high as 1.25% and as low as 0.25%. How do these fees impact performance? Palpably. Take a look at the fees of the indices as ranked by their average performance since 2017:

That light blue line indicates what we would expect to see: in general, indices with higher fees perform worse than indices with lower fees. In other words, these fees don’t exist because they’re increasing the performance of the index, as one might argue with paying a fee for active fund management. These fees exist for the sole reason of increasing participation rates. That’s it. And the tradeoff is that the client, all else being equal, gets a lower performing index. Which one would you rather have: a higher participation rate on a lower performing index or a lower participation rate on a higher performing index? Hopefully, by now, you can see that’s a trick question.

Conclusion

Analyzing this data has confirmed some of my prior views and challenged others. It has confirmed my general view that these indices is something like the menu at a generic, Americanized Mexican restaurant, where the vast majority of the 242 items on the menu consist of meat, beans, veggies, rice and tortillas in different combinations, arrangements and proportions.

However, there are also likely to be a few standout items that go an entirely different path – think chiles en nogada, a poblano pepper stuffed with a complex and flavorful, fragrant and almost fruity meat filling, topped with a sherry, goat cheese and walnut cream sauce and adorned with pomegranates and parsley for a combination that is so surprisingly delicious that it haunts your dreams and makes you want to schedule a trip back to San Antonio just so you can order it at Cuishe again.

Chiles en nogada is the national dish of Mexico. Does that surprise you? It stunned me. It made me realize that I know literally nothing – nothing – about Mexican food. Walnuts? Pomegranates? Goat cheese? These things are not Mexican food. But they are. They are part of the national dish of Mexico. What, then, have I been eating for my entire life? The only conclusion I can come to is that I’ve been eating the Mexican food that Americans like to eat. It’s made, it’s cultivated, for me. The same goes for indices. They are engineered so that the optics look good to producers and to their clients. That does not mean they’re made to be great, or to perform well in the real world, or to deliver reliable results – they’re made to look good, to fit what producers and clients think they want.

That’s why these indices look so incredibly similar. They’re all trying to have great backtests with high participation rates so that they really pop on the illustration, because that’s what producers want. Is that what clients actually want? No, they want actual, real performance. Backtests and high participation rates are not reliable indicators of future performance. They’re almost completely unrelated. Future performance is entirely uncertain.

How do you handle uncertainty? Two ways – get certain or get diversified. Become an expert or admit you’ll never be an expert. Certainty has a siren song. Everyone likes to feel like they’re certain, me included (and maybe me most of all). But with these engineered indices, I don’t think certainty is possible. I’ve heard too many people who create these indices bemoan the original sins of past indices and the indices created by other firms. I’ve heard those people talk to much about things that weren’t “supposed to happen” but did anyway. If the people who create these indices aren’t certain, then who can be?

That leaves us with diversification. If you’re selling Indexed UL and you’re faced with all of these different choices, spread the risk. Acknowledge that you don’t know, but you think there’s merit to diversification itself as a strategy. The only currency we have is credibility. A very easy way to destroy credibility is to make assurances and recommendations about something you don’t control or understand – and right now, on the cusp of this new generation of Indexed UL, there’s nothing that advisors control and understand less than engineered indices. If you’re going to choose, then choose wisely. But perhaps the wisest course is to not choose at all.