Reacting to the conventional wisdom that one's retirement
allocation should depend on one's age, and believing that a stronger
rule of thumb could be developed by accounting for wealth, I developed
a simple two-asset-class model (using bonds and stocks) and ran a
regression. For one set of parameters, I found an ideal preretirement
bond allocation of roughly `45% - (age * .64%) + (wealth *
3.4%)`, where wealth is measured as a multiple of annual
expenses. I have not yet tested how sensitive this formula is to the
parameters I assumed, but I suspect that, by giving two digits for
each coefficient, I have given at least one too many.

Conventional wisdom suggests that a participant in a retirement savings plan put an increasing amount of his portfolio into safer (but lower-return) assets as he gets older. One quantitative variant of this rule suggests that the percentage of one's portfolio in bonds should equal one's age — for example, a 30 year old should put 30% of his savings in bonds and 70% of his savings in stocks.

I have believed that a more sensible rule would be to move one's
portfolio into lower-risk assets as one gets *wealthier*, rather
than as one gets older. Most people grow older and wealthier in
tandem (especially those who participate in retirement savings plans),
so this rule will often not be much different from the conventional
rule, but a 30-year-old who has saved 20 times his annual expenses can
essentially guarantee a comfortable retirement by buying bonds.

To analyze these rules, I created a model of a participant's saving/spending habits and the investment markets.

I assumed a generic retirement plan participant. This participant will retire at age 70, assuming he reaches that age; will save the same amount (in inflation-adjusted terms) every year until retirement; and will dissave the same amount (again, in real terms) every year after retirement. My main result assumes that contributions are 10% of expenses, but I also have results for 5% and 20%. His probability of surviving each year is determined from the life expectancy table in appendix C of the 2004 version of IRS publication 590. The participant allocates his savings between two asset classes: stocks and bonds.

I have assumed that real returns to stocks and bonds are serially independent and identically distributed over time. I made assumptions for the means and standard deviations of stock and bond returns, and the correlation between them; these assumptions can be seen on the web pages with the results. Based on an allocation, I calculate a portfolio mean and standard deviation and assume that the portfolio return will be distributed according to a logistic distribution with the same mean and standard deviation.

These are not precisely in decreasing order of importance, but I've tried to bring up the most serious problems with the model toward the beginning. Correcting for these weaknesses would change the results, but I believe the character of the results would not change. In other words,

- Perhaps most importantly, I've ignored options such as delaying retirement and cutting expenses in a crunch. In reality, these options provide some of the best ways to handle the risk of a stock-market crash. I've assumed fairly generic participants — since decisions like this would depend a great deal on the person, the model would have to be significantly more complicated to handle such decisions.
- Contributions before retirement are presumed flat, as are expenses after retirement. In the real world, people have kids, they change jobs (or go a while without one), they have extraordinary one-time expenses or windfalls.
The objective function may be a strong point of contention. Maximizing the probability of dying solvent implies that a person doesn't care whether he has a dollar or a million; if life span weren't exogenous to the model, it would imply that the best strategy for a solvent person would be suicide.

That said, the participant's life span

*is*exogenous, so the only way to increase the probability of dying solvent is to become more securely solvent. And if I were trying to construct an alternative objective function based on wealth, it would probably have a decreasing slope for increasingly large amounts of wealth, and would be most sensitive in the region where having enough to fund one's retirement is an open question. In other words, it would be similar to the function I've chosen, only more arbitrary.- I've assumed only two asset classes. I actually don't believe this to be a major weakness, but it may be worth trying again with more asset classes.
- The returns to the asset classes are assumed to be serially uncorrelated and time independent, and a logistic distribution has been assumed. So have parameters. I suspect that the long-term variance is overestimated (though the probability of an outlying return may not be). If you buy a long-term inflation-protected zero-coupon bond, there's no need to actually have any variance in the long run, even though year-to-year returns will vary. I also believe that stock returns will be overstated in the long run, because I believe that the intrinsic value of, say, the S&P 500 is less volatile than its market value.
- Piecewise linearity of probabilities; adequacy of number of iterations.
- No transaction costs.
- I used a single life expectancy table. It is not adjusted for wealth, income, or time (e.g. someone born in 1950 is assumed to have the same life expectancy in 2000 as a person born in 2000 is assumed to have in 2050), let alone any parameters we don't use (sex, race, lifestyle, etc.).

I created a small, simple Java program, which I will make available at some point. This Java program fills in two tables, one of ideal allocations, and one of probabilities. Each table has a column for each age and a row for each of 76 wealth levels. The levels of wealth are expressed as multiples of annual expenses, from zero to thirty times annual expenses, with an increment of 40% of annual expenses. A prior version of the program investigated higher levels of wealth, and found that none of them offered any significant probability of running out of money.

For each age in the life expectancy table, starting with the oldest, and for each wealth level, the program calculates an ideal allocation and the resulting probability of dying solvent. For each year except the last, this probability of dying solvent for a given allocation is calculated by repeatedly simulating the returns for the following year. After each simulation, the program finds the probability of dying solvent from the previously-calculated portion of the table.

For example, to calculate the probability of dying solvent for someone who is 65 and has a wealth of 2.4 times annual expenses, the program calculates, based on a particular asset allocation and randomly-selected asset market returns, how much money the participant will have at age 66. If the participant has 2.6 times annual expenses at age 66, the probability of dying solvent is assumed to be the average of the probability of dying solvent at age 66 with a wealth of 2.4 and the probability of dying solvent at age 66 with a wealth of 2.8; each of those probabilities is looked up in the table. This assumption of linearity between wealth levels is a simplification, but the wealth levels are close enough together to prevent it from significantly affecting the results (note, in the output probability tables, that any probability is very close to the average of the probabilities for adjacent wealth levels at the same age).

The asset market returns are simulated by dividing the logistic distribution into several thousand segments of equal probability and iterating over the boundaries between each pair of consecutive segments (the exact number of interations is given in the output). I suspect it would be more statistically sound to use the midpoints of the segments, rather than the endpoints. I also suspect that for large numbers of segments, it doesn't really matter.

For contributions equal to 5%, 10%, or 20% of expenses.

These pages are generated by my Java program, to ensure that I don't insert one set of parameters when the output is based on a different set. I do plan to add regression statistics, which will be generated by Excel, and are therefore more subject to carelessness. The regressions will only use data points between 10% and 90% and ages younger than retirement.

Everything here should be reproducible, especially once I release my Java code. Please reproduce, and let me know if I've screwed up.