Graham Number Backtest: 25 Years of Buying US Stocks Below Intrinsic Value

Cumulative growth of $10,000: Graham Number strategy vs SPY (2000–2024), NYSE, NASDAQ, AMEX

Graham Number Backtest: 25 Years of Buying Stocks Below Intrinsic Value

Benjamin Graham's formula is simple: multiply EPS by book value per share, multiply that by 22.5, take the square root. If the stock trades below that number, it's cheap. We ran this screen across US stocks for 25 years. The result is honest and worth understanding.

Contents

  1. Method
  2. The Formula
  3. What We Found
  4. Year by Year: Two Very Different Eras
  5. 2000-2012: When Cheap Beat the Market
  6. 2013-2021: The Value Malaise
  7. 2022-2024: Partial Recovery
  8. Why the US Market Is Different
  9. The Full Annual Record
  10. Limitations
  11. Run It Yourself
  12. Part of a Series

CAGR: 6.34%. SPY: 7.64%. Annual shortfall: -1.31%.

A $10,000 investment in January 2000 grew to $46,460 by end of 2024. SPY turned the same $10,000 into $63,071. The Graham Number screen, applied mechanically in the US market, didn't keep up.

That's the headline. The story underneath it is more useful.


Method

Data source: Ceta Research (FMP financial data warehouse) Universe: All US exchanges (NYSE, NASDAQ, AMEX), market cap > $1B Period: 2000-2024 (25 annual rebalance periods) Rebalancing: Annual (January), equal weight top 30 by discount to Graham Number Ranking: Deepest discount first (lowest price / Graham Number ratio) Benchmark: S&P 500 Total Return (SPY) Cash rule: Hold cash if fewer than 10 stocks qualify Transaction costs: Size-tiered model (0.1% mega-cap to 0.5% mid-cap) Filing lag: 45 days to prevent look-ahead bias

Full methodology: backtests/METHODOLOGY.md


The Formula

Graham Number = sqrt(22.5 × EPS × Book Value Per Share)

The constant 22.5 comes from Graham's rule that a stock shouldn't exceed 15x earnings or 1.5x book value. Multiply those limits: 15 × 1.5 = 22.5. The screen buys only stocks trading below this theoretical intrinsic value ceiling, ranked by how deep the discount goes.

Parameter Value Purpose
Formula sqrt(22.5 × EPS × BVPS) Graham's combined earnings + book ceiling
Max portfolio size 30 stocks Concentrated enough to matter
Ranking Price / Graham Number ascending Deepest discounts first
Market cap floor > $1B Investable universe only
Cash threshold < 10 qualifying stocks Avoids forced exposure

This is one of the purest value screens in finance. No momentum filters. No quality overlays. No sector caps. Graham's arithmetic, applied literally.


What We Found

Metric Graham Number S&P 500
CAGR 6.34% 7.64%
Total Return 364.6% 530.7%
Max Drawdown -44.39% -34.90%
Volatility (ann.) 21.24% 17.51%
Sharpe Ratio 0.204 0.322
Sortino Ratio 0.362 0.556
Calmar Ratio 0.143 0.219
VaR 95% -20.13% -19.92%
Beta 0.761 1.00
Alpha +0.04%
Up Capture 86.9%
Down Capture 72.1%
Win Rate (vs SPY) 56%
Cash Periods 0/25
Avg Stocks Held 25.9

The down capture of 72.1% is real. In years when the market falls, this portfolio falls only about 72% as much on average. That's genuine downside protection from owning cheap stocks. The problem is the up capture of 86.9%. The portfolio captures most of the upside but not all of it. In a sustained bull market, that gap compounds against you.


Year by Year: Two Very Different Eras

2000-2012: When Cheap Beat the Market

The dot-com bust changed the landscape for value investing almost immediately. Stocks that had been priced for perfection crashed. Graham Number stocks, already priced conservatively, held up better and recovered faster.

Year Portfolio S&P 500 Excess
2000 -5.83% -10.50% +4.67%
2001 +15.49% -9.17% +24.66%
2002 -5.89% -19.92% +14.03%
2003 +49.02% +24.12% +24.89%
2004 +28.37% +10.24% +18.13%
2005 +27.19% +7.17% +20.02%
2006 +16.61% +13.65% +2.96%
2007 +9.83% +4.40% +5.42%
2008 -44.39% -34.31% -10.08%
2009 +46.12% +24.73% +21.39%
2010 +24.51% +14.31% +10.21%
2011 -20.13% +2.46% -22.59%
2012 +23.73% +17.09% +6.64%

2001 and 2002 were the strategy's finest hours. While the S&P fell nearly 28% over those two years combined, the Graham Number portfolio gained a net +9%. The protection came from the margin of safety built into the selection: stocks already priced below their conservative intrinsic value had limited room to fall further.

2003 through 2005 were the payoff years. Banks, industrials, and cyclicals trading below book value doubled and tripled as the economy recovered. The screen caught them before the re-rating. Three consecutive years of 20%+ excess returns.

2008 broke the streak badly. The -44.39% drawdown was the strategy's worst year, worse than SPY by 10 percentage points. Graham Number screens load up on financials and deep cyclicals, the exact sectors that collapsed in the credit crisis. Book values became unreliable overnight. The margin of safety embedded in pre-crisis BVPS figures evaporated when balance sheets were restated.

2009 and 2010 recovered fast. The same sectors that cratered bounced hardest, and the strategy captured +46% in 2009 alone.

2011 was another bad year. The strategy fell -20% in a flat market, hit by European sovereign debt fears that punished financials and cyclicals a second time.

Through 2012, the strategy had beaten SPY in 9 of 13 years. The compound advantage was meaningful. Then the world changed.

2013-2021: The Value Malaise

Year Portfolio S&P 500 Excess
2013 +10.9% +27.77% -16.8%
2014 -11.1% +14.50% -25.6%
2015 -7.7% -0.12% -7.6%
2016 +15.7% +14.45% +1.2%
2017 +24.6% +21.64% +3.0%
2018 -19.6% -5.15% -14.4%
2019 +0.3% +32.31% -32.0%
2020 -6.4% +15.64% -22.1%
2021 +15.0% +31.26% -16.2%

2013 was the inflection point. The S&P gained 27.77%. The Graham Number screen gained 10.9%. The gap of -16.8% wasn't a bad year for value stocks. It was a signal that the US equity market had entered a different regime.

Technology companies began dominating index returns in ways that Graham's framework structurally can't capture. Amazon, Google, Microsoft, Meta, and Apple don't show up in Graham Number screens. They trade at multiples of book value, often with earnings too modest relative to price to produce a meaningful Graham Number at all. As these companies grew from large to enormous, their drag on any non-technology value screen compounded.

2014 was the clearest demonstration. The portfolio fell -11.1% while SPY gained +14.5%. The -25.6% gap is the widest negative excess in the dataset. Financials and energy stocks, which dominate Graham Number screens, were flat to down while technology and biotech led the market.

2019 and 2020 confirmed the pattern. The portfolio gained +0.3% in 2019 while SPY gained +32.3%. Then COVID hit in 2020, and the portfolio fell -6.4% while SPY recovered to +15.6% by year end, led by the same technology stocks the screen never holds.

2022-2024: Partial Recovery

Year Portfolio S&P 500 Excess
2022 -2.9% -18.99% +16.1%
2023 +21.9% +26.00% -4.1%
2024 +8.0% +25.28% -17.3%

2022 was a genuine value recovery year. Rate hikes punished growth stocks with long-duration earnings. The Graham Number portfolio fell only -2.9% while SPY fell -19%. The down capture worked exactly as expected.

2023 and 2024 snapped back to pattern. The AI-driven rally concentrated gains in Nvidia, Microsoft, and a handful of other technology companies that value screens don't hold. 2024's -17.3% excess gap is the latest confirmation. The structural problem hasn't resolved.


Why the US Market Is Different

The Graham Number screen has a sector problem specific to the US market. The formula requires both positive earnings and positive book value per share. That filters out unprofitable growth companies, which is fine. But it also systematically underweights or excludes asset-light businesses that generate high returns on low book values.

Technology companies often have small book values relative to their earnings power. Amazon's book value doesn't capture the value of AWS. Microsoft's balance sheet doesn't reflect the moat of Azure and Office 365. These companies fail the Graham Number screen not because they're risky, but because their value is in cash flows, not balance sheet assets.

The result is a portfolio that concentrates in financials (where book value is meaningful), industrials, and energy. That concentration worked well in the 2000s when those sectors led the market. It has been a structural drag since 2013 as technology's share of the S&P 500 grew from roughly 16% to over 30%.

The down capture of 72.1% is still real and useful. In bear markets, the strategy's defensive character shows up clearly. 2000-2002, 2022. These are years when cheap stocks with tangible asset backing held up better than expensive growth stocks. The problem is that US bull markets since 2013 have been so technology-dominated that the drag in up years more than offsets the protection in down years.


The Full Annual Record

Year Portfolio S&P 500 Excess
2000 -5.8% -10.50% +4.7%
2001 +15.5% -9.17% +24.7%
2002 -5.9% -19.92% +14.0%
2003 +49.0% +24.12% +24.9%
2004 +28.4% +10.24% +18.1%
2005 +27.2% +7.17% +20.0%
2006 +16.6% +13.65% +3.0%
2007 +9.8% +4.40% +5.4%
2008 -44.4% -34.31% -10.1%
2009 +46.1% +24.73% +21.4%
2010 +24.5% +14.31% +10.2%
2011 -20.1% +2.46% -22.6%
2012 +23.7% +17.09% +6.6%
2013 +10.9% +27.77% -16.8%
2014 -11.1% +14.50% -25.6%
2015 -7.7% -0.12% -7.6%
2016 +15.7% +14.45% +1.2%
2017 +24.6% +21.64% +3.0%
2018 -19.6% -5.15% -14.4%
2019 +0.3% +32.31% -32.0%
2020 -6.4% +15.64% -22.1%
2021 +15.0% +31.26% -16.2%
2022 -2.9% -18.99% +16.1%
2023 +21.9% +26.00% -4.1%
2024 +8.0% +25.28% -17.3%

Win rate vs SPY: 56% (14 of 25 years). The majority of years were wins by count. The losses were bigger on average than the wins, which is how a 56% win rate produces negative excess CAGR.


Limitations

Book value reliability. Graham's formula treats book value as a proxy for liquidation value. That assumption worked in the manufacturing-heavy economy of mid-20th century America. In an economy where software, brand, and network effects drive most corporate value, book value per share captures an increasingly small fraction of what a business is worth. The screen selects well for banks and industrials, less well for the economy at large.

The 2008 drawdown. At -44.4%, the strategy's max drawdown is 10 percentage points deeper than SPY's -34.9%. Financial stocks, which frequently appear in Graham Number screens due to low price-to-book ratios, were the epicenter of the crisis. The same balance sheet characteristics that make them look cheap on Graham's formula made them fragile in 2008. Stress-testing against financial sector exposure is necessary before running this strategy live.

Structural underperformance since 2013. This isn't a rough patch. It's 9 years of persistent underperformance interrupted only by the rate-shock year of 2022. The US equity market's increasing concentration in technology and asset-light businesses is a structural headwind for any book-value-based screen.

No quality filter. A company can have positive EPS and positive BVPS and still be a deteriorating business. The original Graham Number screen has no profitability quality check, no leverage limit, no earnings consistency requirement. Adding quality overlays (ROE > 10%, debt-to-equity < 1.5) would likely improve results, but that becomes a different screen.

Annual rebalancing lag. The 45-day filing lag plus annual rebalancing means the portfolio can hold a deteriorating stock for up to 13 months before rotating out.


Run It Yourself

-- Current Graham Number screen (US exchanges)
SELECT
    p.symbol,
    p.companyName,
    p.sector,
    p.exchange,
    ROUND(k.marketCap / 1e9, 2) AS mktcap_b,
    ROUND(k.grahamNumberTTM, 2) AS graham_number,
    ROUND(p.price, 2) AS current_price,
    ROUND(p.price / k.grahamNumberTTM, 3) AS price_to_gn
FROM profile p
JOIN key_metrics_ttm k ON p.symbol = k.symbol
WHERE p.exchange IN ('NYSE', 'NASDAQ', 'AMEX')
  AND k.marketCap > 1000000000
  AND k.grahamNumberTTM > 0
  AND p.price < k.grahamNumberTTM
ORDER BY price_to_gn ASC
LIMIT 30

Run this screen live on Ceta Research →

git clone https://github.com/ceta-research/backtests.git
cd backtests

# US backtest
python3 graham-number/backtest.py --preset us --output results.json --verbose

# All exchanges
python3 graham-number/backtest.py --global --output results/exchange_comparison.json

# Current screen
python3 graham-number/screen.py --preset us

Part of a Series

This is the US analysis. We tested the Graham Number screen across markets globally:

  • Global comparison → — How the strategy performs across 17 exchanges, and where Graham's formula still works
  • Japan analysis → — Book-value-rich market where the screen finds a very different opportunity set
  • UK analysis → — LSE results and how the financial sector composition differs from the US

Cumulative growth of $10,000: Graham Number strategy vs SPY (2000–2024), NYSE, NASDAQ, AMEX
Cumulative growth of $10,000: Graham Number strategy vs SPY (2000–2024), NYSE, NASDAQ, AMEX


Data: Ceta Research (FMP financial data warehouse), 2000-2024. Universe: NYSE + NASDAQ + AMEX. Full methodology: METHODOLOGY.md. Past performance does not guarantee future results.

Read more