The Science of Spaced Repetition: Ebbinghaus, SM-2 and Memory
Forgetting is not a failure of the brain — it is a feature. The human memory system discards information that is not accessed repeatedly, freeing cognitive resources for what matters most. Understanding this mechanism, and working with it rather than against it, is the foundation of spaced repetition: one of the most robustly validated learning techniques in cognitive science.
The Ebbinghaus Forgetting Curve
In 1885, Hermann Ebbinghaus published Über das Gedächtnis (On Memory), documenting the first systematic, quantitative study of human memory. Using himself as the sole subject, he memorised lists of nonsense syllables and measured how quickly he forgot them. The result was the forgetting curve: a mathematical description of how memory retention decays over time.
The modern formulation of the forgetting curve is:
R(t) = e−t/S
where R(t) is the probability of successful recall at time t after the last review, and S is memory stability — a parameter representing the strength of the memory trace. A freshly learned item might have S ≈ 1 day, meaning that after one day retention has fallen to e⁻¹ ≈ 37%. An item reviewed five times correctly might reach S = 30 days, so after a month retention is still e⁻¹ ≈ 37% — but the absolute interval is thirty times longer.
Ebbinghaus found that the rate of forgetting slows dramatically with each successful reactivation. After a single review, forgetting proceeds almost as rapidly as the first time. After several spaced reviews, the curve flattens substantially. This observation — that review timing matters more than review quantity — is the central insight of spaced repetition.
Memory Stability and Retrievability
Modern spaced repetition research (particularly the work of Piotr Woźniak and later researchers using the FSRS — Free Spaced Repetition Scheduler — model) distinguishes two key properties of a memory:
- Stability (S): how long the memory persists before falling below a useful retrieval threshold (typically R = 0.9, or 90% recall probability).
- Retrievability (R): the current probability of recall, which decays according to R(t) = e−t/S.
Each successful retrieval at retrievability R increases stability. The stability increase is larger when the memory was hard to retrieve (low R at review time) — a phenomenon called the spacing effect. This is why reviewing a card one day before it would be forgotten produces a larger stability gain than reviewing it while it is still fresh.
The SM-2 Algorithm
The algorithm that powered the flashcard revolution is SM-2, developed by Piotr Woźniak in 1987 and released as part of SuperMemo 2. It remains the basis of Anki — the world's most widely used spaced repetition application — and countless derivative systems.
SM-2 assigns each card an ease factor (EF), initialised at 2.5. After each review, the learner scores their recall quality q on a 0–5 scale:
- 5 — Perfect recall, instantaneous
- 4 — Correct, after a slight hesitation
- 3 — Correct, but with significant difficulty
- 2 — Incorrect, but correct answer was easy to remember once shown
- 1 — Incorrect, even after seeing the answer
- 0 — Complete failure to recall
The ease factor updates after each review as:
EF′ = EF + (0.1 − (5 − q) × (0.08 + (5 − q) × 0.02))
EF is constrained to a minimum of 1.3. The next review interval is calculated as:
- I(1) = 1 day (first review)
- I(2) = 6 days (second review)
- I(n) = I(n − 1) × EF for n ≥ 3
If q < 3, the card is reset: intervals return to day 1 and the entire sequence restarts. With a stable EF of 2.5, a card reviewed correctly six times has an interval of 6 × 2.5⁴ ≈ 234 days — nearly eight months. A card consistently scored 3 (minimum passing) degrades toward EF = 1.3, yielding only 6 × 1.3⁴ ≈ 26 days at the same point.
Explore these dynamics interactively in the spaced repetition simulator.
Active Recall vs Passive Rereading
Spaced repetition is only as effective as the retrieval practice it embeds. Highlighting text or re-reading notes are passive activities that create an illusion of fluency: the material feels familiar, but familiarity is not the same as the ability to retrieve information under examination conditions.
Active recall — generating the answer before looking at it — produces a qualitatively different memory trace. The retrieval attempt itself strengthens the neural pathway, a process supported by long-term potentiation (LTP) in hippocampal circuits. A 2011 Science paper by Karpicke and Blunt compared four study strategies. Students who used retrieval practice recalled 50% more material on a final test than those who restudied, and 40% more than those who created elaborate concept maps.
The practical implication: a flashcard session where you flip immediately to the answer without attempting recall confers little of the benefit of spaced repetition. The effort of retrieval — even unsuccessful retrieval — is the mechanism.
The Testing Effect
Closely related to active recall is the testing effect (retrieval practice effect). A landmark 2006 study by Roediger and Karpicke asked students to study a prose passage, then split them into groups: some restudied the passage repeatedly; others took recall tests on it. One week later, the tested group recalled 61% of the material; the restudied group recalled 40%. The tested group actually spent less total time on the material.
Neuroimaging studies (using fMRI) show that successful retrieval activates a broader network of cortical regions than restudy — including prefrontal areas associated with elaborative processing. Each retrieval acts as a form of reconsolidation: the memory is slightly reconstructed and then re-stored in a strengthened form, integrating new context and associations.
Interleaving
Most learners practice topics in blocked fashion: complete all problem type A before moving to type B. Interleaving — mixing problem types within a session — feels harder and produces worse immediate performance. Yet multiple meta-analyses show it produces 25–50% better long-term test performance than blocked practice.
The mechanism is thought to be discriminative contrast: switching between problem types forces the brain to identify which strategy applies to each problem, rather than executing the same procedure repeatedly from working memory. This extra cognitive demand, though uncomfortable, builds flexible retrieval structures that generalise to novel problems.
Spaced repetition systems naturally interleave reviews across topics, since cards from different subjects fall due on the same day. This is a structural advantage over subject-by-subject study sessions.
Sleep and Memory Consolidation
Spaced repetition sessions acquire their full power only when supported by adequate sleep. During slow-wave sleep (SWS, NREM stages 3–4), the hippocampus replays patterns corresponding to the day's learning — a process called hippocampal-neocortical dialogue. Sharp-wave ripples in the hippocampus, coordinated with cortical slow oscillations and sleep spindles, drive the gradual transfer of episodic memories to distributed neocortical representations. This is systems consolidation.
REM sleep contributes differently: it strengthens associative memories and procedural skills, and is particularly important for insight — connecting newly learned material to prior knowledge. A 2010 study by Stickgold and Walker showed that subjects who slept after learning improved performance by 20–30% compared to waking controls on an identical retention interval.
Practical consequence: a spaced repetition session the evening before sleep is more effective than the same session in the morning if no subsequent sleep occurs that day. Reviewing shortly before sleep, then reviewing again the next morning (the "sleep sandwich" protocol), combines encoding, consolidation, and retrieval practice in an optimal sequence.
Designing an Effective Spaced Repetition System
Knowing the science allows systematic optimisation:
- Card atomicity: each card should test exactly one fact. Compound cards ("What are the three causes of X?") break the retrieval mechanism because partial recall cannot be scored cleanly.
- Cloze deletion: fill-in-the-blank format ("The SM-2 ease factor starts at ___") forces active retrieval of a specific piece of information.
- Image occlusion: for diagrams, anatomy, or maps — hiding a labelled region and recalling the label.
- Consistent daily reviews: missing a day causes cards to pile up. The queue for the next day grows super-linearly, not linearly, because overdue cards generate additional overdue cards in SM-2 systems.
- New card rate: 10–20 new cards per day is sustainable for most learners. At 20 new cards/day, expect 100–150 daily reviews after 3 months of consistent use.
Limitations and Scope
Spaced repetition is a tool for declarative memory: facts, vocabulary, formulas, dates. It does not substitute for:
- Conceptual understanding, which requires elaborative questioning and problem solving.
- Procedural skills (programming, music, surgery), which require deliberate practice with feedback.
- Creative or analytical tasks, where the goal is synthesis and judgment rather than recall.
Used appropriately — to maintain a large base of accurate, rapidly accessible factual knowledge — spaced repetition is among the most efficient learning interventions ever validated by cognitive science.
Frequently Asked Questions
What is the Ebbinghaus forgetting curve?
The Ebbinghaus forgetting curve describes how memory retention R(t) decays exponentially over time according to R(t) = e−t/S, where t is elapsed time and S is memory stability. Hermann Ebbinghaus discovered this pattern in 1885 through self-experimentation with nonsense syllables.
How does the SM-2 algorithm calculate review intervals?
SM-2 starts with intervals of 1 day, then 6 days, then multiplies each subsequent interval by the ease factor (EF), which starts at 2.5 and adjusts based on recall quality scored 0–5. The formula is I(n) = I(n−1) × EF, with EF updated as EF′ = EF + (0.1 − (5−q)(0.08 + (5−q)×0.02)).
What is active recall and why is it better than rereading?
Active recall means retrieving information from memory without looking at the source — answering a question, writing a summary, or using flashcards. Studies consistently show it produces 50–200% better long-term retention than passive rereading, because the act of retrieval strengthens the memory trace through reconsolidation.
What is interleaving in learning?
Interleaving means mixing different topics or problem types within a study session instead of practicing one topic until mastery before moving to the next (blocking). Research shows interleaving improves long-term test performance by 25–50% compared to blocked practice, despite feeling harder during learning.
How does sleep affect memory consolidation?
During slow-wave sleep (SWS), the hippocampus replays newly acquired memories, transferring them to neocortical long-term storage through systems consolidation. REM sleep then strengthens associative and procedural memories. Missing sleep after learning can reduce next-day recall by 20–40%.
What is the testing effect?
The testing effect (also called retrieval practice effect) is the finding that being tested on material produces stronger long-term memory than an equivalent amount of time spent restudying. A 2006 Roediger & Karpicke study showed students who studied then took tests recalled 61% after one week, versus 40% for students who restudied four times.
What is memory stability in spaced repetition?
Memory stability (S) in the forgetting curve equation R(t) = e−t/S represents how slowly a memory decays. A higher S means the memory persists longer before falling below a useful retrieval threshold. Each successful spaced repetition review increases S, which is why the optimal review interval grows with each repetition.
Is spaced repetition effective for all types of learning?
Spaced repetition is most effective for declarative knowledge — facts, vocabulary, dates, formulas. It is less suited for procedural skills or conceptual understanding, which require deliberate practice and elaborative thinking. For STEM subjects, combining spaced repetition with problem-solving practice yields the best results.
How many new cards per day is optimal in a spaced repetition system?
Most practitioners recommend 10–20 new cards per day as a sustainable rate. Adding more cards increases daily review load exponentially after a few weeks. A deck of 20 new cards/day can produce 100–150 reviews/day after 3 months.
Can I use the spaced repetition simulator on this site?
Yes — the interactive spaced repetition simulator at mysimulator.uk lets you visualise the forgetting curve for different memory stability values and see how review timing affects long-term retention. You can experiment with SM-2 intervals and observe how ease factor changes compound over time.