InputConvDenseBottleneckReadout
← All Work
ML · Personalization · EdTech · Spaced Repetition

upGrad Shorts

A learning-science-driven micro-learning product: short-form lessons ranked by an ML retention engine using SM2 spaced repetition and a neural classifier that predicts the optimal time to resurface each concept for each learner.

Client: upGradRole: ML & Product EngineerPeriod: 2020
Spaced RepetitionNeural NetsPersonalizationMLEdTechFeed Ranking
upGrad Shorts micro-learning feed

15%

Lift in retargeting & cross-sell experiments

SM2+

Spaced repetition + neural adaptation

↑ Engagement

Measurable long-term retention signal

The Insight

Short-form content has an engagement problem that short-form entertainment doesn't: the goal isn't infinite scroll, it's retention. Watching 20 micro-lessons on React hooks means nothing if none of it sticks 2 weeks later when you actually need to use it. The algorithmic approaches that optimize for watch-time (TikTok-style feed ranking) actively work against learning science — they surface novel content over review content because novelty generates engagement in the short term.

The SM2 algorithm — originally designed by SuperMemo — provides a principled framework for spacing review intervals based on recall difficulty. A card you found easy gets pushed further into the future; a card you struggled with resurfaces sooner. The problem with vanilla SM2 is that it's per-item, non-contextual, and assumes a consistent learner state. Real learners have varying attention, vary in consistency, and have engagement patterns the base algorithm doesn't model.

The design decision for upGrad Shorts: use SM2 as the backbone for review scheduling, augmented by a neural classifier that models per-learner state to adapt interval predictions dynamically.

The Retention Engine

SM2 Spaced Repetition Layer

Each concept in the library is a "card" in the SM2 system. When a learner completes a short, they respond with self-reported recall confidence (1–5, embedded as a UX gesture — swipe left for "hard", swipe right for "got it"). SM2 uses this to calculate the next review date. Cards are bucketed by review priority and surfaced in the feed as a mix of new content and scheduled reviews, with the ratio tuned per-learner engagement profile.

Neural Interval Adaptation

The vanilla SM2 interval formula is static — it doesn't know that a learner who typically watches Shorts at 7pm on weekdays and not at all on weekends should have their review intervals adjusted to align with their actual engagement windows. The neural classifier ingests: learner activity patterns (time-of-day/day-of-week engagement history), concept-level recall history, domain difficulty signals, and course progress context. It outputs an interval multiplier that adjusts the SM2 base interval to account for when the learner is likely to be receptive.

Feed Ranking & Blending

The final feed ranking blends: SM2 urgency score (how overdue is this review), novelty score (how many new concepts from the current course module are unseen), and engagement affinity (does this learner historically finish this content type). The blend weights are A/B tested per cohort. The critical constraint: the system must never surface a concept for review significantly before its SM2-scheduled date, to preserve the spacing effect.

Experimentation Infrastructure

The ranking algorithm is modular and fully flag-controlled — every scoring component can be independently enabled, disabled, or weight-adjusted per experiment cohort. I built the experiment harness on top of a feature flag system with Amplitude as the event store. This made it possible to run 4–6 simultaneous experiments on different aspects of the ranking without them contaminating each other, using assignment-level bucketing.

Growth Partnership & Business Impact

The Growth team owned the top-of-funnel; Shorts was positioned as a re-engagement and cross-sell surface. Learners who completed a program module would be served Shorts from adjacent skill areas — someone finishing a Data Analytics course would see Python ML micro-lessons in their feed. The spaced repetition engine kept learners returning to the feed on a regular cadence, which gave Growth a high-intent, active audience for retargeting campaigns.

The 15% lift in retargeting and cross-sell experiments tied to the Shorts feature was measured through holdout analysis — a group of learners with Shorts disabled vs. enabled, with conversion to new program enrollment as the primary metric. The lift was sustained across multiple experiment cycles, validating that the mechanism was real (the engagement cadence) rather than novelty-driven.

Stack

Pythonscikit-learnPyTorchSM2 AlgorithmNode.jsReactTypeScriptRedisPostgreSQLA/B Testing FrameworkFeature FlagsAmplitudeAWS Lambda