AI Learning Platform
Upload your notes, build question banks, study adaptively, learn with friends. A social learning platform with a multi-provider LLM router, spaced-repetition scoring, and collaborative sharing built on Supabase RLS.
The problem
Students accumulate lecture slides, notes, and readings across a term but rarely have a structured way to test themselves against that material. Flashcard apps require manual entry. Past papers cover the wrong syllabus. The gap between “I've read this” and “I know this” goes unmeasured until the exam.
The idea was simple: upload whatever material you have, specify how many questions and at what difficulty, and get a ready-to-use question bank back. Over a term you accumulate banks per module. Then the app helps you study — not just by testing you, but by noticing which areas you actually struggle with and pushing those back to the surface.
My role
Solo build from idea through to production. I designed the data model, built the document ingestion and LLM generation pipeline, wrote the adaptive scoring algorithm, and implemented the sharing system. The project was also a deliberate exercise in working with multiple LLM providers and building resilience at that layer rather than assuming any single provider is always available.
— Architecture
— Key decisions
Coupling to a single provider meant one rate limit or outage could break the entire generation pipeline. Different models also have different cost/quality trade-offs for different question types.
A thin routing layer maps question-generation tasks to the most suitable available model. If the primary provider is slow or rate-limited, the router falls back to the next in the preference list transparently. No generation request surfaces a provider error to the user.
Users needed to share individual questions and entire banks with friends — read-only or editable. A separate permissions service would add significant infrastructure for what is fundamentally a data-access problem.
Postgres row-level security policies enforce share permissions at the database layer. A shares table records (owner, recipient, resource_id, access_level). Every query is automatically scoped — no application-layer permission checks needed, no risk of forgetting one.
Tracking only right/wrong misses questions a user answers correctly but slowly — a sign they're not yet confident. Time alone is noisy (distractions, re-reads). Neither signal is sufficient on its own.
A weighted score combines normalised response time with a rolling error rate. Questions above the threshold are promoted into a high-priority pool and resurface more frequently. The effect is that genuinely weak areas receive more repetitions without the user having to identify them.
Early designs treated question banks as simple lists — a folder of questions. That made cross-bank quizzes, sharing, and collaborative editing awkward to model.
Banks are their own database entity with many-to-many membership to questions. A question can live in multiple banks. Quiz sessions are composed from one or more banks. Sharing a bank grants access to its member questions without duplicating data.
— Technical depth
Document ingestion pipeline
Users upload PDFs, Word documents, or plain text. The pipeline extracts text, chunks it into context-window-safe segments, and passes each chunk to the LLM with a structured prompt that specifies question count, difficulty level, and output format. Questions are returned as structured JSON, validated against a schema, and written to Postgres in a single transaction — either the entire bank lands or nothing does.
← scroll →
LLM routing
The router maintains a ranked list of providers with their current status. On each generation request it selects the highest-ranked available provider, sends the request, and monitors the response time. If the provider exceeds a latency threshold or returns a rate-limit error, the router marks it degraded and retries immediately against the next in the list. The caller never sees a provider-specific error — only success or a single unified failure if all providers are unavailable.
This also made it straightforward to route different task types to different models: cheaper, faster models for simple factual questions; stronger models for analytical or application-level difficulty.
The caller receives success or a single unified error — never a provider-specific message.
Adaptive scoring
Each question session records two signals per attempt: whether the answer was correct, and the response time normalised against the user's median for that difficulty band. These combine into a confidence score:
score = (error_rate × 0.7) + (slow_rate × 0.3)Questions above a score threshold are promoted into a high-priority pool. When the quiz engine selects the next question, it samples from the high-priority pool with a higher probability than from the general pool. This means weak questions resurface without any explicit scheduling — the distribution does the work.
Sharing via RLS
Supabase's row-level security lets you attach policies directly to tables. A shares table records who shared what with whom and at what access level. The RLS policies on questions and question_bankscheck for a matching shares row before allowing a read or write. No application code enforces this — the database does. A leaked API route or a missing auth check can't accidentally expose another user's data because the query will return nothing.
— Outcomes
— What I'd do differently
The document chunking strategy was naive at first — fixed character windows with no regard for semantic boundaries. Chunks split mid-sentence confused the model and produced malformed questions. I'd start with paragraph-aware chunking and add overlap between chunks from the beginning rather than retrofitting it.
The adaptive algorithm weights are hand-tuned constants. They work well in practice but have no principled basis. With more usage data I'd run an offline evaluation against known learning curves to find weights that minimise time-to-mastery rather than guessing.