Home
Case study
Building a National-Scale Design System for AI-Driven Early Childhood Assessment

Building a National-Scale Design System for AI-Driven Early Childhood Assessment

ACER is a Global Education Research Organisation

Published on :
May 20, 2026
achieved task completion
70%
improved customer satisfaction
80%
20
reduced support quesries
90%
staff satisfaction score
Building a National-Scale Design System for AI-Driven Early Childhood Assessment

ACER is a Global Education Research Organisation

Product Design Lead
Position
$150000
Budget
7 months
Duration

ACER (Australian Council for Educational Research) is a globally recognised educational research organisation, trusted by governments, schools, and international bodies to design and evaluate assessments, research programmes, and educational tools. ACER develops complex, data-heavy digital platforms used by educators, students, administrators, and policy makers across Australia and internationally. The organisation's reputation is built on rigour, evidence, and accuracy — the same standards the design practice needed to reflect but hadn't yet operationalised.

Problem & Challenges

The Preschool Outcomes Measure (POM) was developed as a new national digital assessment tool to support educators in observing and understanding preschool children’s learning and development. The challenge was to design a product that translated complex educational research into an intuitive, trusted experience for educators across diverse contexts, while ensuring accessibility, ethical data use, and readiness for scale.I was primarily involved in the Small Scale Trial (SST) and the National Applied Trial (NAT), contributing to the delivery of a validated formative assessment tool aligned with learning progressions.

  • Aligning Diverse Educational Needs - Translating complex pedagogy and assessment theory in to intuitive user experience
  • Designing an adaptable mobile and web tool that fits both structured and exploratory classroom environments.
  • Creating terminology, workflows and support resources that feel natural to educators

The core problem:

ACER had strong design intent but no design infrastructure. Designers across squads were rebuilding the same UI components from scratch. The result was a growing UX debt and patchwork of inconsistent interfaces across 8+ product surfaces, with WCAG compliance unreliable, a real risk in ACER's regulated education context.There was no shared system, no research operations framework, and no AI-integrated workflow to keep pace with delivery expectations. I was brought in to build all three.

Users & Audience

ACER's products serve multiple distinct audiences across its education and fintech platforms:

Educators, school administrators and Federal Government

Teachers and school leaders using ACER's assessment platforms to track student progress. Needs: clear data presentation, low-friction workflows, and interfaces that don't require training to use effectively.

Students

Assessment participants, ranging from primary through to tertiary. Needs: accessible, focused interfaces that minimise anxiety and maximise performance conditions.

Policy makers and researchers

Data-heavy users interpreting ACER research outputs. Needs: complex data made legible, with strong information hierarchy and reliable accessibility across assistive technologies.

Internal design team

A growing team of product designers who were the primary audience for the design system and research operations framework I built. Their ability to adopt and maintain what I built was the real measure of success.

Roles & Responsibilities
I was the Lead Product Designer on the Preschool Outcomes Measure (POM) project, holding end-to-end UX and product design ownership across two major trial phases, the Small Scale Trial (SST) and the National Applied Trial (NAT). The team was small and cross-functional: I worked directly alongside a Product Manager, technical leads, developers, and subject matter experts in early childhood education research. There was no separate research lead. I owned the research programme alongside the design work.
The project ran from March to October 2025, operating in design sprints within an agile delivery model. Given the tight timeline and the complexity of the domain — two major assessment domains, ten subdomains, seven progressive competency levels — I had to move between strategic design decisions and hands-on delivery simultaneously.

What I personally owned:

  • Led end-to-end UX strategy, defining the research approach, design principles, and usability benchmarks before any screens were designed
  • Conducted all primary user research: semi-structured interviews with educators and facilitators, focus groups, competitor analysis (including StoryPark, and assessment platforms across the early childhood education space), and internal discovery workshops
  • Translated complex psychometric and pedagogical frameworks into progressive user journeys and interaction models that non-expert users could navigate with confidence
  • Established Design Ops practices from scratch — built a scalable component library in Figma using variables, tokens, and auto-layout, with WCAG accessibility compliance built in
  • Designed and iterated across the full fidelity spectrum — from low-fidelity wireframes and user flows through to high-fidelity, clickable prototypes used in both trial phases
  • Led two rounds of usability testing — Round 1 during the Small Scale Trial, Round 2 during the National Applied Trial — using task success rates, System Usability Scale (SUS) scoring, and qualitative behavioural observation
  • Synthesised research findings using Dovetail with global tagging, ensuring insights were structured and comparable across trial cohorts
  • Collaborated with engineering to integrate AI-driven decision support — implementing logic based on validated psychometric scales to recommend next steps aligned to each child's competency level, and enabling automated report generation and intelligent data pre-filling
  • Managed design-to-development handoff through detailed Figma documentation, design gap meetings, and UAT support
  • Conducted design QA on staged builds to catch and resolve inconsistencies before launch
  • Established post-launch feedback loops — NAT survey, passive observation, and CX feedback cadence — to inform continuous improvement

Process & What I did

1. Phase 1 - Discovery
Competitive analysis of early childhood assessment platforms including StoryPark, plus internal workshops with PM, developers, and education researchers to map the assessment framework before designing anything.

2. Phase 2 - User research
Semi-structured interviews and focus groups with educators across diverse classroom contexts. Built a Dovetail tagging taxonomy before sessions ran so findings were comparable across cohorts and usable directly in sprint planning..

3. Phase 3 - Translating complexity
Two domains, ten subdomains, seven competency levels had to become flows a busy educator could navigate without training. Designed IA around the educator's natural workflow — not the framework's structure. Validated simultaneously with educators and subject matter experts, facilitating where those two groups disagreed.

4. Phase 4 - Design system
Component library in Figma using variables, tokens, and auto-layout. WCAG compliant from component level. Architecture followed the assessment domain structure so components were reused systematically rather than rebuilt per flow.

5. Phase 5 - AI decision support
Integrated AI logic based on validated psychometric scales: real-time performance analysis, next-step recommendations, automated report generation, and intelligent data pre-filling. Designed recommendation UI in educator language, prompts that built professional confidence, not directives that replaced judgement.

6. Phase 6 - Usability testing
Two rounds. Round 1 (Small Scale Trial): task success rate, SUS scoring, behavioural observation. Round 2 (National Applied Trial): validation at scale with facilitator debrief interviews to surface gaps between self-reported ease and actual friction.

7. Phase 7 - Delivery
Weekly dev reviews, design gap meetings, UAT support, design QA on staged builds. Post-launch feedback infrastructure built into the delivery plan from day one: NAT survey, Mouseflow passive observation, and a CX feedba

Solution

Key Solutions Delivered:

  • Responsive web and mobile platform optimised for touch interaction and classroom use conditions.
  • Progressive assessment flows translating complex framework (2 domains, 10 subdomains, 7 competency levels) into intuitive educator journeys requiring no prior training.
  • AI-powered decision support, real-time next-step recommendations, automated report generation, and intelligent data pre-filling based on validated psychometric scales.
  • Role-based dashboards and workflows for educators, facilitators, and administrators, each tailored to their context and permissions.
  • Observation capture interfaces built around natural classroom moments, record, tag, and submit without disrupting teaching flow.
  • WCAG 2.1 AA compliant design system across all web and mobile surfaces, scalable components, tokens, and auto-layout in Figma.

Additional Enhancements:

  • Onboarding flow validated at 80% task success rate during the National Applied Trial, educators self-onboarded with minimal facilitation.
  • Post-launch feedback infrastructure built into delivery from day one, NAT survey, Mouse flow passive observation, and a CX feedback loop.
  • Dovetail tagging system designed before research began, findings comparable across cohorts and usable directly in sprint planning.
Result
  • 70%+ System Usability Score
  • 0 -> 55 NPS for new digital assessment
  • 80% task success rate for Assessment Flowduring the National Applied Trial
  • 90% internal satisfactory review
  • 25% Faster solution delivery vs prior quarter
  • AI-assisted prototyping shifts validation earlier; reduces late-stage rework
  • Single governed Figma system; 30% fewer UI inconsistencies
  • Strengthened brand perception as a modern, education-focused  brand
  • 3 -> 0.5 days — AI-assisted Dovetail + structured taxonomy + prompt libraries

Lessons learned

•AI tools are most valuable when you design the workflow around them, not drop them into existing processes. The taxonomy and prompt libraries I built were as important as the tools themselves.

•Design system adoption is a change management problem, not a design problem. I had to make the new system demonstrably better for individual designers immediately, not just better for the organisation long-term, or it wouldn't be used.

•Building while delivering is hard but necessary. Waiting for a dedicated infrastructure sprint that never comes means teams carry technical design debt indefinitely. The parallel track approach worked because I made both tracks visible to leadership from day one.

Are You Ready to Make a Real Difference in the World?

Extraordinary things happen when passionate minds unite—let’s create something unforgettable and meaningful, side by side.

See my works
See my works
Trusted by 300k+ Happy customers Around the Globe
50+
Completed Projects
7.9
Average Customer Rating