on this page

The Real Problem

My Role

Domain Challenge

Design System

How it Works

Key Decision

Battle Story

Landing Page

Outcomes

Reflection

BACK TO ALL WORKS

01 - THE BUSINESS PROBLEM

After AI Course Creation shipped, the next urgent problem was already waiting.

When AI Course Creation launched, educators were building content faster. Platform supply was growing. But within weeks, a pattern emerged on client calls. Almost every conversation eventually circled back to the same question: what about assignments?

Assignments were central to how educators ran their courses, and managing them at scale was one of the most time-consuming parts of their workflow. Creating them from scratch, defining grading criteria, and evaluating hundreds of submissions manually was unsustainable. The platform had no AI support for any of it.

the trigger

Our CEO had been listening directly to users and brought a clear pattern back to a leadership meeting: educators were spending hours on grading that a well-designed AI system could handle in minutes. The mission was set to build an AI-powered assignment experience that saves teachers' time without sacrificing learning quality.

Where the stakeholders disagreed

Three different groups had three different ideas about what we were building, and I was in the middle of all of them throughout the entire project.

Hours lost to manual grading each week

Clients actively requesting AI assignment support

No scalable way to give quality feedback

Inconsistent rubrics across educator teams

Full automation push

Some stakeholders wanted AI to grade and close submissions automatically

Business team wanted something impressive for enterprise demos

PM had pressure to position this as a clear competitor differentiator

Educator-in-the-loop approach

✓

Research showed educators needed confidence in grades before releasing them

✓

Students expected feedback that felt considered, not generated

✓

Engineering flagged AI grading accuracy as a real constraint at MVP stage

CEO insights to leadership meeting to product team synthesis. Strong top-down momentum from day one.

02 - BRIEF VS REALITY

The brief said automate grading. The reality was more complicated than that.

The original ask was straightforward on the surface. But when I mapped what educators actually did during grading, the real problem became much clearer and it was not the grading itself.

The surface-level brief

“Build an AI system that creates assignments, generates rubrics, and grades submissions automatically so educators don't have to do it manually.”

When I synthesized stakeholder feedback and mapped the real educator workflow, I found something unexpected. The grading step that everyone was focused on automating was not where the time was being lost. The biggest friction was upstream, for example: creating the assignment and defining what a good answer looked like before any submissions arrived.

The situation is Educators were slow at grading not because evaluation itself was hard, but because they had no structured framework to grade against. They were building that framework from scratch in their heads right before they started. That was the real bottleneck.

The reframed problem

"The problem is not that grading takes too long. The problem is that educators are solving two different things at once: what counts as a good answer, and whether this student's answer is good. Separating those two rubric definition versus assessment is where the real time savings come from."

What changed because of that reframe

Instead of putting everything into automated grading, we designed the system to do three things well: help educators create well-structured assignments quickly, generate rubrics automatically so evaluation criteria existed before grading began, and then use AI to evaluate submissions against those rubrics with the educator in control of the final call.

Image: Eduqat - assignment feature for MVP 1.2 before AI integrated was developed

03 - Root cause

No competitor had built this before. We were mapping the territory as we walked through it.

Why this was genuinely uncharted

Most AI features in edtech at the time were focused on content generation or course recommendations. Nobody in the independent course platform space had combined AI assignment creation, rubric generation, AI grading, and human validation into one coherent workflow. That meant we had no existing product to learn from, no pattern to borrow from, and no benchmark to test against for this specific capability combination.

The real implication

We had to define the feature logic, the interaction patterns, and the trust architecture entirely from scratch. For the UI, this meant borrowing layouts from adjacent products, so platforms that had solved parts of the problem separately and rebuilding those patterns in Eduqat's own design language. That was the right call, but it came with a real cost: the development team had to build new components from the ground up because the adapted patterns sat outside Eduqat's existing component library.

Questions we had to answer before any wireframe opened

QUESTION 1

How much should AI generate versus how much should educators write?

→ Full generation risked ownership loss. Partial generation required defining exactly where the line was.

QUESTION 2

What happens when the AI grades something incorrectly?

→ Before rubrics existed, early AI grading would sometimes go off-context entirely. The system needed a structured anchor to grade against.

QUESTION 3

How do educators verify that AI feedback is fair to students?

→ Students initially experienced AI feedback as generic and impersonal. Human validation before release became the answer.

QUESTION 4

What controls do educators need to trust the system with professional decisions?

→ Educator accountability for grades is legal and ethical, not just preferential. The design had to reflect that.

So, here are the business goals

04 - DISCOVERY & EVIDENCE

The time problem was well documented. The trust problem was not.

“Secondary research validated the pain points from stakeholder synthesis. The more meaningful finding was about what students actually needed from feedback and how rarely they were getting it under the manual system.”

So that's why the grading time problem must be well documented

It because of the assessment and feedback account for between 30 and 40 percent of an educator's total working time in higher education and professional training contexts. For large courses with hundreds of students, that number climbs higher. Eduqat's educator base was no exception.

Let's see the data below

40%

Of educator work time spent on assessment

63%

Of students say feedback arrives too late to act on

4x

More likely to complete a course with immediate feedback

71%

Of educators report grading inconsistency as a real concern

The student data mattered as much as the educator data. Students who received feedback days after submitting had already moved on mentally. The learning window was closed. Faster grading was not just about educator efficiency, but also it was about whether feedback could arrive while it was still useful.

Three directions that came out of the research

Rubric-first, always as the mandatory prerequisite before any grading

Without a rubric, AI assessment was unreliable and went out of context. With one, it became consistent and auditable. Rubric generation as the mandatory first step was both a design decision and a quality guarantee.

AI-assisted feedback with full educator override

The AI generates a feedback draft. The educator reads it, edits any part, and can add personal context the AI could not know. The result reaches students as the educator's feedback because by the time it does, the educator has approved it.

Full autonomous grading, planned for after trust was established

Educators were not ready to delegate final grade decisions to AI in MVP. We put full autonomy on the post-launch roadmap and kept the first version focused on building the trust architecture that would make it possible later.

05 - COMPETItion

No competitor had the full AI creation lifecycle in one flow. That was the opening.

I benchmarked five leading edtech platforms to understand what had been built, what was missing, and where Eduqat could genuinely differentiate rather than just catch up.

Platform

AI Assignment

AI Rubric

AI Grading

Educator Override

Educator Review

AI Feedback

Google Classroom

Partial

Yes

Thinkific

Partial

Yes

Kajabi

Yes

Teachable

Yes

Learnworlds

Partial

Yes

Eduqat (Target)

Yes, full

AI-Assisted

Yes, full

Personalized

No competitor in the independent course platform space combined assignment creation, rubric generation, AI grading, and personalized feedback in one integrated workflow. The real gap was not a feature, but a philosophy:

“Designing AI that educators can inspect, adjust, and genuinely own.”

06 - DESIGN DIRECTION

Three principles written before any wireframe opened. They made every subsequent decision easier to defend.

These were shared with the team before design work began and referenced throughout the project whenever a decision was contested. Stakeholders who wanted to keep changing the structure were anchored back to these to protect the timeline and keep improvisation space for post-launch iterations.

The design mandate

Build an AI system that handles the volume problem without creating a trust problem. Every interaction point where the AI touches a student's grade must have a visible, easy educator override. The AI does the heavy lifting. The educator holds the pen.

The three directions we considered

Rubric-first, always

→ Rubric generation is the mandatory first step before any submission can be graded. A small slowdown at the rubric stage means much faster, more consistent evaluation downstream and it forces educators to think about what they are evaluating before they start evaluating it.

AI proposes, educator disposes

→ Every AI generated output, including assignment drafts, rubric criteria, scores, and feedback text, is treated as a draft for educators to review. Nothing reaches students without human oversight. This principle is reinforced through the visual design. AI outputs are styled to clearly signal “review me,” not “final.”

Transparency over impressiveness

→ We chose not to make the AI feel like magic. When the system assessed a submission, it showed which rubric criteria were met, which were not, and with what confidence. An AI that shows its reasoning is one educators can trust. An AI that just shows a score is a black box, and black boxes don't get used for things that matter.

07 - HOW IT WORKS

Three flows. Every path ends with an educator making the final call before a grade reaches a student.

“The same design principle runs through all three flows: AI does the heavy work, the educator owns the outcome.”

Flow 1 - Create Assignment

These were shared with the team before design work began and referenced throughout the project whenever a decision was contested. Stakeholders who wanted to keep changing the structure were anchored back to these to protect the timeline and keep improvisation space for post-launch iterations.

Flow 2 - Create Grading Rubric

On the Assignment Details page, the educator adds a grading rubric. With AI assistance, they provide a rubric title, a reference assignment, the number of grading criteria, and any additional notes. The AI generates the rubric. The educator reviews and adjusts criteria weights and descriptions before saving. At this stage they can also enable Auto Grading and configure Feedback Settings.

Flow 3 - Assignment Grading Dashboard

The educator navigates to the Assignment Library, selects a course, picks an assignment, and opens a student's submission. If Auto Grading is enabled, the AI fills in the score and feedback based on the rubric. Before publishing, a confirmation modal requires the educator to review and validate. Nothing reaches a student without that final approval step.

08 - KEY DESIGN DECISIONS

Every real design decision has something you gave up. Here are the three that defined this product.

“These were not obvious choices. Each one involved competing stakeholder goals, real constraints, and a trade-off I had to be willing to defend with evidence rather than preference.”

Decision 1 - Core Grading Model

AI-assisted grading with mandatory educator review

Most Critical

Situation

Educators did not trust fully automated grading, while stakeholders pushed for maximum automation.

What We Choose

AI generates grades and feedback, but educators must review and approve before release.

Why This Matter

Educators are accountable for grades, so human oversight is required for trust and adoption.

What I Gave Up

A more impressive fully automated experience that would perform better in demos.

Decision 2 - Rubric Generation Timing

Rubric must be defined before any submission can be assessed

Situation

Educators were used to starting grading first and defining criteria later.

What We Choose

Require rubrics to be defined and approved before any grading begins, with AI assisting creation.

Why This Matter

Consistent criteria across all submissions is essential for fair and defensible grading.

What I Gave Up

Flexibility to adapt criteria during the grading process.

One moment where I pushed back directly

The stakeholders wanted to add a one click "approve all" button to bulk-release all AI assessed submissions without individual review. The argument was that when educators trusted the AI's assessments overall, making them click through each one was unnecessary friction.

What I Did

I pushed back with two arguments. First, bulk release without any review exposed Eduqat to real liability if AI errors reached students at scale. Second, the friction was intentional requiring at least a scan of the AI's work before release was the accountability mechanism that made the whole system trustworthy.

🎥 A Glimpse Into the Future

See how AI features are shaping the way educators work faster, smarter, and intuitive.

09 - THE SOLUTION

Every screen annotated with the reasoning and why it works the way it does.

At Eduqat, wireframes are built in the existing design system at near-final fidelity from the start. Because this feature required new components outside the existing library, the development team had to build several from scratch based on our Figma specs adapted from competitor patterns but rebuilt in Eduqat's visual language.

Eduqat Assignment Feature Before AI

MVP 1 - Before AI Assisted ready

Assessment is Only From a Pop up

MVP 2 - Before AI Assisted ready

New Design But It's Just Assessment Dashboard Without AI

Now, here is the MVP 3 after AI Assisted ready | Section 1 - Create AI Assignment

Screen 1 and 2

From curriculum builder to the assignment hub

The educator navigates to the curriculum builder, selects Exam, then chooses Assignment Material. This takes them to the Assignment Details and Settings page and the hub where assignment content, rubric setup, and all settings including AI grading come together in one view.

Screen 3 to 5

Reuse what exists, or generate something new with AI

Educators can reuse an existing assignment from the library or create a new one. Creating new offers two equal paths: blank canvas or AI generation. Equal visual weight was intentional and we did not want to push educators toward AI before they were ready. Choosing AI brings up a structured brief form reference materials, tone, difficulty level, and optional extra instructions.

Screen 6 and 7

Generated content, fully editable before saving

Educators can reuse an existing assignment from the library or create a new one. Creating new offers two equal paths: blank canvas or AI generation. Equal visual weight was intentional and we did not want to push educators toward AI before they were ready. Choosing AI brings up a structured brief form reference materials, tone, difficulty level, and optional extra instructions.

Section 2 - Create Grading Rubric

Screen 1 to 4

AI-generated criteria, fully editable and locked after publish

The educator creates a new rubric from the Assignment Details page. A modal offers blank or AI-generated. With AI, they input a rubric name, reference assignment, number of criteria, and any notes. The AI generates the rubric criteria, descriptions, and percentage weights and all editable. Once saved and the assignment is published, the rubric locks to ensure all student submissions are evaluated against the same criteria.

Section 3 - Assignment Grading Dashboard

Screen 1 to 3

From assignment library to individual submission review

he Assignment Library shows all courses with active assignments. Clicking into a course reveals the assignment list. From there, selecting an assignment opens the submission list with each student's grading status and score. The grading dashboard shows the student's submission alongside full grade input and feedback controls.

Screen 4 to 6

AI scores and feedback drafts, so educator validates before publishing

When Auto Grading is enabled, the AI fills in the score and feedback automatically. The educator can regenerate feedback or adjust the score manually. The Publish button stays disabled until the educator saves their review and nothing reaches the student without that validation step. When a rubric is active, scores are calculated from rubric point levels, not entered freely, which enforces consistency and keeps scoring criteria visible at all times.

Section 4 - Student View

Screen 1 and 2

Score, rubric breakdown, and feedback are all in one place

After grading is published, students view their final score alongside overall feedback, per-criterion rubric feedback, the original question, and their submitted answer. Rubric-level feedback is far more actionable than a single number with a generic comment. Knowing exactly which criteria fell short gives students something concrete to improve.

10 - WHAT ALMOST BROKE IT

Three weeks before launch, the AI accuracy numbers came in lower than the team expected.

“This is where most case studies get vague. I will tell you exactly what the constraint was, what I recommended, and what I had to give up to protect the decision that mattered.”

The conflict happened at stake

Internal testing of AI grading accuracy on open-ended responses came in at 78% agreement with educator grades meaningfully lower than the 90% threshold the team had been working toward. Engineering projected that reaching 90% in the MVP timeline was not feasible without significantly extended development time.

The options on the table: delay launch until accuracy improved, launch with 78% and manage expectations, or adjust the product design so the accuracy constraint was no longer the critical variable.

What I recommended

I pushed for the third option. If the system surfaced low-confidence assessments clearly and made educator review efficient rather than burdensome, then the accuracy of the underlying model became less important than the quality of the human review layer sitting on top of it. A 78% accurate AI that an educator could quickly review and correct in 30 seconds per submission was more valuable than a 90% accurate AI that educators did not trust and therefore would not use.

What happened next: We redesigned low-confidence submission flagging to surface the 22% of assessments most likely to need adjustment at the top of the grading queue. The effective accuracy of what actually reached students after educator review was over 97%.

The honest trade-off

What I gave up

A more impressive version one announcement

The kind of full automation that wins attention in a competitive market

Some goodwill with the PM who wanted a bigger feature set

What I protected

✓

Educator trust and the feeling that the course belonged to them

✓

A shipping timeline that engineering could actually deliver

✓

A product foundation that could be expanded to full automation in version two.

11 - RESULTS

60% faster creation. 4x faster grading. Here is what those numbers actually represent.

Post-launch data from the first months after the AI Smart Assignment system shipped, with honest context about measurement limitations.

60%

Assignment creation time reduced

Faster grading workflow vs manual baseline

80%

Early educator satisfaction rate

90%

AI grading accuracy after educator review

These metrics are based on internal workflow benchmarks and early tester cohorts, not a statistically significant controlled experiment. Post-launch cohort tracking is ongoing. I am honest about this distinction because overstating early metric certainty leads to bad product decisions.

12 - Reflection

Two decisions I would revisit, one that worked better than expected, and three principles I now carry into every AI product I design.

What I’d change

1. Observe real educators before designing

Most research came from secondary sources and stakeholder synthesis. Watching educators grade real assignments would have revealed critical behavior patterns much earlier.

Bring AI capability constraints into design earlier

The accuracy constraint existed long before it reached design. Treating AI capability as a design input from the start would have created better product decisions.

What worked best

Reframing the problem early

We stopped asking “How do we generate grading with AI?” and started asking “How do we help educators grade with more confidence?”

Making rubric creation mandatory proved to be the right decision. It added friction upfront, but created more structure, fairness, and trust throughout the experience.

Principles I carry forward

AI should surface confidence, not certainty

Trust grows when systems clearly show where they are reliable and where human judgment still matters.

Human review is part of the product

Human oversight is not a temporary safeguard. It is what makes the whole experience trustworthy.

Design with technical constraints early

Better AI products emerge when capability limits are part of product thinking from the beginning.

DESIGNER

Muhammad Abqary Nasution | Raffialdo Bayu | Hazrul Aswad

timeline

3 Months UI Phase, 6 Months Development Completion

timeline

3 - 4 Months

Scope

Dashboard System, Landing Page, Design Direction

Scope

Dashboard System, Landing Page, Design Direction

Collaboration

CEO, 1 PM, 1 Lead Dev, 4 Full-Stack Dev, 1 QA

Ownership

Owned UX end-to-end for the AI assignment flow

Let’s build something meaningful!

Available for freelance, collaboration, or full-time opportunities.

Contact Me

Or send me an email to:

azrulspace@gmail.com

Find me where I build and share

Product Segment Capabilities

Product Design

Product Thinking

AI Integrated Workflow

Design System

UI Design

Figma MCP

Framer

Webflow

Wix

Branding & Illustration

Motion

Let’s build something meaningful!

Available for freelance, collaboration, or full-time opportunities.

Contact Me

Or send me an email to:

azrulspace@gmail.com

Find me where I build and share

Product Segment Capabilities

Product Design

Product Thinking

AI Integrated Workflow

Design System

UI Design

Figma MCP

Framer

Webflow

Wix

Branding & Illustration

Motion

Let’s build something meaningful!

Available for freelance, collaboration, or full-time opportunities.

Contact Me

Or send me an email to:

azrulspace@gmail.com

Find me where I build and share

Product Segment Capabilities

Product Design

Product Thinking

AI Integrated Workflow

Design System

UI Design

Figma MCP

Framer

Webflow

Wix

Branding & Illustration

Motion

Let’s Connect

Book a call

Let’s Connect

Book a call

on this page

The Real Problem

My Role

Domain Challenge

Design System

How it Works

Key Decision

Battle Story

Landing Page

Outcomes

Reflection

BACK TO ALL WORKS

ON THIS PAGE

Go to top

After AI Course Creation shipped, the next urgent problem was already waiting.

When AI Course Creation launched, educators were building content faster. Platform supply was growing. But within weeks, a pattern emerged on client calls. Almost every conversation eventually circled back to the same question: what about assignments?

Where the stakeholders disagreed

Three different groups had three different ideas about what we were building, and I was in the middle of all of them throughout the entire project.

The brief said automate grading. The reality was more complicated than that.

The original ask was straightforward on the surface. But when I mapped what educators actually did during grading, the real problem became much clearer and it was not the grading itself.

The situation is Educators were slow at grading not because evaluation itself was hard, but because they had no structured framework to grade against. They were building that framework from scratch in their heads right before they started. That was the real bottleneck.

What changed because of that reframe

No competitor had built this before. We were mapping the territory as we walked through it.

Why this was genuinely uncharted

Questions we had to answer before any wireframe opened

So, here are the business goals

The time problem was well documented. The trust problem was not.

So that's why the grading time problem must be well documented

It because of the assessment and feedback account for between 30 and 40 percent of an educator's total working time in higher education and professional training contexts. For large courses with hundreds of students, that number climbs higher. Eduqat's educator base was no exception.

Let's see the data below

40%

63%

4x

71%

Three directions that came out of the research

No competitor had the full AI creation lifecycle in one flow. That was the opening.

I benchmarked five leading edtech platforms to understand what had been built, what was missing, and where Eduqat could genuinely differentiate rather than just catch up.

Three principles written before any wireframe opened. They made every subsequent decision easier to defend.

These were shared with the team before design work began and referenced throughout the project whenever a decision was contested. Stakeholders who wanted to keep changing the structure were anchored back to these to protect the timeline and keep improvisation space for post-launch iterations.

The three directions we considered

Three flows. Every path ends with an educator making the final call before a grade reaches a student.

Flow 1 - Create Assignment

These were shared with the team before design work began and referenced throughout the project whenever a decision was contested. Stakeholders who wanted to keep changing the structure were anchored back to these to protect the timeline and keep improvisation space for post-launch iterations.

Flow 2 - Create Grading Rubric

Flow 3 - Assignment Grading Dashboard

Every real design decision has something you gave up. Here are the three that defined this product.

One moment where I pushed back directly

The stakeholders wanted to add a one click "approve all" button to bulk-release all AI assessed submissions without individual review. The argument was that when educators trusted the AI's assessments overall, making them click through each one was unnecessary friction.

See how AI features are shaping the way educators work faster, smarter, and intuitive.

Every screen annotated with the reasoning and why it works the way it does.

Eduqat Assignment Feature Before AI

Now, here is the MVP 3 after AI Assisted ready | Section 1 - Create AI Assignment

The educator navigates to the curriculum builder, selects Exam, then chooses Assignment Material. This takes them to the Assignment Details and Settings page and the hub where assignment content, rubric setup, and all settings including AI grading come together in one view.

Section 2 - Create Grading Rubric

Section 3 - Assignment Grading Dashboard

Section 4 - Student View

Three weeks before launch, the AI accuracy numbers came in lower than the team expected.

The conflict happened at stake

The options on the table: delay launch until accuracy improved, launch with 78% and manage expectations, or adjust the product design so the accuracy constraint was no longer the critical variable.

60% faster creation. 4x faster grading. Here is what those numbers actually represent.

Post-launch data from the first months after the AI Smart Assignment system shipped, with honest context about measurement limitations.

These metrics are based on internal workflow benchmarks and early tester cohorts, not a statistically significant controlled experiment. Post-launch cohort tracking is ongoing. I am honest about this distinction because overstating early metric certainty leads to bad product decisions.

Two decisions I would revisit, one that worked better than expected, and three principles I now carry into every AI product I design.

What I’d change

What worked best

Making rubric creation mandatory proved to be the right decision. It added friction upfront, but created more structure, fairness, and trust throughout the experience.

Principles I carry forward

500 Ingredients, 14 Modules, One Source of Truth

Let’s build something meaningful!

azrulspace@gmail.com

Let’s build something meaningful!

azrulspace@gmail.com

Let’s build something meaningful!

azrulspace@gmail.com