Building an AI-Powered Legal Case Study Analyzer: Architecture and Design

At HuuliTech, we're building tools to democratize legal education in Mongolia. One of our most challenging and rewarding features is the Case Study Solver and Analyzer - a system that helps law students practice legal reasoning by solving case studies and evaluating their solutions.

In this article, I'll walk through the architecture, design decisions, and technical challenges we faced while building this system.

The Problem Space

Legal education in Mongolia relies heavily on case studies. Law students must:

Analyze fact patterns to identify legal issues
Research relevant laws and court precedents
Apply legal frameworks (criminal law, civil law, administrative law, constitutional law)
Write structured solutions following court-specific methodologies
Receive feedback on their analysis quality

Traditionally, this process requires:

Hours of manual research through legal databases
Deep knowledge of legal methodology
Access to mentors for feedback
Practice with real bar exam problems

We wanted to make this process accessible, instant, and scalable using AI.

System Architecture

Our case study system consists of two main flows:

graph TB
    User[User Interface] --> Modal[Case Study Modal]
    Modal --> |Problem Solving| Solver[Case Study Solver]
    Modal --> |Solution Review| Analyzer[Solution Analyzer]

    Solver --> Classifier[Court Type Classifier]
    Solver --> Examples[Example Selector]
    Solver --> Laws[Law Search Service]
    Solver --> Courts[Court Ruling Service]

    Analyzer --> Classifier2[Court Type Classifier]
    Analyzer --> Rubric[Rubric Evaluator]
    Analyzer --> Laws2[Law Recommendation Service]

    Classifier --> LLM[Gemini LLM]
    Examples --> VectorDB[(Vector Embeddings)]
    Laws --> VectorDB
    Courts --> VectorDB
    Laws2 --> VectorDB

    Solver --> Prompt[Prompt Builder]
    Analyzer --> Prompt2[Evaluation Prompt Builder]

    Prompt --> Stream[Streaming Response]
    Prompt2 --> Stream

    Stream --> ChatUI[Chat Interface]

High-Level Flow

Case Study Solver:

User inputs a legal problem statement
System classifies the court type (criminal, civil, administrative, constitutional)
Fetches relevant context: laws, court rulings, and few-shot examples
Builds a comprehensive prompt with legal methodology guides
Streams a structured solution back to the user

Solution Analyzer:

User inputs both problem and their solution
System loads court-specific evaluation rubrics
Fetches relevant laws for recommendations
Evaluates solution against rubric dimensions
Streams detailed feedback with scores and improvement suggestions

Core Design Decisions

1. Court Type Classification

Challenge: Legal analysis differs dramatically across court types. Criminal law focuses on crime elements and criminal responsibility, while civil law emphasizes dispute resolution and evidence burden.

Solution: We built an automatic classifier using Gemini Flash with structured output:

flowchart LR
    Input[Problem Statement] --> Gemini[Gemini Flash LLM]
    Gemini --> Schema[Structured JSON Schema]
    Schema --> Result{Classification Result}
    Result --> Criminal[eruu - Criminal]
    Result --> Civil[irgen - Civil]
    Result --> Admin[zahirgaa - Administrative]
    Result --> Constitutional[undes - Constitutional]
    Result --> NotCase[not_case_study]

The classifier returns:

Court type (with 95%+ accuracy based on our testing)
Confidence score
Reasoning for transparency

Users can override the classification if needed, but auto-detection works remarkably well.

2. Context Retrieval Pipeline

Challenge: Legal analysis requires multiple types of context:

Laws (primary legal sources)
Court rulings (precedents and interpretations)
Few-shot examples (solved bar exam problems)

Solution: Parallel fetching with semantic search:

sequenceDiagram
    participant Service
    participant Embeddings
    participant LawDB
    participant CourtDB
    participant ExampleDB

    Service->>Embeddings: Generate embedding for problem
    Service->>+LawDB: Fetch 10 matching laws
    Service->>+CourtDB: Fetch 10 court rulings
    Service->>+ExampleDB: Select 2 best examples

    LawDB-->>-Service: Relevant laws by type
    CourtDB-->>-Service: Summarized rulings
    ExampleDB-->>-Service: Similar solved problems

    Service->>Service: Build comprehensive prompt

Key optimizations:

Use the same embedding for all searches (generated once)
Fetch resources in parallel using Promise.all()
Load example metadata first, hydrate solutions on demand
Limit context size to prevent token overflow

3. Few-Shot Learning Architecture

Challenge: LLMs need examples of high-quality legal analysis to follow the correct format and depth.

Solution: We maintain a curated library of solved bar exam problems:

backend/src/services/case-study/few-shot/
├── eruu/           # Criminal law examples
│   ├── problem1.txt
│   ├── problem1_solution.txt
│   ├── problem2.txt
│   └── problem2_solution.txt
├── irgen/          # Civil law examples
├── zahirgaa/       # Administrative law examples
└── undes/          # Constitutional law examples

Selection process:

Load all example metadata (problems only - ~500 tokens each)
Use Gemini to select 2 most relevant examples
Hydrate with full solutions only for selected examples

This approach keeps costs low while maintaining quality.

4. Domain-Specific Rubrics

Challenge: Evaluating legal reasoning isn't subjective - it follows well-defined criteria used in Mongolian bar exams.

Solution: Court-specific rubrics with weighted dimensions:

// Criminal Law Rubric (20 points total)
const eruuRubric = [
  {
    dimension: "Crime composition and classification",
    description: "Correctly identify crime elements...",
    maxScore: 8
  },
  {
    dimension: "Co-participants",
    description: "Identify participant types...",
    maxScore: 4
  },
  {
    dimension: "Criminal responsibility and law application",
    description: "Apply sentencing guidelines...",
    maxScore: 5
  },
  {
    dimension: "Criminal procedure",
    description: "Address procedural requirements...",
    maxScore: 3
  }
];

Each court type has 4 rubric dimensions totaling 20 points. The LLM evaluates each dimension and provides:

Score for that dimension
Reasoning for the score
Specific improvements needed

5. Constitutional Law Methodology

Challenge: Constitutional law analysis follows specific multi-step methodologies (5-step or 6-step analysis).

Solution: Load methodology guides as part of the system prompt:

graph LR
    User[User selects methodology] --> Five[5-Step Analysis]
    User --> Six[6-Step Analysis]

    Five --> Step1[1. Right Identification]
    Five --> Step2[2. Restriction Identification]
    Five --> Step3[3. Legal Basis]
    Five --> Step4[4. Proportionality Test]
    Five --> Step5[5. Conclusion]

    Six --> Step1
    Six --> Step2
    Six --> Step2b[3. Scope of Protection]
    Six --> Step3
    Six --> Step4
    Six --> Step5

These guides are embedded in the system prompt, ensuring the LLM follows the correct analytical framework.

Technical Implementation Highlights

Service Layer Architecture

We use a clean service-oriented architecture:

class CaseStudyServiceV2 {
  async processCaseStudy(request: CaseStudyRequest): Promise<CaseStudyResult> {
    // 1. Classify court type
    const classification = await courtService.classifyCourtType(
      request.problemStatement
    );

    // 2. Generate embedding once
    const embedding = await generateGeminiEmbedding(
      request.problemStatement
    );

    // 3. Parallel resource fetching
    const [examples, laws, courtRulings] = await Promise.all([
      selectAndHydrateExamples(exampleMetas, request.problemStatement),
      fetchRelevantLaws(request.problemStatement, courtType, embedding),
      courtService.fetchAndSummarizeForCaseStudy(
        request.problemStatement, courtType, embedding, 10
      )
    ]);

    // 4. Build comprehensive prompt
    const { systemPrompt, userPrompt } = await buildCaseStudyPrompt(
      request, courtType, examples, laws, courtRulings
    );

    return { systemPrompt, userPrompt, courtType, classification };
  }
}

Benefits:

Clear separation of concerns
Easy to test each component independently
Centralized error handling and logging
Type-safe with TypeScript

Streaming Response Pattern

Legal analysis can be lengthy. We stream responses to improve perceived performance:

// Controller handles streaming
async handleCaseStudy(req, res) {
  const result = await caseStudyService.processCaseStudy(request);

  // Stream LLM response
  const stream = await generateLLMStream(
    result.systemPrompt,
    result.userPrompt
  );

  for await (const chunk of stream) {
    res.write(chunk);
  }

  res.end();
}

Users see analysis appearing in real-time, making the wait feel shorter.

Vector Search Strategy

We use hybrid search for legal documents:

-- Combine semantic similarity with full-text search
SELECT *
FROM legal_documents
WHERE similarity(embedding, query_embedding) > 0.7
  AND court_type = $1
ORDER BY similarity DESC
LIMIT 10

This ensures we get:

Semantically relevant documents (vector similarity)
Court-type filtered results (exact match)
Fast retrieval (indexed searches)

Performance Considerations

Latency Breakdown

Typical case study analysis takes 8-12 seconds:

Stage	Time	Optimization
Classification	1-2s	Gemini Flash (lightweight model)
Embedding	0.5-1s	Cached when possible
Context Retrieval	2-3s	Parallel fetching
Prompt Building	0.5s	In-memory operations
LLM Generation	5-8s	Streaming for perceived speed

Optimization strategies:

Use Gemini Flash for classification (10x faster than Pro)
Parallel I/O operations wherever possible
Lazy load example solutions (only when needed)
Stream responses immediately

Cost Optimization

Running LLMs at scale requires careful cost management:

pie title Token Usage Distribution
    "Context (Laws, Examples)" : 4000
    "System Prompt (Methodology)" : 2000
    "User Problem" : 500
    "Generated Solution" : 1500

Cost reduction techniques:

Limit laws to top 10 (not 50)
Summarize court rulings before including
Use example metadata for selection
Choose appropriate model sizes (Flash vs Pro)

Challenges and Solutions

Challenge 1: Context Window Limits

Problem: Including 10 laws + 10 court rulings + 2 examples + methodology guides = 12,000+ tokens

Solution:

Prioritize matching-type laws over other-type laws
Summarize court rulings to key points only
Use truncation strategies for very long laws

Challenge 2: Evaluation Consistency

Problem: LLM evaluations can be inconsistent between runs

Solution:

Provide detailed rubric with specific scoring criteria
Use examples of each score level (0, 50%, 100%)
Request explicit reasoning for each score
Log evaluations for quality monitoring

Challenge 3: Mongolian Language Support

Problem: Legal terminology in Mongolian requires careful handling

Solution:

All prompts and rubrics in Mongolian
Use Gemini models (better multilingual support)
Test with native Mongolian legal experts
Maintain glossary of legal terms

Results and Impact

Since launching the case study feature:

2,500+ case studies solved
85% user satisfaction (based on thumbs up/down)
Average 10 minutes per case study (vs 60+ minutes manually)
Students practice 5x more due to instant feedback

The most impactful feature is the solution analyzer - students can now:

Attempt a problem on their own
Get detailed rubric-based feedback
See specific areas for improvement
Learn from recommended laws and precedents

Future Directions

We're exploring several enhancements:

1. Adaptive Difficulty

Track student performance and suggest appropriately challenging problems

2. Comparative Analysis

Show how other students approached the same problem (anonymized)

3. Interactive Clarification

Allow students to ask follow-up questions about their feedback

4. Multi-Turn Dialogue

Support conversational problem-solving instead of single-shot analysis

5. Personalized Study Plans

Generate practice schedules based on weak rubric dimensions

Key Takeaways

Building an AI-powered legal education tool taught us:

Domain expertise matters: Understanding legal methodology was crucial for prompt engineering
Context is king: Quality context (laws, examples, precedents) beats model size
Structured output: Use JSON schemas and rubrics for consistent evaluations
Performance optimization: Parallel fetching and streaming make a huge UX difference
Iterative improvement: Start simple, gather feedback, enhance based on real usage

Conclusion

The Case Study Solver and Analyzer represents our commitment to making legal education accessible to every Mongolian law student. By combining LLMs, vector search, and domain-specific rubrics, we've created a tool that provides instant, structured, and actionable feedback on legal reasoning.

The system is far from perfect - legal reasoning is nuanced and context-dependent. But by focusing on clear methodology, transparent evaluation criteria, and continuous improvement based on user feedback, we're building something that genuinely helps students learn.

If you're building AI-powered education tools, I hope our architecture and design decisions provide useful insights. Feel free to reach out with questions or suggestions!

Want to try the Case Study Solver? Visit huuli.tech and click the "Бодлого" button in the chat interface. It's free to use with our trial program.

Technical Appendix

For those interested in implementation details:

Tech Stack:

Backend: Express.js + TypeScript
LLM: Google Gemini (Flash for classification, Pro for generation)
Vector DB: PostgreSQL with pgvector extension
Frontend: Next.js 15 + React Query
Streaming: Server-Sent Events (SSE)

Open Questions:

How to handle multi-language legal systems? (English common law vs Mongolian civil law)
Can we fine-tune smaller models on legal reasoning?
What's the right balance between automation and human feedback?

Resources:

Building an AI-Powered Legal Case Study Analyzer: Architecture and Design

The Problem Space

System Architecture

High-Level Flow

Core Design Decisions

1. Court Type Classification

2. Context Retrieval Pipeline

3. Few-Shot Learning Architecture

4. Domain-Specific Rubrics

5. Constitutional Law Methodology

Technical Implementation Highlights

Service Layer Architecture

Streaming Response Pattern

Vector Search Strategy

Performance Considerations

Latency Breakdown

Cost Optimization

Challenges and Solutions

Challenge 1: Context Window Limits

Challenge 2: Evaluation Consistency

Challenge 3: Mongolian Language Support

Results and Impact

Future Directions

1. Adaptive Difficulty

2. Comparative Analysis

3. Interactive Clarification

4. Multi-Turn Dialogue

5. Personalized Study Plans

Key Takeaways

Conclusion

Technical Appendix

Манай бүтээгдэхүүнүүдтэй танилцаарай

ХуульGPT

ШүүхBrief

Танай байгууллагад зориулсан шийдэл хэрэгтэй юу?