ХуульTech LogoХуульTech
Онцлох нийтлэл

Building an AI-Powered Legal Case Study Analyzer: Architecture and Design

How we built an intelligent system to help law students solve and analyze legal case studies using LLMs, vector search, and domain-specific rubrics

HuuliTech Engineering Team
2025-11-07
12 minutes
ArchitectureAILLMLegal TechSystem Design

At HuuliTech, we're building tools to democratize legal education in Mongolia. One of our most challenging and rewarding features is the Case Study Solver and Analyzer - a system that helps law students practice legal reasoning by solving case studies and evaluating their solutions.

In this article, I'll walk through the architecture, design decisions, and technical challenges we faced while building this system.

The Problem Space

Legal education in Mongolia relies heavily on case studies. Law students must:

  1. Analyze fact patterns to identify legal issues
  2. Research relevant laws and court precedents
  3. Apply legal frameworks (criminal law, civil law, administrative law, constitutional law)
  4. Write structured solutions following court-specific methodologies
  5. Receive feedback on their analysis quality

Traditionally, this process requires:

  • Hours of manual research through legal databases
  • Deep knowledge of legal methodology
  • Access to mentors for feedback
  • Practice with real bar exam problems

We wanted to make this process accessible, instant, and scalable using AI.

System Architecture

Our case study system consists of two main flows:

graph TB
    User[User Interface] --> Modal[Case Study Modal]
    Modal --> |Problem Solving| Solver[Case Study Solver]
    Modal --> |Solution Review| Analyzer[Solution Analyzer]

    Solver --> Classifier[Court Type Classifier]
    Solver --> Examples[Example Selector]
    Solver --> Laws[Law Search Service]
    Solver --> Courts[Court Ruling Service]

    Analyzer --> Classifier2[Court Type Classifier]
    Analyzer --> Rubric[Rubric Evaluator]
    Analyzer --> Laws2[Law Recommendation Service]

    Classifier --> LLM[Gemini LLM]
    Examples --> VectorDB[(Vector Embeddings)]
    Laws --> VectorDB
    Courts --> VectorDB
    Laws2 --> VectorDB

    Solver --> Prompt[Prompt Builder]
    Analyzer --> Prompt2[Evaluation Prompt Builder]

    Prompt --> Stream[Streaming Response]
    Prompt2 --> Stream

    Stream --> ChatUI[Chat Interface]

High-Level Flow

Case Study Solver:

  1. User inputs a legal problem statement
  2. System classifies the court type (criminal, civil, administrative, constitutional)
  3. Fetches relevant context: laws, court rulings, and few-shot examples
  4. Builds a comprehensive prompt with legal methodology guides
  5. Streams a structured solution back to the user

Solution Analyzer:

  1. User inputs both problem and their solution
  2. System loads court-specific evaluation rubrics
  3. Fetches relevant laws for recommendations
  4. Evaluates solution against rubric dimensions
  5. Streams detailed feedback with scores and improvement suggestions

Core Design Decisions

1. Court Type Classification

Challenge: Legal analysis differs dramatically across court types. Criminal law focuses on crime elements and criminal responsibility, while civil law emphasizes dispute resolution and evidence burden.

Solution: We built an automatic classifier using Gemini Flash with structured output:

flowchart LR
    Input[Problem Statement] --> Gemini[Gemini Flash LLM]
    Gemini --> Schema[Structured JSON Schema]
    Schema --> Result{Classification Result}
    Result --> Criminal[eruu - Criminal]
    Result --> Civil[irgen - Civil]
    Result --> Admin[zahirgaa - Administrative]
    Result --> Constitutional[undes - Constitutional]
    Result --> NotCase[not_case_study]

The classifier returns:

  • Court type (with 95%+ accuracy based on our testing)
  • Confidence score
  • Reasoning for transparency

Users can override the classification if needed, but auto-detection works remarkably well.

2. Context Retrieval Pipeline

Challenge: Legal analysis requires multiple types of context:

  • Laws (primary legal sources)
  • Court rulings (precedents and interpretations)
  • Few-shot examples (solved bar exam problems)

Solution: Parallel fetching with semantic search:

sequenceDiagram
    participant Service
    participant Embeddings
    participant LawDB
    participant CourtDB
    participant ExampleDB

    Service->>Embeddings: Generate embedding for problem
    Service->>+LawDB: Fetch 10 matching laws
    Service->>+CourtDB: Fetch 10 court rulings
    Service->>+ExampleDB: Select 2 best examples

    LawDB-->>-Service: Relevant laws by type
    CourtDB-->>-Service: Summarized rulings
    ExampleDB-->>-Service: Similar solved problems

    Service->>Service: Build comprehensive prompt

Key optimizations:

  • Use the same embedding for all searches (generated once)
  • Fetch resources in parallel using Promise.all()
  • Load example metadata first, hydrate solutions on demand
  • Limit context size to prevent token overflow

3. Few-Shot Learning Architecture

Challenge: LLMs need examples of high-quality legal analysis to follow the correct format and depth.

Solution: We maintain a curated library of solved bar exam problems:

backend/src/services/case-study/few-shot/
├── eruu/           # Criminal law examples
│   ├── problem1.txt
│   ├── problem1_solution.txt
│   ├── problem2.txt
│   └── problem2_solution.txt
├── irgen/          # Civil law examples
├── zahirgaa/       # Administrative law examples
└── undes/          # Constitutional law examples

Selection process:

  1. Load all example metadata (problems only - ~500 tokens each)
  2. Use Gemini to select 2 most relevant examples
  3. Hydrate with full solutions only for selected examples

This approach keeps costs low while maintaining quality.

4. Domain-Specific Rubrics

Challenge: Evaluating legal reasoning isn't subjective - it follows well-defined criteria used in Mongolian bar exams.

Solution: Court-specific rubrics with weighted dimensions:

// Criminal Law Rubric (20 points total)
const eruuRubric = [
  {
    dimension: "Crime composition and classification",
    description: "Correctly identify crime elements...",
    maxScore: 8
  },
  {
    dimension: "Co-participants",
    description: "Identify participant types...",
    maxScore: 4
  },
  {
    dimension: "Criminal responsibility and law application",
    description: "Apply sentencing guidelines...",
    maxScore: 5
  },
  {
    dimension: "Criminal procedure",
    description: "Address procedural requirements...",
    maxScore: 3
  }
];

Each court type has 4 rubric dimensions totaling 20 points. The LLM evaluates each dimension and provides:

  • Score for that dimension
  • Reasoning for the score
  • Specific improvements needed

5. Constitutional Law Methodology

Challenge: Constitutional law analysis follows specific multi-step methodologies (5-step or 6-step analysis).

Solution: Load methodology guides as part of the system prompt:

graph LR
    User[User selects methodology] --> Five[5-Step Analysis]
    User --> Six[6-Step Analysis]

    Five --> Step1[1. Right Identification]
    Five --> Step2[2. Restriction Identification]
    Five --> Step3[3. Legal Basis]
    Five --> Step4[4. Proportionality Test]
    Five --> Step5[5. Conclusion]

    Six --> Step1
    Six --> Step2
    Six --> Step2b[3. Scope of Protection]
    Six --> Step3
    Six --> Step4
    Six --> Step5

These guides are embedded in the system prompt, ensuring the LLM follows the correct analytical framework.

Technical Implementation Highlights

Service Layer Architecture

We use a clean service-oriented architecture:

class CaseStudyServiceV2 {
  async processCaseStudy(request: CaseStudyRequest): Promise<CaseStudyResult> {
    // 1. Classify court type
    const classification = await courtService.classifyCourtType(
      request.problemStatement
    );

    // 2. Generate embedding once
    const embedding = await generateGeminiEmbedding(
      request.problemStatement
    );

    // 3. Parallel resource fetching
    const [examples, laws, courtRulings] = await Promise.all([
      selectAndHydrateExamples(exampleMetas, request.problemStatement),
      fetchRelevantLaws(request.problemStatement, courtType, embedding),
      courtService.fetchAndSummarizeForCaseStudy(
        request.problemStatement, courtType, embedding, 10
      )
    ]);

    // 4. Build comprehensive prompt
    const { systemPrompt, userPrompt } = await buildCaseStudyPrompt(
      request, courtType, examples, laws, courtRulings
    );

    return { systemPrompt, userPrompt, courtType, classification };
  }
}

Benefits:

  • Clear separation of concerns
  • Easy to test each component independently
  • Centralized error handling and logging
  • Type-safe with TypeScript

Streaming Response Pattern

Legal analysis can be lengthy. We stream responses to improve perceived performance:

// Controller handles streaming
async handleCaseStudy(req, res) {
  const result = await caseStudyService.processCaseStudy(request);

  // Stream LLM response
  const stream = await generateLLMStream(
    result.systemPrompt,
    result.userPrompt
  );

  for await (const chunk of stream) {
    res.write(chunk);
  }

  res.end();
}

Users see analysis appearing in real-time, making the wait feel shorter.

Vector Search Strategy

We use hybrid search for legal documents:

-- Combine semantic similarity with full-text search
SELECT *
FROM legal_documents
WHERE similarity(embedding, query_embedding) > 0.7
  AND court_type = $1
ORDER BY similarity DESC
LIMIT 10

This ensures we get:

  • Semantically relevant documents (vector similarity)
  • Court-type filtered results (exact match)
  • Fast retrieval (indexed searches)

Performance Considerations

Latency Breakdown

Typical case study analysis takes 8-12 seconds:

StageTimeOptimization
Classification1-2sGemini Flash (lightweight model)
Embedding0.5-1sCached when possible
Context Retrieval2-3sParallel fetching
Prompt Building0.5sIn-memory operations
LLM Generation5-8sStreaming for perceived speed

Optimization strategies:

  • Use Gemini Flash for classification (10x faster than Pro)
  • Parallel I/O operations wherever possible
  • Lazy load example solutions (only when needed)
  • Stream responses immediately

Cost Optimization

Running LLMs at scale requires careful cost management:

pie title Token Usage Distribution
    "Context (Laws, Examples)" : 4000
    "System Prompt (Methodology)" : 2000
    "User Problem" : 500
    "Generated Solution" : 1500

Cost reduction techniques:

  • Limit laws to top 10 (not 50)
  • Summarize court rulings before including
  • Use example metadata for selection
  • Choose appropriate model sizes (Flash vs Pro)

Challenges and Solutions

Challenge 1: Context Window Limits

Problem: Including 10 laws + 10 court rulings + 2 examples + methodology guides = 12,000+ tokens

Solution:

  • Prioritize matching-type laws over other-type laws
  • Summarize court rulings to key points only
  • Use truncation strategies for very long laws

Challenge 2: Evaluation Consistency

Problem: LLM evaluations can be inconsistent between runs

Solution:

  • Provide detailed rubric with specific scoring criteria
  • Use examples of each score level (0, 50%, 100%)
  • Request explicit reasoning for each score
  • Log evaluations for quality monitoring

Challenge 3: Mongolian Language Support

Problem: Legal terminology in Mongolian requires careful handling

Solution:

  • All prompts and rubrics in Mongolian
  • Use Gemini models (better multilingual support)
  • Test with native Mongolian legal experts
  • Maintain glossary of legal terms

Results and Impact

Since launching the case study feature:

  • 2,500+ case studies solved
  • 85% user satisfaction (based on thumbs up/down)
  • Average 10 minutes per case study (vs 60+ minutes manually)
  • Students practice 5x more due to instant feedback

The most impactful feature is the solution analyzer - students can now:

  1. Attempt a problem on their own
  2. Get detailed rubric-based feedback
  3. See specific areas for improvement
  4. Learn from recommended laws and precedents

Future Directions

We're exploring several enhancements:

1. Adaptive Difficulty

Track student performance and suggest appropriately challenging problems

2. Comparative Analysis

Show how other students approached the same problem (anonymized)

3. Interactive Clarification

Allow students to ask follow-up questions about their feedback

4. Multi-Turn Dialogue

Support conversational problem-solving instead of single-shot analysis

5. Personalized Study Plans

Generate practice schedules based on weak rubric dimensions

Key Takeaways

Building an AI-powered legal education tool taught us:

  1. Domain expertise matters: Understanding legal methodology was crucial for prompt engineering
  2. Context is king: Quality context (laws, examples, precedents) beats model size
  3. Structured output: Use JSON schemas and rubrics for consistent evaluations
  4. Performance optimization: Parallel fetching and streaming make a huge UX difference
  5. Iterative improvement: Start simple, gather feedback, enhance based on real usage

Conclusion

The Case Study Solver and Analyzer represents our commitment to making legal education accessible to every Mongolian law student. By combining LLMs, vector search, and domain-specific rubrics, we've created a tool that provides instant, structured, and actionable feedback on legal reasoning.

The system is far from perfect - legal reasoning is nuanced and context-dependent. But by focusing on clear methodology, transparent evaluation criteria, and continuous improvement based on user feedback, we're building something that genuinely helps students learn.

If you're building AI-powered education tools, I hope our architecture and design decisions provide useful insights. Feel free to reach out with questions or suggestions!


Want to try the Case Study Solver? Visit huuli.tech and click the "Бодлого" button in the chat interface. It's free to use with our trial program.

Technical Appendix

For those interested in implementation details:

Tech Stack:

  • Backend: Express.js + TypeScript
  • LLM: Google Gemini (Flash for classification, Pro for generation)
  • Vector DB: PostgreSQL with pgvector extension
  • Frontend: Next.js 15 + React Query
  • Streaming: Server-Sent Events (SSE)

Open Questions:

  • How to handle multi-language legal systems? (English common law vs Mongolian civil law)
  • Can we fine-tune smaller models on legal reasoning?
  • What's the right balance between automation and human feedback?

Resources:

Манай бүтээгдэхүүнүүдтэй танилцаарай

ХуульTech нь Монголын хуульчдад зориулсан хиймэл оюун ухааны шийдлүүдийг санал болгож байна. Манай технологийг ашиглан та илүү үр дүнтэй, хурдан ажиллах боломжтой.

ХуульGPT

Монгол хэл дээрх анхны хуулийн AI туслах. Хуулийн асуултуудад тодорхой, найдвартай хариулт өгнө.

Одоо турших

ШүүхBrief

Chrome extension шүүхийн тогтоолыг автоматаар товчлон гаргах. Хуульчдын цагийг хэмнэх хамгийн хурдан арга.

Суулгах

Танай байгууллагад зориулсан шийдэл хэрэгтэй юу?

Бид танай хэрэгцээнд тохирсон хуулийн технологийн шийдлүүдийг хөгжүүлж чадна.

Холбогдох