MedExpertMatch Implementation Plan¶
Last Updated: 2026-01-22
Version: 1.1
Status: MVP Complete ✅
Document Purpose¶
This implementation plan provides a detailed, phase-by-phase guide for implementing MedExpertMatch based on the Product Requirements Document (PRD), Architecture, and Use Cases. The plan follows Test-Driven Development (TDD) principles and uses patterns from the expert-match codebase as reference.
Related Documentation:
- Product Requirements Document - Complete product requirements and specifications
- Architecture - System architecture and design
- Use Cases - Detailed use case workflows with sequence diagrams
- UI Flows and Mockups - User interface wireframes, flows, and UI/UX guidelines
- Vision - Project vision and long-term goals
Implementation Timeline¶
Total Duration: 6 weeks (MVP for MedGemma Impact Challenge)
- Week 1-2: Foundation (Domain Models, Database Schema, Repositories)
- Week 3: Core Services (MedGemma Integration, Case Analysis)
- Week 4: Agent Skills Implementation
- Week 5-6: Integration, Testing, UI, Demo Preparation
Prerequisites¶
Development Environment¶
- Java: Java 21 (LTS)
- Maven: Maven 3.9+
- Docker: Docker and Docker Compose (for database containers)
- IDE: IntelliJ IDEA or VS Code with Java extensions
- Python: Python 3.8+ (for documentation)
Docker Containers¶
The project requires Docker containers for:
- Development Database: PostgreSQL 17 with PgVector and Apache AGE
- Test Database: PostgreSQL 17 with PgVector and Apache AGE (Testcontainers)
- Demo Database: PostgreSQL 17 with PgVector and Apache AGE (for demo/test data)
Phase 1: Foundation (Weeks 1-2)¶
1.1 Project Setup¶
1.1.1 Initialize Spring Boot Project¶
Reference: See expert-match pom.xml structure
Tasks:
- Create Maven project structure
- Configure
pom.xmlwith dependencies:- Spring Boot 4.0.2
- Spring AI 2.0.0-M2
- PostgreSQL Driver
- Testcontainers
- Lombok
- Datafaker (for test data generation)
- HAPI FHIR R5 (for FHIR resource creation and validation):
ca.uhn.hapi.fhir:hapi-fhir-structures-r5:7.0.0(or latest version supporting R5)
- Create package structure following domain-driven design
Package Structure:
com.berdachuk.medexpertmatch/
├── core/ # Shared infrastructure
├── doctor/ # Doctor domain module
├── medicalcase/ # Medical case domain module
├── medicalcoding/ # ICD-10, SNOMED codes
├── clinicalexperience/ # Clinical experience domain module
├── query/ # Query processing
├── retrieval/ # Hybrid GraphRAG retrieval
├── llm/ # LLM orchestration
├── embedding/ # Vector embedding generation
├── graph/ # Apache AGE graph management
├── chat/ # Chat conversation management
├── ingestion/ # Data ingestion (test data generator)
└── web/ # Thymeleaf UI controllers
1.1.2 Docker Container Setup¶
Reference: See expert-match docker/Dockerfile.dev, docker/Dockerfile.test, docker-compose.dev.yml
Create Docker Files:
docker/Dockerfile.dev (Development Database):
# Use the official Apache AGE image with PostgreSQL 17 and AGE 1.6.0
FROM apache/age:release_PG17_1.6.0
# Install necessary packages for building pgVector extension
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
git \
ca-certificates \
postgresql-server-dev-17 && \
update-ca-certificates && \
rm -rf /var/lib/apt/lists/*
# Clone, build, and install pgVector 0.8.0
RUN git config --global http.sslverify false && \
git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git /pgvector && \
cd /pgvector && \
make PG_CONFIG=/usr/lib/postgresql/17/bin/pg_config && \
make PG_CONFIG=/usr/lib/postgresql/17/bin/pg_config install && \
cd / && \
rm -rf /pgvector
# Update PostgreSQL configuration to preload both extensions
RUN if [ -f /usr/share/postgresql/postgresql.conf.sample ]; then \
sed -i "s/shared_preload_libraries = 'age'/shared_preload_libraries = 'age,vector'/" /usr/share/postgresql/postgresql.conf.sample || \
echo "shared_preload_libraries = 'age,vector'" >> /usr/share/postgresql/postgresql.conf.sample; \
elif [ -f /usr/share/postgresql/17/postgresql.conf.sample ]; then \
sed -i "s/shared_preload_libraries = 'age'/shared_preload_libraries = 'age,vector'/" /usr/share/postgresql/17/postgresql.conf.sample || \
echo "shared_preload_libraries = 'age,vector'" >> /usr/share/postgresql/17/postgresql.conf.sample; \
else \
echo "shared_preload_libraries = 'age,vector'" >> /etc/postgresql/postgresql.conf || true; \
fi
EXPOSE 5432
CMD ["postgres"]
docker/Dockerfile.test (Test Database - same as dev):
# Same as Dockerfile.dev - used for Testcontainers
FROM apache/age:release_PG17_1.6.0
# Install necessary packages for building pgVector extension
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
git \
ca-certificates \
postgresql-server-dev-17 && \
update-ca-certificates && \
rm -rf /var/lib/apt/lists/*
# Clone, build, and install pgVector 0.8.0
RUN git config --global http.sslverify false && \
git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git /pgvector && \
cd /pgvector && \
make PG_CONFIG=/usr/lib/postgresql/17/bin/pg_config && \
make PG_CONFIG=/usr/lib/postgresql/17/bin/pg_config install && \
cd / && \
rm -rf /pgvector
# Update PostgreSQL configuration to preload both extensions
RUN if [ -f /usr/share/postgresql/postgresql.conf.sample ]; then \
sed -i "s/shared_preload_libraries = 'age'/shared_preload_libraries = 'age,vector'/" /usr/share/postgresql/postgresql.conf.sample || \
echo "shared_preload_libraries = 'age,vector'" >> /usr/share/postgresql/postgresql.conf.sample; \
elif [ -f /usr/share/postgresql/17/postgresql.conf.sample ]; then \
sed -i "s/shared_preload_libraries = 'age'/shared_preload_libraries = 'age,vector'/" /usr/share/postgresql/17/postgresql.conf.sample || \
echo "shared_preload_libraries = 'age,vector'" >> /usr/share/postgresql/17/postgresql.conf.sample; \
else \
echo "shared_preload_libraries = 'age,vector'" >> /etc/postgresql/postgresql.conf || true; \
fi
EXPOSE 5432
CMD ["postgres"]
docker-compose.dev.yml (Development Database):
version: '3.8'
services:
postgres-dev:
build:
context: .
dockerfile: docker/Dockerfile.dev
image: medexpertmatch-postgres-dev:latest
container_name: medexpertmatch-postgres-dev
environment:
POSTGRES_USER: medexpertmatch
POSTGRES_PASSWORD: medexpertmatch
POSTGRES_DB: medexpertmatch
ports:
- "5433:5432" # Map container port 5432 to host port 5433
volumes:
- ~/data/medexpertmatch-postgres:/var/lib/postgresql/data
healthcheck:
test: [ "CMD-SHELL", "pg_isready -U medexpertmatch" ]
interval: 10s
timeout: 5s
retries: 5
networks:
- medexpertmatch-network
postgres-demo:
build:
context: .
dockerfile: docker/Dockerfile.dev
image: medexpertmatch-postgres-dev:latest
container_name: medexpertmatch-postgres-demo
environment:
POSTGRES_USER: medexpertmatch
POSTGRES_PASSWORD: medexpertmatch
POSTGRES_DB: medexpertmatch_demo
ports:
- "5434:5432" # Map container port 5432 to host port 5434
volumes:
- ~/data/medexpertmatch-postgres-demo:/var/lib/postgresql/data
healthcheck:
test: [ "CMD-SHELL", "pg_isready -U medexpertmatch" ]
interval: 10s
timeout: 5s
retries: 5
networks:
- medexpertmatch-network
networks:
medexpertmatch-network:
driver: bridge
scripts/build-test-container.sh:
#!/bin/bash
# Build the test container image for integration tests
# This image includes PostgreSQL 17 with Apache AGE and PgVector extensions
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
echo "Building medexpertmatch-postgres-test Docker image..."
echo "This may take 5-10 minutes on first build..."
cd "$PROJECT_ROOT"
docker build -f docker/Dockerfile.test -t medexpertmatch-postgres-test:latest .
echo ""
echo "✅ Test container image built successfully!"
echo "Image: medexpertmatch-postgres-test:latest"
echo ""
echo "You can now run integration tests with:"
echo " mvn test -Dtest=*IT"
Tasks:
- Create
docker/directory - Create
Dockerfile.devandDockerfile.test - Create
docker-compose.dev.ymlwith dev and demo database services - Create
scripts/build-test-container.sh - Make script executable:
chmod +x scripts/build-test-container.sh - Build test container:
./scripts/build-test-container.sh - Start development database:
docker compose -f docker-compose.dev.yml up -d postgres-dev - Start demo database:
docker compose -f docker-compose.dev.yml up -d postgres-demo
1.1.3 Database Schema Design¶
Reference: See expert-match src/main/resources/db/migration/V1__initial_schema.sql
Tasks:
- Create Flyway migration file:
src/main/resources/db/migration/V1__initial_schema.sql - Design tables:
doctors(adapted fromemployees)medical_cases(adapted fromprojects)clinical_experiences(adapted fromwork_experiences)icd10_codes(new)medical_specialties(adapted fromtechnologies)facilities(new)consultation_matches(new)
- Create indexes for vector search (PgVector)
- Create graph schema (Apache AGE)
Key Design Decisions:
- Doctor IDs: VARCHAR(74) for external system IDs (supports UUID strings, 19-digit numeric strings, or other formats)
- Medical Case IDs: CHAR(24) for internal MongoDB-compatible IDs
- Vector Columns: Use
vector(1536)for embeddings (MedGemma dimensions) - Graph Labels:
Doctor,MedicalCase,ICD10Code,MedicalSpecialty,Facility
1.2 Domain Models¶
Reference: See expert-match src/main/java/com/berdachuk/expertmatch/employee/domain/Employee.java
1.2.1 Doctor Domain Model¶
Location: src/main/java/com/berdachuk/medexpertmatch/doctor/domain/Doctor.java
Tasks:
- Create
Doctorrecord (adapted fromEmployee) - Add medical-specific fields:
- Medical specialties (List
) - Board certifications (List
) - Facility affiliations (List
) - Telehealth capability (boolean)
- Medical specialties (List
- Create
MedicalSpecialtyenum/entity - Create DTOs, filters, wrappers
Example Structure:
package com.berdachuk.medexpertmatch.doctor.domain;
import lombok.Builder;
import lombok.Data;
import java.util.List;
@Data
@Builder
public record Doctor(
String id, // External system ID (VARCHAR(74)) - UUID, 19-digit numeric, or other format
String name,
String email,
List<String> specialties, // Medical specialties
List<String> certifications, // Board certifications
List<String> facilityIds, // Facility affiliations
boolean telehealthEnabled,
String availabilityStatus
) {
}
1.2.2 MedicalCase Domain Model¶
Location: src/main/java/com/berdachuk/medexpertmatch/medicalcase/domain/MedicalCase.java
Tasks:
- Create
MedicalCaserecord (adapted fromProject) - Add medical-specific fields:
- Patient age (anonymized)
- Chief complaint
- Symptoms
- ICD-10 codes (List
) - SNOMED codes (List
) - Urgency level (enum: CRITICAL, HIGH, MEDIUM, LOW)
- Required specialty
- Case type (enum: INPATIENT, SECOND_OPINION, CONSULT_REQUEST)
- Create related entities and DTOs
1.2.3 ClinicalExperience Domain Model¶
Location: src/main/java/com/berdachuk/medexpertmatch/clinicalexperience/domain/ClinicalExperience.java
Tasks:
- Create
ClinicalExperiencerecord (adapted fromWorkExperience) - Add medical-specific fields:
- Case outcomes
- Procedures performed
- Complexity level
- Complications
- Patient outcomes (anonymized)
- Link to
DoctorandMedicalCase
1.2.4 ICD10Code Domain Model¶
Location: src/main/java/com/berdachuk/medexpertmatch/medicalcoding/domain/ICD10Code.java
Tasks:
- Create
ICD10Codeentity (new) - Store ICD-10 code hierarchy
- Support code relationships and synonyms
1.3 Repository Layer¶
Reference: See expert-match src/main/java/com/berdachuk/expertmatch/employee/repository/EmployeeRepository.java
and implementation
1.3.1 DoctorRepository¶
Location: src/main/java/com/berdachuk/medexpertmatch/doctor/repository/DoctorRepository.java
Tasks:
- Create
DoctorRepositoryinterface - Implement
DoctorRepositoryImplwith JDBC - Create
DoctorMapper(RowMapper) - Implement methods:
findById(String doctorId)findAll()findBySpecialty(String specialty)findByCondition(String icd10Code)(for graph queries)findByConditionWithMetrics(String icd10Code, Period period)(for analytics)
- Write integration tests:
DoctorRepositoryIT
Example Structure:
package com.berdachuk.medexpertmatch.doctor.repository;
import com.berdachuk.medexpertmatch.doctor.domain.Doctor;
import java.util.List;
import java.util.Optional;
public interface DoctorRepository {
Optional<Doctor> findById(String doctorId);
List<Doctor> findAll();
List<Doctor> findBySpecialty(String specialty);
List<Doctor> findByCondition(String icd10Code);
Map<String, List<Doctor>> findByConditionWithMetrics(String icd10Code, Period period);
}
1.3.2 MedicalCaseRepository¶
Location: src/main/java/com/berdachuk/medexpertmatch/medicalcase/repository/MedicalCaseRepository.java
Tasks:
- Create
MedicalCaseRepositoryinterface - Implement
MedicalCaseRepositoryImpl - Create
MedicalCaseMapper - Implement methods:
findById(String caseId)save(MedicalCase medicalCase)findByType(CaseType type)findOpenConsultRequests()
- Write integration tests:
MedicalCaseRepositoryIT
1.3.3 ClinicalExperienceRepository¶
Location:
src/main/java/com/berdachuk/medexpertmatch/clinicalexperience/repository/ClinicalExperienceRepository.java
Tasks:
- Create
ClinicalExperienceRepositoryinterface - Implement
ClinicalExperienceRepositoryImpl - Create
ClinicalExperienceMapper - Implement batch loading methods:
findByDoctorIds(List<String> doctorIds)→Map<String, List<ClinicalExperience>>findByCaseIds(List<String> caseIds)→Map<String, List<ClinicalExperience>>
- Write integration tests:
ClinicalExperienceRepositoryIT
1.4 Base Integration Test Setup¶
Reference: See expert-match src/test/java/com/berdachuk/expertmatch/integration/BaseIntegrationTest.java
Location: src/test/java/com/berdachuk/medexpertmatch/integration/BaseIntegrationTest.java
Tasks:
- Create
BaseIntegrationTestabstract class - Configure Testcontainers with custom image:
medexpertmatch-postgres-test:latest - Set up
@DynamicPropertySourcefor database connection - Implement
@BeforeEachto clear test data - Configure container reuse for faster tests
Example Structure:
package com.berdachuk.medexpertmatch.integration;
import org.springframework.test.context.DynamicPropertyRegistry;
import org.springframework.test.context.DynamicPropertySource;
import org.testcontainers.containers.PostgreSQLContainer;
import org.testcontainers.utility.DockerImageName;
import lombok.extern.slf4j.Slf4j;
@Slf4j
public abstract class BaseIntegrationTest {
static final PostgreSQLContainer<?> postgres;
static {
PostgreSQLContainer<?> container = new PostgreSQLContainer<>(
DockerImageName.parse("medexpertmatch-postgres-test:latest")
.asCompatibleSubstituteFor("postgres"))
.withDatabaseName("medexpertmatch_test")
.withUsername("test")
.withPassword("test")
.withReuse(true)
.withLabel("test", "medexpertmatch-integration");
container.start();
postgres = container;
}
@DynamicPropertySource
static void configureProperties(DynamicPropertyRegistry registry) {
registry.add("spring.datasource.url", postgres::getJdbcUrl);
registry.add("spring.datasource.username", postgres::getUsername);
registry.add("spring.datasource.password", postgres::getPassword);
}
@BeforeEach
void setUp() {
// Clear test data before each test
clearTestData();
}
protected abstract void clearTestData();
}
Phase 2: Core Services (Week 3)¶
2.1 CaseAnalysisService¶
Reference: See expert-match query processing and entity extraction patterns
Location: src/main/java/com/berdachuk/medexpertmatch/query/service/CaseAnalysisService.java
Tasks:
- Create
CaseAnalysisServiceinterface - Implement
CaseAnalysisServiceImplwith MedGemma integration - Implement methods:
analyzeCase(String caseText) → CaseAnalysisextractICD10Codes(String caseText) → List<String>classifyUrgency(String caseText) → UrgencyLeveldetermineRequiredSpecialty(String caseText) → String
- Use Spring AI
PromptTemplatewith.stfiles - Write integration tests:
CaseAnalysisServiceIT
2.2 SemanticGraphRetrievalService (Semantic Graph Retrieval)¶
Location: src/main/java/com/berdachuk/medexpertmatch/retrieval/service/SemanticGraphRetrievalService.java
Tasks:
- Create
SemanticGraphRetrievalServiceinterface - Implement
SgrServiceImpl - Implement methods:
score(MedicalCase case, Doctor doctor) → ScoreResultsemanticGraphRetrievalRouteScore(MedicalCase case, Facility facility) → RouteScoreResultcomputePriorityScore(MedicalCase case) → PriorityScore
- Combine signals:
- Vector similarity (PgVector embeddings)
- Graph relationships (Apache AGE)
- Historical performance (outcomes, ratings)
- Write integration tests:
SgrServiceIT
2.3 GraphService¶
Location: src/main/java/com/berdachuk/medexpertmatch/graph/service/GraphService.java
Tasks:
- Create
GraphServiceinterface - Implement
GraphServiceImplwith Apache AGE Cypher queries - Implement methods:
graphQueryTopExperts(String conditionCode, Period period) → List<ExpertMetrics>graphQueryCandidateCenters(String conditionCode) → List<FacilityCandidate>queryDoctorCaseRelationships(String doctorId, String conditionCode) → List<DoctorCaseRelationship>
- Write integration tests:
GraphServiceIT
2.4 MatchingService¶
Location: src/main/java/com/berdachuk/medexpertmatch/retrieval/service/MatchingService.java
Tasks:
- Create
MatchingServiceinterface - Implement
MatchingServiceImpl - Orchestrate matching across multiple services
- Implement methods:
matchDoctorsToCase(String caseId, MatchOptions options) → List<DoctorMatch>matchFacilitiesForCase(String caseId, RoutingOptions options) → List<FacilityMatch>
- Write integration tests:
MatchingServiceIT
2.5 FHIR Adapters¶
Location: src/main/java/com/berdachuk/medexpertmatch/ingestion/adapter/FhirBundleAdapter.java
Reference: FHIR R5 specification (v5.0.0) for resource structure and data types
Tasks:
- Create FHIR adapter interfaces
- Implement adapters compatible with FHIR R5 specification (v5.0.0):
FhirBundleAdapter: Convert FHIR Bundle → MedicalCase- Parse Bundle entries (Patient, Condition, Observation, Encounter)
- Validate Bundle structure and resource references
- Extract resources following FHIR R5 data types
FhirPatientAdapter: Extract patient data (anonymized)- Extract demographics from Patient resource
- Ensure no PHI is extracted
FhirConditionAdapter: Extract conditions, ICD-10 codes- Extract Condition.code.coding with ICD-10 system (
http://hl7.org/fhir/sid/icd-10) - Extract condition onset, severity, clinical status
- Extract Condition.code.coding with ICD-10 system (
FhirEncounterAdapter: Extract encounter data- Extract encounter type, status, class
- Extract service provider and participants
FhirObservationAdapter: Extract observation data- Extract observation values, codes, effective dates
- Use HAPI FHIR library for FHIR resource parsing and validation
- Write integration tests:
FhirAdapterIT- Test with FHIR-compliant test data
Phase 3: Agent Skills (Week 4)¶
3.1 Agent Skills Setup¶
Reference: See expert-match .claude/skills/ directory structure
and Architecture - Agent Skills
Tasks:
- Create
.claude/skills/directory structure - Create 7 skill directories (see Architecture for skill descriptions):
case-analyzer/- Analyze cases, extract entities, ICD-10 codes, classify urgency and complexitydoctor-matcher/- Match doctors to cases, scoring and ranking using multiple signalsevidence-retriever/- Search guidelines, PubMed, GRADE evidence summariesrecommendation-engine/- Generate clinical recommendations, diagnostic workup, treatment optionsclinical-advisor/- Differential diagnosis, risk assessmentnetwork-analyzer/- Network expertise analytics, graph-based expert discovery, aggregate metricsrouting-planner/- Facility routing optimization, multi-facility scoring, geographic routing
- Create
SKILL.mdfile in each directory with domain knowledge and tool invocation guidance - Configure Spring AI Agent Skills in
application.yml - Map skills to use cases (see Architecture - API Layer for endpoint-to-skill mapping)
3.2 Java Tool Methods¶
Reference: See expert-match @Tool method patterns
Tasks:
- ✅ Create
MedicalAgentToolsclass with@Toolmethods - ✅ Implement tools for each skill:
- ✅
case-analyzer:analyze_case_text(),extract_icd10_codes(),classify_urgency(),determine_required_specialty() - ✅
doctor-matcher:query_candidate_doctors(),score_doctor_match(),match_doctors_to_case() - ✅
evidence-retriever:search_clinical_guidelines()(LLM-based),query_pubmed()(NCBI E-utilities API) - ✅
recommendation-engine:generate_recommendations()(DIAGNOSTIC, TREATMENT, FOLLOW_UP) - ✅
clinical-advisor:differential_diagnosis(),risk_assessment()(COMPLICATION, MORTALITY, READMISSION) - ✅
network-analyzer:graph_query_top_experts(),aggregate_metrics()(DOCTOR, CONDITION, FACILITY) - ✅
routing-planner:graph_query_candidate_centers(),semantic_graph_retrieval_route_score()
- ✅
- ✅ Wire tools to services/repositories (FacilityRepository, GraphService, ClinicalExperienceRepository, PubMedService)
- ✅ Write integration tests:
MedicalAgentToolsIT(comprehensive test coverage for all tools)
3.3 MedicalAgentService¶
Location: src/main/java/com/berdachuk/medexpertmatch/llm/service/MedicalAgentService.java
Tasks:
- Create
MedicalAgentServiceinterface - Implement
MedicalAgentServiceImplwith Spring AI ChatClient - Implement agent orchestration:
- Load skills from
.claude/skills/ - Select skills based on intent
- Invoke tools via skills
- Format responses
- Load skills from
- Write integration tests:
MedicalAgentServiceIT
3.4 Agent API Endpoints¶
Location: src/main/java/com/berdachuk/medexpertmatch/llm/rest/MedicalAgentController.java
Reference: See Architecture - API Layer and Use Cases for endpoint details
Tasks:
- Create REST controller for agent endpoints
- Implement endpoints (see Use Cases for sequence diagrams):
POST /api/v1/agent/match/{caseId}- Specialist matching (Use Cases 1 & 2)- Skills: case-analyzer, doctor-matcher
- UI Page:
/match
POST /api/v1/agent/prioritize-consults- Queue prioritization (Use Case 3)- Skills: case-analyzer
- UI Page:
/queue
POST /api/v1/agent/network-analytics- Network analytics (Use Case 4)- Skills: network-analyzer
- UI Page:
/analytics
POST /api/v1/agent/analyze-case/{caseId}- Case analysis (Use Case 5)- Skills: case-analyzer, evidence-retriever, recommendation-engine
- UI Page:
/analyze/{caseId}
POST /api/v1/agent/recommendations/{matchId}- Expert recommendations (Use Case 5)- Skills: doctor-matcher
- UI Page:
/analyze/{caseId}
POST /api/v1/agent/route-case/{caseId}- Regional routing (Use Case 6)- Skills: case-analyzer, routing-planner
- UI Page:
/routing
- Write integration tests:
MedicalAgentControllerIT - Test each endpoint against corresponding use case workflow
Phase 4: Test Data Generator (Week 4)¶
4.1 TestDataGeneratorService¶
Reference: See expert-match src/main/java/com/berdachuk/expertmatch/ingestion/service/TestDataGenerator.java
and FHIR R5 specification (v5.0.0)
Location: src/main/java/com/berdachuk/medexpertmatch/ingestion/service/TestDataGenerator.java
FHIR Compliance: All test data must be compatible with FHIR R5 specification (v5.0.0) to ensure interoperability and realistic testing scenarios.
Tasks:
- Create
TestDataGeneratorservice - Use Datafaker library for realistic synthetic data
- Implement FHIR-compliant data generation:
- Generate FHIR resources (Patient, Condition, Observation, Encounter, Practitioner, Organization)
- Create FHIR Bundles containing multiple resources
- Ensure all resources conform to FHIR R5 data types and structure
- Use valid FHIR resource IDs and references
- Anonymize patient data (no PHI in test data)
- Implement methods:
generateTestData(String size, boolean clear)generateDoctors(int count)- Creates FHIR Practitioner resourcesgenerateMedicalCases(int count)- Creates FHIR Bundles (Patient, Condition, Observation, Encounter)generateClinicalExperiences(int doctorCount, int casesPerDoctor)generateFhirBundles(int count)- Generate FHIR-compliant bundles for test casesgenerateEmbeddings()- Generate embeddings from FHIR resourcesbuildGraph()- Build graph relationships from database data (✅ Implemented - automatically called after data generation)clearTestData()- Clear all test data
- Support data sizes: tiny, small, medium, large, huge
- Generate medical-specific FHIR-compliant data:
- FHIR Patient: Anonymized demographics (age, gender, no identifiers)
- FHIR Practitioner: Doctor/specialist information (name, qualifications, specialties)
- FHIR Condition: Medical conditions with ICD-10 codes (using
Condition.code.codingwith ICD-10 system) - FHIR Observation: Clinical observations, vital signs, lab results
- FHIR Encounter: Healthcare encounters (inpatient, outpatient, telehealth types)
- FHIR Organization: Healthcare facilities and organizations
- FHIR Bundle: Container for multiple resources with proper resource references
- Ensure FHIR resource references are valid:
- Patient references in Condition (
Condition.subject) - Encounter references in Observation (
Observation.encounter) - Practitioner references in Encounter (
Encounter.participant) - Organization references in Encounter (
Encounter.serviceProvider)
- Patient references in Condition (
- Write integration tests:
TestDataGeneratorIT- Test FHIR resource generation and validation
Example Structure:
package com.berdachuk.medexpertmatch.ingestion.service;
import net.datafaker.Faker;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
import org.hl7.fhir.r5.model.*;
import org.hl7.fhir.r5.model.Bundle.BundleEntryComponent;
@Slf4j
@Service
public class TestDataGenerator {
private static final String[] MEDICAL_SPECIALTIES = {
"Cardiology", "Oncology", "Neurology", "Emergency Medicine",
"Internal Medicine", "Surgery", "Pediatrics", "Psychiatry"
};
private static final String[] ICD10_CODES = {
"I21.9", "C50.9", "G93.1", "J44.0", "E11.9"
};
private static final String FHIR_ICD10_SYSTEM = "http://hl7.org/fhir/sid/icd-10";
private final Faker faker = new Faker();
public void generateTestData(String size, boolean clear) {
// Implementation
}
public void generateDoctors(int count) {
// Generate FHIR Practitioner resources with Datafaker
for (int i = 0; i < count; i++) {
Practitioner practitioner = new Practitioner();
practitioner.setId(IdType.newRandomUuid());
// Add name, qualifications, specialties following FHIR R4 structure
}
}
public void generateMedicalCases(int count) {
// Generate FHIR Bundles containing Patient, Condition, Observation, Encounter
for (int i = 0; i < count; i++) {
Bundle bundle = new Bundle();
bundle.setType(Bundle.BundleType.COLLECTION);
// Create Patient (anonymized)
Patient patient = createAnonymizedPatient();
bundle.addEntry().setResource(patient);
// Create Condition with ICD-10 code
Condition condition = createCondition(patient.getId());
bundle.addEntry().setResource(condition);
// Create Encounter
Encounter encounter = createEncounter(patient.getId());
bundle.addEntry().setResource(encounter);
// Create Observations
Observation observation = createObservation(patient.getId(), encounter.getId());
bundle.addEntry().setResource(observation);
}
}
private Patient createAnonymizedPatient() {
Patient patient = new Patient();
patient.setId(IdType.newRandomUuid());
// Add anonymized demographics (age, gender, no identifiers)
return patient;
}
private Condition createCondition(String patientId) {
Condition condition = new Condition();
condition.setId(IdType.newRandomUuid());
condition.getSubject().setReference("Patient/" + patientId);
// Add ICD-10 code using FHIR coding structure
Coding coding = new Coding();
coding.setSystem(FHIR_ICD10_SYSTEM);
coding.setCode(faker.options().option(ICD10_CODES));
condition.getCode().addCoding(coding);
return condition;
}
private Encounter createEncounter(String patientId) {
Encounter encounter = new Encounter();
encounter.setId(IdType.newRandomUuid());
encounter.getSubject().setReference("Patient/" + patientId);
encounter.setStatus(Encounter.EncounterStatus.FINISHED);
encounter.setClass_(new Coding("http://terminology.hl7.org/CodeSystem/v3-ActCode", "IMP", "inpatient encounter"));
return encounter;
}
private Observation createObservation(String patientId, String encounterId) {
Observation observation = new Observation();
observation.setId(IdType.newRandomUuid());
observation.getSubject().setReference("Patient/" + patientId);
observation.getEncounter().setReference("Encounter/" + encounterId);
observation.setStatus(Observation.ObservationStatus.FINAL);
// Add observation value, code, etc.
return observation;
}
}
FHIR Library: Use HAPI FHIR library for Java to create and validate FHIR resources:
<dependency>
<groupId>ca.uhn.hapi.fhir</groupId>
<artifactId>hapi-fhir-structures-r5</artifactId>
<version>7.0.0</version>
</dependency>
Note: HAPI FHIR R5 support may require version 7.0.0 or later. Check HAPI FHIR releases for latest R5-compatible version.
4.2 TestDataController¶
Location: src/main/java/com/berdachuk/medexpertmatch/ingestion/rest/TestDataController.java
Tasks:
- Create REST controller for test data generation
- Implement endpoints:
POST /api/v1/test-data/generate?size={size}&clear={clear}POST /api/v1/test-data/generate-embeddingsPOST /api/v1/test-data/build-graphPOST /api/v1/test-data/generate-complete-dataset?size={size}&clear={clear}
- Write integration tests:
TestDataControllerIT
Phase 5: UI Layer (Week 5)¶
5.1 Thymeleaf Setup¶
Reference: See expert-match Thymeleaf implementation patterns and UI Flows and Mockups for wireframe mockups
Tasks:
- Add
spring-boot-starter-thymeleafdependency - Create template structure:
src/main/resources/templates/fragments/layout.htmlsrc/main/resources/templates/fragments/header.htmlsrc/main/resources/templates/fragments/footer.htmlsrc/main/resources/templates/index.html(Home Page - see UI Mockup)src/main/resources/templates/match.html(Find Specialist - see UI Mockup)src/main/resources/templates/queue.html(Consultation Queue - see UI Mockup)src/main/resources/templates/analytics.html(Network Analytics - see UI Mockup)src/main/resources/templates/analyze.html(Case Analysis - see UI Mockup)src/main/resources/templates/routing.html(Regional Routing - see UI Mockup)src/main/resources/templates/doctors/{doctorId}.html(Doctor Profile - see UI Mockup)src/main/resources/templates/admin/test-data.html(Synthetic Data - see UI Mockup)
- Create static resources:
src/main/resources/static/css/,static/js/ - Follow wireframe mockups from UI Flows and Mockups for visual layout
- Implement UI/UX guidelines: color scheme, typography, spacing, accessibility ( see UI/UX Guidelines)
5.2 Web Controllers¶
Location: src/main/java/com/berdachuk/medexpertmatch/web/controller/
Reference: See UI Flows and Mockups for user flows and form requirements
Tasks:
- Create
@Controllerclasses (not@RestController) - Implement controllers:
HomeController- Home page (/) - Dashboard with navigation and statsMatchController- Find Specialist (/match) - Use Cases 1 & 2QueueController- Consultation Queue (/queue) - Use Case 3AnalyticsController- Network Analytics (/analytics) - Use Case 4CaseAnalysisController- Case Analysis (/analyze/{caseId}) - Use Case 5RoutingController- Regional Routing (/routing) - Use Case 6DoctorController- Doctor Profile (/doctors/{doctorId})TestDataController- Test Data Generator (/admin/test-data) - Admin UI
- Return template names and use
Modelto pass data - Implement user flows as documented in UI Flows and Mockups
- Follow form field requirements from PRD Section 7.2
- Write integration tests:
*ControllerIT
Phase 6: Integration & Testing (Week 5-6)¶
6.1 Integration Testing¶
Reference: See Use Cases for detailed sequence diagrams and workflows
Tasks:
- Write integration tests for all use cases
- Test complete workflows (see Use Cases for sequence diagrams):
- Use Case 1: Specialist Matching -
POST /api/v1/agent/match/{caseId}(see Use Case 1) - Use Case 2: Second Opinion -
POST /api/v1/agent/match/{caseId}( see Use Case 2) - Use Case 3: Queue Prioritization -
POST /api/v1/agent/prioritize-consults(see Use Case 3) - Use Case 4: Network Analytics -
POST /api/v1/agent/network-analytics(see Use Case 4) - Use Case 5: Decision Support -
POST /api/v1/agent/analyze-case/{caseId}andPOST /api/v1/agent/recommendations/{matchId}( see Use Case 5) - Use Case 6: Regional Routing -
POST /api/v1/agent/route-case/{caseId}( see Use Case 6)
- Use Case 1: Specialist Matching -
- Test agent skills integration (7 skills: case-analyzer, doctor-matcher, evidence-retriever, recommendation-engine, clinical-advisor, network-analyzer, routing-planner)
- Test FHIR adapter integration
- Test test data generator
- Test UI flows (see UI Flows and Mockups - User Flow Diagrams)
6.2 Performance Optimization¶
Tasks:
- Optimize database queries
- Add indexes for vector search
- Implement caching where appropriate
- Optimize graph queries
- Performance testing
6.3 Demo Preparation¶
Reference: See PRD Section 4.2.3 for test data generator requirements
Tasks:
- Generate demo dataset (medium size: 500 doctors, 1000 cases) using test data generator
- Pre-populate embeddings for all entities
- Build graph relationships in Apache AGE
- Create demo scenarios for each use case (see Use Cases):
- Use Case 1: Specialist Matching - Demo with complex inpatient case
- Use Case 2: Second Opinion - Demo with telehealth-enabled doctors
- Use Case 3: Queue Prioritization - Demo with multiple urgency levels
- Use Case 4: Network Analytics - Demo with ICD-10 code I21.9
- Use Case 5: Decision Support - Demo with differential diagnosis
- Use Case 6: Regional Routing - Demo with facility routing
- Prepare demo documentation
- Verify UI flows work correctly (see UI Flows and Mockups)
- Test all 8 UI pages with demo data
Docker Container Management¶
Development Database¶
Start Development Database:
Verify:
docker ps | grep medexpertmatch-postgres-dev
docker exec -it medexpertmatch-postgres-dev psql -U medexpertmatch -d medexpertmatch -c "SELECT * FROM pg_extension WHERE extname IN ('vector', 'age');"
Demo Database¶
Start Demo Database:
Generate Demo Data:
# After starting the application
curl -X POST "http://localhost:8080/api/v1/test-data/generate-complete-dataset?size=medium&clear=true"
Test Database (Testcontainers)¶
Build Test Container:
Run Tests:
Module Implementation Order¶
Week 1: Foundation Modules¶
- core - Shared infrastructure
- doctor - Doctor domain model and repository
- medicalcase - Medical case domain model and repository
- medicalcoding - ICD-10 codes domain model
- clinicalexperience - Clinical experience domain model and repository
Week 2: Infrastructure Modules¶
- embedding - Vector embedding generation
- graph - Apache AGE graph management
- retrieval - Hybrid GraphRAG retrieval (SemanticGraphRetrievalService, MatchingService)
Week 3: Processing Modules¶
- query - Query processing and case analysis (CaseAnalysisService)
- llm - LLM orchestration and agent skills (MedicalAgentService)
Week 4: Integration Modules¶
- ingestion - Data ingestion and test data generator
- chat - Chat conversation management (if needed)
Week 5: UI Module¶
- web - Thymeleaf UI controllers and templates
Testing Strategy¶
Unit Tests¶
- Purpose: Test pure logic, algorithms, utilities
- Naming:
*Test.javaor*Tests.java - Location:
src/test/java/.../.../Test.java - Examples:
CaseAnalysisServiceTest,SgrServiceTest
Integration Tests¶
- Purpose: Test complete workflows with real database
- Naming:
*IT.javaor*ITCase.java - Location:
src/test/java/.../.../...IT.java - Base Class: Extend
BaseIntegrationTest - Examples:
DoctorRepositoryIT,MedicalAgentServiceIT,UseCase1IT
Test Data Strategy¶
- Unit Tests: Use mocks
- Integration Tests: Use Testcontainers with custom image
- Demo: Use separate demo database container
- Test Data Generator: Use Datafaker for realistic synthetic data
Code Examples from expert-match¶
Repository Pattern¶
Reference:
expert-match/src/main/java/com/berdachuk/expertmatch/employee/repository/impl/EmployeeRepositoryImpl.java
Pattern:
- Interface in
repository/package - Implementation in
repository/impl/package - RowMapper in
repository/impl/jdbc/package - SQL files in
src/main/resources/sql/
Service Pattern¶
Reference: expert-match/src/main/java/com/berdachuk/expertmatch/employee/service/impl/EmployeeServiceImpl.java
Pattern:
- Interface in
service/package - Implementation in
service/impl/package - Use
@Transactionalon service methods - Inject repository interfaces, not implementations
Test Data Generator Pattern¶
Reference: expert-match/src/main/java/com/berdachuk/expertmatch/ingestion/service/TestDataGenerator.java
Pattern:
- Use Datafaker for realistic synthetic data
- Support multiple data sizes
- Generate embeddings and graph relationships
- Provide REST API for data generation
Key Implementation Guidelines¶
TDD Approach¶
- Write Test First: Create test before implementation
- Run Test: Verify it fails (red phase)
- Implement Feature: Write minimal code to make test pass (green phase)
- Refactor: Improve code while keeping tests green
- Verify: Always verify tests pass before moving to next task
Code Style¶
- Follow
.cursorrulesguidelines - Use Lombok annotations (
@Slf4j,@Data,@Builder, etc.) - Use interface-based design for services and repositories
- Follow domain-driven module organization
- Use 4 spaces for indentation
- Maximum 120 characters per line
Database Testing¶
- Always use Testcontainers: Never use H2 or in-memory databases
- Use custom test container:
medexpertmatch-postgres-test:latest - Extend BaseIntegrationTest: All database tests should extend this
- Clear data in @BeforeEach: Always clear relevant tables before creating test data
Success Criteria¶
Phase 1 Completion¶
- ✅ All domain models created (Doctor, MedicalCase, ClinicalExperience, ICD10Code, MedicalSpecialty, Facility)
- ✅ Database schema implemented with PgVector and Apache AGE
- ✅ All repositories implemented with tests
- ✅ Docker containers set up and working (dev, demo, test)
Phase 2 Completion¶
- ✅ Core services implemented (MatchingService, SemanticGraphRetrievalService, GraphService, CaseAnalysisService)
- ✅ MedGemma integration working (via OpenAI-compatible providers)
- ✅ SemanticGraphRetrievalService scoring implemented (vector + graph + historical performance)
- ✅ GraphService queries working (Apache AGE Cypher queries)
- ✅ FHIR adapters implemented
Phase 3 Completion¶
- ✅ All 7 agent skills created (case-analyzer, doctor-matcher, evidence-retriever, recommendation-engine, clinical-advisor, network-analyzer, routing-planner)
- ✅ Java tools implemented (all
@Toolmethods) - ✅ Agent orchestration working (MedicalAgentService)
- ✅ All 6 API endpoints implemented and tested
Phase 4 Completion¶
- ✅ Test data generator implemented (supports tiny, small, medium, large, huge sizes)
- ✅ Demo data generated (medium: 500 doctors, 1000 cases)
- ✅ Embeddings generated for all entities
- ✅ Graph relationships built in Apache AGE
Phase 5 Completion¶
- ✅ Thymeleaf UI implemented (8 pages)
- ✅ All pages created matching wireframe mockups (see UI Flows and Mockups)
- ✅ User flows working (see UI Flows and Mockups - User Flow Diagrams)
- ✅ UI/UX guidelines implemented (color scheme, typography, spacing, accessibility)
Phase 6 Completion¶
- ✅ All integration tests passing (all 6 use cases tested)
- ✅ Performance optimized (sub-second matching, < 100ms vector search, < 500ms graph queries)
- ✅ Demo ready (demo scenarios for all use cases)
- ✅ Documentation complete (PRD, Architecture, Use Cases, UI Flows, Implementation Plan)
Implementation Summary¶
All 6 phases completed successfully! ✅
The MVP is fully implemented with:
- 134 Java source files
- 25 integration test files (219 tests)
- 61 agent tools implemented
- All 6 primary use cases functional
- All 7 agent skills operational
- Complete test coverage
See the codebase and test coverage for detailed metrics and implementation patterns.
Next Steps (Future Enhancements)¶
-
Performance Optimization:
- Caching for frequently accessed data
- Graph query optimization
- Batch embedding generation optimization
-
UI Enhancements:
- Real-time updates via WebSocket
- Advanced filtering and search
- Visualization improvements
-
Feature Additions:
- Multi-language support
- Advanced analytics
- Export capabilities
Related Documentation¶
For detailed information on specific aspects of the implementation:
- Implementation status - Current implementation status, metrics, patterns, and known limitations (see codebase)
- Product Requirements Document - Complete product requirements, functional requirements, UI pages, and API specifications
- Architecture - System architecture, module structure, service layer, API layer, and agent skills
- Use Cases - Detailed use case workflows with sequence diagrams showing API, agent, skill, and service interactions
- UI Flows and Mockups - User interface wireframes (PlantUML Salt), user flows, form mockups, and UI/UX guidelines
- Vision - Project vision, value propositions, success metrics, and long-term goals
Last updated: 2026-01-22