Repository Methods Update¶
Last Updated: 2026-01-19
Status: ✅ COMPLETED
Overview¶
Added findAll() methods to repositories to support database existence checks in the synthetic data generator
optimization.
Changes Made¶
1. ICD10CodeRepository¶
File: src/main/java/com/berdachuk/medexpertmatch/medicalcoding/repository/ICD10CodeRepository.java
Added Method:
Implementation:
src/main/java/com/berdachuk/medexpertmatch/medicalcoding/repository/impl/ICD10CodeRepositoryImpl.java
@Override
public List<ICD10Code> findAll() {
return namedJdbcTemplate.query(findAllSql, Map.of(), icd10CodeMapper);
}
SQL File: src/main/resources/sql/medicalcoding/findAll.sql
SELECT id, code, description, category, parent_code, related_codes
FROM medexpertmatch.icd10_codes
ORDER BY code
2. MedicalSpecialtyRepository¶
Status: ✅ Already existed - no changes needed
File: src/main/java/com/berdachuk/medexpertmatch/doctor/repository/MedicalSpecialtyRepository.java
Existing Method:
/**
* Finds all medical specialties.
*
* @return List of all medical specialties
*/
List<MedicalSpecialty> findAll();
Implementation:
src/main/java/com/berdachuk/medexpertmatch/doctor/repository/impl/MedicalSpecialtyRepositoryImpl.java
@Override
public List<MedicalSpecialty> findAll() {
return namedJdbcTemplate.query(findAllSql, medicalSpecialtyMapper);
}
SQL File: src/main/resources/sql/doctor/medicalspecialty/findAll.sql
SELECT id, name, normalized_name, description, icd10_code_ranges, related_specialties
FROM medexpertmatch.medical_specialties
ORDER BY name
Usage in SyntheticDataGenerator¶
Both findAll() methods are used in the synthetic data generator to check if sufficient data already exists in the
database before making LLM calls:
ICD-10 Codes Check¶
// In generateIcd10Codes()
int existingCount = icd10CodeRepository.findAll().size();
int targetCount = calculateTargetIcd10Count(size);
if(existingCount >=targetCount){
log.
info("Sufficient ICD-10 codes exist in database ({} >= {}), loading existing codes and skipping LLM generation",
existingCount, targetCount);
List<ICD10Code> existingCodes = icd10CodeRepository.findAll();
extendedIcd10Codes =existingCodes.
stream()
.
map(ICD10Code::code)
.
distinct()
.
limit(targetCount)
.
collect(Collectors.toList());
return;
}
Medical Specialties Check¶
// In generateMedicalSpecialties()
int existingCount = medicalSpecialtyRepository.findAll().size();
int targetCount = calculateTargetSpecialtyCount(size);
if(existingCount >=targetCount){
log.
info("Sufficient medical specialties exist in database ({} >= {}), skipping LLM generation",
existingCount, targetCount);
return;
}
Benefits¶
- Reduced LLM Calls: Eliminates unnecessary LLM API calls when data already exists
- Database Reuse: Leverages existing data instead of regenerating
- Performance: Faster generation when data is already available
- Cost Savings: Reduces API costs for repeated generations
Testing¶
Both methods are tested through:
- Integration tests in
SyntheticDataGeneratorIT - Database existence checks in
generateIcd10Codes()andgenerateMedicalSpecialties()