Tutorial - Synthea dataset
Download from Official website
- Download fhir r4 synthea data from their official dataset.
- But in this way we only can get 1000 samples, and we cannot change the samples.
Generate from GitHub project
- Generate fhir r4 bundle data from synthea github project.
- Requirements: Long-Term Support (LTS) Java 11 or 17.
- Clone synthea project
- Notice: every time re-run the generate command, should delete the
buildfolder.
bash
git clone https://github.com/synthetichealth/synthea.git
cd syntheagit clone https://github.com/synthetichealth/synthea.git
cd syntheaGenerate data
Generating the population one at a time...
bash./run_synthea./run_syntheaCommand-line arguments may be provided to specify a state, city, population size, or seed for randomization.
bash./run_synthea [-s seed] [-p populationSize] [state [city]]./run_synthea [-s seed] [-p populationSize] [state [city]]
Other options
bash
./run_synthea -h
> Task :run
Usage: run_synthea [options] [state [city]]
Options: [-s seed] [-cs clinicianSeed] [-p populationSize]
[-ps singlePersonSeed]
[-r referenceDate as YYYYMMDD]
[-e endDate as YYYYMMDD]
[-g gender] [-a minAge-maxAge]
[-o overflowPopulation]
[-c localConfigFilePath]
[-d localModulesDirPath]
[-i initialPopulationSnapshotPath]
[-u updatedPopulationSnapshotPath]
[-t updateTimePeriodInDays]
[-f fixedRecordPath]
[-k keepMatchingPatientsPath]
[--config*=value]
* any setting from src/main/resources/synthea.properties
Examples:
run_synthea Massachusetts
run_synthea Alaska Juneau
run_synthea -s 12345
run_synthea -p 1000
run_synthea -s 987 Washington Seattle
run_synthea -s 21 -p 100 Utah "Salt Lake City"
run_synthea -g M -a 60-65
run_synthea -p 10 --exporter.fhir.export=true
run_synthea --exporter.baseDirectory="./output_tx/" Texas./run_synthea -h
> Task :run
Usage: run_synthea [options] [state [city]]
Options: [-s seed] [-cs clinicianSeed] [-p populationSize]
[-ps singlePersonSeed]
[-r referenceDate as YYYYMMDD]
[-e endDate as YYYYMMDD]
[-g gender] [-a minAge-maxAge]
[-o overflowPopulation]
[-c localConfigFilePath]
[-d localModulesDirPath]
[-i initialPopulationSnapshotPath]
[-u updatedPopulationSnapshotPath]
[-t updateTimePeriodInDays]
[-f fixedRecordPath]
[-k keepMatchingPatientsPath]
[--config*=value]
* any setting from src/main/resources/synthea.properties
Examples:
run_synthea Massachusetts
run_synthea Alaska Juneau
run_synthea -s 12345
run_synthea -p 1000
run_synthea -s 987 Washington Seattle
run_synthea -s 21 -p 100 Utah "Salt Lake City"
run_synthea -g M -a 60-65
run_synthea -p 10 --exporter.fhir.export=true
run_synthea --exporter.baseDirectory="./output_tx/" TexasFHIR dataset
- The FHIR R4 dataset will be automatically generated in the
outputfolder. - The dataset should has two json files contains basic infomations of all fhir resources, if we want load the dataset to FHIR server, we must load these two files.
hospitalInformationxxxx.jsonpractitionerInformationxxxx.json
- Then the dataset should be used to the remain tutorials.