Generate simulated lab data
dummy_lab_cbc_electrolyte.RdThis function creates a synthetic dataset with a subset of lab tests that are contained in the GEMINI "lab" table, as seen in GEMINI Data Repository Dictionary. The function currently focuses on simulating two lab tests: hemoglobin and sodium, as they are often used to identify routine blood work tests of complete blood count and electrolytes. This function will return: collection date time, information about the test type, test code, and test result value. It is a long format data table.
Usage
dummy_lab_cbc_electrolyte(
nid = 1000,
n_hospitals = 10,
time_period = c(2015, 2023),
cohort = NULL,
seed = NULL
)Arguments
- nid
(
integer) Number of unique encounter IDs to simulate. In this data table, each ID occurs once. It is ignored ifcohortis provided.- n_hospitals
(
integer) Number of hospitals in simulated dataset. It is ignored ifcohortis provided- time_period
(
numeric): Date range of data, by years or specific dates in either format: ("yyyy-mm-dd", "yyyy-mm-dd") or (yyyy, yyyy). It is ignored ifcohortis provided.- cohort
(
data.frame or data.table)
Optional, a data frame or data table with columns:genc_id(integer): Mock encounter ID numbershospital_num(integer): Mock hospital ID numbersadmission_date_time(character): Date and time of IP admission in YYYY-MM-DD HH:MM formatdischarge_date_time(character): Date and time of IP discharge in YYYY-MM-DD HH:MM format. Whencohortis not NULL,nid,n_hospitals, andtime_periodare ignored.
- seed
(
integer) Optional, a number for setting the seed to get reproducible results.
Value
(data.table)
A data.table object similar to the "lab" table that contains the following fields:
genc_id(integer): Mock encounter ID; integers starting from 1 or as seen incohorthospital_num(integer): Mock hospital ID; integers starting from 1 or as seen incohorttest_type_mapped_omop(character): Test name and code mapped by GEMINI; currently two tests are available: 3000963 (hemoglobin) and 3019550 (sodium)test_name_raw(character): Test name as reported by hospitaltest_code_raw(character): Test code as reported by hospitalresult_value(character): Test resultscollection_date_time(character): Date and time when the sample was collected
Examples
dummy_lab_cbc_electrolyte(nid = 10, n_hospitals = 1, seed = 1)
#> genc_id hospital_num collection_date_time test_type_mapped_omop
#> <int> <int> <char> <num>
#> 1: 1 1 2016-11-06 09:22 3000963
#> 2: 1 1 2016-11-08 08:07 3019550
#> 3: 1 1 2016-11-08 10:25 3000963
#> 4: 1 1 2016-11-06 06:28 3019550
#> 5: 1 1 2016-11-07 09:05 3019550
#> ---
#> 149: 10 1 2021-12-29 05:36 3000963
#> 150: 10 1 2021-12-29 11:06 3019550
#> 151: 10 1 2021-12-29 09:58 3000963
#> 152: 10 1 2021-12-29 03:43 3019550
#> 153: 10 1 2021-12-28 08:33 3000963
#> test_name_raw test_code_raw result_value
#> <char> <char> <char>
#> 1: CBC <NA> 84
#> 2: SODIUM <NA> 139
#> 3: HEMOGLOBIN <NA> 81
#> 4: SODIUM <NA> 135
#> 5: SODIUM <NA> 138
#> ---
#> 149: HEMOGLOBIN HGB 109
#> 150: SODIUM <NA> 135
#> 151: HEMOGLOBIN Hemoglobin 139
#> 152: SODIUM 140
#> 153: Hemoglobin <NA> 89
dummy_lab_cbc_electrolyte(cohort = dummy_admdad())
#> genc_id hospital_num collection_date_time test_type_mapped_omop
#> <int> <int> <char> <num>
#> 1: 1 9 2018-10-13 18:33 3019550
#> 2: 1 9 2018-10-07 10:14 3019550
#> 3: 1 9 2018-10-10 07:24 3000963
#> 4: 1 9 2018-10-31 19:18 3019550
#> 5: 1 9 2018-10-21 05:35 3000963
#> ---
#> 15896: 1000 1 2020-04-27 07:40 3000963
#> 15897: 1000 1 2020-04-27 06:27 3000963
#> 15898: 1000 1 2020-04-26 11:06 3019550
#> 15899: 1000 1 2020-04-29 03:27 3000963
#> 15900: 1000 1 2020-04-25 11:50 3000963
#> test_name_raw test_code_raw result_value
#> <char> <char> <char>
#> 1: SODIUM 133
#> 2: Sodium,Serum,Plasma 131
#> 3: HEMOGLOBIN HGB 97
#> 4: SODIUM 139
#> 5: HEMOGLOBIN 100.06 125
#> ---
#> 15896: HEMOGLOBIN Hb 70
#> 15897: Hemoglobin <NA> 130
#> 15898: SODIUM NAPL 145
#> 15899: HEMOGLOBIN <NA> 76
#> 15900: HEMOGLOBIN HGB 122