Skip to contents

This function processes the output returned by rxnorm_query() into a frequency table to faciliate validation of RxNorm matches by a Subject Matter Expert (SME). It retrieves previously validated "raw_input ~ rxnorm_match" pairs (fields returned by rxnorm_query()) from the lookup_pharmacy_mapping table to identify RxNorm matches that have already been validated by previous projects. The function additionally applies optional hierarchy filtering and small cell suppression. The returned output facilitates further analyses of validated pharmacy entries for the drug(s) of interest.

When the rxnorm_query() object contains unmatched rows of the pharmacy table, the function returns all columns containing drug-containing columns of unmatched pharmacy rows, together with their historical mappings from previous projects.

section @warning The default version of this function requires database versions drm_cleandb_v4_1_1/h4h_template_v5_0_1 or newer due to the dependency on the lookup_pharmacy_mapping table. However, as a workaround, users can also supply a custom lookup_pharmacy_mapping table (e.g., based on a flat file or, for internal GEMINI staff, the pharmacy master mapping DB). For HPC4Health users working with older database versions: Please reach out to our team via the ticketing system if you require access to the pharmacy mapping lookup table.

Usage

prepare_pharm_for_validation(
  dbcon = NULL,
  rxnorm_res,
  hierarchy = TRUE,
  cell_suppression = TRUE,
  outpath = NULL,
  custom_lookup = NULL
)

Arguments

dbcon

(DBIConnection)
DBI connection to a GEMINI database version drm_cleandb_v4_1_1/ h4h_template_v5_0_1 or newer. This is required to query the lookup_pharmacy_mapping table. Alternatively, users may provide a custom_lookup input instead (see below).

rxnorm_res

(data.frame, data.table, list)
Output returned by rxnorm_query() to be processed for validation. If the RxNorm search returned unmatched rows, this object should be provided as a list containing both matched_rows and unmatched_rows.

hierarchy

(logical)
If TRUE (default), the function applies predefined hierarchy filters to retain only the highest-priority drug-containing "raw_input" per pharmacy row (i.e. hierarchy filtering is applied at the level of individual drug orders). The hierarchy rules prioritize drug information in the following order: med_id_generic_name_raw > med_id_brand_name_raw > med_id_hospital_code_raw > med_id_din > med_id_ndc > iv_component_type. For each pharmacy row (identified by row_num of the pharmacy table), only the highest-priority column that was matched to the drug(s) of interest is retained as the "raw_input" for that row. The hierarchy rules handle potentially redundant or conflicting information of different search columns for the same pharmacy order. It ensures consistent drug-containing information per row and streamlines the validation process. Therefore, it is recommended to set hierarchy to TRUE. Only set to FALSE when applying non-standard hierarchy rules or conducting further detailed investigations.

When the rxnorm_res object contains unmatched rows, hierarchy filters are applied to search for historical mappings. For unmatched rows, hierarchy filters cannot be disabled.

cell_suppression

(logical)
When set to TRUE (default), pharmacy entries associated with less than 6 unique encounters are cell-suppressed and shown as "<6" in the output table.

outpath

(character)
Optional file path for saving the output files. Default is NULL, and no file will be exported to folder. If provided, two files are saved:

  1. an .xlsx file for SME review,

  2. an .RDS file for analysts (This file should strictly be used within GEMINI's HPC4Health/NORA environment and should NEVER be pushed to a GitLab repository). Compared to the file for SMEs, the .RDS file for analysts contains an additional column pharm_row_num with all row_num values in the pharmacy table (as a list) corresponding to a given RxNorm match. The IDs in the pharm_row_num serve as identifiers for analysts to merge the validated frequency table back to the pharmacy table. For example, this can be used to extract individual pharmacy orders that contain the SME-validated drug entry and perform additional filtering (e.g., by order date-time). Note: When an outpath is provided, cell_suppression needs to be set to TRUE.

custom_lookup

(data.frame, data.table)
Optional input allowing users to provide a custom version of the lookup_pharmacy_mapping table (e.g., from a flat file/master mapping DB). Needs to contain the following character columns:

  • search_type: Name of column with RxNorm match (e.g., med_id_generic_name_raw)

  • raw_input: Entry from the pharmacy table RxNorm matched to the drug of interest (e.g., amoxicillin 50 mg/ml)

  • rxnorm_match: Drug of interest the pharmacy entry was matched to (e.g., amoxicillin)

  • drug_group: Broader drug group of matched drug (e.g., antibiotic)

  • project_name: Name of the project drug match was validated for

If a custom_lookup table is provided to this function, the dbcon argument will be ignored and the custom table will take priority over the lookup table in the database.

Value

A list containing two or three data tables:

  • sme: Frequency table for SME validation with columns count, search_type, raw_input, rxnorm_match, drug_group, times_validated, SME_confirm (for SME to confirm each "raw_input ~ rxnorm_match" pair; must provide a logical value TRUE/FALSE), SME_comments (for SME to add comments, e.g. whether the drug should be included in the study, regardless of whether the match itself is validated)

  • analyst: Frequency table for analyst use, with columns count, search_type, raw_input, rxnorm_match, drug_group, times_validated, and pharm_row_num (as a list)

  • unmatched_rows: Frequency table for unmatched pharmacy rows if they exist in the input rxnorm_res. Contains pharmacy columns searched by rxnorm_query() and additionally, historically_mapped_to_drug and historically_mapped_to_drug_group (agnostic to drugs user searched for), SME_mapped_search_type, SME_mapped_drug, SME_mapped_drug_group, SME_comment

Additionally, if outpath is provided, two files are saved to the specified folder:

  • "pharmacy_mapping_for_SME.xlsx" for SME review

  • "pharm_res_INTERNAL_USE_ONLY.rds" for analyst to use post SME validation

Details

The function takes results from rxnorm_query() and generates structured frequency tables of pharmacy data for SME review. Once the validation is complete, analysts can use the output table to obtain all validated pharmacy entries for the drugs of interest These are the key processing steps performed by the function:

  • Normalizes pharmacy data by converting all text values to lowercase, applying ASCII encoding, and trimming off leading and trailing white spaces and periods.

  • For rxnorm_match and drug_group, additional normalization is applied to convert plural words to their singular form.

  • Retrieves previous validation and adds the times_validated column to the returned table indicating how many times each "raw_input ~ rxnorm_match" pair has previously been validated by individual projects (0: never been validated, 1: validated by one project, 2+: validated by two or more projects). The function additionally retrieves information about the broader classification of each drug from previous projects. However, please note that the classification is not standardized and should be used as supplementary information only.

  • Applies hierarchy filtering on search_type such that only drug information at the highest priority is retained for consideration per row of pharmacy table (see details in parameter description of hierarchy).

  • Computes number of unique genc_ids associated with each "search_type ~ raw_input ~ rxnorm_match" entry. Occurrences less than 6 are suppressed to "<6".

  • Generates a frequency table for SME to perform validation for each "raw_input ~ rxnorm_match" pair (note: validation should be agnostic to all other fields). Each row of the frequency table is uniquely identified by row_id. When an outpath is provided, the function saves .xlsx/.rds files to the specified directory.

  • Generates a frequency table for analyst use following SME validation. The table is identical to the SME version but includes an additional column pharm_row_num, which is a list storing row_num values (of the pharmacy table) associated with each "search_type ~ raw_input ~ rxnorm_match" entry. The row_num is an identifier allowing analysts to trace each entry back to the original pharmacy table.

  • If rxnorm_query() was executed with return_unmatched = TRUE, the function performs a secondary search on unmatched rows. The secondary search matches each drug-containing field (i.e. search_type) with existing mappings in the pharmacy_master_mapping database in order of hierarchy med_id_generic_name_raw > med_id_brand_name_raw > med_id_hospital_code_raw > med_id_din > med_id_ndc > iv_component_type. The highest priority match is returned per row, along with the match's corresponding drug_group. The search is not specific to the drug searched by rxnorm_query(). Instead, it is a broad search against all existing drug mappings found in the lookup_pharmacy_mapping table (or user-provided custom lookup table). Users may need to apply filters (e.g. via regex) and manual mapping to identify if any entries in the unmatched rows may contain the drug(s) of interest.

Examples

if (FALSE) { # \dontrun{
drv <- dbDriver("PostgreSQL")
# Connect to GEMINI database
dbcon <- DBI::dbConnect(drv,
  dbname = "db",
  host = "domain_name.ca",
  port = 1234,
  user = getPass("Username: "),
  password = getPass("Password: ")
)
rxnorm_res <- rxnorm_query(
  dbcon,
  drug_name = c("warfarin"),
  return_unmatched = FALSE
)

## Prepare RxNorm output for validation
res <- prepare_pharm_for_validation(
  dbcon = dbcon,
  rxnorm_res = rxnorm_res,
  hierarchy = TRUE,
  cell_suppression = TRUE,
  outpath = "/user_path/"
)

res$sme # frequency table to be validated by SME
res$analyst # frequency table with `pharm_row_num` to be used by analysts after SME validation.

## For internal GEMINI staff only:
## Provide custom mapping lookup table based on master mapping DB
# connect to master mapping DB
pharm_db <- DBI::dbConnect(drv,
  dbname = "pharmacy_mapping",
  host = "domain_name.ca",
  port = 1234,
  user = getPass("Username: "),
  password = getPass("Password: ")
)
# query master mapping table
pharm_mapping <- dbGetQuery(
  pharm_db, "SELECT * FROM pharmacy_master_mapping;"
) %>%
  data.table()

res <- prepare_pharm_for_validation(
  rxnorm_res = rxnorm_res,
  custom_lookup = pharm_mapping # no `dbcon` input required
)
} # }