Developing a shared sepsis data infrastructure: a systematic review and concept map to FHIR

“There is an urgent need to improve diagnostic excellence in sepsis1. One potential solution is the development of a flexible, scalable, and interoperable data infrastructure to screen and identify sepsis patients across health systems using real-time, granular clinical data available within electronic health records (EHRs). Despite clinical data for millions of sepsis or pre-sepsis patients routinely collected in EHRs, the healthcare enterprise has accessed only a small fraction of these data to improve the clinical understanding of sepsis. The development of a sepsis data backbone could be extended to meet diverse needs, including sepsis translational research, clinical care delivery, machine learning/artificial intelligence deployment, disease surveillance, quality improvement, and health policy. Furthermore, once established in sepsis, a similar data infrastructure could be tailored to include other conditions. For example, awareness of the potential value of this data infrastructure has been magnified by the COVID-19 pandemic—many COVID-19 inpatients met sepsis criteria with systemic inflammation, organ failure, and a high risk of mortality—highlighting the critical need for real-time data interoperability that could inform pandemic response, health system preparedness, translational research and clinical care. [..]

Here, we discuss phase 1 of Sepsis on FHIR. We first conducted a systematic review of the sepsis literature and identified a comprehensive set of relevant EHR clinical variables as a sharable set of features across sites. We then produced a concept mapping of clinical variables to FHIR resources. [..]

We identified 788 clinical variables from the 55 manuscripts. Recorded data elements included clinical measurements (e.g., heart rate, respiratory rate, white blood cell count), infectious signs and symptoms (e.g., dysuria, abdominal pain), and individual ICD codes for diagnosis and procedures, current procedural terminology (CPT) and diagnosis-related groups (DRG) codes for sepsis, septicemia, and sepsis syndromes, organ dysfunction (e.g., acute kidney injury, hypotension) and infection (e.g., sepsis due to anaerobes, candidal sepsis).

After the removal of duplicates and unstructured variables, 151 unique clinical variables remained. Variables represented 7 broad domains including, patient characteristics, vital signs and laboratory tests, interventions data (e.g., medication administration, diagnostic tests, catheterization, surgical events), fluid balance, location information, healthcare use and outcomes, and administrative/billing codes. All major sepsis consensus definitions were represented by selected clinical variables, capturing the evolution of sepsis definitions over time. [..]

We mapped the 151 clinical variables to their corresponding FHIR resources and, where appropriate, to specific code values that represented each element. Variables mapped most frequently to the FHIR DiagnosticReport resource (41%), followed by the observation (33%), patient (12%), encounter (9%), or procedure (5%) resources.

To promote flexibility, we linked variables to several FHIR resource types whenever possible. For example, the variable “serum sodium” is represented as a FHIR Observation resource where the Observation.code data element is mapped to the Logical Observation Identifiers Names and Codes (LOINC) code 2951-2 (referring to “Sodium [Moles/volume] in Serum or Plasma”). LOINC is the most commonly used international healthcare terminology standard which describes a reference set of health data and codes for laboratory and clinical observations. FHIR mappings frequently link to more than one LOINC code. For example, blood urea nitrogen (BUN) mapped to both LOINC codes 3094-0 (BUN SerPl-mCnc) and 6299-2 (BUN Bld-mCnc). For variables linked to more than one code, we established a HL7 FHIR-based sepsis value set to represent a set of codes with the same clinical meaning. Most variables (99 out of 151) were FHIR resource observations with direct mappings to LOINC codes. In contrast, the variable “Admit Time” mapped directly to an existing FHIR data element for healthcare encounter time stamps (Encounter.period.start). [..]

The development of an automated, flexible infrastructure for sharing sepsis data represents a key step toward achieving diagnostic excellence and improving sepsis outcomes for several reasons. First, compiling high-fidelity health data from millions of patient encounters creates a substantial opportunity to uncover sepsis subgroups. Recent studies have highlighted the heterogeneity in treatment effects among specific subgroups which demonstrates potentially adverse effects of current treatment approaches in some patient subtypes. Thus, establishing representative subtypes and identifying them in real-time will be key for initiating clinical trials and targeting treatments. Second, while focused, small repositories of biospecimens exist in sepsis, they cannot be easily appended to detailed clinical data shared across sites to improve their utility and statistical power. Third, as described above, current reporting requirements for the CMS SEP-1 measure is highly resource-intensive and of uncertain benefit for patients. Automated extraction of patient data can improve reporting while also helping to refine the use of policy-driven targets. Fourth, while there are many emerging machine learning and artificial intelligence algorithms designed to improve the prediction of sepsis onset or deterioration, they can be brittle (i.e., predictive performance degrades) when developed in one healthcare context and exported to an external context. A federated learning system built using data from many diverse environments can improve the generalizability, representativeness, and transparency of such algorithms for clinical care. Fifth, as evident in the COVID-19 pandemic, the lack of a national real-time system of interoperable data substantially hampers the identification, mitigation, and eradication of large-scale communicable diseases. Finally, the extension of this platform beyond the United States may allow for standardized international data exchange.”

Full article, EB Brant, JN Kennedy, AJ King, et al. NPJ | Digital Medicine, 2022.4.2