Oct. 25, 2021 – Translating real-world data (RWD) into a standard format that the Food and Drug Administration (FDA) can process, review and archive will pose challenges that are best met with detailed documentation of a sponsor’s data approach, according to the agency’s draft guidance on data standards for drug and biological products submissions containing RWD, which was announced Oct. 22 in the Federal Register. The agency is accepting comments on the draft guidance until Dec. 21.
The use of RWD to support the approval of a new indication for an approved drug or to satisfy post-approval study requirements was established by the 21st Century Cures Act. The agency also committed to publishing guidance on RWE issues under the Prescription Drug User Fee Amendments of 2017.
“FDA recognizes the challenges involves in standardizing study data from RWD sources for inclusion in applicable drug submissions,” the draft guidance states. These challenges include:
- The variety of RWD sources and their inconsistent formats (e.g., electronic health record (EHR), registry);
- The differences in source data captured regionally and globally using different standards, terminologies, and exchange formats for the representation of the same or similar data elements;
- A wide range of methods and algorithms used to create datasets intended to aggregate data; and
- The many aspects of health care data that can affect the overall quality of the data, including business processes and database structure, inconsistent vocabularies and coding systems, and de-identification methodologies used to protect patient data when shared.
The best way to “increase confidence” in the data that emerges during data curation and data transformation is to document and apply the processes utilized, the FDA states in the draft guidance. These processes may include electronic documentation of data additions, deletions or alterations from the source data system to the final study analytic data set(s).
“Sponsors should also document any changes to data to conform to the current FDA-supported data standards, and the potential impacts of these changes,” according to the FDA. Sponsors also should discuss planned submission of RWD with the appropriate FDA review division and their approaches for meeting FDA data standards.
“Sponsors should describe these approaches, including in the protocol, data management plan, and/or final study reports,” the draft guidance states. “With adequate documentation of the conformance methods and their rationale,” study data derived from RWD can be transformed to acceptable data standards – the Clinical Data Interchange Standards Consortium’s (CDISC’s) Study Data Tabulation Model (SDTM) – and submitted to the FDA for review.
In its draft guidance, the FDA addresses considerations for mapping RWD to standard data submission standards. For example, there is “wide divergence in the terminologies used and their precise meaning between RWD sources and FDA-supported data standards,” even for seemingly simple variables, such as “male/female.” For example, the document states, “sex as a variable may be codified in CDISC’s terminology as a concept based on physical characteristics, whereas EHRs may use gender identity.”
To address these differences, “documentation of the sponsor’s rationale for choosing particular CDISC data elements for RWD and documentation of the differences between the two is critical,” FDA states, further suggesting that the sponsor should provide a description of the general approach and anticipated impact of data mapping as a part of or in an appendix to the Study Data Reviewer’s Guide “to highlight the domains involved.”
The agency also recommends that sponsors include a data dictionary that documents the definition of every data element used and all relevant information about the element. Technical details should be included in the “Define-XML” file.
When transforming RWD into data consistent with FDA standards, sponsors should consider challenges in the following instances:
- Management of semantic concepts that are present in multiple locations in a health record (e.g., medication information);
- Inconsistent coding or miscoding of concepts (e.g., drugs or diagnoses);
- Changes in data collection or coding practices (e.g., International Classification of Diseases-9) that occurred during the study; or
- Missing information.
All of these cases should be documented and justified, the draft guidance states. The document also provides several examples of mapping healthcare data to SDTM.
After reviewing the public comments on the draft guidance, the FDA will issue a final guidance on this issue.