Working...
ClinicalTrials.gov
ClinicalTrials.gov Menu

NLM Scrubber: NLM s Software Application to De-identify Clinical Text Documents

The safety and scientific validity of this study is the responsibility of the study sponsor and investigators. Listing a study does not mean it has been evaluated by the U.S. Federal Government. Read our disclaimer for details.
ClinicalTrials.gov Identifier: NCT02795806
Recruitment Status : Enrolling by invitation
First Posted : June 10, 2016
Last Update Posted : May 10, 2019
Sponsor:
Collaborators:
National Cancer Institute (NCI)
National Institutes of Health Clinical Center (CC)
Information provided by (Responsible Party):
National Institutes of Health Clinical Center (CC) ( National Library of Medicine (NLM) )

Brief Summary:

Background: Electronic health records contain a vast amount of data about diseases and treatments. Researchers could use this data to test their ideas, but they would need to use records from more than just their own group of patients. But access to those records is restricted to ensure patient privacy.

U.S. National Library of Medicine (NLM) has created a computer tool called NLM Scrubber. This program recognizes and deletes personal information from health records. The researchers who developed this program now need access to the original records. This will allow them to see how well the program removes personal information from patient records and how they can make it more accurate.

Objectives:

To find ways to improve clinical text de-identification.

Eligibility:

No new participants. Researchers will review data that have already been collected.

Design:

Researchers will collect a random sample of reports. These will be from different doctors in different fields.

Researchers will manually remove personal information from the records.

Researchers will also automatically remove personal information from original records using NLM-Scrubber.

Researchers will compare the results of the computer program versus the manual changes. They will note when the program has not been removing personal information correctly. They will also note when the program has been deleting nonpersonal health information incorrectly.

Researchers will use the results to revise the program. They will keep testing it until the de-identification process is complete.

...


Condition or disease
Personally Identifiable Information

Detailed Description:

This study is about the quality assessment, improvement, and monitoring of an automatic clinical text de-identification software application called NLM Scrubber, which has been developed at the National Library of Medicine (NLM). The application has been developed so that clinical reports can be used in secondary scientific studies (i.e., for secondary use) without breaching patient privacy. Research on methods for protecting patient privacy and on the development of NLM Scrubber have been conducted by following the guidelines of and in compliance with HIPAA and the Privacy Act.

In order to further develop and improve NLM Scrubber and assess its de-identification performance effectively, the investigators require the original / unredacted samples from all potential clinical report types and sources. To this end, NLM investigators have been

collaborating with entities within NIH, namely, NIH Clinical Center, BTRIS, and NCI as well as outside entities, Kentucky State Registry administered by University of Kentucky and researchers from the University of Pittsburgh, who stated their interest in integrating NLM

Scrubber to their application called Text Information Extraction System. These entities collect samples of various types of clinical reports for assessing and improving NLM Scrubber performance. However we also need access to the original data in order to assess

potential problems and improve the accuracy of NLM Scrubber.


Layout table for study information
Study Type : Observational
Estimated Enrollment : 1 participants
Observational Model: Other
Time Perspective: Retrospective
Official Title: NLM Scrubber: NLM's Software Application to De-identify Clinical Text Documents
Study Start Date : June 9, 2016
Estimated Primary Completion Date : September 30, 2026
Estimated Study Completion Date : January 29, 2027



Primary Outcome Measures :
  1. The rate of de-identification of PII [ Time Frame: 01/01/2017-01/31/2027 ]
    HIPAA Privacy Rule defines 18 types of personally identifying information, that need to be de-identified, which include personal names, addresses, significant dates, numeric identifiers (such as socialsecurity number). Our annotators label those words and numbers creating a gold standard and NLM-Scrubber tries to recognize andeliminate all of them. The rate of de-identification of PII refers to success of this outcome measure.


Secondary Outcome Measures :
  1. The rate of erroneously redacted clinical information [ Time Frame: 01/01/2017-01/31/2027 ]
    While NLM-Scrubber tries to eliminate only PII elements while preserving nonidentifying study data, it inadvertently deletes some ofthe non-identifying study data elements (non-protected health information) as well. The rate of erroneously redacted clinical information refers to the failure of NLMScrubber in preserving nonidentifying health information.



Information from the National Library of Medicine

Choosing to participate in a study is an important personal decision. Talk with your doctor and family members or friends about deciding to join a study. To learn more about this study, you or your doctor may contact the study research staff using the contacts provided below. For general information, Learn About Clinical Studies.


Layout table for eligibility information
Ages Eligible for Study:   18 Years and older   (Adult, Older Adult)
Sexes Eligible for Study:   All
Accepts Healthy Volunteers:   No
Sampling Method:   Probability Sample
Study Population
Everybody for whom a clinical narrative report is created.
Criteria
  • No new participant enrollment. Researchers will review data that have already been collected.

Information from the National Library of Medicine

To learn more about this study, you or your doctor may contact the study research staff using the contact information provided by the sponsor.

Please refer to this study by its ClinicalTrials.gov identifier (NCT number): NCT02795806


Locations
Layout table for location information
United States, Maryland
National Library of Medicine
Bethesda, Maryland, United States
Sponsors and Collaborators
National Library of Medicine (NLM)
National Cancer Institute (NCI)
National Institutes of Health Clinical Center (CC)
Investigators
Layout table for investigator information
Principal Investigator: Mehmet M Kayaalp, Ph.D. National Library of Medicine (NLM)