I thought that for this post, I would introduce the new subject on the blog, lab data management. The idea is that in addition to providing witty reflection on how I got to where I am in my career, I would talk a little more about what that career looks like.
Before I can get to my career and what I actually do (still trying to figure that one out), I should provide some background. Lab data management is a subset of clinical data management so I’ll start there. I am going to use the Wikipedia definition since I got rid of my encyclopedia set decades ago. Clinical data management is a set of processes and procedures that “ensure collection, integration and of data at appropriate quality and cost”. The goal of clinical data management is to generate high-quality, reliable and statistically sound data to ensure that conclusions drawn from research are well-supported by the data. So, no pressure…right?
In many clinical trial settings, both in-house and contracted out (CROs), lab data management is conducted by clinical data managers along with the management of all the other clinical data. There are only a few institutions that I’m aware of that separate the laboratory data. I should clarify that when I’m talking about lab data, I’m not talking about the safety labs done to monitor the participants during the course of the trial (white blood cell counts, liver enzyme tests, etc). Those are monitored along with the other clinical data, at least in our organization. Lab data for my team consists of the endpoint data (HIV diagnostic data), pharmacokinetic (PK) data for drug trials and a whole host of immunology assays that are being done to assess the immune response to vaccines.
So what do we do with the lab data? I’m so glad you asked. Lab data management for us can be grouped into two broad categories, specimen monitoring/specimen data quality control and assay data processing. Specimen monitoring and specimen data quality control are essentially the same thing. For the purposes of this post, I’ll call it specimen monitoring. In all clinical trials, participants have specimens taken. It’s usually blood draws but it can also include tissue biopsies, etc. The metadata around these specimens can end up being entered in two different data streams, the clinical data stream (i.e. Case Report Form filled out when a participant comes in for a visit), and a Lab Information Management System (LIMS), which is filled out when the specimen is processed in the lab. In order for the specimen to be used for HIV diagnostic testing or immunological testing, the metadata has to match in both places. Let’s take the example of the HIV diagnostic testing. There are algorithms for testing in HIV to determine not only if someone is infected, but if it is an acute or chronic infection. HIV testing algorithms are not the same for every study. If you are performing a HIV vaccine trial where the whole point is to elicit antibodies against HIV, you will have to have a series of tests to determine if the antibody responses that show up positive on a diagnostic test are vaccine-elicited or from actual HIV infection. If you are testing a HIV prevention intervention, the testing algorithm will be different. So if the metadata for a specimen at the time of draw says that this blood tube is from visit 4 from protocol 001, then the diagnostic lab knows what testing algorithm to run. If, somewhere in the process of sending the tube to the lab and the transfer of information from the clinical database, to a specimen label or lab requisition form, to the LIMS, the metadata got changed to visit 4 from protocol 002, then the testing algorithm will be different. This would render any data from that testing invalid.
One whole scope of work for my team is to ensure that the metadata from a specimen remains correct throughout the course of the study, no matter what data stream that specimen appears in. We accomplish this by programmatically comparing the different data streams each day and issuing QCs when the data doesn’t match. We then work with the labs and clinics to find the reason for the data discrepancy, the source documentation to determine the real value and to correct the QC. This ensures that as many specimens as possible can then be used for testing. Participants trust that when they donate blood or tissue, that it will be put to good use and we help to ensure that it will be.
The second large scope of work for the team is assay processing. After clinical specimens have been processed and sent to labs for testing, we receive that assay data back into our group. We again check to make sure the specimen metadata is clean and we also do additional quality checks to evaluate the data for format consistency, logic (if there is supposed to be a numeric value, we check to make sure the values are numeric), and some range checks and other assay specific checks. This part of our work is important because not only do we want all specimens to be able to be used for testing, we want all the lab testing data to be used in the statistical analyses. We provide consistently formatted and clean datasets to the statisticians for their analysis.
In short, lab data management in SCHARP is a group dedicated to preserving high quality laboratory data for analysis in clinical trials by safeguarding the metadata around clinical specimens and providing consistent and clean laboratory datasets for analysis. If you’re interested, I can go into more detail about how we do this in subsequent posts.I will definitely be doing more posts about why it’s important to think about data management, even in a research setting and discussing some methods and best practices for how to start implementing lab data management, regardless of the setting.