Data Integrity Matters | Limiting Access to Tools That Could Be used to Manipulate Data (Part 3)

By January 14, 2019

Integration and reintegration tools

Useful tools can be applied to achieve accurate and consistent results, but also, in the wrong hands or if someone has the intent, they might be used to falsify data. How can you tell the difference?

Being able to optimize peak integration (using the method parameters or manual integration) and to correct peak identification (whether by resetting the method retention time (RT) settings or manually identifying peaks) are tools that may be used by unscrupulous staff to fraudulently manipulate a failing result into a passing specification. In other words in the wrong hands, with the wrong intent, and without a robust training and review process, altering chromatographic peak processing parameters has been misused by analysts to falsify results.

Experienced scientists also know that chromatographic analysis can be affected by temperature, humidity, column history, as well as mobile phase preparation, such that one day’s analysis often varies slightly from the previous day’s run. In order to achieve consistent output and measurement, it is critical to adapt and/or optimize factors such as peak detection threshold or RT expectations to ensure consistent, correct, and accurate peak measurement and identification. But how can reviewers, approvers, quality, or outside auditors recognize the legitimate vs. egregious use of those tools?

In my last Data Integrity Matters blog (Part 2), I introduced four scenarios of concern:

  1. System readiness, trial, or equilibration injections, which should only be concerning if these are used to unofficially “pretest” actual samples
  2. Multiple attempts to optimize the integration parameters to achieve accurate integration
  3. The use of manual integration to achieve accurate integration
  4. Suppression of peak integration to eliminate certain peaks from being included in analyses

Here, I discuss scenarios two and three together since they are intimately related. How SOPs are written about one will influence an analyst’s use of the other.

Achieving accurate integration

Let’s first think about the real world. Chromatographic peak shapes and separation can change from injection to injection, from day to day, and from instrument to instrument. A certain degree of method reproducibility and robustness is always challenged during method validation, but almost never in the history of chromatographic method development, has method validation been performed with a single unchanging set of processing parameters. Chromatographers endeavor to validate a single set of acquisition parameters, but now, when striving to adopt a QbD approach, even this would become a design space of relevant parameters. No laboratory, however, would consider putting such restrictions on integration methods. It would imply that using the same parameters every day is better science than using the appropriate parameters to get the most consistent and accurate integration.

The comparison of peak areas between samples and standards is so critical to the result calculation that peak integration must be consistent. With varying peak shapes and retention times (which occur in all but the simplest methods), analysts must optimize the parameters in the integration method to adapt to the slightly varying peak start and end positions. One set of parameters ought to be used for all injections in a sequence – standards as well as samples. However, forcing the same rule for ‘all sequences in a study’ or ‘all batch analysis forever’ is bound to result in inaccurate and inconsistent integration from one day to the next.

Most chromatography data systems (CDS) can provide integration parameters that work for low as well as high concentration samples, as the major of parameters to optimize baseline placement (peak width and threshold) evaluate changes in slope of either the raw data curve or a derivative of the raw data curve. But integration methods also have timed events where certain parameters may be optimized for different sections of the chromatogram. The action of optimizing or adapting these must be logged in an audit trail and, ideally, also saved in interim results created during that iterative process and well as being clearly saved to the traceable processing method. Once a suitable set of parameters are created, they can then be applied to the entire set of data to calculate trustworthy and consistent peak areas.

While technically this is not called “manual integration,” it is, in fact, a manual process requiring human skills to uncover and evaluate the best parameters. Often, depending on the components used for System Readiness injections or System Suitability injections, an analyst might use these injections to optimize integration parameters; however they could still need further adaption based on the actual sample chromatograms.

Ensuring true and consistent integration

This process of integration method optimization was, for a long time, considered essential for trustworthy reproducible peak integration. Critically, in an age where peak integration and peak areas were not part of the stored record, it was the only way that the result could be regenerated by the same analyst or a different analyst. Manually placing the peak starts and ends was considered dangerous because a second person would be unlikely to ever place the peak starts and ends in the exact same place, but reapplying a method would always produce the exact same results.

This danger no longer exists with most CDS solutions, where the actual results, whether from automated or manual integration are stored in a non-editable file. This means there is never a need to reproduce the integration from an earlier “processing” when all processed results are stored and are available for review.

It might also be argued that adapting the automated integration of peaks manually, for the few peaks that a method may integrate “incorrectly,” is more transparent and trustworthy than forcing analysts to repeatedly try multiple iterations of integration parameters for the whole sequence in order to fix a couple of poorly integrated peaks. It permits a reviewer to focus their review on the chromatographic peaks that have been manually integrated, to ensure a true and consistent integration.

However, the intent and motivation during peak integration may be questioned. Were those parameters chosen to accurately perform integration according to the SOP, or were they chosen deliberately to over- or underestimate peak areas in order to force a failing injection to pass?

A casual reviewer is unlikely to be able to make that distinction, but a skilled reviewer, who knows from the SOP what the resulting integration should look like, will spot deliberate under- or overestimations of peak areas. This requires review of both sample and standard injections, as both require consistent integration. Ideally, SOPs will include a picture of how peaks, especially unresolved or grouped peaks, should be integrated.

What is the right integration?

Figure 1. Examples of correct and incorrect integration in an SOP based on how the method was validated. It may have alternative instructions for when the rider peak is small or comparable to the API peak.

It is the actual resultant peak integration that must be consistent, not the use of consistent parameters or settings to achieve that peak integration.

It should be noted, as well, that the bad practice of imperceptible “peak shaving” or dragging baselines under the noise to achieve a larger area, can only affect the resultant calculation minutely before it becomes obvious to a trained reviewer. As such, only samples that are “borderline fails” might be brought into a false passing status by tweaking integration without that manipulation becoming obvious. For this reason, focusing reviews onto “borderline passes” should allow a trained reviewer to easily spot egregious use of integration to create false results.

Three potential outcomes for banning manual integration

Banning the use of manual integration is a common response to avoid questions about data integrity. However, there are three outcomes to this crude action:

  1. Laboratories will have to accept poor and inconsistent integration
  2. Analysts will find a workaround that permits them to integrate each run with a different set of integration parameters (typically involves performing quantitation in a LIMS, or worse, a spreadsheet, without traceability back to the integration methods)
  3. Analysts will be forced to spend hours of their day developing complex and manipulative methods to address variations between chromatograms with a single processing method. Typically this will require many “integration events” that could even include placing peak starts and ends at specific time points; in effect, performing manual integration under the guise of an automated method to deceive the reviewer

In summary, before making decisions about how chromatographic peak integration and reintegration are allowed in a lab, consider the methods that are being run. How complex are the methods or the sample matrices? How robust and repeatable are the methods? How much actual peak resolution is there? And from these assessments go on to determine if separation methods can be improved to enable right-first-time automated integration?

From there, develop appropriate SOPs for integration and reintegration that clearly guide analysts how to approach poor peak integration using the prescribed integration method, how to optimize integration, and when manual integration is allowed. This should also guide the reviewer on how to assess the peak integration in both final, and superseded results. Finally, if poorly separated peaks cannot be resolved by method improvements, each analytical method should clearly guide both analysts and reviewers on the expected “integration pattern” for groups of peaks.

Accurate integration should always precede result generation to ensure unbiased calculations and reported results.


In my next Data Integrity Matters blog, I will discuss the use of integration parameters to deliberately suppress certain peaks from being integrated and therefore included in chromatographic calculations.

Read more articles in Heather Longden’s blog series, Data Integrity Matters.

< Previous
Next >

Categories: Data Integrity