|Title:||A KNIME Workflow for Automated Structure Verification|
|Authors:||James A. Lumley, Gary Sharman, Thomas Wilkin,
Matthew Hirst, Carlos Cobas, Michael Goebel
|Date:||First published February 21, 2020|
Adequate characterization of chemical entities made for biological screening in the drug discovery context is critical. Incorrectly characterized structures lead to mistakes in the interpretation of structure–activity relationships and confuse an already multidimensional optimization problem. Mistakes in the later use of these compounds waste money and valuable resources in a discovery process already under cost pressure. Left unidentified, these errors lead to problems in project data packages during quality review. At worst, they put intellectual property and patent integrity at risk. We describe a KNIME workflow for the early and automated identification of these errors during registration of a new chemical entity into the corporate screening catalog. This Automated Structure Verification workflow provides early identification (within 24 hours) of missing or inconsistent analytical data and therefore reduces any mistakes that inevitably get made. Automated identification removes the burden of work from the chemist submitting the compound into the registration system. No additional work is required unless a problem is identified and the submitter alerted. Before implementation, 14% of samples within the existing sample catalog were missing data on initial pass. A year after implementation, only 0.2% were missing data.