Operationalizing Machine Learning in the Laboratory | LiMSforum.com - The Global Laboratory, Informatics, Medical and Science Professional Community | LiMSforum.com – The Global Laboratory, Informatics, Medical and Science Professional Community

In previous blogs we have shown how the Data Analytics Solution can be utilized in many different laboratory scenarios to help identify bottlenecks using Business Intelligence Capability, and then reduce testing and forecast sample throughput using the Machine Learning Capability. In this blog we will show how the Machine Learning (ML) models that have been built can be automatically trained, and how they can be operationalized within Thermo Scientific™ SampleManager™ LIMS itself – without the need to utilize external applications or platforms.

As the Data Analytics Solution is embedded within SampleManager LIMS, it has several advantages from an operationalization perspective, namely –

All data remains within SampleManager LIMS – which aids data governance
The data is always up to date
ML models are version controlled
ML models access control and data is governed through roles and groups
Retraining can be automatically scheduled from within SampleManager LIMS
Predictions can run from workflow and so are triggered automatically

From the MLOps perspective, the ML model is developed, trained, deployed and monitored within SampleManager LIMS.

Each ML model has several flags which control its visibility at each stage of the MLOps process. For a ML model to be deployed it needs to be flagged as an active model and be flagged for automated training.

The SampleManager LIMS Background Scheduler is used to retrain the ML models flagged as available for automated training. In this example the models are retained every Tuesday and Sunday at 16:35.

Results evaluation

It is important that the performance of each ML model is tracked as new data is used to retrain each model. The quality of the ML model is dependent upon the quality of the labelled data used to train it, so over time the quality may vary. The performance of the ML model against the labelled trained data can be tracked through the results evaluation page.

When the results for a new sample have been entered, workflow can be used to automatically perform a prediction using a given ML model.

In this example, when all the results have been entered for the ‘Wine quality’ Test the ‘Test Completed’ Event is triggered. The ‘Wine Classification XGBoost’ profiling model is run for that sample and its quality classification predicted. A ‘Wine Quality Prediction’ Test is then added to the sample and the predicted wine quality value is stored in the ‘Quality Cat’ Result. This graphical approach allows a workflow to be configured for each customer’s needs. For example it could send an email, or perform a retest if the quality classification is not acceptable.

Read more about how the Data Analytics Solution for SampleManager LIMS can be applied here.