Selecting a laboratory information management system for biorepositories in low- and middle-income countries: The H3Africa experience and lessons learned

What's important for a biorepository laboratory information management system (LIMS), and what options are out there? What unique constraints in Africa make that selection more difficult? This brief 2017 paper from the Human Heredity and Health in Africa (H3Africa) Consortium outlines their take on finding the right LIMS solution for three of their regional biorepositories in Africa. The group emphasizes in the end that "[c]hoosing a LIMS in low- and middle-income countries requires careful consideration of the various factors that could affect its successful and sustainable deployment and use."

Baobab Laboratory Information Management System: Development of an open-source laboratory information management system for biobanking

This journal article, published in Biopreservation and Biobanking in early 2017, presents the development philosophy and implementation of a custom-modified version of Bika LIMS called Baobab LIMS, designed for biobank clients and researchers. Bendou et al., who enlisted customization help directly from Bika Lab System, describe how "[t]he need to implement biobank standard operation procedures as well as stimulate the use of standards for biobank data representation motivated the implementation of Baobab LIMS, an open-source LIMS for biobanking." The group concludes that while the open-source LIMS is quite usable as is, it will require further development of more "generic and configurable workflows." Despite this, the authors anticipate the software to be useful to the biobanking community.

The FAIR Guiding Principles for scientific data management and stewardship

Most scientists know that much of the data created in academic research efforts ends up being locked away in silos, difficult to share with others. But what are scientists doing about? In this 2016 paper published in Scientific Data, Wilkinson et al. outline a distinct set of principles created towards reducing the silos of information: the FAIR Principles. The authors state the primary goal of the FAIR Principles is to "put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals." After describing the principles and giving examples of projects that adhere to them, the authors conclude that the principles have the potential to "guide the implementation of the most basic levels of good Data Management and Stewardship practice, thus helping researchers adhere to the expectations and requirements of their funding agencies."

A multi-service data management platform for scientific oceanographic products

The problem? Disparate data sources, from weather and wave forecasts to navigation charts and natural hazard assessments, made oceanography research in Southern Italy more cumbersome. Solution? Create a secure, standardized, and interoperable data platform that can merge all that and other information together into one powerful and easy-to-use platform. Thus the TESSA (Development of Technology for Situational Sea Awareness) program was born. D'Anca et al. discuss the creation and use of TESSA as a geospatial tool that merges real-time and archived data to help researchers in Southern Italy. The authors conclude that TESSA is "a valid prototype easily adopted to provide an efficient dissemination of maritime data and a consolidation of the management of operational oceanographic activities," even in other parts of the world.

MASTR-MS: A web-based collaborative laboratory information management system (LIMS) for metabolomics

In development since at least the summer of 2009, the open-source MASTR-MS laboratory information management system (LIMS) was designed to better handle the data and metadata of metabolomics, the study of an entity's metabolites. In this 2017 paper published in Metabolomics, the development team of MASTR-MS discuss the current state of their LIMS, how it's being used, and what the future holds for it. They conclude by stating the software's "comprehensive functions and features enable researchers and facilities to effectively manage a wide range of different project and experimental data types, and it facilitate the mining of new and existing [metabolomic] datasets."

The effect of a test ordering software intervention on the prescription of unnecessary laboratory tests – A randomized controlled trial

When designing something as simple as a menu of laboratory tests into a piece of clinical software, it's relatively easy to not think of the ramifications of the contents of such a menu. In this 2017 article published in BMC Medical Informatics and Decision Making, Martins et al. argue that there are consequences to what's included in a laboratory test drop-down menu, primarily that the presence — or lack thereof — of a test type may influence how frequently that test is prescribed. The group concludes that "[r]emoving unnecessary tests from a quick shortcut menu of the diagnosis and laboratory tests ordering system had a significant impact and reduced unnecessary prescription of tests," which in turn led "to the reduction of negative patient effects and to the reduction of unnecessary costs."

The state of open-source electronic health record projects: A software anthropology study

In this journal article published in JMIR Medical Informatics in 2017, Alsaffar et al. review research from mid-2014 that looked at the state of open-source electronic health record (EHR) systems, primarily via SourceForge. The authors, noting a lack of research concerning the demographics and motivation of open-source EHR projects, present their finding, concluding that "lack of a corporate entity in most F/OSS EHR projects translates to a marginal capacity to market the respective F/OSS system and to navigate [HITECH] certification."

PCM-SABRE: A platform for benchmarking and comparing outcome prediction methods in precision cancer medicine

In this 2017 paper published in BMC Bioinformatics, Eyal-Altman et al. explain the use and benefits of their KNIME-based cancer outcome analysis software PCM-SABRE (Precision Cancer Medicine - Survival Analysis Benchmarking, Reporting and Evaluation). The group demonstrates its effectiveness by reconstructing the previous work of Chou et al. and showing how the results necessitate the tool for better reproducibility. The researchers conclude that when used effectively, PCM-Sabre's "resulting pipeline can be shared with others in an intuitive yet executable way, which will improve, if adopted by other investigators, the comparability and interpretability of future works attempting to predict patient survival from gene expression data."

Ten simple rules for cultivating open science and collaborative R&D

This journal article in PLOS Computational Biology's long-running Ten Simple Rules series goes back to 2013, when a collaborative group of eight authors from around the globe pooled their thoughts together on the topic of open science and collaborative R&D. The conversations (linked to in this article) provide context and insight into the various projects — from the Gene Wiki initiative to the Open Source Drug Discovery (OSDD) project — that have required significant deviation of thought from the traditional company view of conducting business. From "lead as a coach, not a CEO" to "grow the commons," the article's authors provide their thoughts on what best makes for collaborative and open science projects.

Ten simple rules to enable multi-site collaborations through data sharing

In yet another installment of PLOS Computational Biology's Ten Simple Rules series, Boland et al. of Columbia University and the Broad Institute of MIT and Harvard share their thoughts and experiences with multi-site collaborations and data sharing. The group provides practical tips for making data sharing easier and more successful, strengthening collaborations and the scientific process.

Ten simple rules for developing usable software in computational biology

This is another entry in PLOS Computational Biology's long-running Ten Simple Rules series, which attempts to break down computational biology / bioinformatics topics (that relate to the informatics side) down into a digestible and cited format. This 2017 entry by List et al. looks at the typical problems associated with computational biology software development and attempts to provide a clear approach for more usable, efficient software. The authors conclude that despite following these 10 rules, there's more to be done: "...effort is required from both users and developers to further improve a tool. Even engaging with only a few users ... is likely to have a large impact on usability."

The effect of the General Data Protection Regulation on medical research

In this brief paper by Rumbold and Pierscionek, the implications and theoretical impact of the European Union's General Data Protection Regulation are discussed. Addressing in particular claims that the new "consent requirements ... would severely restrict medical data research," the researchers break down the law that goes into effect in 2018, including anonymization issues, consent issues, and data sharing issues that will potentially affect biomedical data research. They conclude the impact will by minimal: "The GDPR will facilitate medical research, except where it is research not considered in the public interest. In that case, more demanding requirements for anonymization will entail either true anonymization or consent."

Methods for specifying scientific data standards and modeling relationships with applications to neuroscience

Neuroscience, like so many fields of science, is swimming in data, much of it in differing formats. This creates barriers to data sharing and project enactment. Rübel et al. argue that standardization of neuroscience data formats can improve analysis and sharing efforts. "Arguably, the focus of a neuroscience data standard should be on addressing the application-centric needs of organizing scientific data and metadata, rather than on reinventing file storage methods," they state. This late 2016 paper, published in Frontiers of Neuroinformatics, details their effort to make such a standardized framework, called BRAINformat, one that "fill[s] important gaps in the portfolio of available tools for creating advanced standards for modern scientific data."

Data and metadata brokering – Theory and practice from the BCube Project

This 2017 paper by University of Colorado's Siri Jodha Singh Khalsa, published in Data Science Journal, provides background on the successes, challenges, and outcomes of the Brokering Building Block (BCube) project, which aims "to provide [geo]scientists, policy makers and the public with computing resources, analytic tools and educational material, all within an open, interconnected and collaborative environment." It describes the processes of infrastructure development, interoperability design, data testing, and lessons learned from the process, including an analysis of the human elements involved in making data sharing easier and more profound.

A metadata-driven approach to data repository design

Turnkey data repositories such as DSpace have been evolving over the past decade, from housing publication preprints and postprints to today handling actual data management tasks of research. But what if this evolving technology could further be improved "to improve the discoverability of the deposited data"? Harvey et al. of the Imperial College London explored this topic in their 2017 paper published in Journal of Cheminformatics, developing new insights into repository design and DataCite metadata schemes. They published their results hoping that it "may in turn assist researchers wishing to deposit data in identifying the repository attributes that can best expose the discoverability and re-use of their data."

DGW: An exploratory data analysis tool for clustering and visualisation of epigenomic marks

Lukauskas et al. present their open-source software package DGW (Dynamic Gene Warping) in this December 2016 paper published in BMC Informatics. Used for the "simultaneous alignment and clustering of multiple epigenomic marks," the software uses a process called dynamic time warping (DTW) to capture epigenomic mark structure. The authors conclude that their research shows "that DGW can be a practical and user-friendly tool for exploratory data analysis of high-throughput epigenomic data sets" and demonstrates "potential as a useful tool for biological hypothesis generation."

SCIFIO: An extensible framework to support scientific image formats

In this short paper published in December 2016, Hiner et al. of the University of Wisconsin at Madison demonstrate their open-source library SCIFIO (SCientific Image Format Input and Output). Built on inspiration from the Bio-Formats library for microscopy image data, SCIFIO attempts to act as "a domain-independent image I/O framework enabling seamless and extensible translation between image metadata models." Rather than fight with the difficulties of repeating experiments based on data in proprietary formats, SCIFIO's open-source nature help with reproducibility of research results and proves "capable of adapting to the demands of scientific imaging analysis."

Use of application containers and workflows for genomic data analysis

As technology progresses, it allows bioinformaticians to improve the efficiency of their data processing tools and provide better solutions for patient care. As this December 2016 paper by Schulz et al. points out, one way in which dramatic change can potentially occur in science's big data management is through the use of application virtualization. The researchers, based out of Yale, attempt to "demonstrate the potential benefits of containerized applications and application workflows for computational genomics research." They conclude this technology has the potential to "improve pipeline and experimental reproducibility since preconfigured applications can be readily deployed to nearly any host system."

The case for a global clinical oncology digital archive

This brief non-peer-reviewed article by Sanjay Joshi, Isilon CTO of Healthcare and Life Sciences at the Dell EMC Emerging Technologies Team, looks at the global state of imaging in oncology clinical trials. His message? "[C]linical trials need to scale this critical imaging infrastructure component (the VNA) globally as a value-add and integrate it with clinical trials standards like the Clinical Data Interchange Standards Consortium (CDISC) along with the large ecosystem of applications that manage trials." In other words, standards, security, and scale are as important as ever in dealing with data, and clinical imaging is no exception.

Informatics metrics and measures for a smart public health systems approach: Information science perspective

In this 2017 paper published in Computational and Mathematical Methods in Medicine, Carney and Shea of the Gillings School of Global Public Health at University of North Carolina - Chapel Hill take a closer look at what drives intelligent public health system characteristics, and they provide insights into measures and capabilities vital to the public health informatician. They conclude that "[a] common set of analytic measures and capabilities that can drive efficiency and viable models can demonstrate how incremental changes in smartness generate corresponding changes in public health performance." This work builds on existing literature and seeks "to establish standardized measures for smart, learning, and adaptive public health systems."