The use of publicly available data repositories is increasingly encouraged by organizations handling academic researchers’ publications. Funding agencies, data organizations, and academic publishers alike are providing their researchers with lists of recommended repositories, though many aren’t certified (as little as six percent of recommended are certified). Husen et al. show that the landscape of recommended and certified data repositories is varied, and they conclude that “common ground for the standardization of data repository requirements” should be a common goal.
In this 2017 paper by Mathews and Marc, a picture is painted of laboratory information systems (LIS) usability, based on a structured survey with more than 250 qualifying responses: LIS design isn’t particularly user-friendly. From ad hoc laboratory reporting to quality control reporting, users consistently found several key LIS tasks either difficult or very difficult, with Orchard Harvest as the demonstrable exception. They conclude “that usability of an LIS is quite poor compared to various systems that have been evaluated using the [System Usability Scale]” though “[f]urther research is warranted to determine the root cause(s) of the difference in perceived usability” between Orchard Harvest and other discussed LIS.
As calls for open scientific data get louder — particularly in Europe — research institutes, organizations, and data management stakeholders are being forced to consider their data management policies and research workflow in an attempt to better answer those calls. Martin et al. of the National Research Institute of Science and Technology for Environment and Agriculture (Irstea) in France present their institute’s efforts towards those goals in this 2017 paper published in LIBER Quarterly. They refer frequently to the scientific and technical information (STI) professionals who must take on new skills, develop new policies, and implement new tools to better meet the goals of open data. Martin et al. conclude with five points, foremost that these changes don’t happen overnight; the necessary change “requires adaptation to technological developments and changes in scientific practices” in order to be successful.
Comprehending the health informatics spectrum: Grappling with system entropy and advancing quality clinical research
In this 2017 paper published in ICT Express, Huang et al., associated with a variety of power research institutes in China, provide an overview of energy informatics and the standardization efforts that have occurred around it and the concept of the “smart grid.” After describing the various energy systems, technical fundamentals, and standardization efforts, the group concludes that “[l]earning from the successful concepts and technologies pioneered by the internet, and establishing an open interconnection protocol are basic methods for the successful development of the future energy system.” They also add that “[i]t is essential to build an intelligent information system (or platform) to promote the interoperability and coordination of different energy systems and [information and communication technology] planes.”
The electronic laboratory notebook (ELN) is implemented in a wide variety of research environments, but what are the special requirements of a public-private partnership? In this 2016 paper published in PeerJ Computer Science, members of the TRANSLOCATION consortium — “a multinational and multisite public–private partnership (PPP) with 15 academic partners, five pharmaceutical companies and seven small- and medium-sized enterprises” — carefully present the process they took in selecting, installing, supporting, and re-evaluating an ELN for their scientific research. The group concludes that selection, implementation, change management. user buy-in, and value-added ability are all vital to the adoption and use of an ELN in a PPP.
This paper published in Journal of Medical Biochemistry looks back on the benefits and errors associated with the implementation of a laboratory information system (LIS) in the Railway Healthcare Institute of Belgrade, Serbia. Author Vera Lukić explains how their first implementation went wrong and how a new LIS quickly helped improve the workflow of Railway’s lab. Lukić finds that for them system flexibility and the ability to customize to user needs was most important in their implementation. He concludes that their LIS benefited them through the “increased pace of patient admission, prevention of sample identification errors, prevention of test translation errors, permanent results storage in electronic form, prevention of billing errors, improved time savings and better staff organization,” resulting in “a step forward towards optimization of the total testing process.”
This brief viewpoint article by Deliberato et al. looks at the state of note taking in electronic health records (EHRs) and proposes that EHR developers look to artificial intelligence (AI) components to improve note taking and other tasks in their software. Not only would AI improve note taking, they argue, but “AI would provide helpful suggestions to the user about what information is available and how it might influence the next course of action. AI could also function to emphasize or deemphasize certain elements of the record, based on previous results, external databases, and knowledge networks.”
In this 2017 paper published in the Journal of Biomedical Informatics, Panahiazar et al. “propose a novel metadata prediction framework to learn associations from existing metadata that can be used to predict metadata values.” They tested their framework using experimental metadata contained in the Gene Expression Omnibus (GEO), an international public repository for community-contributed genomic datasets. The framework itself uses association rule mining (ARM) algorithms to predict structured metadata, and they found that indeed ARM is a strong tool towards that goal, though with several limitations. They conclude that with tools like theirs, the discovered “[p]redictive metadata can be used both prospectively to facilitate metadata authoring, and retrospectively to improve, correct and augment existing metadata in biomedical databases.”
Rapid development of entity-based data models for bioinformatics with persistence object-oriented design and structured interfaces
When it comes to bioinformatics data, databases are one of the more important the behind-the-scenes work horses. Yet inherent challenges of data heterogeneity and context-dependent interconnection in database design have driven the creation of specialized databases, which has, as a byproduct, caused additional problems in their creation. In this paper, Jerusalem College of Technology’s Ezra Tsur proposes “an open-source framework for the curation of specialized databases,” one that demonstrates “integration of the most relevant technologies to OO-based database design in a single framework” as well as extensibility to function with many other bioinformatics tools.
What are the links between pathology in the clinical setting and bioinformatics? Are residents in pathology gaining enough training in bioinformatics? And why should they learn about it in the first place? Clay and Fisher provide their take on these questions in this 2017 review paper published in Cancer Informatics. From their point-of-view, bioinformatics training is vital to practicing pathology “in the ‘information age’ of diagnostic medicine,” and training should be taken more seriously at the residency and fellowship levels. They conclude that “in order for bioinformatics education to firmly integrate into the fabric of resident education, its importance and broad application to the practice of pathology must be recognized and given a prominent seat at the education table.”
This 2015 paper by Faria-Campos et al. of the Brazilian Universidade Federal de Minas Gerais presents the reader with an overview of FluxCTTX, essentially a cytotoxicity module for the Flux laboratory information management system (LIMS). Citing a lack of laboratory informatics tools that can handle the specifics of cytotoxicity assays, the group develop FluxCTTX and tested it in five different laboratory environments, concluding that it can better “guarantee the quality of activities in the process of cytotoxicity tests and enforce the use of good laboratory practices (GLP).”
PathEdEx – Uncovering high-explanatory visual diagnostics heuristics using digital pathology and multiscale gaze data
The visual and non-visual analysis techniques of pathology diagnosis are still in a relative infancy, representing a complex conglomeration of various data sources and techniques “which currently can be described more like a subjective exercise than a well-defined protocol.” Enter Shin et al., who developed the PathEdEx system, an informatics computational framework designed to pair digital pathology images and pathologists’ eye/gaze patterns associated with those images to better understand and develop diagnostics methods and educational material for future pathologists. “All in all,” they conclude, “the PathEdEx informatics tools have great potential to uncover, quantify, and study pathology diagnostic heuristics and can pave a path for precision diagnostics in the era of precision medicine.”
Where are electronic laboratory notebooks (ELNs) going, and what do they lack? How does data from several user groups paint the picture of the ELN and its functionality and shortcomings? In this 2017 paper published in Journal of Cheminformatics, researchers from the University of Southampton and BioSistemika examine the market and users of ELNs/paper laboratory notebooks, intent of identifying areas ELNs could be improved. They conclude that optimally there should be “an ELN environment that can serve as an interface between paper lab notebooks and the electronic documents that scientists create, one that is interoperable and utilizes semantic web and cloud technologies, particularly given that “current technology is such that it is desirable that ELN solutions work alongside paper for the foreseeable future.”
Like many other fields of science, earth science is increasingly swimming in data. Unlike other fields, the discussion of earth science data and its analysis hasn’t been as vigorous, comparatively speaking. Kempler and Mathews of NASA speak of their efforts within the Earth Science Information Partners’ (ESIP) earth science data analytics (ESDA) program in this 2017 paper published in Data Science Journal. The duo shares their experiences and provides a set of techniques and skills typical to ESDA “that is useful in articulating data management and research needs, as well as a working list of techniques and skills relevant to the different types of ESDA.”
This brief paper published in BMC Bioinformatics provides a sociological examination of the world of bioinformatics and how it’s perceived institutionally. Bartlett et al. argue that institutionally while less focus is place on bioinformatics processes and more on the data input and output, putting the contributions of bioinformaticists into a “black box,” losing scientific credit in the process. The researchers conclude that “[i]n the pursuit of relevance and impact, future scientific careers will increasingly involve playing the role of a fractional scientist … combining a variety of expertise and epistemic aspirations…” to become “tomorrow’s bioinformatic scientists.”
In this 2017 journal article published in the Data Science Journal, data scientist Sabina Leonelli reflects on a paradigm shift in biology where “specific data production technologies [are used] as proxy for assessing data quality,” creating problems along the way, particularly for the open data movement. Leonelli’s major concern: “Ethnographic research carried out in such environments evidences a widespread fear among researchers that providing extensive information about their experimental set-up will affect the perceived quality of their data, making their findings vulnerable to criticisms by better-resourced peers,” hindering data and provenance sharing. And the conclusion? Endorsing specific data production and management technologies as indicators of data quality can cloud the goals of open data initiatives.
In this 2017 article published in Frontiers in Neuroinformatics, Grigis et al., like many before them, note that data in scientific areas of research such as genetics, imaging, and the social sciences has become “massive, heterogeneous, and complex.” Their solution is a Python-based one that integrates the CubicWeb open-source semantic framework and other tools “to overcome the challenges associated with data sharing and collaborative requirements” found in population imaging studies. The resulting data sharing service (DSS) proves to be flexible, integratable, and expandable based on demand, they conclude.
Wikipedia defines bibliometrics as a “statistical analysis of written publications, such as books or articles.” Related to information and library science, bibliometrics has been helping researchers make better sense of the trends and impacts made across numerous fields. In this 2017 paper, Heo et al. use bibliometric methods new and old to examine the field of bioinformatics via related journals over a period of 20 years to better understand how the field has changed in that time. They conclude that “the characteristics of the bioinformatics field become more distinct and more specific, and the supporting role of peripheral fields of bioinformatics, such as conceptual, mathematical, and systems biology, gradually increases over time, though the core fields of proteomics, genomics, and genetics are still the major topics.”
As Khan and Mathelier note in their abstract, one of the more common tasks of a bioinformatician is to take lists of genomes or genomic regions from high-throughput sequencing and compare them visually. Noting the lack of a comprehensive tool to visualize such complex datasets, the authors developed Intervene, a tool for computing intersections of multiple genomic and list sets. They conclude that “Intervene is the first tool to provide three types of visualization approaches for multiple sets of gene or genomic intervals,” and they have made the the source code, web app, and documentation freely available to the public.