In this 2019 review paper published in Frontiers in Oncology, Bhattacharya et al. describe the state of collaborative, artificial-intelligence-based computational cancer research within various agencies and departments of the United States. The researchers point to three major initiatives that aim “to push the frontiers of computing technologies in specific areas of cancer research” at the cellular, molecular, and population levels. They present details concerning the three initiatives, enacted as pilot programs with specific goals: Pilot One “to develop predictive capabilities of drug response in pre-clinical models of cancer,” Pilot Two “on delivering a validated multiscale model of Ras biology on a cell membrane,” and Pilot Three “to leverage high-performance computing and artificial intelligence to meet the emerging needs of cancer surveillance.” Additionally, emerging opportunities and challenges that continue to arise out of these pilots are also addressed, before concluding that “opportunities for extreme-scale computing in AI and cancer research extend well beyond these pilots.”
Genomic and sequencing data are inherently complex and have significant storage requirements. They require a robust infrastructure with well-considered policies to make the most of their potential. While North America and Europe have helped lead the way in this goal, Africa is behind them in the adoption of genomic technologies. Parker et al., of the Human Hereditary and Health in Africa (H3Africa) program, have taken on the challenge of provisioning and managing the infrastructure required to meet the goals of various Africa-based genetic research projects. This paper describes the H3Africa Data Archive, “the first formalized human genomic data archive on the continent.” The authors discuss their process and findings, also noting various challenges that presented during the implementation process, as well as recognizing the various benefits from such a project.
Process variation detection using missing data in a multihospital community practice anatomic pathology laboratory
In this brief journal article by Ochsner Health System’s Gretchen Galliano, a case is made for a programmatic approach to analyzing missing data in various laboratory information systems (LIS) and determining potential correlations with procedural and systemic processes in the health system. Using the R programming language, homemade scripts, and existing R packages, Ochsner Health Systems visualized and analyzed data from more than 70,000 cases of missing timestamp data, splitting the various cases into five pools. Galliano concludes that the process of “evaluating cases with missing predefined process timestamps” has the potential for improving “the ability to detect other data variations and procedure noncompliance in the AP workflow in a prospective fashion.” She added that as an additional benefit, “[p]eriodically evaluating data patterns can give AP LIS teams and operations teams insight into user–LIS interactions and may help identify areas that need focus or updating.”
Development and validation of a fast gas chromatography–mass spectrometry method for the determination of cannabinoids in Cannabis sativa L
In this 2018 paper published in the Journal of Food and Drug Analysis, Cardenia et al. discuss their development of a “routine method for determining cannabinoids” in the flowers of Cannabis sativa L. using fast gas chromatography coupled to mass spectrometry (fast GC-MS), with appropriate derivatization approaches that take into account potential decarboxylation. The authors discuss the various problems with other methods then present their materials and methods. After considerable discussion of their results, they conclude that the procedure is fast (within seven minutes), has good resolution (R > 1.1), and remains cost-effective. Sensitivity was also high, with “a high repeatability and robustness in both cannabinoid standard mixtures and hemp inflorescence samples.”
Clinical data research networks (CDRNs)—consisting of a variety of health care delivery organizations that share deidentified clinical data for clinical research purposes—constitute yet another collaboratory mechanism for scientific researchers to pool data and make new discoveries. However, one of the faults of CDRN data is that it typically comes from electronic health records (EHRs), which contain data with a lean more towards supporting “clinical operations rather than clinical research.” This means data quality is of the utmost importance when pooling and putting to effective use such disparate data sources. In this research, Khare et al.propose a systematic workflow for making quality assessments of CRDN’s data before use, a workflow that includes hundreds of systematic data checks and a GitHub-based reporting system to track and correct issues in a more timely fashion. They conclude that their publicly available toolkit definitively has value, though implementers should be advised that “sufficient resources should be dedicated for investigating problems and optimizing data” due to the time-intensive nature of the entire process.
Identification of Cannabis sativa L. (hemp) retailers by means of multivariate analysis of cannabinoids
In this 2019 article published in Molecules, Palmieri et al. demonstrate their ability to use nine cannabinoids, a specific analytical method, and multivariate analysis—without any other identifying information—to identify the retailer of 161 hemp samples from four retailers. Highlighting the fact that simply using analyses of Δ9-tetrahydrocannabinol (THC) and cannabidiol (CBD) “to extrapolate the phytochemical composition of hemp” may be insufficient in some cases, the researchers turn to high-performance liquid chromatography–tandem mass spectrometry (HPLC-MS/MS) and partial least squares discriminant analysis (PLS-DA) to identify hemp sample origins. The authors note that using their techniques, “92% of the hemp samples were correctly classified by the cannabinoid variables in both fitting and cross-validation.” They conclude “that a simple chemical analysis coupled with a robust chemometric method could be a powerful tool for forensic purposes.”
The concept of data sharing and open science have been touted more over the past decade, often in the face of claims of lack of reproducibility and the need for more collaboration across scientific disciplines. At times researchers will point to a specific “culture” evident in their organization that helps or hinder the move towards data sharing. But the concept of aligning data cultures—particularly through the lens of identifying and solving the inherent differences between disciplines—isn’t the way to look at data sharing, argue Poirier and Costelloe-Kuehn. Instead, we must ” showcase and affirm the diversity of traditions and modes of analysis that have shaped how data gets collected, organized, and interpreted in diverse settings,” they say. In this essay, the authors present their heuristic (a problem solving and self-discovery method) for sharing data at scale, from the meta level down to the nano level, giving researchers the tools to “affirm and respect the diversity of cultures that guide global and interdisciplinary research practice.”
Design and evaluation of a LIS-based autoverification system for coagulation assays in a core clinical laboratory
In this 2019 paper published in BMC Medical Informatics and Decision Making, Wang et al. of China Medical University demonstrate the results of an attempt to add autoverification mechanisms for coagulation assays into their laboratory information systems (LIS) to better improve both operations and patient care. After providing background on coagulation assays and autoverification guidelines, the researchers describe their methodology for programmatically developing autoverification decisions rules and implementing them into their laboratory workflow. Additionally, they discuss how best assess validation of the new system and its results. The authors conclude that not only has the new system improved turnaround time in the lab, but it also has improved the level of medical safety in its diagnoses in the affiliated hospitals.
In this 2019 paper published in the International Journal of Online and Biomedical Engineering, authors Hodhod et al. present their expert system CyberMaster, designed to “to assist inexperienced instructors with cybersecurity course design.” They highlight the need for improved cybersecurity training in not only universities but also the workplace and give some underlying reasons for why cybersecurity issues seem to be increasing. The authors turn to the National Institute of Standards and Technology (NIST) National Initiative for Cybersecurity Education Framework (NICE Framework) for the development of CyberMaster. After describing its creation and implementation, they conclude that “[t]he system contributes to changing the current status of cybersecurity education by helping instructors anywhere in the world to develop cybersecurity courses.”
Compared to other U.S. states, California arguably has some of the most strict laws regarding the laboratory testing of cannabis. Economically, what have been some of the effects of these regulations? Valdes-Donoso et al. attempt to contribute to that conversation in this 2019 paper published in California Agriculture. Using state regulations, expert opinions, primary data from California’s laboratories, and data from cannabis testing equipment manufacturers, the authors attempt to estimate the cost per pound of testing and sampling under the state’s regulatory framework. This includes what they consider to be particularly costly: cases where cannabis is rejected for testing failure. They conclude their research by discussing the economic and regulatory implications of their findings, including supply and demand issues, costs of legal vs. illegal cannabis, and comparisons to other state-mandated agricultural testing in the state.
In this brief paper published in Frontiers in Marine Science, Armstrong et al. present the details of their OceanWorks integrated data analytics platform (IDAP) (which later was open sourced as the Apache Science Data Analytics Platform [SDAP]). Confronted with disparate data management solutions for performing research on oceanographic research data, the authors developed OceanWorks to provide an integrated platform capable of advanced data queries, data analysis, anomaly detection, data matching, data subsetting, and more. Since its creation, OceanWorks has been deployed in multiple NASA environments to handle a wide variety of data management tasks at various deployment intensities. They conclude that under its open-source SDAP iteration, the software platform will “continue to evolve and leverage any new open-source big data technology” in order “to deliver fast, web-accessible services for working with oceanographic measurements.”
Over the last decade, various researchers have proposed numerous methods for improving the security and privacy of critical data on virtual machines hosted on remote cloud servers, particularly when mobile applications are involved. Annane and Ghazali review those research efforts in this 2019 paper published in the International Journal of Interactive Mobile Technologies and detail the various gaps in that research. After carefully looking at the strengths and drawbacks of numerous techniques, the authors additionally propose the three biggest challenges not well addressed in that research: scalability, credibility, and robust communication techniques. They also suggest future research that will address “three secure policies to protect users’ sensitive data against co-resident and hypervisor attacks, as well as preserve the communication of users’ sensitive data when deployed on a different cloud host.”
In this brief article by Greene et al. of the National Institute of Standards and Technology (NIST), details of the organization’s attempt to open access to it scientific data infrastructure are provided. After introducing the details of the type of data held at NIST, as well as the U.S. governments “open access to research” (OAR) initiative, the authors describe the technological architecture that went towards making NIST’s data more open. They describe the OAR application workflow all the way to the end, where efforts to “enable data interoperability as much as possible in order to maximize the usability of NIST data” were put in place. The authors conclude that their system not only meets OAR requirements, making basic data access a breeze, but it also “will facilitate the use of AI and machine learning applications and help solve many complexities in mining data-rich resources.”
Development of standard operating protocols for the optimization of Cannabis-based formulations for medical purposes
In this 2019 paper by Baratta et al., published in Frontiers in Pharmacology, the authors examine a variety of different procedures for preparing decoctions and oil-based extracts using the Cannabis plant. Using the different formulations as a base, they wanted to determine the efficacy of the various “standard operating procedures for the preparation and optimization of Cannabis-based galenic formulations.” After discussing the materials and methods, and reviewing the results, the authors discussed the ramifications of their research, in particular in contrast to what the Italian government recommends procedure-wise. They concluded that their “β-4” method of oil preparation yielded “significantly higher [recovered THC and CBD] compared to those with water-based extraction (decoctions) or other current oil-based extraction techniques.”
Where do technology, security, and the DNA synthesis market intersect? Most notably is the need for minimizing biological risks and maximizing the safety and security of DNA synthesis practices around the world. Diggans and Leproust of Twist Bioscience highlight this major intersection and discuss what they and the International Gene Synthesis Consortium (IGSC) view as the most important aspects of DNA synthesis to address in order to mitigate risks and improve global safety and security. Using the IGSC’s Harmonized Screening Protocol as a base, the authors provide background on the subject and address current industry best practices and how they could be improved using cybersecurity and strong software testing methodologies. They also address how research funding priorities should be addressed to build and maintain databases, reduce risks, and “democratize access to sequence screening.” The authors conclude that a multi-faceted approach of practices such as “red teaming,” stronger investments in screening—particularly with oligonucleotide pools—and stronger efforts to “teach and promote the evaluation of the security implications of new synthetic biology techniques or materials” to practicing synthetic biologists will yield positive results for the future of DNA synthesis.
Japan Aerospace Exploration Agency’s public-health monitoring and analysis platform: A satellite-derived environmental information system supporting epidemiological study
In this 2019 paper published in Geospatial Health, Oyashi et al. present details on the Japan Aerospace Exploration Agency’s (JAXA) Public-health Monitoring and Analysis Platform (PMAP). The web-based system was designed to take a wide variety of environmental data and present it in such a way so as to allow epidemiologists to make local, regional, national, or even global insights. Using more than 30 years of archived Earth-observed satellite data from multiple sources, the authors demonstrate the system and explain its underlying technology. While recognizing some of the inherent limitations of optical-sensor-based data towards user goals, the team concludes that its JPMAP system successfully takes “various satellite-derived environmental information related to epidemiological data and pre-processes the data to improve its accessibility for epidemiological research.”
Where do government regulations, privacy, big data, and the management of the energy grid infrastructure collide? The implementation and management of modern smart grids, that’s where. In this 2019 paper by Hatzakis et al., a major distribution system operator (DSO) in the Netherlands is at the heart of discovering how Europe’s General Data Protection Regulation (GDPR) are affecting modern implementations of smart energy grids. From cybersecurity risks to public concerns about the privacy of their energy data, this article reviews various ethical issues, concluding with “the need for clarification in practice of privacy policies (particularly of GDPR) to lift concerns about the capability of organizations to remain within its boundaries without holding back progress.”
This brief perspective article by Anwar et al. of the BHF Centre for Cardiovascular Science at University of Edinburgh examines the potential for using health informatics to improve heart failure outcomes in clinical settings. Noting “decompensated heart failure accounts for up to five percent of all acute unscheduled hospital admissions and has the longest length of stay of any cardiac condition,” the authors remind readers that despite significant evidence-based practices backed by science, there continues to be a disconnect between those practices and actual clinical practice. They then turn to a 2019 cohort study of nearly 100,000 U.K patients to provide hints at how collecting, managing, and effectively using heart failure data in contemporary clinical practice can be beneficial to many. The authors conclude that “we need to collate healthcare data from across both primary and secondary care settings in real time and use robust methodology to evaluate major changes in clinical practice or policy decisions” and attempt to show a visual of what such a platform would look like.
Cybersecurity for biopharmaceutical manufacturing? The “rapid pace of innovation dictates that it is not too early to consider the cyberbiosecurity implications” of the potential dangers that can arise in the industry, argue Mantle et al. in this 2019 paper published in Frontiers in Bioengineering and Biotechnology. From compromised master cell banks to intentional corruption of ” the design, reading, and writing of DNA sequences to produce pathogenic, self-replicating entities,” the authors demonstrate worst case scenarios and practical considerations that scientists should be considering when incorporating various forms of automation into biopharmaceutical manufacturing. After presenting their scenarios and considerations, they conclude that “current best practices from industrial manufacturing and state-of-the-art cybersecurity could serve as a starting point to safeguard and mitigate against cyberbiosecurity threats to biomanufacturing.”
A bibliometric analysis of Cannabis publications: Six decades of research and a gap on studies with the plant
What are the trends and glaring holes in scientific publications regarding the Cannabis plant? We know some parts of the world have been more proactive in said research, often thanks to slightly more lax regulations than other countries. But where is the research coming from, and what still must be addressed? Matielo et al. tested this concept in a bibliometric analysis of roughly six decades of publications. They found an increase in the relational study of cannabis and its affects on human genetics but a significant dearth of publications on the genetics of the plant itself. Only since about 2005 has genetics studies of the plant picked up. The authors found several other patterns in their analysis, included in the paper’s “Discussion” section.