• Table of Contents
    {"ID":78063,"post_author":"9208550","post_date":"2018-12-14 14:31:40","post_date_gmt":"0000-00-00 00:00:00","post_content":"","post_title":"LIMSjournal - Fall 2018","post_excerpt":"","post_status":"draft","comment_status":"closed","ping_status":"closed","post_password":"","post_name":"","to_ping":"","pinged":"","post_modified":"2018-12-14 14:31:40","post_modified_gmt":"2018-12-14 19:31:40","post_content_filtered":"","post_parent":0,"guid":"https:\/\/www.limsforum.com\/?post_type=ebook&#038;p=78063","menu_order":0,"post_type":"ebook","post_mime_type":"","comment_count":"0","filter":"","_ebook_metadata":{"enabled":"on","private":"0","guid":"E180E4AA-004E-4410-AE0F-BBC279445A09","title":"LIMSjournal - Fall 2018","subtitle":"Volume 4, Issue 3","cover_theme":"nico_4","cover_image":"https:\/\/www.limsforum.com\/wp-content\/plugins\/rdp-ebook-builder\/pl\/cover.php?cover_style=nico_4&subtitle=Volume+4%2C+Issue+3&editor=Shawn+Douglas&title=LIMSjournal+-+Fall+2018&title_image=https%3A%2F%2Fs3.limsforum.com%2Fwww.limsforum.com%2Fwp-content%2Fuploads%2FFig2_Evans_Informatics2017_4-4.jpg&publisher=LabLynx+Press","editor":"Shawn Douglas","publisher":"LabLynx Press","author_id":"26","image_url":"","items":{"51ebf6ac1bdd905b9c8c8d23fe8b8a29_type":"article","51ebf6ac1bdd905b9c8c8d23fe8b8a29_title":"Password compliance for PACS work stations: Implications for emergency-driven medical environments (Mahlaola and van Dyk 2017)","51ebf6ac1bdd905b9c8c8d23fe8b8a29_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments","51ebf6ac1bdd905b9c8c8d23fe8b8a29_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Password compliance for PACS work stations: Implications for emergency-driven medical environments\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nPassword compliance for PACS work stations: Implications for emergency-driven medical environmentsJournal\n \nSouth African Journal of Bioethics and LawAuthor(s)\n \nMahlaola, T.B.; van Dyk, B.Author affiliation(s)\n \nUniversity of JohannesburgYear published\n \n2017Volume and issue\n \n10(2)Page(s)\n \n62\u20136DOI\n \n10.7196\/SAJBL.2017.v10i2.00600ISSN\n \n1999-7639Distribution license\n \nCreative Commons Attribution-NonCommercial 4.0 InternationalWebsite\n \nhttps:\/\/www.ajol.info\/index.php\/sajbl\/article\/view\/165242Download\n \nhttps:\/\/www.ajol.info\/index.php\/sajbl\/article\/download\/165242\/154702 (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Background \n4 Methods \n5 Results \n6 Discussion \n7 Conclusion \n8 Acknowledgements \n\n8.1 Author contributions \n8.2 Funding \n8.3 Conflicts of interest \n\n\n9 References \n10 Notes \n\n\n\nAbstract \nBackground: The effectiveness of password usage in data security remains an area of high scrutiny. Literature findings do not inspire confidence in the use of passwords. Human factors such as the acceptance of and compliance with minimum standards of data security are considered significant determinants of effective data-security practices. However, human and technical factors alone do not provide solutions if they exclude the context in which the technology is applied.\nObjectives: To reflect on the outcome of a dissertation which argues that the minimum standards of effective password use prescribed by the information security sector are not suitable to the emergency-driven medical environment, and that their application as required by law raises new and unforeseen ethical dilemmas.\nMethod: A close-ended questionnaire, the Picture Archiving and Communication System Confidentiality Scale (PAC-CS) was used to collect quantitative data from 115 health professionals employed in both a private radiology and a hospital setting. The PACS-CS sought to explore the extent of compliance with accepted minimum standards of effective password usage.\nResults: The percentage compliance with minimum standards was calculated. A significant statistical difference (p&lt;0.05) between the expected and observed data-security practices was recorded.\nConclusion: The study interrogates the suitability of adherence to minimum standards of effective password usage in an emergency-driven medical environment and calls for much-needed debate in this area.\n\nIntroduction \nThe effectiveness of password usage in data security has been heavily criticized. A variety of assumptions regarding password usage have been made, depending on the focus of the literature. From a technical perspective, passwords are considered ineffective in restricting access only to individuals with authorized and legitimate access to data.[1] Engineers suspect that human factors play a significant role in determining the effectiveness of technical safeguards, so that human beings are deemed the weakest link in data security.[2] It remains unclear whether the use of passwords is effective in safeguarding electronic data.\nLiterature findings do not inspire confidence in the usage of passwords for data security. Several quotes taken from various points in time attest to this fact, for example: \"Boot passwords, put your computer under lock and key\"[3]; \"Goodbye, passwords. You aren\u2019t a good defense\"[4], and more recently, \"Forget passwords \u2013 use your face instead.\"[5]\nThere is extensive literature focusing on the effectiveness and suitability of password usage in preventing confidentiality breaches within environments such as computer security. The researchers have no knowledge of similar studies relating to the suitability of password usage within the medical environment. The aim of this article is to bring to the fore factors unique to the medical environment that argue against the direct \"copy and paste\" adoption of the minimum standards for effective password usage from computer security into the medical environment.\n\nBackground \nThe use of passwords is ineffective in restricting access only to individuals who are authorized to access data. This popular and easy means of controlling access to data may, in fact, provide the easiest way to breach confidentiality. Information technologists insist that with proper management, passwords are an effective means of protecting the security of data. Measures include, but are not limited to, the use of strong passwords, having individual rather than shared passwords, and changing passwords on a regular basis.[6]\nCompliance with the minimum standards for effective password usage requires knowledge of and to some extent expertise in data security on the part of the healthcare provider.[7] However, the responsibility to comply cannot be placed solely on the healthcare provider. Standards for effective password usage should be well accepted and applied by all users of the technology. At times, factors unique to the medical field may influence the acceptance of security measures. For instance, in a medical emergency, there may be a legitimate need to circumvent the minimum standards of effective password usage in order to save a life.[2][8] It is for this reason that the contributions of both human and technical factors in normative research are noteworthy, but will never be adequate if the context in which technology is applied remains excluded.\nThis paper draws on the assumption that the situated use of technology creates challenges to the inscribed ethics of technology use, resulting in the emergence of new ethical dilemmas. Based on this assumption, we argue that the proper management of passwords as described in the environment of computer security is not suitable to the emergency-driven medical environment. In this paper, we reflect on the research outcome of the first author\u2019s dissertation in putting this argument forward.[9]\n\nMethods \nA picture archiving and communication system (PACS) is a digital storage system designed to address the limitations of film and paper records. A conventional storage system imposes disadvantages that become an impediment to the continuity of patient care, because the records could be easily misplaced and therefore difficult to retrieve, resulting in delayed medical treatment.[10] PACS is inherently a radiology archiving system that may be extended to various other sections within a hospital. It allows for remote and instant access to radiology data by a multidisciplinary complement of health professionals (HPs) who are based in different locations within a hospital setting, so that the data of the same patient may be accessed simultaneously by different HPs.[11] PACS has contributed to improved patient care by increasing efficiency and the accessibility of data, and has led to fewer delays in the clinical management of patients.[11] The electronic nature of PACS makes it possible for patients\u2019 data to be accessed, duplicated, and exported without the patient\u2019s knowledge and consent.[12] The use of passwords aids in restricting access to PACS data, to minimize the risk of breaching patient confidentiality.\nThe original research aimed to determine the extent to which the practices of HPs complied with patient-confidentiality principles when using PACS. The study invitation was initially extended to six hospitals in Johannesburg. However, owing to a 75% refusal rate among this group, the eventual study sample was drawn instead from a private hospital and radiology setting affiliated to different healthcare-facility groups located in Johannesburg instead. The selection criteria included HPs who were willing to participate and were using PACS as either part of routine activity or as a means of delivering patient care. The study sample comprised a multidisciplinary complement of HPs such as radiologists, radiographers, student radiographers, doctors, medical specialists, and nurses.\nPrior to data collection, ethical clearance was obtained from the research settings as well as the research committee of the University of Johannesburg (ref. no. HDC67\/02-2011), South Africa (SA). Data were collected from various sections within the hospital, namely radiology, emergency, casualty, theatre, and intensive-care units, including coronary care, acute care, respiratory, trauma intensive care, neurology, and surgical-care units. Data were collected over a period of three months using a self-designed questionnaire, the Picture Archiving and Communication Confidentiality Scale (PAC-CS). Consent was obtained verbally and implied through the completion of the PAC-CS. Informed consent was ensured by allowing participants to ask questions relating to the study, and the data were anonymized. Access to study data was restricted to the researchers.\nThe PAC-CS design was informed by the content of the ISO\/IEC 17799:2005[13] standard, from which the constructs, the choice of questions, and the quantification were derived and adapted. The ISO\/IEC 17799:2005 is a model used in information technology to benchmark an organization\u2019s compliance with international standards of data security. The consistency of the PACS-CS design with the ISO 17799 model helped to establish its content validity and reliability. A sample size of 115 participants was achieved through the hand-delivery of PAC-CS using a non-probability quota-sampling technique.[14]\nA quantitative, correlational design was deemed suitable for determining the extent of compliance of the situated practices of effective password usage by HPs with minimum standards for effective password usage. The lack of guidelines pertaining to PACS by the Health Professions Council of SA (HPCSA) at the time of this study led to the use of the Health Insurance Portability and Accountability Act (HIPAA)\u2019s security rule of 1996 as an alternative model for compliance with data-security rules.[15][16] The HIPAA security rule is a detailed outline of the national standards and steps necessary to protect electronic health information from inadvertent disclosures through breaches of security. The choice of this U.S. legislation was informed by its reputation as one of the best regulatory rules pertaining to electronic data security, embedded in the fact that it is continually updated in line with technological advances, and most importantly, addresses the security needs of PACS technology explicitly.[16]\nThe participant responses were analyzed by an independent statistician using the Statistical Package for Social Sciences (SPSS, USA) version 16. The quantified responses were expressed in terms of frequency counts and compliance percentage. A 90% benchmark was set for minimum compliance with technical safeguards, whereas a 10% benchmark indicated an intolerable level of non-compliance. Statistical significance (p&gt;0.05) was calculated using the One Sample Chi-Square test for non-parametric data, the choice of which was informed by the lack of randomization, the sample size and the type of data collected.[14] While the cross-tabulations were used to determine the degree of statistical significance, the phi coefficient helped to calculate the extent of the correlation, the strength of which was determined by the Pearson Chi-Square test.\nSection A of the PAC-CS focused on the compliance of technical and physical safeguards with international standards. The responses to the close-ended questions regarding technical safeguards in terms of password usage, namely the type of passwords and the frequency of password changes, will be presented.\n\nResults \nThe study results were evaluated in line with the following definition: the situated practices for effective password usage of HP are conceptually defined as the complete range of functions, activities, roles, responsibilities, and decision-making capabilities in which individuals are competent, educated, and authorized to perform within a specified work environment in complying with the minimum standards of effective password usage. In Table 1 and Fig. 1, the study questions and the corresponding responses that relate to the effectiveness of passwords when using PACS technology are summarized.\n\n\n\n\n\n\n\nTable 1. Summary of effective password usage\n\n\nBenchmark\n\nResponse\n\nExpected, n (%)\n\nObserved, n (%)\n\nExtent of compliance (p=0.000)\n\n\nDo you have a unique PACS access code? (N=113)\n\n\nYes=90%\n\nYes\n\n102 (90%)\n\n31 (27%)\n\n27%\n\n\n\n\nNo\n\n11 (10%)\n\n82 (73%)\n\n\n\n\nDoes your department have a PACS access code which everybody uses? (N=114)\n\n\nNo=90%\n\nYes\n\n11 (10%)\n\n89 (78%)\n\n22%\n\n\n\n\nNo\n\n103 (90%)\n\n25 (22%)\n\n\n\n\nCan you access data from the PACS without using an access code? (N=110)\n\n\nNo=90%\n\nYes\n\n11 (10%)\n\n25 (23%)\n\n77%\n\n\n\n\nNo\n\n99 (90%)\n\n85 (77%)\n\n\n\n\nApproximately how long does the PACS work station remain active? (N=113)\n\n\n&lt;1 minute\n\n&lt;1 minute\n\n102 (90%)\n\n3 (3%)\n\n3%\n\n\n90%\n\n1 minute\n\n8 (7%)\n\n4 (4%)\n\n\n\n\n\n\n&gt;1 minute\n\n2 (2%)\n\n44 (39%)\n\n\n\n\n\n\nAll the time\n\n1 (1%)\n\n62 (54%)\n\n\n\n\n\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. A sketch of Intervene\u2019s command line interface and web application, and input data type\n\n\n\nAccording to Table 1, 102 participants (90% of the sample) were expected (EN) to use individual passwords to access PACS. Only 27% of the participants complied with the use of individual passwords, while the remainder, 78%, used shared departmental codes instead. A further 23% of participants accessed PACS without requiring a password, and only 2% changed their PACS passwords on a monthly basis. Moreover, a mere 3% of the PACS workstations remained active for less than a minute. In determining the extent of the non-usage of passwords, cross-tabulations between the radiology and non-radiology groups were conducted. Fig. 1 demonstrates that staff members in radiology departments accessed the PACS workstations without the use of an access code to a greater extent than their non-radiology counterparts.\n\nDiscussion \nPACS workstations are purposefully designed to provide instant access to data. By design, PACS is inherently a password-driven technology. The password requirement serves (i) as a means of restricting access to data only to authorized PACS users and (ii) to authenticate the person accessing the data. The password-driven nature of PACS in itself is a form of an inscribed ethic designed to protect the confidentiality of patient data stored in the PACS. In protecting confidentiality, patients\u2019 privacy is secured, and the intrinsic value of patients as human beings is recognized. Not all types of passwords are considered effective in delivering the ethics inscribed in PACS technology. The minimum standards for effective password usage necessitate that passwords are long and contain a variety of characters that would not be easy to crack.[6] As a gold standard, individual passwords rather than shared passwords are recommended, and these need to be changed frequently. The benefit of effective access restriction is the protection of patient confidentiality, which HPs are obligated to uphold. In the original study, the motivations informing the choice of passwords for the various departments within a hospital setting could not be ascertained. This paper draws on other literature findings to explore possible reasons for poor compliance with the minimum standards of effective passwords, specifically for emergency departments.\nThe study outcomes vary from 27% of participants using individual passwords, to 78% who used shared departmental passwords. In cases where the automatic log-off was disabled, participants accessed PACS without requiring passwords, and this accounted for 23% of the results. The multidisciplinary nature of the study participants introduces a range of functions, activities, roles, and responsibilities that should be considered within a specified work environment when explaining the inconsistency in the types of passwords used. It appears that some sections within the hospital setting used passwords that were unique to each department and shared by all members within that particular section, accounting for the 78% use of shared departmental passwords. It could not be ascertained whether the choice of departmental passwords complied with the requirement for hard-to-crack passwords. It may be postulated that the departmental password should be easy to remember, and have predictable features that are not consistent with hard-to-crack passwords.\nPerhaps the staffing issues unique to the medical setting provide compelling reasons for the use of departmental passwords. For instance, nursing departments employ a significant number of temporary staff, while casualty officers and some specialist doctors, such as traumatologists, work on an on-call basis whereby they may rotate within the public and private sectors. Setting up an individual password for each of the temporary and rotational staff may be a costly, time-consuming and futile exercise when a staff member may be employed only for one day. It may not be possible to set up passwords for an urgent replacement organized at the last minute to replace a staff member who called in sick for duty.\nUnlike general wards and intensive-care units where nurses, referring doctors, and radiographers could all access PACS, the radiology department is mainly accessed by radiology staff, making it susceptible to practices of accessing PACS without requiring a password. This practice may be endorsed by the culture of trust that dominates medical environments, in which HPs are considered to be ethical beings who respect confidentiality and therefore require minimal supervision.[7] Emergency and theatre departments may be a further example of environments where passwords are not utilized. In contrast, doctors\u2019 consulting rooms may be suitable for the use of individual passwords, accounting for the 27% reported in this study. The advantage of using individual passwords is that improper conduct relating to data security may be traced back to the offender. Audit trails are mandatory by law, as otherwise, how would violations of confidentiality be punished?\nIn a medical emergency, a patient\u2019s life may be threatened by the sudden and unexpected development of a health condition. High unpredictability and the requirement for expedited service delivery are characteristic of a medical emergency department. The need for efficiency raises challenges that require a balance between the right to life, efficiency, and the protection of human dignity. The right to life and the right to dignity are enshrined in sections 10 and 11 of the SA Constitution, respectively.[17] Section 2.3(a) of the Patient\u2019s Rights Charter states that everyone has the right to receive timely emergency care.[18]\nMembers of the emergency team never know what to expect at any given point in time, resulting in feelings of anxiety.[19] When attending to multiple patients at the same time, overcrowding, high noise levels, and fatigue may result in interruptions of the thinking and decision-making process.[20] These factors are cited in the literature as the leading cause of errors in diagnosis associated with clinical emergencies.[21] The need for efficiency in emergency departments induces stress in members of the emergency team. Individuals are likely to forget passwords that are long and contain a variety of characters, especially when working under stressful conditions.[6][22][23] Perhaps considerations regarding the right to life and timely access to emergency care inform some of the practices that result in the accessing of the PACS without requiring a password. Similar reasons may account for the 54% of PACS workstations that were not capable of automatic log-off, causing them to remain active all the time.\n\nConclusion \nThis paper highlights the dilemma in emergency departments between the need for efficient patient treatment and respect for patient ethical rights. In a medical environment dominated by a culture of trust, human dignity may not be the primary concern, especially when competing with the supreme right to life. However, just because HPs are inclined to trust one another, based on the assumption that HPs are ethical beings who respect patient confidentiality, this does not mean that all HPs are trustworthy. There may be occasions when patients suspect that HPs may abuse their privileges of access to medical records.[24]\nThe protection of patient data requires the fulfillment of diligent security measures, including the use of effective passwords and automatic computer log-off. These measures may be time-consuming and therefore not suitable for the levels of efficiency needed in emergency departments. The use of effective passwords is necessary to protect human dignity, the provision of which is enshrined in section 14(d) of the SA Constitution.[17] Yet, practices that are compliant with the minimum standards of effective passwords stand to threaten the supreme human right to life.\nIn a medical emergency, seconds count. Computers take ~60 seconds to initialize and authenticate the user, excluding the additional time needed to process an image or to call up patient data.[25] Depending on the type of medical emergency, 60 seconds could mean the difference between organ impairment and death. Eliminating the time for computer initialization and authentication could go a long way towards saving lives. At the time of this report, there were no data to suggest that lives have been lost as a result of computer initialization and authentication. However, the lack of data does not mean that incidents have not occurred or will not occur in the future.\nIt remains unclear whether compliance with the minimum standards for effective password usage is suitable to emergency departments. This article may have contributed to normative ethics in asking the question as to whether medical emergency departments ought to be an exception to the minimum standards of effective password usage. The reasons for non-compliance presented in this article are mere suggestions drawn from the literature. Future research is needed, firstly, to determine reasons for non-compliance specific to the use of PACS in an emergency department; and secondly, to determine alternative security measures that would aid in preserving patient confidentiality in such departments.\n\nAcknowledgements \nThis article presents the findings of a Master\u2019s dissertation obtained through the University of Johannesburg. Both authors greatly appreciate the permission granted by the research settings and the contribution of the participants.\n\nAuthor contributions \nTBM and BvD worked jointly in preparing and approving the article for submission. In her capacity as the primary researcher, TBM contributed to project design and data collection. BvD was the study supervisor, and made enormous contributions to the conceptual framework and editing of the manuscript.\n\nFunding \nNone.\n\nConflicts of interest \nNone.\n\nReferences \n\n\n\u2191 Dayarathna, R.&#32;(2009).&#32;\"The principle of security safeguards: Unauthorized activities\".&#32;Computer Law &amp; Security Review&#32;25&#32;(2): 165\u201372.&#32;doi:10.1016\/j.clsr.2009.02.012. &#160; \n\n\u2191 2.0 2.1 Ifinedo, P.&#32;(2012).&#32;\"Understanding information systems security policy compliance: An integration of the theory of planned behavior and the protection motivation theory\".&#32;Computers &amp; Security&#32;31&#32;(1): 83\u201395.&#32;doi:10.1016\/j.cose.2011.10.007. &#160; \n\n\u2191 Steers, K.&#32;(2003).&#32;\"Boot passwords, put your PC under lock and key\".&#32;PC World&#32;21&#32;(9): 168. &#160; \n\n\u2191 Stross, R.&#32;(09 August 2008).&#32;\"Goodbye, Passwords. You Aren\u2019t a Good Defense\".&#32;The New York Times.&#32;https:\/\/www.nytimes.com\/2008\/08\/10\/technology\/10digi.html .&#32;Retrieved 27 May 2017 . &#160; \n\n\u2191 Graham, J.&#32;(05 January 2015).&#32;\"Forget passwords - use your face instead\".&#32;USA Today.&#32;https:\/\/www.pressreader.com\/usa\/usa-today-us-edition\/20150105\/281801397332402 . &#160; \n\n\u2191 6.0 6.1 6.2 Payton, L.&#32;(2010).&#32;\"Memory for Passwords: The Effects of Varying Number, Type, and Composition\".&#32;PSI CHI Journal of Psychological Research&#32;15&#32;(4): 209\u201313.&#32;doi:10.24839\/1089-4136.JN15.4.209. &#160; \n\n\u2191 7.0 7.1 Williams, P.A.H.&#32;(2008).&#32;\"In a \u2018trusting\u2019 environment, everyone is responsible for information security\".&#32;Information Security Technical Report&#32;13&#32;(4): 207\u201315.&#32;doi:10.1016\/j.istr.2008.10.009. &#160; \n\n\u2191 Robinson, R.&#32;(2016).&#32;\"Moral Distress: A Qualitative Study of Emergency Nurses\".&#32;Dimensions of Critical Care Nursing&#32;35&#32;(4): 235\u201340.&#32;doi:10.1097\/DCC.0000000000000185. &#160; \n\n\u2191 Mahlaola, T.B.&#32;(20 January 2015),&#32;Compliance of health professionals with patient confidentiality when using PACS and RIS,&#32;University of Johannesburg,&#32;https:\/\/ujcontent.uj.ac.za\/vital\/access\/manager\/Repository\/uj:13153 &#160; \n\n\u2191 Beach, J.; Oates, J.&#32;(2014).&#32;\"Maintaining best practice in record-keeping and documentation\".&#32;Nursing Standard&#32;28&#32;(36): 45\u201350.&#32;doi:10.7748\/ns2014.05.28.36.45.e8835. &#160; \n\n\u2191 11.0 11.1 Bolan, C.&#32;(2013).&#32;\"Technology Trends: A view of the future image exchange\".&#32;Applied Radiology&#32;42&#32;(11): 32\u20137.&#32;https:\/\/appliedradiology.com\/articles\/technology-trends-a-view-of-the-future-image-exchange . &#160; \n\n\u2191 Benatar, D.&#32;(2010).&#32;\"Indiscretion and other threats to confidentiality\".&#32;South African Journal of Bioethics &amp; Law&#32;3&#32;(2): 59\u201362.&#32;http:\/\/www.sajbl.org.za\/index.php\/sajbl\/article\/view\/101 . &#160; \n\n\u2191 Bird, K.&#32;(20 June 2005).&#32;\"Improved ISO\/IEC 17799 makes information assets even more secure\".&#32;ISO News.&#32;International Organization for Standardization. &#160; \n\n\u2191 14.0 14.1 Daniel, J.&#32;(2012).&#32;Sampling Essentials: Practical Guidelines for Making Sampling Choices.&#32;SAGE Publications.&#32;doi:10.4135\/9781452272047.&#32;ISBN&#160;9781412952217. &#160; \n\n\u2191 Health Professions Council of South Africa&#32;(May 2008).&#32;\"Guidelines for Good Practice in the Health Care Professions - Confidentiality: Protecting and Providing Information\"&#32;(PDF).&#32;http:\/\/www.hpcsa.co.za\/Uploads\/editor\/UserFiles\/downloads\/conduct_ethics\/rules\/generic_ethical_rules\/booklet_10_confidentiality_protecting_and_providing_information.pdf . &#160; \n\n\u2191 16.0 16.1 Cao, F.; Huang, H.K.; Zhou, X.Q.&#32;(2003).&#32;\"Medical image security in a HIPAA mandated PACS environment\".&#32;Computerized Medical Imaging and Graphics&#32;27&#32;(2\u20133): 185\u201396.&#32;doi:10.1016\/S0895-6111(02)00073-3. &#160; \n\n\u2191 17.0 17.1 \"The Constitution of the Republic of South Africa\".&#32;Republic of South Africa.&#32;18 December 1996.&#32;http:\/\/www.justice.gov.za\/legislation\/constitution\/SAConstitution-web-eng.pdf . &#160; \n\n\u2191 Health Professions Council of South Africa&#32;(May 2008).&#32;\"Guidelines for Good Practice in the Health Care Professions - National Patients' Rights Charter\"&#32;(PDF).&#32;http:\/\/www.hpcsa.co.za\/downloads\/conduct_ethics\/rules\/generic_ethical_rules\/booklet_3_patients_rights_charter.pdf . &#160; \n\n\u2191 Croskerry, P.; Cosby, K.S.; Schenkel, S.M.; Wears, R.L.&#32;(2009).&#32;Patient Safety in Emergency Medicine.&#32;Lippincott Williams &amp; Wilkins.&#32;pp.&#160;448.&#32;ISBN&#160;9780781777278. &#160; \n\n\u2191 Palmer, L.K.&#32;(2013).&#32;\"The Relationship between Stress, Fatigue, and Cognitive Functioning\".&#32;College Student Journal&#32;47&#32;(2): 312\u201325.&#32;https:\/\/eric.ed.gov\/?id=EJ1022296 . &#160; \n\n\u2191 Baldi, P.L.&#32;(2014).&#32;\"Error risk in the decision making process\".&#32;Emergency Care Journal&#32;10&#32;(1): 37\u201340.&#32;doi:10.4081\/ecj.2014.2119. &#160; \n\n\u2191 Healy, S.; Tyrrell, M.&#32;(2011).&#32;\"Stress in emergency departments: Experiences of nurses and doctors\".&#32;Emergency Nurse&#32;19&#32;(4): 31\u20137.&#32;doi:10.7748\/en2011.07.19.4.31.c8611. &#160; \n\n\u2191 Espa\u00f1a, L.Y.&#32;(2016).&#32;\"Effects of Password Type and Memory Techniques on User Password Memory\".&#32;PSI CHI Journal of Psychological Research&#32;21&#32;(4): 269\u201375.&#32;doi:10.24839\/2164-8204.JN21.4.269. &#160; \n\n\u2191 Aky\u00fcz, E.; Erdemir, F.&#32;(2013).&#32;\"Surgical patients\u2019 and nurses\u2019 opinions and expectations about privacy in care\".&#32;Nursing Ethics&#32;20&#32;(6): e671.&#32;doi:10.1177\/0969733012468931. &#160; \n\n\u2191 Samy, G.N.; Ahmad, R.; Ismail, Z.&#32;(2010).&#32;\"Security threats categories in healthcare information systems\".&#32;Health Informatics Journal&#32;16&#32;(3): 201\u20139.&#32;doi:10.1177\/1460458210377468. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\">https:\/\/www.limswiki.org\/index.php\/Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on imaging informaticsLIMSwiki journal articles on software\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 30 July 2018, at 22:45.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 492 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","51ebf6ac1bdd905b9c8c8d23fe8b8a29_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Password_compliance_for_PACS_work_stations_Implications_for_emergency-driven_medical_environments skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Password compliance for PACS work stations: Implications for emergency-driven medical environments<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p><b>Background<\/b>: The effectiveness of password usage in data security remains an area of high scrutiny. Literature findings do not inspire confidence in the use of passwords. Human factors such as the acceptance of and compliance with minimum standards of data security are considered significant determinants of effective data-security practices. However, human and technical factors alone do not provide solutions if they exclude the context in which the technology is applied.\n<\/p><p><b>Objectives<\/b>: To reflect on the outcome of a dissertation which argues that the minimum standards of effective password use prescribed by the <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> security sector are not suitable to the emergency-driven medical environment, and that their application as required by law raises new and unforeseen ethical dilemmas.\n<\/p><p><b>Method<\/b>: A close-ended questionnaire, the Picture Archiving and Communication System Confidentiality Scale (PAC-CS) was used to collect quantitative data from 115 health professionals employed in both a private radiology and a <a href=\"https:\/\/www.limswiki.org\/index.php\/Hospital\" title=\"Hospital\" target=\"_blank\" class=\"wiki-link\" data-key=\"b8f070c66d8123fe91063594befebdff\">hospital<\/a> setting. The PACS-CS sought to explore the extent of compliance with accepted minimum standards of effective password usage.\n<\/p><p><b>Results<\/b>: The percentage compliance with minimum standards was calculated. A significant statistical difference (<i>p<\/i>&lt;0.05) between the expected and observed data-security practices was recorded.\n<\/p><p><b>Conclusion<\/b>: The study interrogates the suitability of adherence to minimum standards of effective password usage in an emergency-driven medical environment and calls for much-needed debate in this area.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>The effectiveness of password usage in data security has been heavily criticized. A variety of assumptions regarding password usage have been made, depending on the focus of the literature. From a technical perspective, passwords are considered ineffective in restricting access only to individuals with authorized and legitimate access to data.<sup id=\"rdp-ebb-cite_ref-DayarathnaThePrinc09_1-0\" class=\"reference\"><a href=\"#cite_note-DayarathnaThePrinc09-1\" rel=\"external_link\">[1]<\/a><\/sup> Engineers suspect that human factors play a significant role in determining the effectiveness of technical safeguards, so that human beings are deemed the weakest link in data security.<sup id=\"rdp-ebb-cite_ref-IfinedoUnder12_2-0\" class=\"reference\"><a href=\"#cite_note-IfinedoUnder12-2\" rel=\"external_link\">[2]<\/a><\/sup> It remains unclear whether the use of passwords is effective in safeguarding electronic data.\n<\/p><p>Literature findings do not inspire confidence in the usage of passwords for data security. Several quotes taken from various points in time attest to this fact, for example: \"Boot passwords, put your computer under lock and key\"<sup id=\"rdp-ebb-cite_ref-SteersBoot03_3-0\" class=\"reference\"><a href=\"#cite_note-SteersBoot03-3\" rel=\"external_link\">[3]<\/a><\/sup>; \"Goodbye, passwords. You aren\u2019t a good defense\"<sup id=\"rdp-ebb-cite_ref-StrossGoodbye08_4-0\" class=\"reference\"><a href=\"#cite_note-StrossGoodbye08-4\" rel=\"external_link\">[4]<\/a><\/sup>, and more recently, \"Forget passwords \u2013 use your face instead.\"<sup id=\"rdp-ebb-cite_ref-GrahamForget15_5-0\" class=\"reference\"><a href=\"#cite_note-GrahamForget15-5\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p>There is extensive literature focusing on the effectiveness and suitability of password usage in preventing confidentiality breaches within environments such as computer security. The researchers have no knowledge of similar studies relating to the suitability of password usage within the medical environment. The aim of this article is to bring to the fore factors unique to the medical environment that argue against the direct \"copy and paste\" adoption of the minimum standards for effective password usage from computer security into the medical environment.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Background\">Background<\/span><\/h2>\n<p>The use of passwords is ineffective in restricting access only to individuals who are authorized to access data. This popular and easy means of controlling access to data may, in fact, provide the easiest way to breach confidentiality. Information technologists insist that with proper management, passwords are an effective means of protecting the security of data. Measures include, but are not limited to, the use of strong passwords, having individual rather than shared passwords, and changing passwords on a regular basis.<sup id=\"rdp-ebb-cite_ref-PaytonMemory10_6-0\" class=\"reference\"><a href=\"#cite_note-PaytonMemory10-6\" rel=\"external_link\">[6]<\/a><\/sup>\n<\/p><p>Compliance with the minimum standards for effective password usage requires knowledge of and to some extent expertise in data security on the part of the healthcare provider.<sup id=\"rdp-ebb-cite_ref-WilliamsInATrust08_7-0\" class=\"reference\"><a href=\"#cite_note-WilliamsInATrust08-7\" rel=\"external_link\">[7]<\/a><\/sup> However, the responsibility to comply cannot be placed solely on the healthcare provider. Standards for effective password usage should be well accepted and applied by all users of the technology. At times, factors unique to the medical field may influence the acceptance of security measures. For instance, in a medical emergency, there may be a legitimate need to circumvent the minimum standards of effective password usage in order to save a life.<sup id=\"rdp-ebb-cite_ref-IfinedoUnder12_2-1\" class=\"reference\"><a href=\"#cite_note-IfinedoUnder12-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-RobinsonMoral16_8-0\" class=\"reference\"><a href=\"#cite_note-RobinsonMoral16-8\" rel=\"external_link\">[8]<\/a><\/sup> It is for this reason that the contributions of both human and technical factors in normative research are noteworthy, but will never be adequate if the context in which technology is applied remains excluded.\n<\/p><p>This paper draws on the assumption that the situated use of technology creates challenges to the inscribed ethics of technology use, resulting in the emergence of new ethical dilemmas. Based on this assumption, we argue that the proper management of passwords as described in the environment of computer security is not suitable to the emergency-driven medical environment. In this paper, we reflect on the research outcome of the first author\u2019s dissertation in putting this argument forward.<sup id=\"rdp-ebb-cite_ref-MahlaolaCompliance15_9-0\" class=\"reference\"><a href=\"#cite_note-MahlaolaCompliance15-9\" rel=\"external_link\">[9]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Methods\">Methods<\/span><\/h2>\n<p>A <a href=\"https:\/\/www.limswiki.org\/index.php\/Picture_archiving_and_communication_system\" title=\"Picture archiving and communication system\" target=\"_blank\" class=\"wiki-link\" data-key=\"523b73ff51fa83663dc0b1d59e6d0f05\">picture archiving and communication system<\/a> (PACS) is a digital storage system designed to address the limitations of film and paper records. A conventional storage system imposes disadvantages that become an impediment to the continuity of patient care, because the records could be easily misplaced and therefore difficult to retrieve, resulting in delayed medical treatment.<sup id=\"rdp-ebb-cite_ref-BeachMaint14_10-0\" class=\"reference\"><a href=\"#cite_note-BeachMaint14-10\" rel=\"external_link\">[10]<\/a><\/sup> PACS is inherently a radiology archiving system that may be extended to various other sections within a hospital. It allows for remote and instant access to radiology data by a multidisciplinary complement of health professionals (HPs) who are based in different locations within a hospital setting, so that the data of the same patient may be accessed simultaneously by different HPs.<sup id=\"rdp-ebb-cite_ref-BolanTech13_11-0\" class=\"reference\"><a href=\"#cite_note-BolanTech13-11\" rel=\"external_link\">[11]<\/a><\/sup> PACS has contributed to improved patient care by increasing efficiency and the accessibility of data, and has led to fewer delays in the clinical management of patients.<sup id=\"rdp-ebb-cite_ref-BolanTech13_11-1\" class=\"reference\"><a href=\"#cite_note-BolanTech13-11\" rel=\"external_link\">[11]<\/a><\/sup> The electronic nature of PACS makes it possible for patients\u2019 data to be accessed, duplicated, and exported without the patient\u2019s knowledge and consent.<sup id=\"rdp-ebb-cite_ref-BenetarIndis10_12-0\" class=\"reference\"><a href=\"#cite_note-BenetarIndis10-12\" rel=\"external_link\">[12]<\/a><\/sup> The use of passwords aids in restricting access to PACS data, to minimize the risk of breaching patient confidentiality.\n<\/p><p>The original research aimed to determine the extent to which the practices of HPs complied with patient-confidentiality principles when using PACS. The study invitation was initially extended to six hospitals in Johannesburg. However, owing to a 75% refusal rate among this group, the eventual study sample was drawn instead from a private hospital and radiology setting affiliated to different healthcare-facility groups located in Johannesburg instead. The selection criteria included HPs who were willing to participate and were using PACS as either part of routine activity or as a means of delivering patient care. The study sample comprised a multidisciplinary complement of HPs such as radiologists, radiographers, student radiographers, doctors, medical specialists, and nurses.\n<\/p><p>Prior to data collection, ethical clearance was obtained from the research settings as well as the research committee of the University of Johannesburg (ref. no. HDC67\/02-2011), South Africa (SA). Data were collected from various sections within the hospital, namely radiology, emergency, casualty, theatre, and intensive-care units, including coronary care, acute care, respiratory, trauma intensive care, neurology, and surgical-care units. Data were collected over a period of three months using a self-designed questionnaire, the Picture Archiving and Communication Confidentiality Scale (PAC-CS). Consent was obtained verbally and implied through the completion of the PAC-CS. Informed consent was ensured by allowing participants to ask questions relating to the study, and the data were anonymized. Access to study data was restricted to the researchers.\n<\/p><p>The PAC-CS design was informed by the content of the ISO\/IEC 17799:2005<sup id=\"rdp-ebb-cite_ref-BirdImproved05_13-0\" class=\"reference\"><a href=\"#cite_note-BirdImproved05-13\" rel=\"external_link\">[13]<\/a><\/sup> standard, from which the constructs, the choice of questions, and the quantification were derived and adapted. The ISO\/IEC 17799:2005 is a model used in information technology to benchmark an organization\u2019s compliance with international standards of data security. The consistency of the PACS-CS design with the ISO 17799 model helped to establish its content validity and reliability. A sample size of 115 participants was achieved through the hand-delivery of PAC-CS using a non-probability quota-sampling technique.<sup id=\"rdp-ebb-cite_ref-DanielSampling12_14-0\" class=\"reference\"><a href=\"#cite_note-DanielSampling12-14\" rel=\"external_link\">[14]<\/a><\/sup>\n<\/p><p>A quantitative, correlational design was deemed suitable for determining the extent of compliance of the situated practices of effective password usage by HPs with minimum standards for effective password usage. The lack of guidelines pertaining to PACS by the Health Professions Council of SA (HPCSA) at the time of this study led to the use of the <a href=\"https:\/\/www.limswiki.org\/index.php\/Health_Insurance_Portability_and_Accountability_Act\" title=\"Health Insurance Portability and Accountability Act\" target=\"_blank\" class=\"wiki-link\" data-key=\"b70673a0117c21576016cb7498867153\">Health Insurance Portability and Accountability Act<\/a> (HIPAA)\u2019s security rule of 1996 as an alternative model for compliance with data-security rules.<sup id=\"rdp-ebb-cite_ref-HPCSAGuide08_15-0\" class=\"reference\"><a href=\"#cite_note-HPCSAGuide08-15\" rel=\"external_link\">[15]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-CaoMedical03_16-0\" class=\"reference\"><a href=\"#cite_note-CaoMedical03-16\" rel=\"external_link\">[16]<\/a><\/sup> The HIPAA security rule is a detailed outline of the national standards and steps necessary to protect electronic health information from inadvertent disclosures through breaches of security. The choice of this U.S. legislation was informed by its reputation as one of the best regulatory rules pertaining to electronic data security, embedded in the fact that it is continually updated in line with technological advances, and most importantly, addresses the security needs of PACS technology explicitly.<sup id=\"rdp-ebb-cite_ref-CaoMedical03_16-1\" class=\"reference\"><a href=\"#cite_note-CaoMedical03-16\" rel=\"external_link\">[16]<\/a><\/sup>\n<\/p><p>The participant responses were analyzed by an independent statistician using the Statistical Package for Social Sciences (SPSS, USA) version 16. The quantified responses were expressed in terms of frequency counts and compliance percentage. A 90% benchmark was set for minimum compliance with technical safeguards, whereas a 10% benchmark indicated an intolerable level of non-compliance. Statistical significance (<i>p<\/i>&gt;0.05) was calculated using the One Sample Chi-Square test for non-parametric data, the choice of which was informed by the lack of randomization, the sample size and the type of data collected.<sup id=\"rdp-ebb-cite_ref-DanielSampling12_14-1\" class=\"reference\"><a href=\"#cite_note-DanielSampling12-14\" rel=\"external_link\">[14]<\/a><\/sup> While the cross-tabulations were used to determine the degree of statistical significance, the phi coefficient helped to calculate the extent of the correlation, the strength of which was determined by the Pearson Chi-Square test.\n<\/p><p>Section A of the PAC-CS focused on the compliance of technical and physical safeguards with international standards. The responses to the close-ended questions regarding technical safeguards in terms of password usage, namely the type of passwords and the frequency of password changes, will be presented.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Results\">Results<\/span><\/h2>\n<p>The study results were evaluated in line with the following definition: the situated practices for effective password usage of HP are conceptually defined as the complete range of functions, activities, roles, responsibilities, and decision-making capabilities in which individuals are competent, educated, and authorized to perform within a specified work environment in complying with the minimum standards of effective password usage. In Table 1 and Fig. 1, the study questions and the corresponding responses that relate to the effectiveness of passwords when using PACS technology are summarized.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\"><b>Table 1.<\/b> Summary of effective password usage\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Benchmark\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Response\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Expected, <i>n<\/i> (%)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Observed, <i>n<\/i> (%)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Extent of compliance (<i>p<\/i>=0.000)\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\">Do you have a unique PACS access code? (<i>N<\/i>=113)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Yes=90%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Yes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">102 (90%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">31 (27%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">27%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">No\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">11 (10%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">82 (73%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\">Does your department have a PACS access code which everybody uses? (<i>N<\/i>=114)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">No=90%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Yes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">11 (10%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">89 (78%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">22%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">No\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">103 (90%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">25 (22%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\">Can you access data from the PACS without using an access code? (<i>N<\/i>=110)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">No=90%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Yes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">11 (10%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">25 (23%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">77%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">No\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">99 (90%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">85 (77%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\">Approximately how long does the PACS work station remain active? (<i>N<\/i>=113)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">&lt;1 minute\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">&lt;1 minute\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">102 (90%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3 (3%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">90%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1 minute\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8 (7%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">4 (4%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">&gt;1 minute\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2 (2%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">44 (39%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">All the time\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1 (1%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">62 (54%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Mahlaola_SAJouBioLaw2017_10-2.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"e0a4bebac721eb84e660c2c2f9e85e5b\"><img alt=\"Fig1 Mahlaola SAJouBioLaw2017 10-2.png\" src=\"https:\/\/www.limswiki.org\/images\/1\/1c\/Fig1_Mahlaola_SAJouBioLaw2017_10-2.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> A sketch of Intervene\u2019s command line interface and web application, and input data type<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>According to Table 1, 102 participants (90% of the sample) were expected (EN) to use individual passwords to access PACS. Only 27% of the participants complied with the use of individual passwords, while the remainder, 78%, used shared departmental codes instead. A further 23% of participants accessed PACS without requiring a password, and only 2% changed their PACS passwords on a monthly basis. Moreover, a mere 3% of the PACS workstations remained active for less than a minute. In determining the extent of the non-usage of passwords, cross-tabulations between the radiology and non-radiology groups were conducted. Fig. 1 demonstrates that staff members in radiology departments accessed the PACS workstations without the use of an access code to a greater extent than their non-radiology counterparts.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>PACS workstations are purposefully designed to provide instant access to data. By design, PACS is inherently a password-driven technology. The password requirement serves (i) as a means of restricting access to data only to authorized PACS users and (ii) to authenticate the person accessing the data. The password-driven nature of PACS in itself is a form of an inscribed ethic designed to protect the confidentiality of patient data stored in the PACS. In protecting confidentiality, patients\u2019 privacy is secured, and the intrinsic value of patients as human beings is recognized. Not all types of passwords are considered effective in delivering the ethics inscribed in PACS technology. The minimum standards for effective password usage necessitate that passwords are long and contain a variety of characters that would not be easy to crack.<sup id=\"rdp-ebb-cite_ref-PaytonMemory10_6-1\" class=\"reference\"><a href=\"#cite_note-PaytonMemory10-6\" rel=\"external_link\">[6]<\/a><\/sup> As a gold standard, individual passwords rather than shared passwords are recommended, and these need to be changed frequently. The benefit of effective access restriction is the protection of patient confidentiality, which HPs are obligated to uphold. In the original study, the motivations informing the choice of passwords for the various departments within a hospital setting could not be ascertained. This paper draws on other literature findings to explore possible reasons for poor compliance with the minimum standards of effective passwords, specifically for emergency departments.\n<\/p><p>The study outcomes vary from 27% of participants using individual passwords, to 78% who used shared departmental passwords. In cases where the automatic log-off was disabled, participants accessed PACS without requiring passwords, and this accounted for 23% of the results. The multidisciplinary nature of the study participants introduces a range of functions, activities, roles, and responsibilities that should be considered within a specified work environment when explaining the inconsistency in the types of passwords used. It appears that some sections within the hospital setting used passwords that were unique to each department and shared by all members within that particular section, accounting for the 78% use of shared departmental passwords. It could not be ascertained whether the choice of departmental passwords complied with the requirement for hard-to-crack passwords. It may be postulated that the departmental password should be easy to remember, and have predictable features that are not consistent with hard-to-crack passwords.\n<\/p><p>Perhaps the staffing issues unique to the medical setting provide compelling reasons for the use of departmental passwords. For instance, nursing departments employ a significant number of temporary staff, while casualty officers and some specialist doctors, such as traumatologists, work on an on-call basis whereby they may rotate within the public and private sectors. Setting up an individual password for each of the temporary and rotational staff may be a costly, time-consuming and futile exercise when a staff member may be employed only for one day. It may not be possible to set up passwords for an urgent replacement organized at the last minute to replace a staff member who called in sick for duty.\n<\/p><p>Unlike general wards and intensive-care units where nurses, referring doctors, and radiographers could all access PACS, the radiology department is mainly accessed by radiology staff, making it susceptible to practices of accessing PACS without requiring a password. This practice may be endorsed by the culture of trust that dominates medical environments, in which HPs are considered to be ethical beings who respect confidentiality and therefore require minimal supervision.<sup id=\"rdp-ebb-cite_ref-WilliamsInATrust08_7-1\" class=\"reference\"><a href=\"#cite_note-WilliamsInATrust08-7\" rel=\"external_link\">[7]<\/a><\/sup> Emergency and theatre departments may be a further example of environments where passwords are not utilized. In contrast, doctors\u2019 consulting rooms may be suitable for the use of individual passwords, accounting for the 27% reported in this study. The advantage of using individual passwords is that improper conduct relating to data security may be traced back to the offender. <a href=\"https:\/\/www.limswiki.org\/index.php\/Audit_trail\" title=\"Audit trail\" target=\"_blank\" class=\"wiki-link\" data-key=\"96a617b543c5b2f26617288ba923c0f0\">Audit trails<\/a> are mandatory by law, as otherwise, how would violations of confidentiality be punished?\n<\/p><p>In a medical emergency, a patient\u2019s life may be threatened by the sudden and unexpected development of a health condition. High unpredictability and the requirement for expedited service delivery are characteristic of a medical emergency department. The need for efficiency raises challenges that require a balance between the right to life, efficiency, and the protection of human dignity. The right to life and the right to dignity are enshrined in sections 10 and 11 of the SA Constitution, respectively.<sup id=\"rdp-ebb-cite_ref-SATheConst96_17-0\" class=\"reference\"><a href=\"#cite_note-SATheConst96-17\" rel=\"external_link\">[17]<\/a><\/sup> Section 2.3(a) of the Patient\u2019s Rights Charter states that everyone has the right to receive timely emergency care.<sup id=\"rdp-ebb-cite_ref-HPCSAGuide08-3_18-0\" class=\"reference\"><a href=\"#cite_note-HPCSAGuide08-3-18\" rel=\"external_link\">[18]<\/a><\/sup>\n<\/p><p>Members of the emergency team never know what to expect at any given point in time, resulting in feelings of anxiety.<sup id=\"rdp-ebb-cite_ref-CroskerryPatient09_19-0\" class=\"reference\"><a href=\"#cite_note-CroskerryPatient09-19\" rel=\"external_link\">[19]<\/a><\/sup> When attending to multiple patients at the same time, overcrowding, high noise levels, and fatigue may result in interruptions of the thinking and decision-making process.<sup id=\"rdp-ebb-cite_ref-PalmerTheRel13_20-0\" class=\"reference\"><a href=\"#cite_note-PalmerTheRel13-20\" rel=\"external_link\">[20]<\/a><\/sup> These factors are cited in the literature as the leading cause of errors in diagnosis associated with clinical emergencies.<sup id=\"rdp-ebb-cite_ref-BaldiError14_21-0\" class=\"reference\"><a href=\"#cite_note-BaldiError14-21\" rel=\"external_link\">[21]<\/a><\/sup> The need for efficiency in emergency departments induces stress in members of the emergency team. Individuals are likely to forget passwords that are long and contain a variety of characters, especially when working under stressful conditions.<sup id=\"rdp-ebb-cite_ref-PaytonMemory10_6-2\" class=\"reference\"><a href=\"#cite_note-PaytonMemory10-6\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HealyStress11_22-0\" class=\"reference\"><a href=\"#cite_note-HealyStress11-22\" rel=\"external_link\">[22]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-Espa.C3.B1aEffects16_23-0\" class=\"reference\"><a href=\"#cite_note-Espa.C3.B1aEffects16-23\" rel=\"external_link\">[23]<\/a><\/sup> Perhaps considerations regarding the right to life and timely access to emergency care inform some of the practices that result in the accessing of the PACS without requiring a password. Similar reasons may account for the 54% of PACS workstations that were not capable of automatic log-off, causing them to remain active all the time.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusion\">Conclusion<\/span><\/h2>\n<p>This paper highlights the dilemma in emergency departments between the need for efficient patient treatment and respect for patient ethical rights. In a medical environment dominated by a culture of trust, human dignity may not be the primary concern, especially when competing with the supreme right to life. However, just because HPs are inclined to trust one another, based on the assumption that HPs are ethical beings who respect patient confidentiality, this does not mean that all HPs are trustworthy. There may be occasions when patients suspect that HPs may abuse their privileges of access to medical records.<sup id=\"rdp-ebb-cite_ref-Aky.C3.BCzSurgical13_24-0\" class=\"reference\"><a href=\"#cite_note-Aky.C3.BCzSurgical13-24\" rel=\"external_link\">[24]<\/a><\/sup>\n<\/p><p>The protection of patient data requires the fulfillment of diligent security measures, including the use of effective passwords and automatic computer log-off. These measures may be time-consuming and therefore not suitable for the levels of efficiency needed in emergency departments. The use of effective passwords is necessary to protect human dignity, the provision of which is enshrined in section 14(d) of the SA Constitution.<sup id=\"rdp-ebb-cite_ref-SATheConst96_17-1\" class=\"reference\"><a href=\"#cite_note-SATheConst96-17\" rel=\"external_link\">[17]<\/a><\/sup> Yet, practices that are compliant with the minimum standards of effective passwords stand to threaten the supreme human right to life.\n<\/p><p>In a medical emergency, seconds count. Computers take ~60 seconds to initialize and authenticate the user, excluding the additional time needed to process an image or to call up patient data.<sup id=\"rdp-ebb-cite_ref-SamySecurity10_25-0\" class=\"reference\"><a href=\"#cite_note-SamySecurity10-25\" rel=\"external_link\">[25]<\/a><\/sup> Depending on the type of medical emergency, 60 seconds could mean the difference between organ impairment and death. Eliminating the time for computer initialization and authentication could go a long way towards saving lives. At the time of this report, there were no data to suggest that lives have been lost as a result of computer initialization and authentication. However, the lack of data does not mean that incidents have not occurred or will not occur in the future.\n<\/p><p>It remains unclear whether compliance with the minimum standards for effective password usage is suitable to emergency departments. This article may have contributed to normative ethics in asking the question as to whether medical emergency departments ought to be an exception to the minimum standards of effective password usage. The reasons for non-compliance presented in this article are mere suggestions drawn from the literature. Future research is needed, firstly, to determine reasons for non-compliance specific to the use of PACS in an emergency department; and secondly, to determine alternative security measures that would aid in preserving patient confidentiality in such departments.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>This article presents the findings of a Master\u2019s dissertation obtained through the University of Johannesburg. Both authors greatly appreciate the permission granted by the research settings and the contribution of the participants.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>TBM and BvD worked jointly in preparing and approving the article for submission. In her capacity as the primary researcher, TBM contributed to project design and data collection. BvD was the study supervisor, and made enormous contributions to the conceptual framework and editing of the manuscript.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Funding\">Funding<\/span><\/h3>\n<p>None.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h3>\n<p>None.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-DayarathnaThePrinc09-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DayarathnaThePrinc09_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Dayarathna, R.&#32;(2009).&#32;\"The principle of security safeguards: Unauthorized activities\".&#32;<i>Computer Law &amp; Security Review<\/i>&#32;<b>25<\/b>&#32;(2): 165\u201372.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.clsr.2009.02.012\" target=\"_blank\">10.1016\/j.clsr.2009.02.012<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+principle+of+security+safeguards%3A+Unauthorized+activities&amp;rft.jtitle=Computer+Law+%26+Security+Review&amp;rft.aulast=Dayarathna%2C+R.&amp;rft.au=Dayarathna%2C+R.&amp;rft.date=2009&amp;rft.volume=25&amp;rft.issue=2&amp;rft.pages=165%E2%80%9372&amp;rft_id=info:doi\/10.1016%2Fj.clsr.2009.02.012&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IfinedoUnder12-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-IfinedoUnder12_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-IfinedoUnder12_2-1\" rel=\"external_link\">2.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ifinedo, P.&#32;(2012).&#32;\"Understanding information systems security policy compliance: An integration of the theory of planned behavior and the protection motivation theory\".&#32;<i>Computers &amp; Security<\/i>&#32;<b>31<\/b>&#32;(1): 83\u201395.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.cose.2011.10.007\" target=\"_blank\">10.1016\/j.cose.2011.10.007<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Understanding+information+systems+security+policy+compliance%3A+An+integration+of+the+theory+of+planned+behavior+and+the+protection+motivation+theory&amp;rft.jtitle=Computers+%26+Security&amp;rft.aulast=Ifinedo%2C+P.&amp;rft.au=Ifinedo%2C+P.&amp;rft.date=2012&amp;rft.volume=31&amp;rft.issue=1&amp;rft.pages=83%E2%80%9395&amp;rft_id=info:doi\/10.1016%2Fj.cose.2011.10.007&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SteersBoot03-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SteersBoot03_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Steers, K.&#32;(2003).&#32;\"Boot passwords, put your PC under lock and key\".&#32;<i>PC World<\/i>&#32;<b>21<\/b>&#32;(9): 168.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Boot+passwords%2C+put+your+PC+under+lock+and+key&amp;rft.jtitle=PC+World&amp;rft.aulast=Steers%2C+K.&amp;rft.au=Steers%2C+K.&amp;rft.date=2003&amp;rft.volume=21&amp;rft.issue=9&amp;rft.pages=168&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-StrossGoodbye08-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-StrossGoodbye08_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Stross, R.&#32;(09 August 2008).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.nytimes.com\/2008\/08\/10\/technology\/10digi.html\" target=\"_blank\">\"Goodbye, Passwords. You Aren\u2019t a Good Defense\"<\/a>.&#32;<i>The New York Times<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.nytimes.com\/2008\/08\/10\/technology\/10digi.html\" target=\"_blank\">https:\/\/www.nytimes.com\/2008\/08\/10\/technology\/10digi.html<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 27 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Goodbye%2C+Passwords.+You+Aren%E2%80%99t+a+Good+Defense&amp;rft.atitle=The+New+York+Times&amp;rft.aulast=Stross%2C+R.&amp;rft.au=Stross%2C+R.&amp;rft.date=09+August+2008&amp;rft_id=https%3A%2F%2Fwww.nytimes.com%2F2008%2F08%2F10%2Ftechnology%2F10digi.html&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GrahamForget15-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GrahamForget15_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Graham, J.&#32;(05 January 2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.pressreader.com\/usa\/usa-today-us-edition\/20150105\/281801397332402\" target=\"_blank\">\"Forget passwords - use your face instead\"<\/a>.&#32;<i>USA Today<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.pressreader.com\/usa\/usa-today-us-edition\/20150105\/281801397332402\" target=\"_blank\">https:\/\/www.pressreader.com\/usa\/usa-today-us-edition\/20150105\/281801397332402<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Forget+passwords+-+use+your+face+instead&amp;rft.atitle=USA+Today&amp;rft.aulast=Graham%2C+J.&amp;rft.au=Graham%2C+J.&amp;rft.date=05+January+2015&amp;rft_id=https%3A%2F%2Fwww.pressreader.com%2Fusa%2Fusa-today-us-edition%2F20150105%2F281801397332402&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PaytonMemory10-6\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-PaytonMemory10_6-0\" rel=\"external_link\">6.0<\/a><\/sup> <sup><a href=\"#cite_ref-PaytonMemory10_6-1\" rel=\"external_link\">6.1<\/a><\/sup> <sup><a href=\"#cite_ref-PaytonMemory10_6-2\" rel=\"external_link\">6.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Payton, L.&#32;(2010).&#32;\"Memory for Passwords: The Effects of Varying Number, Type, and Composition\".&#32;<i>PSI CHI Journal of Psychological Research<\/i>&#32;<b>15<\/b>&#32;(4): 209\u201313.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.24839%2F1089-4136.JN15.4.209\" target=\"_blank\">10.24839\/1089-4136.JN15.4.209<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Memory+for+Passwords%3A+The+Effects+of+Varying+Number%2C+Type%2C+and+Composition&amp;rft.jtitle=PSI+CHI+Journal+of+Psychological+Research&amp;rft.aulast=Payton%2C+L.&amp;rft.au=Payton%2C+L.&amp;rft.date=2010&amp;rft.volume=15&amp;rft.issue=4&amp;rft.pages=209%E2%80%9313&amp;rft_id=info:doi\/10.24839%2F1089-4136.JN15.4.209&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WilliamsInATrust08-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WilliamsInATrust08_7-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-WilliamsInATrust08_7-1\" rel=\"external_link\">7.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Williams, P.A.H.&#32;(2008).&#32;\"In a \u2018trusting\u2019 environment, everyone is responsible for information security\".&#32;<i>Information Security Technical Report<\/i>&#32;<b>13<\/b>&#32;(4): 207\u201315.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.istr.2008.10.009\" target=\"_blank\">10.1016\/j.istr.2008.10.009<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=In+a+%E2%80%98trusting%E2%80%99+environment%2C+everyone+is+responsible+for+information+security&amp;rft.jtitle=Information+Security+Technical+Report&amp;rft.aulast=Williams%2C+P.A.H.&amp;rft.au=Williams%2C+P.A.H.&amp;rft.date=2008&amp;rft.volume=13&amp;rft.issue=4&amp;rft.pages=207%E2%80%9315&amp;rft_id=info:doi\/10.1016%2Fj.istr.2008.10.009&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RobinsonMoral16-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RobinsonMoral16_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Robinson, R.&#32;(2016).&#32;\"Moral Distress: A Qualitative Study of Emergency Nurses\".&#32;<i>Dimensions of Critical Care Nursing<\/i>&#32;<b>35<\/b>&#32;(4): 235\u201340.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1097%2FDCC.0000000000000185\" target=\"_blank\">10.1097\/DCC.0000000000000185<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Moral+Distress%3A+A+Qualitative+Study+of+Emergency+Nurses&amp;rft.jtitle=Dimensions+of+Critical+Care+Nursing&amp;rft.aulast=Robinson%2C+R.&amp;rft.au=Robinson%2C+R.&amp;rft.date=2016&amp;rft.volume=35&amp;rft.issue=4&amp;rft.pages=235%E2%80%9340&amp;rft_id=info:doi\/10.1097%2FDCC.0000000000000185&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MahlaolaCompliance15-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MahlaolaCompliance15_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation\" id=\"rdp-ebb-CITEREFMahlaola.2C_T.B.2015\">Mahlaola, T.B.&#32;(20 January 2015),&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/ujcontent.uj.ac.za\/vital\/access\/manager\/Repository\/uj:13153\" target=\"_blank\"><i>Compliance of health professionals with patient confidentiality when using PACS and RIS<\/i><\/a>,&#32;University of Johannesburg<span class=\"printonly\">,&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/ujcontent.uj.ac.za\/vital\/access\/manager\/Repository\/uj:13153\" target=\"_blank\">https:\/\/ujcontent.uj.ac.za\/vital\/access\/manager\/Repository\/uj:13153<\/a><\/span><\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Compliance+of+health+professionals+with+patient+confidentiality+when+using+PACS+and+RIS&amp;rft.aulast=Mahlaola%2C+T.B.&amp;rft.au=Mahlaola%2C+T.B.&amp;rft.date=20+January+2015&amp;rft.pub=University+of+Johannesburg&amp;rft_id=https%3A%2F%2Fujcontent.uj.ac.za%2Fvital%2Faccess%2Fmanager%2FRepository%2Fuj%3A13153&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BeachMaint14-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BeachMaint14_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Beach, J.; Oates, J.&#32;(2014).&#32;\"Maintaining best practice in record-keeping and documentation\".&#32;<i>Nursing Standard<\/i>&#32;<b>28<\/b>&#32;(36): 45\u201350.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.7748%2Fns2014.05.28.36.45.e8835\" target=\"_blank\">10.7748\/ns2014.05.28.36.45.e8835<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Maintaining+best+practice+in+record-keeping+and+documentation&amp;rft.jtitle=Nursing+Standard&amp;rft.aulast=Beach%2C+J.%3B+Oates%2C+J.&amp;rft.au=Beach%2C+J.%3B+Oates%2C+J.&amp;rft.date=2014&amp;rft.volume=28&amp;rft.issue=36&amp;rft.pages=45%E2%80%9350&amp;rft_id=info:doi\/10.7748%2Fns2014.05.28.36.45.e8835&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BolanTech13-11\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BolanTech13_11-0\" rel=\"external_link\">11.0<\/a><\/sup> <sup><a href=\"#cite_ref-BolanTech13_11-1\" rel=\"external_link\">11.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bolan, C.&#32;(2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/appliedradiology.com\/articles\/technology-trends-a-view-of-the-future-image-exchange\" target=\"_blank\">\"Technology Trends: A view of the future image exchange\"<\/a>.&#32;<i>Applied Radiology<\/i>&#32;<b>42<\/b>&#32;(11): 32\u20137<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/appliedradiology.com\/articles\/technology-trends-a-view-of-the-future-image-exchange\" target=\"_blank\">https:\/\/appliedradiology.com\/articles\/technology-trends-a-view-of-the-future-image-exchange<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Technology+Trends%3A+A+view+of+the+future+image+exchange&amp;rft.jtitle=Applied+Radiology&amp;rft.aulast=Bolan%2C+C.&amp;rft.au=Bolan%2C+C.&amp;rft.date=2013&amp;rft.volume=42&amp;rft.issue=11&amp;rft.pages=32%E2%80%937&amp;rft_id=https%3A%2F%2Fappliedradiology.com%2Farticles%2Ftechnology-trends-a-view-of-the-future-image-exchange&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BenetarIndis10-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BenetarIndis10_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Benatar, D.&#32;(2010).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.sajbl.org.za\/index.php\/sajbl\/article\/view\/101\" target=\"_blank\">\"Indiscretion and other threats to confidentiality\"<\/a>.&#32;<i>South African Journal of Bioethics &amp; Law<\/i>&#32;<b>3<\/b>&#32;(2): 59\u201362<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.sajbl.org.za\/index.php\/sajbl\/article\/view\/101\" target=\"_blank\">http:\/\/www.sajbl.org.za\/index.php\/sajbl\/article\/view\/101<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Indiscretion+and+other+threats+to+confidentiality&amp;rft.jtitle=South+African+Journal+of+Bioethics+%26+Law&amp;rft.aulast=Benatar%2C+D.&amp;rft.au=Benatar%2C+D.&amp;rft.date=2010&amp;rft.volume=3&amp;rft.issue=2&amp;rft.pages=59%E2%80%9362&amp;rft_id=http%3A%2F%2Fwww.sajbl.org.za%2Findex.php%2Fsajbl%2Farticle%2Fview%2F101&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BirdImproved05-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BirdImproved05_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Bird, K.&#32;(20 June 2005).&#32;\"Improved ISO\/IEC 17799 makes information assets even more secure\".&#32;<i>ISO News<\/i>.&#32;International Organization for Standardization.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Improved+ISO%2FIEC+17799+makes+information+assets+even+more+secure&amp;rft.atitle=ISO+News&amp;rft.aulast=Bird%2C+K.&amp;rft.au=Bird%2C+K.&amp;rft.date=20+June+2005&amp;rft.pub=International+Organization+for+Standardization&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DanielSampling12-14\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-DanielSampling12_14-0\" rel=\"external_link\">14.0<\/a><\/sup> <sup><a href=\"#cite_ref-DanielSampling12_14-1\" rel=\"external_link\">14.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Daniel, J.&#32;(2012).&#32;<i>Sampling Essentials: Practical Guidelines for Making Sampling Choices<\/i>.&#32;SAGE Publications.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4135%2F9781452272047\" target=\"_blank\">10.4135\/9781452272047<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781412952217.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Sampling+Essentials%3A+Practical+Guidelines+for+Making+Sampling+Choices&amp;rft.aulast=Daniel%2C+J.&amp;rft.au=Daniel%2C+J.&amp;rft.date=2012&amp;rft.pub=SAGE+Publications&amp;rft_id=info:doi\/10.4135%2F9781452272047&amp;rft.isbn=9781412952217&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HPCSAGuide08-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HPCSAGuide08_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Health Professions Council of South Africa&#32;(May 2008).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.hpcsa.co.za\/Uploads\/editor\/UserFiles\/downloads\/conduct_ethics\/rules\/generic_ethical_rules\/booklet_10_confidentiality_protecting_and_providing_information.pdf\" target=\"_blank\">\"Guidelines for Good Practice in the Health Care Professions - Confidentiality: Protecting and Providing Information\"<\/a>&#32;(PDF)<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.hpcsa.co.za\/Uploads\/editor\/UserFiles\/downloads\/conduct_ethics\/rules\/generic_ethical_rules\/booklet_10_confidentiality_protecting_and_providing_information.pdf\" target=\"_blank\">http:\/\/www.hpcsa.co.za\/Uploads\/editor\/UserFiles\/downloads\/conduct_ethics\/rules\/generic_ethical_rules\/booklet_10_confidentiality_protecting_and_providing_information.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Guidelines+for+Good+Practice+in+the+Health+Care+Professions+-+Confidentiality%3A+Protecting+and+Providing+Information&amp;rft.atitle=&amp;rft.aulast=Health+Professions+Council+of+South+Africa&amp;rft.au=Health+Professions+Council+of+South+Africa&amp;rft.date=May+2008&amp;rft_id=http%3A%2F%2Fwww.hpcsa.co.za%2FUploads%2Feditor%2FUserFiles%2Fdownloads%2Fconduct_ethics%2Frules%2Fgeneric_ethical_rules%2Fbooklet_10_confidentiality_protecting_and_providing_information.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CaoMedical03-16\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-CaoMedical03_16-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-CaoMedical03_16-1\" rel=\"external_link\">16.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Cao, F.; Huang, H.K.; Zhou, X.Q.&#32;(2003).&#32;\"Medical image security in a HIPAA mandated PACS environment\".&#32;<i>Computerized Medical Imaging and Graphics<\/i>&#32;<b>27<\/b>&#32;(2\u20133): 185\u201396.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS0895-6111%2802%2900073-3\" target=\"_blank\">10.1016\/S0895-6111(02)00073-3<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Medical+image+security+in+a+HIPAA+mandated+PACS+environment&amp;rft.jtitle=Computerized+Medical+Imaging+and+Graphics&amp;rft.aulast=Cao%2C+F.%3B+Huang%2C+H.K.%3B+Zhou%2C+X.Q.&amp;rft.au=Cao%2C+F.%3B+Huang%2C+H.K.%3B+Zhou%2C+X.Q.&amp;rft.date=2003&amp;rft.volume=27&amp;rft.issue=2%E2%80%933&amp;rft.pages=185%E2%80%9396&amp;rft_id=info:doi\/10.1016%2FS0895-6111%2802%2900073-3&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SATheConst96-17\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-SATheConst96_17-0\" rel=\"external_link\">17.0<\/a><\/sup> <sup><a href=\"#cite_ref-SATheConst96_17-1\" rel=\"external_link\">17.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.justice.gov.za\/legislation\/constitution\/SAConstitution-web-eng.pdf\" target=\"_blank\">\"The Constitution of the Republic of South Africa\"<\/a>.&#32;Republic of South Africa.&#32;18 December 1996<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.justice.gov.za\/legislation\/constitution\/SAConstitution-web-eng.pdf\" target=\"_blank\">http:\/\/www.justice.gov.za\/legislation\/constitution\/SAConstitution-web-eng.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=The+Constitution+of+the+Republic+of+South+Africa&amp;rft.atitle=&amp;rft.date=18+December+1996&amp;rft.pub=Republic+of+South+Africa&amp;rft_id=http%3A%2F%2Fwww.justice.gov.za%2Flegislation%2Fconstitution%2FSAConstitution-web-eng.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HPCSAGuide08-3-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HPCSAGuide08-3_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Health Professions Council of South Africa&#32;(May 2008).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.hpcsa.co.za\/downloads\/conduct_ethics\/rules\/generic_ethical_rules\/booklet_3_patients_rights_charter.pdf\" target=\"_blank\">\"Guidelines for Good Practice in the Health Care Professions - National Patients' Rights Charter\"<\/a>&#32;(PDF)<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.hpcsa.co.za\/downloads\/conduct_ethics\/rules\/generic_ethical_rules\/booklet_3_patients_rights_charter.pdf\" target=\"_blank\">http:\/\/www.hpcsa.co.za\/downloads\/conduct_ethics\/rules\/generic_ethical_rules\/booklet_3_patients_rights_charter.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Guidelines+for+Good+Practice+in+the+Health+Care+Professions+-+National+Patients%27+Rights+Charter&amp;rft.atitle=&amp;rft.aulast=Health+Professions+Council+of+South+Africa&amp;rft.au=Health+Professions+Council+of+South+Africa&amp;rft.date=May+2008&amp;rft_id=http%3A%2F%2Fwww.hpcsa.co.za%2Fdownloads%2Fconduct_ethics%2Frules%2Fgeneric_ethical_rules%2Fbooklet_3_patients_rights_charter.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CroskerryPatient09-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CroskerryPatient09_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Croskerry, P.; Cosby, K.S.; Schenkel, S.M.; Wears, R.L.&#32;(2009).&#32;<i>Patient Safety in Emergency Medicine<\/i>.&#32;Lippincott Williams &amp; Wilkins.&#32;pp.&#160;448.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9780781777278.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Patient+Safety+in+Emergency+Medicine&amp;rft.aulast=Croskerry%2C+P.%3B+Cosby%2C+K.S.%3B+Schenkel%2C+S.M.%3B+Wears%2C+R.L.&amp;rft.au=Croskerry%2C+P.%3B+Cosby%2C+K.S.%3B+Schenkel%2C+S.M.%3B+Wears%2C+R.L.&amp;rft.date=2009&amp;rft.pages=pp.%26nbsp%3B448&amp;rft.pub=Lippincott+Williams+%26+Wilkins&amp;rft.isbn=9780781777278&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PalmerTheRel13-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PalmerTheRel13_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Palmer, L.K.&#32;(2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/eric.ed.gov\/?id=EJ1022296\" target=\"_blank\">\"The Relationship between Stress, Fatigue, and Cognitive Functioning\"<\/a>.&#32;<i>College Student Journal<\/i>&#32;<b>47<\/b>&#32;(2): 312\u201325<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/eric.ed.gov\/?id=EJ1022296\" target=\"_blank\">https:\/\/eric.ed.gov\/?id=EJ1022296<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+Relationship+between+Stress%2C+Fatigue%2C+and+Cognitive+Functioning&amp;rft.jtitle=College+Student+Journal&amp;rft.aulast=Palmer%2C+L.K.&amp;rft.au=Palmer%2C+L.K.&amp;rft.date=2013&amp;rft.volume=47&amp;rft.issue=2&amp;rft.pages=312%E2%80%9325&amp;rft_id=https%3A%2F%2Feric.ed.gov%2F%3Fid%3DEJ1022296&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BaldiError14-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BaldiError14_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Baldi, P.L.&#32;(2014).&#32;\"Error risk in the decision making process\".&#32;<i>Emergency Care Journal<\/i>&#32;<b>10<\/b>&#32;(1): 37\u201340.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4081%2Fecj.2014.2119\" target=\"_blank\">10.4081\/ecj.2014.2119<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Error+risk+in+the+decision+making+process&amp;rft.jtitle=Emergency+Care+Journal&amp;rft.aulast=Baldi%2C+P.L.&amp;rft.au=Baldi%2C+P.L.&amp;rft.date=2014&amp;rft.volume=10&amp;rft.issue=1&amp;rft.pages=37%E2%80%9340&amp;rft_id=info:doi\/10.4081%2Fecj.2014.2119&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HealyStress11-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HealyStress11_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Healy, S.; Tyrrell, M.&#32;(2011).&#32;\"Stress in emergency departments: Experiences of nurses and doctors\".&#32;<i>Emergency Nurse<\/i>&#32;<b>19<\/b>&#32;(4): 31\u20137.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.7748%2Fen2011.07.19.4.31.c8611\" target=\"_blank\">10.7748\/en2011.07.19.4.31.c8611<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Stress+in+emergency+departments%3A+Experiences+of+nurses+and+doctors&amp;rft.jtitle=Emergency+Nurse&amp;rft.aulast=Healy%2C+S.%3B+Tyrrell%2C+M.&amp;rft.au=Healy%2C+S.%3B+Tyrrell%2C+M.&amp;rft.date=2011&amp;rft.volume=19&amp;rft.issue=4&amp;rft.pages=31%E2%80%937&amp;rft_id=info:doi\/10.7748%2Fen2011.07.19.4.31.c8611&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Espa.C3.B1aEffects16-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Espa.C3.B1aEffects16_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Espa\u00f1a, L.Y.&#32;(2016).&#32;\"Effects of Password Type and Memory Techniques on User Password Memory\".&#32;<i>PSI CHI Journal of Psychological Research<\/i>&#32;<b>21<\/b>&#32;(4): 269\u201375.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.24839%2F2164-8204.JN21.4.269\" target=\"_blank\">10.24839\/2164-8204.JN21.4.269<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Effects+of+Password+Type+and+Memory+Techniques+on+User+Password+Memory&amp;rft.jtitle=PSI+CHI+Journal+of+Psychological+Research&amp;rft.aulast=Espa%C3%B1a%2C+L.Y.&amp;rft.au=Espa%C3%B1a%2C+L.Y.&amp;rft.date=2016&amp;rft.volume=21&amp;rft.issue=4&amp;rft.pages=269%E2%80%9375&amp;rft_id=info:doi\/10.24839%2F2164-8204.JN21.4.269&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Aky.C3.BCzSurgical13-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Aky.C3.BCzSurgical13_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Aky\u00fcz, E.; Erdemir, F.&#32;(2013).&#32;\"Surgical patients\u2019 and nurses\u2019 opinions and expectations about privacy in care\".&#32;<i>Nursing Ethics<\/i>&#32;<b>20<\/b>&#32;(6): e671.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1177%2F0969733012468931\" target=\"_blank\">10.1177\/0969733012468931<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Surgical+patients%E2%80%99+and+nurses%E2%80%99+opinions+and+expectations+about+privacy+in+care&amp;rft.jtitle=Nursing+Ethics&amp;rft.aulast=Aky%C3%BCz%2C+E.%3B+Erdemir%2C+F.&amp;rft.au=Aky%C3%BCz%2C+E.%3B+Erdemir%2C+F.&amp;rft.date=2013&amp;rft.volume=20&amp;rft.issue=6&amp;rft.pages=e671&amp;rft_id=info:doi\/10.1177%2F0969733012468931&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SamySecurity10-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SamySecurity10_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Samy, G.N.; Ahmad, R.; Ismail, Z.&#32;(2010).&#32;\"Security threats categories in healthcare information systems\".&#32;<i>Health Informatics Journal<\/i>&#32;<b>16<\/b>&#32;(3): 201\u20139.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1177%2F1460458210377468\" target=\"_blank\">10.1177\/1460458210377468<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Security+threats+categories+in+healthcare+information+systems&amp;rft.jtitle=Health+Informatics+Journal&amp;rft.aulast=Samy%2C+G.N.%3B+Ahmad%2C+R.%3B+Ismail%2C+Z.&amp;rft.au=Samy%2C+G.N.%3B+Ahmad%2C+R.%3B+Ismail%2C+Z.&amp;rft.date=2010&amp;rft.volume=16&amp;rft.issue=3&amp;rft.pages=201%E2%80%939&amp;rft_id=info:doi\/10.1177%2F1460458210377468&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193156\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.579 seconds\nReal time usage: 0.609 seconds\nPreprocessor visited node count: 18886\/1000000\nPreprocessor generated node count: 39816\/1000000\nPost\u2010expand include size: 120950\/2097152 bytes\nTemplate argument size: 39077\/2097152 bytes\nHighest expansion depth: 15\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 583.956 1 - -total\n 81.53% 476.125 1 - Template:Reflist\n 68.81% 401.823 25 - Template:Citation\/core\n 49.13% 286.917 16 - Template:Cite_journal\n 15.25% 89.046 6 - Template:Cite_web\n 12.99% 75.862 1 - Template:Infobox_journal_article\n 12.52% 73.135 1 - Template:Infobox\n 7.83% 45.712 80 - Template:Infobox\/row\n 6.50% 37.958 2 - Template:Cite_book\n 4.47% 26.119 1 - Template:Cite\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10703-0!*!0!!en!5!* and timestamp 20181214193155 and revision id 33633\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments\">https:\/\/www.limswiki.org\/index.php\/Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","51ebf6ac1bdd905b9c8c8d23fe8b8a29_images":["https:\/\/www.limswiki.org\/images\/1\/1c\/Fig1_Mahlaola_SAJouBioLaw2017_10-2.png"],"51ebf6ac1bdd905b9c8c8d23fe8b8a29_timestamp":1544815915,"9872ac73fcb8d8cb5b8b6de9ced82c60_type":"article","9872ac73fcb8d8cb5b8b6de9ced82c60_title":"How could the ethical management of health data in the medical field inform police use of DNA? (Krikorian and Vailly 2018)","9872ac73fcb8d8cb5b8b6de9ced82c60_url":"https:\/\/www.limswiki.org\/index.php\/Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F","9872ac73fcb8d8cb5b8b6de9ced82c60_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:How could the ethical management of health data in the medical field inform police use of DNA?\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nHow could the ethical management of health data in the medical field inform police use of DNA?Journal\n \nFrontiers in Public HealthAuthor(s)\n \nKrikorian, Gaelle; Vailly, Jo\u00eblleAuthor affiliation(s)\n \nInstitut de recherche interdisciplinaire sur les enjeux sociaux (IRIS)Primary contact\n \nEmail: gaelle.krikorian@gmail.comEditors\n \nLef\u00e8vre, ThomasYear published\n \n2018Volume and issue\n \n6Page(s)\n \n154DOI\n \n10.3389\/fpubh.2018.00154ISSN\n \n2296-2565Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/www.frontiersin.org\/articles\/10.3389\/fpubh.2018.00154\/fullDownload\n \nhttps:\/\/www.frontiersin.org\/articles\/10.3389\/fpubh.2018.00154\/pdf (PDF)\n\nContents\n\n1 Introduction \n2 Nature of the information and genetic data produced in the police sphere \n3 How is this framed legally, politically, and ethically? \n4 Conclusion \n5 Acknowledgements \n\n5.1 Author contributions \n5.2 Funding \n5.3 Conflict of interest statement \n\n\n6 Footnotes \n7 References \n8 Notes \n\n\n\nIntroduction \nVarious events paved the way for the production of ethical norms regulating biomedical practices, from the Nuremberg Code (1947)\u2014produced by the international trial of Nazi regime leaders and collaborators\u2014and the Declaration of Helsinki by the World Medical Association (1964) to the invention of the term \u201cbioethics\u201d by American biologist Van Rensselaer Potter.[1] The ethics of biomedicine has given rise to various controversies\u2014particularly in the fields of newborn screening[2], prenatal screening[3], and cloning[4]\u2014resulting in the institutionalization of ethical questions in the biomedical world of genetics. In 1994, France passed legislation (commonly known as the \u201cbioethics laws\u201d) to regulate medical practices in genetics. The medical community has also organized itself in order to manage ethical issues relating to its decisions, with a view to handling \u201cpractices with many strong uncertainties\u201d and enabling clinical judgments and decisions to be made not by individual practitioners but rather by multidisciplinary groups drawing on different modes of judgment and forms of expertise.[5] Thus, the biomedical approach to genetics has been characterized by various debates and the existence of public controversies.\nIn the judicial sphere, the situation is very different. Since the end of the 1990s, developments in biomedical research have led to genetic data being used in police work and legal proceedings. Today, forensic science is omnipresent in investigations, not just in complex criminal cases but also routinely in cases of \u201cminor\u201d or \u201cmass\u201d delinquency. Genetics, which certainly receives the most media coverage among the techniques involved[6], has taken on considerable importance.[7] However, although very similar techniques are used in biomedicine and police work (DNA amplification, sequencing, etc.), the forms of collective management surrounding them are very different, as well as the ethico-legal frameworks and their evolution, as this text will demonstrate.\nKeywords: DNA, police, ethics, genetic technologies, criminal investigations\n\nNature of the information and genetic data produced in the police sphere \nIn police work in France, data produced by DNA are currently compiled and used in two different ways: first, to create files on individuals in the FNAEG or Fichier national automatis\u00e9 des empreintes g\u00e9n\u00e9tiques (national automated DNA database) and, second, in order to obtain information about perpetrators of crimes (their appearance, their origin, their kinship links to other individuals).\nPolice use of DNA has been allowed in France since the 1998 law providing for the creation of the FNAEG. A DNA profile corresponds to a \u201cspecific individual alphanumeric combination\u201d[8] that is the numerical encoding of analysis of DNA segments. This profile is the result of analysis of DNA fragments using genetic markers. This analysis can be carried out on a minute amount of genetic material (saliva, blood, sperm, hair, contact, etc.). It identifies the presence of sequences specific to an individual that differentiate them from any other person (with the exception of an identical twin) but that are not supposed to provide any phenotypical information (about appearance, geographical origin, or diseases).[a] Such profiles therefore make individuals \u201cidentifiable in their uniqueness.\u201d[9] During investigations, DNA is collected from suspects or unidentified stains left on crime scenes or people and the results of this analysis are entered into the database. Identification through the FNAEG was originally restricted to a limited number of crimes\u2014those of a sexual nature, as part of the law relating to the prevention and punishment of sexual crimes and the protection of minors. This remit has progressively been extended to include the vast majority of crimes and offences[b], leading to the routine use of DNA in investigations.[c] As a result of this evolution, there has been a substantial increase in the number of persons with files in the FNAEG, more than three million as of late 2015.[d]\nNew techniques have also emerged in recent years. It is now possible to obtain indications about an individual's physical appearance based on a sample of his\/her DNA[10][11]: the analyses in question provide statistical information on eye, hair, and skin color, etc. These techniques are more exploratory and aim not to match DNA with an identity by comparison but to determine the characteristics of the perpetrator of a crime. These data result from analysis of several dozen DNA markers that, unlike the FNAEG's data, are selected deliberately so that they can provide information about a person's physical appearance. They are therefore aimed at \u201cgenerating a suspect\u201d[12] but because the information about this person's features are incomplete (e.g., a person with blue eyes, fair skin, light brown hair, and of European \u201cbio-geographical\u201d ancestry), they define \u201ctarget populations of interest\u201d to guide police investigations.[13] Several private and public laboratories in France now produce what professionals often refer to as \u201cDNA photofits\u201d; it is estimated that several dozen such analyses have been carried out since 2014 as part of investigations.\n\nHow is this framed legally, politically, and ethically? \nThe legal framework surrounding how the police and justice system use DNA analysis was devised to follow the creation of the FNAEG. For this reason, and in order to defuse fears and criticisms, the law only allows analyses using \u201cnon-coding\u201d DNA so as to meet the initial objective of allowing identification without providing information about individuals. French law only provides the police DNA for identification purposes \u201cwithin the framework of investigative measures or the preparation of a case during a judicial proceeding,\u201d[e] in cases of missing persons[f], or, more recently, in the context of familial searches to allow \u201csearches for persons directly related to [an] unknown person\u201d who has left a stain at a crime scene (i.e., without determining phenotype).[g]\nConcerning the so-called \u201cDNA Photofit\u201d technique, in June 2014, France's highest court, the Court of Cassation, ruled admissible an expert report charged with providing \u201call useful elements relating to the suspect's visible morphological characteristics\u201d based on stains collected after a rape in an investigation into a series of sexual assaults in Lyon between October 2012 and January 2014. The Court of Cassation's authorization of this practice in DNA analysis was the first in France. For judges and prosecutors, there is now set a legal precedent allowing them to authorize \u201cDNA Photofits\u201d when they consider this could help an investigation.\nIn legal terms, the emerging of new technical possibilities and their practical use create conflicting and parallel regimes. On one hand, \u201cDNA Photofits\u201d do not correspond to the legal frameworks devised in the 1990s. It does not provide identification, per se, but is rather an \u201cassistance to the investigation,\u201d as it uses coding DNA. One another hand, as science evolves, the law is falling out of step with the technical and scientific reality. New knowledge shows that some of the markers used by the FNAEG may in fact allow further information to be obtained about people regarding their predisposition to certain diseases, their genetic pathologies, and their \u201cethnic origin\u201d (by continent or sub-continent).[h] Moreover, whereas at the FNAEG's inception it was considered unacceptable for the police to use medical information, certain professionals in police or justice now recognize that this information (whether genetic or not) can be useful in investigations (providing information about wanted persons' need for medication or healthcare, or about their physical appearance, etc.). Although there are no changes in the legal framework on this matter, the idea is spreading and the red line is, to some extend, and for some of the professionals, fading.\nIt is thus obvious that police uses of DNA data providing information about individuals' characteristics raise novel politic-ethical issues.[17][18] In particular, it brings into play the issue of what constitutes private data[19]\u2014for certain geneticists, where \u201cDNA Photofits\u201d are concerned, externally visible characteristics do not fall into this category because they are visible.[11] Generally, as stated by some professionals during interviews, the question is \u201cto know until where to go. And where to stop.\u201c Regarding the FNAEG and French law, in a case heard in June 2017, the European Court of Human Rights (ECHR) ruled that \u201cinterference with the applicant's right to respect for his private life had been disproportionate.\u201d[i] The ECHR judgment ruled against France and underscored that French law regarding DNA date storage should be differentiated \u201caccording to the nature and seriousness of the offence committed.\"[j]\nIn Germany, a contradictory dialogue between experts took place regarding Forensic DNA Phenotyping revealing public and political debate on the matter.[20] In France, despite the stakes involved and the spread of new usages of DNA techniques, no public debate has emerged in recent years concerning new uses of DNA in police work. In 2008, a private analysis laboratory offering indicative geo-genetic tests (tests d'origine g\u00e9o-g\u00e9n\u00e9tique or TOGG) providing information about individuals' origin based on their DNA sparked a media debate that complicated the issue[21]; however, the controversy soon died down. A few years later, Ministry of Justice instructions to judges and prosecutors discouraged the use of this technique, with no further debate. Since then, although the Court of Cassation's 2014 decision opened up the possibility of using an unprecedented practice, this has not generated any public debate or controversy. \n\u201cDNA Photofits\u201d have received some media coverage[k], but this has mainly been to underscore the technical process involved, echoing the fiction conveyed by television series that have made the use of genetic techniques in criminal investigations seem commonplace and particularly efficient. Our sociological fieldwork has revealed, however, that there was organized debate among judges and prosecutors between 2013 and 2014. At the time, the investigating judge who had for the first time ordered the analysis of the suspect's visible morphological characteristics referred the case to the examining chamber himself, to obtain a verdict on whether the expert report he had requested was legal. Although the examining chamber approved the report, the public prosecutor brought the issue before the Court of Cassation\u2014the highest legal authority in France\u2014in order to ensure the final nature of the decision. The Court of Cassation ruled that a judge could have recourse to such analyses. Following this verdict, several bodies consulted by the Ministry of Justice[l] provided opinions underscoring the need for this technique to be written into and regulated by the law. This has not been implemented to date. After being authorized for several years under a temporary protocol, familial searches allowing \u201cgenetic proximity testing\u201d[22] were written into law in 2016. However, the Court of Cassation's judgment on DNA analysis to provide \u201call useful elements relating to a suspect's visible morphological characteristics\u201d has not been brought up for parliamentary debate to be included in the law. There has been no political management of the question at the state level, nor has the issue been included in the general debate organized by the National Consultative Council of Ethics (Comit\u00e9 Consultatif National d'Ethique) in 2018 regarding the revision of laws on bio-ethics.\n\nConclusion \nThe use of these new technological and scientific techniques plays a significant role in guiding how we engage with the world[23], just as it redefines the production of identity translated into information[24] and structures the way sensitive information about individuals is used and circulated. Despite these stakes, and the initial caution that surrounded the creation of the national automated DNA database, it has not gone hand-in-hand with collective political and ethical debate. This raises questions about the conditions for the existence or for the absence of political controversies that call for further sociological investigations about the framing of the issue and the social and political logic at play.\nAs the uses of these techniques are developing in police practices, this absence of collective management of the issue refers the professional to forms of local arbitration. Our fieldwork has shown that they are aware that these practices raise issues and therefore devise ethical frameworks for their own use of DNA. As a consequence, in this field, as it is the case in others, ethical issues are addressed in a fragmented manner as endogenous ethical frameworks are \u201ccobbled together\u201d by professionals as a function of their practices and needs. Each institution, laboratory, and in some cases each individual, is crafting a frame and a perimeter of limits to what can be done according to their understanding and appreciation of the legal setting, the practical utility of actions and the ethical constraints perceived.\nThe ECHR's recent ruling against France regarding the FNAEG may force lawmakers to reach a verdict on this issue, thereby triggering what seems like necessary public debate on forensic use of DNA. The new possibilities provided by genetic technologies point to the need for promoting dialogue among the various professionals using this technology in police work (forensic teams and geneticists working with them, police investigators, private laboratories, prosecutors, judges, etc.), but also with healthcare professionals\u2014who already have experience of the institutionalized management of ethical considerations relating to their practices in genetics\u2014and, more broadly, in society as a whole.\n\nAcknowledgements \nAuthors are grateful to Lucy Garnier for translating this article from French.\n\nAuthor contributions \nGK is the main contributor. JV is the head of the research programme and collaborated to the writing of the article.\n\nFunding \nThis research was financed by the National Research Agency (ANR) in France (Project FITEGE, contract: ANR-14-CE29-0014).\n\nConflict of interest statement \nThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.\n\nFootnotes \n\n\n\u2191 The Order of 10 August 2015 increased the number of markers analyzed to 21; policemen and analysis laboratories had three years to comply with this new requirement. \n\n\u2191 Act n\u00b098-468 of 17 June 1998 relative to the punishment of sexual crimes and the protection of minors introduced article 706-54 into the Code of Criminal Procedure making provision for the creation of an automated national database to centralize the DNA profiles of persons convicted of offences of a sexual nature. The remit of the database was then extended on several occasions. In 2001, it included serious crimes against persons. In 2003, the law on internal security extended it to persons convicted of or implicated in crimes and offences against persons or property. \n\n\u2191 Collecting DNA samples in investigations is now the rule. An ad hoc body of staff has been trained over the past 15 years that almost systematically processes crime scenes. \n\n\u2191 This figure was provided to the French Parliament by the Ministry of the Interior following a question by parliamentarian Sergio Coronado (member of the \u201cEcologist\u201d parliamentary group) (http:\/\/questions.assemblee-nationale.fr\/q14\/14-79728QE.htm). \n\n\u2191 Art. 16.11 of the Civil Code \n\n\u2191 Art. 26, Domestic Security Guidance and Planning Act n\u00b0 95-73 of 21 January 1995 \n\n\u2191 This possibility was written into law in 2016 in article 796-56-1-1 of Act n\u00b0 2016-731 of 3 June 2016 strengthening provisions for the fight against organized crime, terrorism, and their financing, and improving the efficiency and guarantees of the criminal procedure. \n\n\u2191 For example, according to a study by the Telethon Institute of Genetics and Medicine, D2S1388, one of the markers used by the FNAEG, plays a determining role in the transmission of pseudohyperkalaemia, a rare genetic disease.[14] In 2011, a publication by Chinese researchers highlighted the association between marker D21S11-28.2 and coronary heart disease.[15] A team of Portuguese researchers[16] has developed an online calculator capable of correlating certain markers used in the FNAEG's DNA samples with individual affiliation to population groups (Sub-Saharan Africa, Eurasia, East Asia, North Africa, Near East, North America, South America, and Central America). \n\n\u2191 Case of Aycaguer V. France, 22 June 2017, 8806\/12, ECHR, Court (Fifth Section) \n\n\u2191 See legal summary, available at https:\/\/hudoc.echr.coe.int\/eng#{%22itemid%22:[%22002-11703%22} \n\n\u2191 A search conducted on the press database Europresse for the period 2010 to 2018 brought up around 70 pieces published mentioning the terms \u201cDNA Photofits\u201d or \u201cGenetic photofits\u201d. \n\n\u2191 These bodies were the Commission nationale consultative des droits de l'homme (CNCDH \u2013 National consultative committee on human rights) and the approval committee for people authorized to conduct identification procedures using DNA profiles in the context of legal proceedings or the extrajudicial procedure for identifying deceased persons. \n\n\nReferences \n\n\n\u2191 Potter, V.R.&#32;(1970).&#32;\"Bioethics, the science of survival\".&#32;Perspectives in Biology and Medicine&#32;14&#32;(1): 127\u201353.&#32;doi:10.1353\/pbm.1970.0015. &#160; \n\n\u2191 Vailly, J.&#32;(2013).&#32;The Birth of a Genetics Policy: Social Issues of Newborn Screening.&#32;Routledge.&#32;pp.&#160;240.&#32;ISBN&#160;9781472422729. &#160; \n\n\u2191 Isambert, F.A.&#32;(1980).&#32;\"\u00c9thique et g\u00e9n\u00e9tique: De l'utopie eug\u00e9nique au contr\u00f4le des malformations cong\u00e9nitales\".&#32;Revue fran\u00e7aise de sociologie&#32;21&#32;(3): 331\u201354.&#32;doi:10.2307\/3320930. &#160; \n\n\u2191 Pulman, B.&#32;(2005).&#32;\"Les enjeux du clonage\".&#32;Revue fran\u00e7aise de sociologie&#32;46&#32;(3): 413\u201342.&#32;doi:10.3917\/rfs.463.0413. &#160; \n\n\u2191 Bourret, P.; Rabeharisoa, V.&#32;(2008).&#32;\"D\u00e9cision et jugement m\u00e9dicaux en situation de forte incertitude&#160;: l\u2019exemple de deux pratiques cliniques \u00e0 l\u2019\u00e9preuve de la g\u00e9n\u00e9tique\".&#32;Sciences sociales et sant\u00e9&#32;26&#32;(1): 128.&#32;doi:10.3917\/sss.261.0033. &#160; \n\n\u2191 Brewer, P.R.; Ley, B.L.&#32;(2009).&#32;\"Media Use and Public Perceptions of DNA Evidence\".&#32;Science Communication&#32;32&#32;(1): 93\u2013117.&#32;doi:10.1177\/1075547009340343. &#160; \n\n\u2191 Williams, R.; Johnson, P.&#32;(2008).&#32;Genetic Policing: The Uses of DNA in Police Investigations.&#32;Willan.&#32;pp.&#160;208.&#32;ISBN&#160;9781843922049. &#160; \n\n\u2191 Cabal, C.; Le D\u00e9aut, J.-Y.; Revol, H.&#32;(2001).&#32;Rapport sur la valeur scientifique de l'utilisation des empreintes g\u00e9n\u00e9tiques dans le domaine judiciaire.&#32;Assembl\u00e9e nationale.&#32;ISBN&#160;2111150177. &#160; \n\n\u2191 Bonniol, J.-L.; Darlu, P.&#32;(2014).&#32;\"L\u2019ADN au service d\u2019une nouvelle qu\u00eate des anc\u00eatres?\".&#32;Civilisations&#32;63: 201\u201319.&#32;doi:10.4000\/civilisations.3747. &#160; \n\n\u2191 Kayser, M.; de Knijff, P.&#32;(2011).&#32;\"Improving human forensics through advances in genetics, genomics and molecular biology\".&#32;Nature Reviews Genetics&#32;12&#32;(3): 179\u201392.&#32;doi:10.1038\/nrg2952.&#32;PMID&#160;21331090. &#160; \n\n\u2191 11.0 11.1 Kayser, M.&#32;(2015).&#32;\"Forensic DNA Phenotyping: Predicting human appearance from crime scene material for investigative purposes\".&#32;Forensic Science International Genetics&#32;18: 33\u201348.&#32;doi:10.1016\/j.fsigen.2015.02.003.&#32;PMID&#160;25716572. &#160; \n\n\u2191 M'charek, A.&#32;(2013).&#32;\"Beyond Fact or Fiction: On the Materiality of Race in Practice\".&#32;Cultural Anthropology&#32;28&#32;(3): 420\u201342.&#32;doi:10.1111\/cuan.12012. &#160; \n\n\u2191 Caliebe, A.; Krawczak, M.; Kayser, M.&#32;(2018).&#32;\"Predictive values in Forensic DNA Phenotyping are not necessarily prevalence-dependent\".&#32;FSI Genetics&#32;33: e7\u2013e8.&#32;doi:10.1016\/j.fsigen.2017.11.006. &#160; \n\n\u2191 Carella, M.; d'Adamo, A.P.; Grootenboer-Mignot, S. et al.&#32;(2004).&#32;\"A second locus mapping to 2q35-36 for familial pseudohyperkalaemia\".&#32;European Journal of Human Genetics&#32;12&#32;(12): 1073\u20136.&#32;doi:10.1038\/sj.ejhg.5201280. &#160; \n\n\u2191 Hui, L.; Jing, Y.; Rui, M.; Weijian, Y.&#32;(2011).&#32;\"Novel association analysis between 9 short tandem repeat loci polymorphisms and coronary heart disease based on a cross-validation design\".&#32;Atherosclerosis&#32;218&#32;(1): 151\u20135.&#32;doi:10.1016\/j.atherosclerosis.2011.05.024.&#32;PMID&#160;21703622. &#160; \n\n\u2191 Pereira, L.; Alshamali, F.; Andreassen, R. et al.&#32;(2011).&#32;\"PopAffiliator: online calculator for individual affiliation to a major population group based on 17 autosomal short tandem repeat genotype profile\".&#32;International Journal of Legal Medicine&#32;125&#32;(5): 629\u201336.&#32;doi:10.1007\/s00414-010-0472-2.&#32;PMID&#160;20552217. &#160; \n\n\u2191 M'charek, A.&#32;(2008).&#32;\"Silent witness, articulate collective: DNA evidence and the inference of visible traits\".&#32;Bioethics&#32;22&#32;(9): 519-28.&#32;doi:10.1111\/j.1467-8519.2008.00699.x.&#32;PMID&#160;18959734. &#160; \n\n\u2191 MacLean, C.E.; Lamparello, A.&#32;(2014).&#32;\"Forensic DNA phenotyping in criminal investigations and criminal courts: Assessing and mitigating the dilemmas inherent in the science\".&#32;Recent Advances in DNA and Gene Sequences&#32;8&#32;(2): 104-12.&#32;PMID&#160;25687339. &#160; \n\n\u2191 Toom, V.; Wienroth, M.; M'charek, A. et al.&#32;(2016).&#32;\"Approaching ethical, legal and social issues of emerging forensic DNA phenotyping (FDP) technologies comprehensively: Reply to 'Forensic DNA phenotyping: Predicting human appearance from crime scene material for investigative purposes' by Manfred Kayser\".&#32;Forensic Science International Genetics&#32;22: e1\u2013e4.&#32;doi:10.1016\/j.fsigen.2016.01.010.&#32;PMID&#160;26832996. &#160; \n\n\u2191 Buchanan, N.; Staubach, F.; Wienroth, M. et al.&#32;(2018).&#32;\"Forensic DNA phenotyping legislation cannot be based on \u201cIdeal FDP\u201d\u2014A response to Caliebe, Krawczak and Kayser (2017)\".&#32;FSI Genetics&#32;34: e13\u2013e14.&#32;doi:10.1016\/j.fsigen.2018.01.009. &#160; \n\n\u2191 Vailly, J.&#32;(2017).&#32;\"The politics of suspects\u2019 geo-genetic origin in France: The conditions, expression, and effects of problematisation\".&#32;BioSocieties&#32;12&#32;(1): 66\u201388.&#32;doi:10.1057\/s41292-016-0028-x. &#160; \n\n\u2191 Prainsack, B.&#32;(2010).&#32;\"Chapter 2: Key issues in DNA profiling and databasing: Implications for governance\".&#32;In&#32;Hindmarsh, R.; Prainsack, B..&#32;Genetic Suspects: Global Governance of Forensic DNA Profiling and Databasing.&#32;Cambridge University Press.&#32;pp.&#160;15\u201339.&#32;ISBN&#160;9780521519434. &#160; \n\n\u2191 Williams, R.; Wienroth, M.&#32;(2017).&#32;\"Social and ethical aspects of forensic genetics: A critical review\".&#32;Forensic Science Review&#32;29&#32;(2): 145\u201369.&#32;PMID&#160;28691916. &#160; \n\n\u2191 Aas, K.F.&#32;(2006).&#32;\"\u2018The body does not lie\u2019: Identity, risk and trust in technoculture\".&#32;Crime, Media, Culture: An International Journal&#32;2&#32;(2): 143-158.&#32;doi:10.1177\/1741659006065401. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Footnotes were originally numbered but have been converted to lowercase alpha for this version. The link in footnote j had to be applied to Google Shortener because the HUDOC uses invalid characters in their URLs, and this wiki's footnote system breaks when the original URL is used.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\">https:\/\/www.limswiki.org\/index.php\/Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on forensic scienceLIMSwiki journal articles on health informatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 11 September 2018, at 19:26.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 1,494 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","9872ac73fcb8d8cb5b8b6de9ced82c60_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:How could the ethical management of health data in the medical field inform police use of DNA?<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Various events paved the way for the production of ethical norms regulating biomedical practices, from the Nuremberg Code (1947)\u2014produced by the international trial of Nazi regime leaders and collaborators\u2014and the Declaration of Helsinki by the World Medical Association (1964) to the invention of the term \u201cbioethics\u201d by American biologist Van Rensselaer Potter.<sup id=\"rdp-ebb-cite_ref-PotterBio70_1-0\" class=\"reference\"><a href=\"#cite_note-PotterBio70-1\" rel=\"external_link\">[1]<\/a><\/sup> The ethics of biomedicine has given rise to various controversies\u2014particularly in the fields of newborn screening<sup id=\"rdp-ebb-cite_ref-2\" class=\"reference\"><a href=\"#cite_note-2\" rel=\"external_link\">[2]<\/a><\/sup>, prenatal screening<sup id=\"rdp-ebb-cite_ref-Isambert.C3.89thique80_3-0\" class=\"reference\"><a href=\"#cite_note-Isambert.C3.89thique80-3\" rel=\"external_link\">[3]<\/a><\/sup>, and cloning<sup id=\"rdp-ebb-cite_ref-PulmanLesEnjeux05_4-0\" class=\"reference\"><a href=\"#cite_note-PulmanLesEnjeux05-4\" rel=\"external_link\">[4]<\/a><\/sup>\u2014resulting in the institutionalization of ethical questions in the biomedical world of genetics. In 1994, France passed legislation (commonly known as the \u201cbioethics laws\u201d) to regulate medical practices in genetics. The medical community has also organized itself in order to manage ethical issues relating to its decisions, with a view to handling \u201cpractices with many strong uncertainties\u201d and enabling clinical judgments and decisions to be made not by individual practitioners but rather by multidisciplinary groups drawing on different modes of judgment and forms of expertise.<sup id=\"rdp-ebb-cite_ref-BourretD.C3.A9cision08_5-0\" class=\"reference\"><a href=\"#cite_note-BourretD.C3.A9cision08-5\" rel=\"external_link\">[5]<\/a><\/sup> Thus, the biomedical approach to genetics has been characterized by various debates and the existence of public controversies.\n<\/p><p>In the judicial sphere, the situation is very different. Since the end of the 1990s, developments in biomedical research have led to genetic data being used in police work and legal proceedings. Today, <a href=\"https:\/\/www.limswiki.org\/index.php\/Forensic_science\" title=\"Forensic science\" target=\"_blank\" class=\"wiki-link\" data-key=\"415d36a7b65494677b6d2873d5febec1\">forensic science<\/a> is omnipresent in investigations, not just in complex criminal cases but also routinely in cases of \u201cminor\u201d or \u201cmass\u201d delinquency. Genetics, which certainly receives the most media coverage among the techniques involved<sup id=\"rdp-ebb-cite_ref-BrewerMedia09_6-0\" class=\"reference\"><a href=\"#cite_note-BrewerMedia09-6\" rel=\"external_link\">[6]<\/a><\/sup>, has taken on considerable importance.<sup id=\"rdp-ebb-cite_ref-WilliamsGenetic08_7-0\" class=\"reference\"><a href=\"#cite_note-WilliamsGenetic08-7\" rel=\"external_link\">[7]<\/a><\/sup> However, although very similar techniques are used in biomedicine and police work (DNA amplification, <a href=\"https:\/\/www.limswiki.org\/index.php\/Sequencing\" title=\"Sequencing\" class=\"mw-disambig wiki-link\" target=\"_blank\" data-key=\"e36167a9eb152ca16a0c4c4e6d13f323\">sequencing<\/a>, etc.), the forms of collective management surrounding them are very different, as well as the ethico-legal frameworks and their evolution, as this text will demonstrate.\n<\/p><p><b>Keywords<\/b>: DNA, police, ethics, genetic technologies, criminal investigations\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Nature_of_the_information_and_genetic_data_produced_in_the_police_sphere\">Nature of the information and genetic data produced in the police sphere<\/span><\/h2>\n<p>In police work in France, data produced by DNA are currently compiled and used in two different ways: first, to create files on individuals in the FNAEG or <i>Fichier national automatis\u00e9 des empreintes g\u00e9n\u00e9tiques<\/i> (national automated DNA database) and, second, in order to obtain <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> about perpetrators of crimes (their appearance, their origin, their kinship links to other individuals).\n<\/p><p>Police use of DNA has been allowed in France since the 1998 law providing for the creation of the FNAEG. A DNA profile corresponds to a \u201cspecific individual alphanumeric combination\u201d<sup id=\"rdp-ebb-cite_ref-CabalRapport01_8-0\" class=\"reference\"><a href=\"#cite_note-CabalRapport01-8\" rel=\"external_link\">[8]<\/a><\/sup> that is the numerical encoding of analysis of DNA segments. This profile is the result of analysis of DNA fragments using genetic markers. This analysis can be carried out on a minute amount of genetic material (saliva, blood, sperm, hair, contact, etc.). It identifies the presence of sequences specific to an individual that differentiate them from any other person (with the exception of an identical twin) but that are not supposed to provide any phenotypical information (about appearance, geographical origin, or diseases).<sup id=\"rdp-ebb-cite_ref-9\" class=\"reference\"><a href=\"#cite_note-9\" rel=\"external_link\">[a]<\/a><\/sup> Such profiles therefore make individuals \u201cidentifiable in their uniqueness.\u201d<sup id=\"rdp-ebb-cite_ref-BonniolL.27ADN14_10-0\" class=\"reference\"><a href=\"#cite_note-BonniolL.27ADN14-10\" rel=\"external_link\">[9]<\/a><\/sup> During investigations, DNA is collected from suspects or unidentified stains left on crime scenes or people and the results of this analysis are entered into the database. Identification through the FNAEG was originally restricted to a limited number of crimes\u2014those of a sexual nature, as part of the law relating to the prevention and punishment of sexual crimes and the protection of minors. This remit has progressively been extended to include the vast majority of crimes and offences<sup id=\"rdp-ebb-cite_ref-11\" class=\"reference\"><a href=\"#cite_note-11\" rel=\"external_link\">[b]<\/a><\/sup>, leading to the routine use of DNA in investigations.<sup id=\"rdp-ebb-cite_ref-12\" class=\"reference\"><a href=\"#cite_note-12\" rel=\"external_link\">[c]<\/a><\/sup> As a result of this evolution, there has been a substantial increase in the number of persons with files in the FNAEG, more than three million as of late 2015.<sup id=\"rdp-ebb-cite_ref-13\" class=\"reference\"><a href=\"#cite_note-13\" rel=\"external_link\">[d]<\/a><\/sup>\n<\/p><p>New techniques have also emerged in recent years. It is now possible to obtain indications about an individual's physical appearance based on a sample of his\/her DNA<sup id=\"rdp-ebb-cite_ref-KayserImproving11_14-0\" class=\"reference\"><a href=\"#cite_note-KayserImproving11-14\" rel=\"external_link\">[10]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-KayserForensic15_15-0\" class=\"reference\"><a href=\"#cite_note-KayserForensic15-15\" rel=\"external_link\">[11]<\/a><\/sup>: the analyses in question provide statistical information on eye, hair, and skin color, etc. These techniques are more exploratory and aim not to match DNA with an identity by comparison but to determine the characteristics of the perpetrator of a crime. These data result from <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">analysis<\/a> of several dozen DNA markers that, unlike the FNAEG's data, are selected deliberately so that they can provide information about a person's physical appearance. They are therefore aimed at \u201cgenerating a suspect\u201d<sup id=\"rdp-ebb-cite_ref-M.27charekBeyond13_16-0\" class=\"reference\"><a href=\"#cite_note-M.27charekBeyond13-16\" rel=\"external_link\">[12]<\/a><\/sup> but because the information about this person's features are incomplete (e.g., a person with blue eyes, fair skin, light brown hair, and of European \u201cbio-geographical\u201d ancestry), they define \u201ctarget populations of interest\u201d to guide police investigations.<sup id=\"rdp-ebb-cite_ref-CaliebePredictive18_17-0\" class=\"reference\"><a href=\"#cite_note-CaliebePredictive18-17\" rel=\"external_link\">[13]<\/a><\/sup> Several private and public laboratories in France now produce what professionals often refer to as \u201cDNA photofits\u201d; it is estimated that several dozen such analyses have been carried out since 2014 as part of investigations.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"How_is_this_framed_legally.2C_politically.2C_and_ethically.3F\">How is this framed legally, politically, and ethically?<\/span><\/h2>\n<p>The legal framework surrounding how the police and justice system use DNA analysis was devised to follow the creation of the FNAEG. For this reason, and in order to defuse fears and criticisms, the law only allows analyses using \u201cnon-coding\u201d DNA so as to meet the initial objective of allowing identification without providing information about individuals. French law only provides the police DNA for identification purposes \u201cwithin the framework of investigative measures or the preparation of a case during a judicial proceeding,\u201d<sup id=\"rdp-ebb-cite_ref-18\" class=\"reference\"><a href=\"#cite_note-18\" rel=\"external_link\">[e]<\/a><\/sup> in cases of missing persons<sup id=\"rdp-ebb-cite_ref-19\" class=\"reference\"><a href=\"#cite_note-19\" rel=\"external_link\">[f]<\/a><\/sup>, or, more recently, in the context of familial searches to allow \u201csearches for persons directly related to [an] unknown person\u201d who has left a stain at a crime scene (i.e., without determining phenotype).<sup id=\"rdp-ebb-cite_ref-20\" class=\"reference\"><a href=\"#cite_note-20\" rel=\"external_link\">[g]<\/a><\/sup>\n<\/p><p>Concerning the so-called \u201cDNA Photofit\u201d technique, in June 2014, France's highest court, the Court of Cassation, ruled admissible an expert report charged with providing \u201call useful elements relating to the suspect's visible morphological characteristics\u201d based on stains collected after a rape in an investigation into a series of sexual assaults in Lyon between October 2012 and January 2014. The Court of Cassation's authorization of this practice in DNA analysis was the first in France. For judges and prosecutors, there is now set a legal precedent allowing them to authorize \u201cDNA Photofits\u201d when they consider this could help an investigation.\n<\/p><p>In legal terms, the emerging of new technical possibilities and their practical use create conflicting and parallel regimes. On one hand, \u201cDNA Photofits\u201d do not correspond to the legal frameworks devised in the 1990s. It does not provide identification, per se, but is rather an \u201cassistance to the investigation,\u201d as it uses coding DNA. One another hand, as science evolves, the law is falling out of step with the technical and scientific reality. New knowledge shows that some of the markers used by the FNAEG may in fact allow further information to be obtained about people regarding their predisposition to certain diseases, their genetic pathologies, and their \u201cethnic origin\u201d (by continent or sub-continent).<sup id=\"rdp-ebb-cite_ref-24\" class=\"reference\"><a href=\"#cite_note-24\" rel=\"external_link\">[h]<\/a><\/sup> Moreover, whereas at the FNAEG's inception it was considered unacceptable for the police to use medical information, certain professionals in police or justice now recognize that this information (whether genetic or not) can be useful in investigations (providing information about wanted persons' need for medication or healthcare, or about their physical appearance, etc.). Although there are no changes in the legal framework on this matter, the idea is spreading and the red line is, to some extend, and for some of the professionals, fading.\n<\/p><p>It is thus obvious that police uses of DNA data providing information about individuals' characteristics raise novel politic-ethical issues.<sup id=\"rdp-ebb-cite_ref-M.27charekSilent08_25-0\" class=\"reference\"><a href=\"#cite_note-M.27charekSilent08-25\" rel=\"external_link\">[17]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-MacLeanForensic14_26-0\" class=\"reference\"><a href=\"#cite_note-MacLeanForensic14-26\" rel=\"external_link\">[18]<\/a><\/sup> In particular, it brings into play the issue of what constitutes private data<sup id=\"rdp-ebb-cite_ref-ToomApproaching16_27-0\" class=\"reference\"><a href=\"#cite_note-ToomApproaching16-27\" rel=\"external_link\">[19]<\/a><\/sup>\u2014for certain geneticists, where \u201cDNA Photofits\u201d are concerned, externally visible characteristics do not fall into this category because they are visible.<sup id=\"rdp-ebb-cite_ref-KayserForensic15_15-1\" class=\"reference\"><a href=\"#cite_note-KayserForensic15-15\" rel=\"external_link\">[11]<\/a><\/sup> Generally, as stated by some professionals during interviews, the question is \u201cto know until where to go. And where to stop.\u201c Regarding the FNAEG and French law, in a case heard in June 2017, the European Court of Human Rights (ECHR) ruled that \u201cinterference with the applicant's right to respect for his private life had been disproportionate.\u201d<sup id=\"rdp-ebb-cite_ref-28\" class=\"reference\"><a href=\"#cite_note-28\" rel=\"external_link\">[i]<\/a><\/sup> The ECHR judgment ruled against France and underscored that French law regarding DNA date storage should be differentiated \u201caccording to the nature and seriousness of the offence committed.\"<sup id=\"rdp-ebb-cite_ref-29\" class=\"reference\"><a href=\"#cite_note-29\" rel=\"external_link\">[j]<\/a><\/sup>\n<\/p><p>In Germany, a contradictory dialogue between experts took place regarding Forensic DNA Phenotyping revealing public and political debate on the matter.<sup id=\"rdp-ebb-cite_ref-BuchananForensic18_30-0\" class=\"reference\"><a href=\"#cite_note-BuchananForensic18-30\" rel=\"external_link\">[20]<\/a><\/sup> In France, despite the stakes involved and the spread of new usages of DNA techniques, no public debate has emerged in recent years concerning new uses of DNA in police work. In 2008, a private analysis <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratory<\/a> offering indicative geo-genetic tests (<i>tests d'origine g\u00e9o-g\u00e9n\u00e9tique<\/i> or TOGG) providing information about individuals' origin based on their DNA sparked a media debate that complicated the issue<sup id=\"rdp-ebb-cite_ref-VaillyThePolitics17_31-0\" class=\"reference\"><a href=\"#cite_note-VaillyThePolitics17-31\" rel=\"external_link\">[21]<\/a><\/sup>; however, the controversy soon died down. A few years later, Ministry of Justice instructions to judges and prosecutors discouraged the use of this technique, with no further debate. Since then, although the Court of Cassation's 2014 decision opened up the possibility of using an unprecedented practice, this has not generated any public debate or controversy. \n<\/p><p>\u201cDNA Photofits\u201d have received some media coverage<sup id=\"rdp-ebb-cite_ref-32\" class=\"reference\"><a href=\"#cite_note-32\" rel=\"external_link\">[k]<\/a><\/sup>, but this has mainly been to underscore the technical process involved, echoing the fiction conveyed by television series that have made the use of genetic techniques in criminal investigations seem commonplace and particularly efficient. Our sociological fieldwork has revealed, however, that there was organized debate among judges and prosecutors between 2013 and 2014. At the time, the investigating judge who had for the first time ordered the analysis of the suspect's visible morphological characteristics referred the case to the examining chamber himself, to obtain a verdict on whether the expert report he had requested was legal. Although the examining chamber approved the report, the public prosecutor brought the issue before the Court of Cassation\u2014the highest legal authority in France\u2014in order to ensure the final nature of the decision. The Court of Cassation ruled that a judge could have recourse to such analyses. Following this verdict, several bodies consulted by the Ministry of Justice<sup id=\"rdp-ebb-cite_ref-33\" class=\"reference\"><a href=\"#cite_note-33\" rel=\"external_link\">[l]<\/a><\/sup> provided opinions underscoring the need for this technique to be written into and regulated by the law. This has not been implemented to date. After being authorized for several years under a temporary protocol, familial searches allowing \u201cgenetic proximity testing\u201d<sup id=\"rdp-ebb-cite_ref-PrainsackGenetic10_34-0\" class=\"reference\"><a href=\"#cite_note-PrainsackGenetic10-34\" rel=\"external_link\">[22]<\/a><\/sup> were written into law in 2016. However, the Court of Cassation's judgment on DNA analysis to provide \u201call useful elements relating to a suspect's visible morphological characteristics\u201d has not been brought up for parliamentary debate to be included in the law. There has been no political management of the question at the state level, nor has the issue been included in the general debate organized by the National Consultative Council of Ethics (Comit\u00e9 Consultatif National d'Ethique) in 2018 regarding the revision of laws on bio-ethics.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusion\">Conclusion<\/span><\/h2>\n<p>The use of these new technological and scientific techniques plays a significant role in guiding how we engage with the world<sup id=\"rdp-ebb-cite_ref-WilliamsSocial17_35-0\" class=\"reference\"><a href=\"#cite_note-WilliamsSocial17-35\" rel=\"external_link\">[23]<\/a><\/sup>, just as it redefines the production of identity translated into information<sup id=\"rdp-ebb-cite_ref-AasTheBody06_36-0\" class=\"reference\"><a href=\"#cite_note-AasTheBody06-36\" rel=\"external_link\">[24]<\/a><\/sup> and structures the way sensitive information about individuals is used and circulated. Despite these stakes, and the initial caution that surrounded the creation of the national automated DNA database, it has not gone hand-in-hand with collective political and ethical debate. This raises questions about the conditions for the existence or for the absence of political controversies that call for further sociological investigations about the framing of the issue and the social and political logic at play.\n<\/p><p>As the uses of these techniques are developing in police practices, this absence of collective management of the issue refers the professional to forms of local arbitration. Our fieldwork has shown that they are aware that these practices raise issues and therefore devise ethical frameworks for their own use of DNA. As a consequence, in this field, as it is the case in others, ethical issues are addressed in a fragmented manner as endogenous ethical frameworks are \u201ccobbled together\u201d by professionals as a function of their practices and needs. Each institution, laboratory, and in some cases each individual, is crafting a frame and a perimeter of limits to what can be done according to their understanding and appreciation of the legal setting, the practical utility of actions and the ethical constraints perceived.\n<\/p><p>The ECHR's recent ruling against France regarding the FNAEG may force lawmakers to reach a verdict on this issue, thereby triggering what seems like necessary public debate on forensic use of DNA. The new possibilities provided by genetic technologies point to the need for promoting dialogue among the various professionals using this technology in police work (forensic teams and geneticists working with them, police investigators, private laboratories, prosecutors, judges, etc.), but also with healthcare professionals\u2014who already have experience of the institutionalized management of ethical considerations relating to their practices in genetics\u2014and, more broadly, in society as a whole.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>Authors are grateful to Lucy Garnier for translating this article from French.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>GK is the main contributor. JV is the head of the research programme and collaborated to the writing of the article.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Funding\">Funding<\/span><\/h3>\n<p>This research was financed by the National Research Agency (ANR) in France (Project FITEGE, contract: ANR-14-CE29-0014).\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflict_of_interest_statement\">Conflict of interest statement<\/span><\/h3>\n<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Footnotes\">Footnotes<\/span><\/h2>\n<div class=\"reflist\" style=\"list-style-type: lower-alpha;\">\n<ol class=\"references\">\n<li id=\"cite_note-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-9\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">The Order of 10 August 2015 increased the number of markers analyzed to 21; policemen and analysis laboratories had three years to comply with this new requirement.<\/span>\n<\/li>\n<li id=\"cite_note-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-11\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">Act n\u00b098-468 of 17 June 1998 relative to the punishment of sexual crimes and the protection of minors introduced article 706-54 into the Code of Criminal Procedure making provision for the creation of an automated national database to centralize the DNA profiles of persons convicted of offences of a sexual nature. The remit of the database was then extended on several occasions. In 2001, it included serious crimes against persons. In 2003, the law on internal security extended it to persons convicted of or implicated in crimes and offences against persons or property.<\/span>\n<\/li>\n<li id=\"cite_note-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-12\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">Collecting DNA samples in investigations is now the rule. An <i>ad hoc<\/i> body of staff has been trained over the past 15 years that almost systematically processes crime scenes.<\/span>\n<\/li>\n<li id=\"cite_note-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-13\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">This figure was provided to the French Parliament by the Ministry of the Interior following a question by parliamentarian Sergio Coronado (member of the \u201cEcologist\u201d parliamentary group) (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/questions.assemblee-nationale.fr\/q14\/14-79728QE.htm\" target=\"_blank\">http:\/\/questions.assemblee-nationale.fr\/q14\/14-79728QE.htm<\/a>).<\/span>\n<\/li>\n<li id=\"cite_note-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-18\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">Art. 16.11 of the Civil Code<\/span>\n<\/li>\n<li id=\"cite_note-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-19\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">Art. 26, Domestic Security Guidance and Planning Act n\u00b0 95-73 of 21 January 1995<\/span>\n<\/li>\n<li id=\"cite_note-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-20\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">This possibility was written into law in 2016 in article 796-56-1-1 of Act n\u00b0 2016-731 of 3 June 2016 strengthening provisions for the fight against organized crime, terrorism, and their financing, and improving the efficiency and guarantees of the criminal procedure.<\/span>\n<\/li>\n<li id=\"cite_note-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-24\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">For example, according to a study by the Telethon Institute of Genetics and Medicine, D2S1388, one of the markers used by the FNAEG, plays a determining role in the transmission of pseudohyperkalaemia, a rare genetic disease.<sup id=\"rdp-ebb-cite_ref-CarellaASecond04_21-0\" class=\"reference\"><a href=\"#cite_note-CarellaASecond04-21\" rel=\"external_link\">[14]<\/a><\/sup> In 2011, a publication by Chinese researchers highlighted the association between marker D21S11-28.2 and coronary heart disease.<sup id=\"rdp-ebb-cite_ref-HuiNovel11_22-0\" class=\"reference\"><a href=\"#cite_note-HuiNovel11-22\" rel=\"external_link\">[15]<\/a><\/sup> A team of Portuguese researchers<sup id=\"rdp-ebb-cite_ref-PereiraPop11_23-0\" class=\"reference\"><a href=\"#cite_note-PereiraPop11-23\" rel=\"external_link\">[16]<\/a><\/sup> has developed an online calculator capable of correlating certain markers used in the FNAEG's DNA samples with individual affiliation to population groups (Sub-Saharan Africa, Eurasia, East Asia, North Africa, Near East, North America, South America, and Central America).<\/span>\n<\/li>\n<li id=\"cite_note-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-28\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">Case of Aycaguer V. France, 22 June 2017, 8806\/12, ECHR, Court (Fifth Section)<\/span>\n<\/li>\n<li id=\"cite_note-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-29\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">See legal summary, available at <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/goo.gl\/FcyuUM\" target=\"_blank\">https:\/\/hudoc.echr.coe.int\/eng#{%22itemid%22:[%22002-11703%22<\/a>} <\/span>\n<\/li>\n<li id=\"cite_note-32\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-32\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">A search conducted on the press database Europresse for the period 2010 to 2018 brought up around 70 pieces published mentioning the terms \u201cDNA Photofits\u201d or \u201cGenetic photofits\u201d.<\/span>\n<\/li>\n<li id=\"cite_note-33\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-33\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">These bodies were the Commission nationale consultative des droits de l'homme (CNCDH \u2013 National consultative committee on human rights) and the approval committee for people authorized to conduct identification procedures using DNA profiles in the context of legal proceedings or the extrajudicial procedure for identifying deceased persons.<\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-PotterBio70-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PotterBio70_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Potter, V.R.&#32;(1970).&#32;\"Bioethics, the science of survival\".&#32;<i>Perspectives in Biology and Medicine<\/i>&#32;<b>14<\/b>&#32;(1): 127\u201353.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1353%2Fpbm.1970.0015\" target=\"_blank\">10.1353\/pbm.1970.0015<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Bioethics%2C+the+science+of+survival&amp;rft.jtitle=Perspectives+in+Biology+and+Medicine&amp;rft.aulast=Potter%2C+V.R.&amp;rft.au=Potter%2C+V.R.&amp;rft.date=1970&amp;rft.volume=14&amp;rft.issue=1&amp;rft.pages=127%E2%80%9353&amp;rft_id=info:doi\/10.1353%2Fpbm.1970.0015&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-2\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Vailly, J.&#32;(2013).&#32;<i>The Birth of a Genetics Policy: Social Issues of Newborn Screening<\/i>.&#32;Routledge.&#32;pp.&#160;240.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781472422729.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=The+Birth+of+a+Genetics+Policy%3A+Social+Issues+of+Newborn+Screening&amp;rft.aulast=Vailly%2C+J.&amp;rft.au=Vailly%2C+J.&amp;rft.date=2013&amp;rft.pages=pp.%26nbsp%3B240&amp;rft.pub=Routledge&amp;rft.isbn=9781472422729&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Isambert.C3.89thique80-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Isambert.C3.89thique80_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Isambert, F.A.&#32;(1980).&#32;\"\u00c9thique et g\u00e9n\u00e9tique: De l'utopie eug\u00e9nique au contr\u00f4le des malformations cong\u00e9nitales\".&#32;<i>Revue fran\u00e7aise de sociologie<\/i>&#32;<b>21<\/b>&#32;(3): 331\u201354.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2307%2F3320930\" target=\"_blank\">10.2307\/3320930<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=%C3%89thique+et+g%C3%A9n%C3%A9tique%3A+De+l%27utopie+eug%C3%A9nique+au+contr%C3%B4le+des+malformations+cong%C3%A9nitales&amp;rft.jtitle=Revue+fran%C3%A7aise+de+sociologie&amp;rft.aulast=Isambert%2C+F.A.&amp;rft.au=Isambert%2C+F.A.&amp;rft.date=1980&amp;rft.volume=21&amp;rft.issue=3&amp;rft.pages=331%E2%80%9354&amp;rft_id=info:doi\/10.2307%2F3320930&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PulmanLesEnjeux05-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PulmanLesEnjeux05_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pulman, B.&#32;(2005).&#32;\"Les enjeux du clonage\".&#32;<i>Revue fran\u00e7aise de sociologie<\/i>&#32;<b>46<\/b>&#32;(3): 413\u201342.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3917%2Frfs.463.0413\" target=\"_blank\">10.3917\/rfs.463.0413<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Les+enjeux+du+clonage&amp;rft.jtitle=Revue+fran%C3%A7aise+de+sociologie&amp;rft.aulast=Pulman%2C+B.&amp;rft.au=Pulman%2C+B.&amp;rft.date=2005&amp;rft.volume=46&amp;rft.issue=3&amp;rft.pages=413%E2%80%9342&amp;rft_id=info:doi\/10.3917%2Frfs.463.0413&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BourretD.C3.A9cision08-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BourretD.C3.A9cision08_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bourret, P.; Rabeharisoa, V.&#32;(2008).&#32;\"D\u00e9cision et jugement m\u00e9dicaux en situation de forte incertitude&#160;: l\u2019exemple de deux pratiques cliniques \u00e0 l\u2019\u00e9preuve de la g\u00e9n\u00e9tique\".&#32;<i>Sciences sociales et sant\u00e9<\/i>&#32;<b>26<\/b>&#32;(1): 128.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3917%2Fsss.261.0033\" target=\"_blank\">10.3917\/sss.261.0033<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=D%C3%A9cision+et+jugement+m%C3%A9dicaux+en+situation+de+forte+incertitude+%3A+l%E2%80%99exemple+de+deux+pratiques+cliniques+%C3%A0+l%E2%80%99%C3%A9preuve+de+la+g%C3%A9n%C3%A9tique&amp;rft.jtitle=Sciences+sociales+et+sant%C3%A9&amp;rft.aulast=Bourret%2C+P.%3B+Rabeharisoa%2C+V.&amp;rft.au=Bourret%2C+P.%3B+Rabeharisoa%2C+V.&amp;rft.date=2008&amp;rft.volume=26&amp;rft.issue=1&amp;rft.pages=128&amp;rft_id=info:doi\/10.3917%2Fsss.261.0033&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BrewerMedia09-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BrewerMedia09_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Brewer, P.R.; Ley, B.L.&#32;(2009).&#32;\"Media Use and Public Perceptions of DNA Evidence\".&#32;<i>Science Communication<\/i>&#32;<b>32<\/b>&#32;(1): 93\u2013117.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1177%2F1075547009340343\" target=\"_blank\">10.1177\/1075547009340343<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Media+Use+and+Public+Perceptions+of+DNA+Evidence&amp;rft.jtitle=Science+Communication&amp;rft.aulast=Brewer%2C+P.R.%3B+Ley%2C+B.L.&amp;rft.au=Brewer%2C+P.R.%3B+Ley%2C+B.L.&amp;rft.date=2009&amp;rft.volume=32&amp;rft.issue=1&amp;rft.pages=93%E2%80%93117&amp;rft_id=info:doi\/10.1177%2F1075547009340343&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WilliamsGenetic08-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WilliamsGenetic08_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Williams, R.; Johnson, P.&#32;(2008).&#32;<i>Genetic Policing: The Uses of DNA in Police Investigations<\/i>.&#32;Willan.&#32;pp.&#160;208.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781843922049.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Genetic+Policing%3A+The+Uses+of+DNA+in+Police+Investigations&amp;rft.aulast=Williams%2C+R.%3B+Johnson%2C+P.&amp;rft.au=Williams%2C+R.%3B+Johnson%2C+P.&amp;rft.date=2008&amp;rft.pages=pp.%26nbsp%3B208&amp;rft.pub=Willan&amp;rft.isbn=9781843922049&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CabalRapport01-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CabalRapport01_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Cabal, C.; Le D\u00e9aut, J.-Y.; Revol, H.&#32;(2001).&#32;<i>Rapport sur la valeur scientifique de l'utilisation des empreintes g\u00e9n\u00e9tiques dans le domaine judiciaire<\/i>.&#32;Assembl\u00e9e nationale.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;2111150177.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Rapport+sur+la+valeur+scientifique+de+l%27utilisation+des+empreintes+g%C3%A9n%C3%A9tiques+dans+le+domaine+judiciaire&amp;rft.aulast=Cabal%2C+C.%3B+Le+D%C3%A9aut%2C+J.-Y.%3B+Revol%2C+H.&amp;rft.au=Cabal%2C+C.%3B+Le+D%C3%A9aut%2C+J.-Y.%3B+Revol%2C+H.&amp;rft.date=2001&amp;rft.pub=Assembl%C3%A9e+nationale&amp;rft.isbn=2111150177&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BonniolL.27ADN14-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BonniolL.27ADN14_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bonniol, J.-L.; Darlu, P.&#32;(2014).&#32;\"L\u2019ADN au service d\u2019une nouvelle qu\u00eate des anc\u00eatres?\".&#32;<i>Civilisations<\/i>&#32;<b>63<\/b>: 201\u201319.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4000%2Fcivilisations.3747\" target=\"_blank\">10.4000\/civilisations.3747<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=L%E2%80%99ADN+au+service+d%E2%80%99une+nouvelle+qu%C3%AAte+des+anc%C3%AAtres%3F&amp;rft.jtitle=Civilisations&amp;rft.aulast=Bonniol%2C+J.-L.%3B+Darlu%2C+P.&amp;rft.au=Bonniol%2C+J.-L.%3B+Darlu%2C+P.&amp;rft.date=2014&amp;rft.volume=63&amp;rft.pages=201%E2%80%9319&amp;rft_id=info:doi\/10.4000%2Fcivilisations.3747&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KayserImproving11-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KayserImproving11_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kayser, M.; de Knijff, P.&#32;(2011).&#32;\"Improving human forensics through advances in genetics, genomics and molecular biology\".&#32;<i>Nature Reviews Genetics<\/i>&#32;<b>12<\/b>&#32;(3): 179\u201392.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnrg2952\" target=\"_blank\">10.1038\/nrg2952<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21331090\" target=\"_blank\">21331090<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Improving+human+forensics+through+advances+in+genetics%2C+genomics+and+molecular+biology&amp;rft.jtitle=Nature+Reviews+Genetics&amp;rft.aulast=Kayser%2C+M.%3B+de+Knijff%2C+P.&amp;rft.au=Kayser%2C+M.%3B+de+Knijff%2C+P.&amp;rft.date=2011&amp;rft.volume=12&amp;rft.issue=3&amp;rft.pages=179%E2%80%9392&amp;rft_id=info:doi\/10.1038%2Fnrg2952&amp;rft_id=info:pmid\/21331090&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KayserForensic15-15\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-KayserForensic15_15-0\" rel=\"external_link\">11.0<\/a><\/sup> <sup><a href=\"#cite_ref-KayserForensic15_15-1\" rel=\"external_link\">11.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kayser, M.&#32;(2015).&#32;\"Forensic DNA Phenotyping: Predicting human appearance from crime scene material for investigative purposes\".&#32;<i>Forensic Science International Genetics<\/i>&#32;<b>18<\/b>: 33\u201348.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.fsigen.2015.02.003\" target=\"_blank\">10.1016\/j.fsigen.2015.02.003<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25716572\" target=\"_blank\">25716572<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Forensic+DNA+Phenotyping%3A+Predicting+human+appearance+from+crime+scene+material+for+investigative+purposes&amp;rft.jtitle=Forensic+Science+International+Genetics&amp;rft.aulast=Kayser%2C+M.&amp;rft.au=Kayser%2C+M.&amp;rft.date=2015&amp;rft.volume=18&amp;rft.pages=33%E2%80%9348&amp;rft_id=info:doi\/10.1016%2Fj.fsigen.2015.02.003&amp;rft_id=info:pmid\/25716572&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-M.27charekBeyond13-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-M.27charekBeyond13_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">M'charek, A.&#32;(2013).&#32;\"Beyond Fact or Fiction: On the Materiality of Race in Practice\".&#32;<i>Cultural Anthropology<\/i>&#32;<b>28<\/b>&#32;(3): 420\u201342.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Fcuan.12012\" target=\"_blank\">10.1111\/cuan.12012<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Beyond+Fact+or+Fiction%3A+On+the+Materiality+of+Race+in+Practice&amp;rft.jtitle=Cultural+Anthropology&amp;rft.aulast=M%27charek%2C+A.&amp;rft.au=M%27charek%2C+A.&amp;rft.date=2013&amp;rft.volume=28&amp;rft.issue=3&amp;rft.pages=420%E2%80%9342&amp;rft_id=info:doi\/10.1111%2Fcuan.12012&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CaliebePredictive18-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CaliebePredictive18_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Caliebe, A.; Krawczak, M.; Kayser, M.&#32;(2018).&#32;\"Predictive values in Forensic DNA Phenotyping are not necessarily prevalence-dependent\".&#32;<i>FSI Genetics<\/i>&#32;<b>33<\/b>: e7\u2013e8.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.fsigen.2017.11.006\" target=\"_blank\">10.1016\/j.fsigen.2017.11.006<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Predictive+values+in+Forensic+DNA+Phenotyping+are+not+necessarily+prevalence-dependent&amp;rft.jtitle=FSI+Genetics&amp;rft.aulast=Caliebe%2C+A.%3B+Krawczak%2C+M.%3B+Kayser%2C+M.&amp;rft.au=Caliebe%2C+A.%3B+Krawczak%2C+M.%3B+Kayser%2C+M.&amp;rft.date=2018&amp;rft.volume=33&amp;rft.pages=e7%E2%80%93e8&amp;rft_id=info:doi\/10.1016%2Fj.fsigen.2017.11.006&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CarellaASecond04-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CarellaASecond04_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Carella, M.; d'Adamo, A.P.; Grootenboer-Mignot, S. et al.&#32;(2004).&#32;\"A second locus mapping to 2q35-36 for familial pseudohyperkalaemia\".&#32;<i>European Journal of Human Genetics<\/i>&#32;<b>12<\/b>&#32;(12): 1073\u20136.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fsj.ejhg.5201280\" target=\"_blank\">10.1038\/sj.ejhg.5201280<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+second+locus+mapping+to+2q35-36+for+familial+pseudohyperkalaemia&amp;rft.jtitle=European+Journal+of+Human+Genetics&amp;rft.aulast=Carella%2C+M.%3B+d%27Adamo%2C+A.P.%3B+Grootenboer-Mignot%2C+S.+et+al.&amp;rft.au=Carella%2C+M.%3B+d%27Adamo%2C+A.P.%3B+Grootenboer-Mignot%2C+S.+et+al.&amp;rft.date=2004&amp;rft.volume=12&amp;rft.issue=12&amp;rft.pages=1073%E2%80%936&amp;rft_id=info:doi\/10.1038%2Fsj.ejhg.5201280&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HuiNovel11-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HuiNovel11_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hui, L.; Jing, Y.; Rui, M.; Weijian, Y.&#32;(2011).&#32;\"Novel association analysis between 9 short tandem repeat loci polymorphisms and coronary heart disease based on a cross-validation design\".&#32;<i>Atherosclerosis<\/i>&#32;<b>218<\/b>&#32;(1): 151\u20135.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.atherosclerosis.2011.05.024\" target=\"_blank\">10.1016\/j.atherosclerosis.2011.05.024<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21703622\" target=\"_blank\">21703622<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Novel+association+analysis+between+9+short+tandem+repeat+loci+polymorphisms+and+coronary+heart+disease+based+on+a+cross-validation+design&amp;rft.jtitle=Atherosclerosis&amp;rft.aulast=Hui%2C+L.%3B+Jing%2C+Y.%3B+Rui%2C+M.%3B+Weijian%2C+Y.&amp;rft.au=Hui%2C+L.%3B+Jing%2C+Y.%3B+Rui%2C+M.%3B+Weijian%2C+Y.&amp;rft.date=2011&amp;rft.volume=218&amp;rft.issue=1&amp;rft.pages=151%E2%80%935&amp;rft_id=info:doi\/10.1016%2Fj.atherosclerosis.2011.05.024&amp;rft_id=info:pmid\/21703622&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PereiraPop11-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PereiraPop11_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pereira, L.; Alshamali, F.; Andreassen, R. et al.&#32;(2011).&#32;\"PopAffiliator: online calculator for individual affiliation to a major population group based on 17 autosomal short tandem repeat genotype profile\".&#32;<i>International Journal of Legal Medicine<\/i>&#32;<b>125<\/b>&#32;(5): 629\u201336.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs00414-010-0472-2\" target=\"_blank\">10.1007\/s00414-010-0472-2<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20552217\" target=\"_blank\">20552217<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=PopAffiliator%3A+online+calculator+for+individual+affiliation+to+a+major+population+group+based+on+17+autosomal+short+tandem+repeat+genotype+profile&amp;rft.jtitle=International+Journal+of+Legal+Medicine&amp;rft.aulast=Pereira%2C+L.%3B+Alshamali%2C+F.%3B+Andreassen%2C+R.+et+al.&amp;rft.au=Pereira%2C+L.%3B+Alshamali%2C+F.%3B+Andreassen%2C+R.+et+al.&amp;rft.date=2011&amp;rft.volume=125&amp;rft.issue=5&amp;rft.pages=629%E2%80%9336&amp;rft_id=info:doi\/10.1007%2Fs00414-010-0472-2&amp;rft_id=info:pmid\/20552217&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-M.27charekSilent08-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-M.27charekSilent08_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">M'charek, A.&#32;(2008).&#32;\"Silent witness, articulate collective: DNA evidence and the inference of visible traits\".&#32;<i>Bioethics<\/i>&#32;<b>22<\/b>&#32;(9): 519-28.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Fj.1467-8519.2008.00699.x\" target=\"_blank\">10.1111\/j.1467-8519.2008.00699.x<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/18959734\" target=\"_blank\">18959734<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Silent+witness%2C+articulate+collective%3A+DNA+evidence+and+the+inference+of+visible+traits&amp;rft.jtitle=Bioethics&amp;rft.aulast=M%27charek%2C+A.&amp;rft.au=M%27charek%2C+A.&amp;rft.date=2008&amp;rft.volume=22&amp;rft.issue=9&amp;rft.pages=519-28&amp;rft_id=info:doi\/10.1111%2Fj.1467-8519.2008.00699.x&amp;rft_id=info:pmid\/18959734&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MacLeanForensic14-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MacLeanForensic14_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">MacLean, C.E.; Lamparello, A.&#32;(2014).&#32;\"Forensic DNA phenotyping in criminal investigations and criminal courts: Assessing and mitigating the dilemmas inherent in the science\".&#32;<i>Recent Advances in DNA and Gene Sequences<\/i>&#32;<b>8<\/b>&#32;(2): 104-12.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25687339\" target=\"_blank\">25687339<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Forensic+DNA+phenotyping+in+criminal+investigations+and+criminal+courts%3A+Assessing+and+mitigating+the+dilemmas+inherent+in+the+science&amp;rft.jtitle=Recent+Advances+in+DNA+and+Gene+Sequences&amp;rft.aulast=MacLean%2C+C.E.%3B+Lamparello%2C+A.&amp;rft.au=MacLean%2C+C.E.%3B+Lamparello%2C+A.&amp;rft.date=2014&amp;rft.volume=8&amp;rft.issue=2&amp;rft.pages=104-12&amp;rft_id=info:pmid\/25687339&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ToomApproaching16-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ToomApproaching16_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Toom, V.; Wienroth, M.; M'charek, A. et al.&#32;(2016).&#32;\"Approaching ethical, legal and social issues of emerging forensic DNA phenotyping (FDP) technologies comprehensively: Reply to 'Forensic DNA phenotyping: Predicting human appearance from crime scene material for investigative purposes' by Manfred Kayser\".&#32;<i>Forensic Science International Genetics<\/i>&#32;<b>22<\/b>: e1\u2013e4.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.fsigen.2016.01.010\" target=\"_blank\">10.1016\/j.fsigen.2016.01.010<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26832996\" target=\"_blank\">26832996<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Approaching+ethical%2C+legal+and+social+issues+of+emerging+forensic+DNA+phenotyping+%28FDP%29+technologies+comprehensively%3A+Reply+to+%27Forensic+DNA+phenotyping%3A+Predicting+human+appearance+from+crime+scene+material+for+investigative+purposes%27+by+Manfred+Kayser&amp;rft.jtitle=Forensic+Science+International+Genetics&amp;rft.aulast=Toom%2C+V.%3B+Wienroth%2C+M.%3B+M%27charek%2C+A.+et+al.&amp;rft.au=Toom%2C+V.%3B+Wienroth%2C+M.%3B+M%27charek%2C+A.+et+al.&amp;rft.date=2016&amp;rft.volume=22&amp;rft.pages=e1%E2%80%93e4&amp;rft_id=info:doi\/10.1016%2Fj.fsigen.2016.01.010&amp;rft_id=info:pmid\/26832996&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BuchananForensic18-30\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BuchananForensic18_30-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Buchanan, N.; Staubach, F.; Wienroth, M. et al.&#32;(2018).&#32;\"Forensic DNA phenotyping legislation cannot be based on \u201cIdeal FDP\u201d\u2014A response to Caliebe, Krawczak and Kayser (2017)\".&#32;<i>FSI Genetics<\/i>&#32;<b>34<\/b>: e13\u2013e14.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.fsigen.2018.01.009\" target=\"_blank\">10.1016\/j.fsigen.2018.01.009<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Forensic+DNA+phenotyping+legislation+cannot+be+based+on+%E2%80%9CIdeal+FDP%E2%80%9D%E2%80%94A+response+to+Caliebe%2C+Krawczak+and+Kayser+%282017%29&amp;rft.jtitle=FSI+Genetics&amp;rft.aulast=Buchanan%2C+N.%3B+Staubach%2C+F.%3B+Wienroth%2C+M.+et+al.&amp;rft.au=Buchanan%2C+N.%3B+Staubach%2C+F.%3B+Wienroth%2C+M.+et+al.&amp;rft.date=2018&amp;rft.volume=34&amp;rft.pages=e13%E2%80%93e14&amp;rft_id=info:doi\/10.1016%2Fj.fsigen.2018.01.009&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VaillyThePolitics17-31\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VaillyThePolitics17_31-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Vailly, J.&#32;(2017).&#32;\"The politics of suspects\u2019 geo-genetic origin in France: The conditions, expression, and effects of problematisation\".&#32;<i>BioSocieties<\/i>&#32;<b>12<\/b>&#32;(1): 66\u201388.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1057%2Fs41292-016-0028-x\" target=\"_blank\">10.1057\/s41292-016-0028-x<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+politics+of+suspects%E2%80%99+geo-genetic+origin+in+France%3A+The+conditions%2C+expression%2C+and+effects+of+problematisation&amp;rft.jtitle=BioSocieties&amp;rft.aulast=Vailly%2C+J.&amp;rft.au=Vailly%2C+J.&amp;rft.date=2017&amp;rft.volume=12&amp;rft.issue=1&amp;rft.pages=66%E2%80%9388&amp;rft_id=info:doi\/10.1057%2Fs41292-016-0028-x&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PrainsackGenetic10-34\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PrainsackGenetic10_34-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Prainsack, B.&#32;(2010).&#32;\"Chapter 2: Key issues in DNA profiling and databasing: Implications for governance\".&#32;In&#32;Hindmarsh, R.; Prainsack, B..&#32;<i>Genetic Suspects: Global Governance of Forensic DNA Profiling and Databasing<\/i>.&#32;Cambridge University Press.&#32;pp.&#160;15\u201339.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9780521519434.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Chapter+2%3A+Key+issues+in+DNA+profiling+and+databasing%3A+Implications+for+governance&amp;rft.atitle=Genetic+Suspects%3A+Global+Governance+of+Forensic+DNA+Profiling+and+Databasing&amp;rft.aulast=Prainsack%2C+B.&amp;rft.au=Prainsack%2C+B.&amp;rft.date=2010&amp;rft.pages=pp.%26nbsp%3B15%E2%80%9339&amp;rft.pub=Cambridge+University+Press&amp;rft.isbn=9780521519434&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WilliamsSocial17-35\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WilliamsSocial17_35-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Williams, R.; Wienroth, M.&#32;(2017).&#32;\"Social and ethical aspects of forensic genetics: A critical review\".&#32;<i>Forensic Science Review<\/i>&#32;<b>29<\/b>&#32;(2): 145\u201369.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28691916\" target=\"_blank\">28691916<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Social+and+ethical+aspects+of+forensic+genetics%3A+A+critical+review&amp;rft.jtitle=Forensic+Science+Review&amp;rft.aulast=Williams%2C+R.%3B+Wienroth%2C+M.&amp;rft.au=Williams%2C+R.%3B+Wienroth%2C+M.&amp;rft.date=2017&amp;rft.volume=29&amp;rft.issue=2&amp;rft.pages=145%E2%80%9369&amp;rft_id=info:pmid\/28691916&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AasTheBody06-36\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AasTheBody06_36-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Aas, K.F.&#32;(2006).&#32;\"\u2018The body does not lie\u2019: Identity, risk and trust in technoculture\".&#32;<i>Crime, Media, Culture: An International Journal<\/i>&#32;<b>2<\/b>&#32;(2): 143-158.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1177%2F1741659006065401\" target=\"_blank\">10.1177\/1741659006065401<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=%E2%80%98The+body+does+not+lie%E2%80%99%3A+Identity%2C+risk+and+trust+in+technoculture&amp;rft.jtitle=Crime%2C+Media%2C+Culture%3A+An+International+Journal&amp;rft.aulast=Aas%2C+K.F.&amp;rft.au=Aas%2C+K.F.&amp;rft.date=2006&amp;rft.volume=2&amp;rft.issue=2&amp;rft.pages=143-158&amp;rft_id=info:doi\/10.1177%2F1741659006065401&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Footnotes were originally numbered but have been converted to lowercase alpha for this version. The link in footnote j had to be applied to Google Shortener because the HUDOC uses invalid characters in their URLs, and this wiki's footnote system breaks when the original URL is used.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193155\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.578 seconds\nReal time usage: 0.608 seconds\nPreprocessor visited node count: 19548\/1000000\nPreprocessor generated node count: 34183\/1000000\nPost\u2010expand include size: 131435\/2097152 bytes\nTemplate argument size: 42693\/2097152 bytes\nHighest expansion depth: 15\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 591.735 1 - -total\n 81.22% 480.635 2 - Template:Reflist\n 70.09% 414.753 24 - Template:Citation\/core\n 62.47% 369.668 20 - Template:Cite_journal\n 11.83% 69.996 4 - Template:Cite_book\n 10.66% 63.051 1 - Template:Infobox_journal_article\n 10.26% 60.719 1 - Template:Infobox\n 7.86% 46.485 30 - Template:Citation\/identifier\n 6.21% 36.758 80 - Template:Infobox\/row\n 3.82% 22.618 12 - Template:Efn\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10798-0!*!0!!en!*!* and timestamp 20181214193154 and revision id 33985\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F\">https:\/\/www.limswiki.org\/index.php\/Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","9872ac73fcb8d8cb5b8b6de9ced82c60_images":[],"9872ac73fcb8d8cb5b8b6de9ced82c60_timestamp":1544815914,"e29139b9d43cc4915ffca40cbc15f91c_type":"article","e29139b9d43cc4915ffca40cbc15f91c_title":"Big data in the era of health information exchanges: Challenges and opportunities for public health (Baseman et al. 2017)","e29139b9d43cc4915ffca40cbc15f91c_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health","e29139b9d43cc4915ffca40cbc15f91c_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Big data in the era of health information exchanges: Challenges and opportunities for public health\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nBig data in the era of health information exchanges: Challenges and opportunities for public healthJournal\n \nInformaticsAuthor(s)\n \nBaseman, Janet G.; Revere, Debra; Painter, IanAuthor affiliation(s)\n \nUniversity of WashingtonPrimary contact\n \nEmail: jbaseman at uw dot eduEditors\n \nGe, Mouzhi; Dohnal, VlastislavYear published\n \n2017Volume and issue\n \n4(4)Page(s)\n \n39DOI\n \n10.3390\/informatics4040039ISSN\n \n2227-9709Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttp:\/\/www.mdpi.com\/2227-9709\/4\/4\/39\/htmDownload\n \nhttp:\/\/www.mdpi.com\/2227-9709\/4\/4\/39\/pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Objective \n4 Methods \n\n4.1 Ethics \n4.2 Data source \n4.3 Analysis \n\n\n5 Results \n\n5.1 Challenge 1 \n5.2 Challenge 2 \n5.3 Challenge 3 \n5.4 Challenge 4 \n5.5 Challenge 5 \n\n\n6 Discussion \n7 Conclusions \n8 Acknowledgements \n\n8.1 Author contributions \n8.2 Conflicts of interest \n\n\n9 References \n10 Notes \n\n\n\nAbstract \nPublic health surveillance of communicable diseases depends on timely, complete, accurate, and useful data that are collected across a number of health care and public health systems. Health information exchanges (HIEs) which support electronic sharing of data and information between health care organizations are recognized as a source of \"big data\" in health care and have the potential to provide public health with a single stream of data collated across disparate systems and sources. However, given these data are not collected specifically to meet public health objectives, it is unknown whether a public health agency\u2019s (PHA\u2019s) secondary use of the data is supportive of or presents additional barriers to meeting disease reporting and surveillance needs. To explore this issue, we conducted an assessment of big data that is available to a PHA\u2014laboratory test results and clinician-generated notifiable condition report data\u2014through its participation in an HIE.\nKeywords: big data, communicable diseases, data mining, data quality, epidemiology, health information exchange, infectious diseases, population surveillance, public health\n\nIntroduction \nWe evaluated two datasets\u2014for sexually-transmitted infections (STIs) and non-STIs\u2014for the time period of January 1, 2012 to September 15, 2013 used by a PHA that is part of one of the largest and oldest HIE infrastructures in the U.S. The two datasets were independently analyzed for their data quality, utility, and appropriateness for meeting public health surveillance objectives: (1) timeliness, defined as the difference between earliest date of a disease report and date the report is received at the PHA; (2) volume, defined as the number of disease report cases received by the PHA; and (3) completion, defined as the number of days to close a disease case report.\nOur assessment uncovered the following challenges for effective utilization of big data by public health:\n\n While PHAs almost exclusively rely on secondary use data for surveillance, big data that has been collected for clinical purposes omits data fields of high value for public health.\n Big data is not always smart data, especially when the context within which the data is collected is absent.\n Data collected by disparate, varying systems and sources can introduce uncertainties and limit trustworthiness in the data, which may diminish its value for public health purposes.\n The process by which data is obtained needs to be evident in order for big data to be useful to public health.\n Big data for public health purposes needs to answer both \"what\" and \"why\" questions.\nDespite these and other issues\u2014such as measurement error and confounding, well-known challenges to both big and small data\u2014strategies traditionally employed by public health epidemiologists and other public health professionals can uncover limitations and contribute to the design of solutions in collection, integration, warehousing, and analysis of big data so its value and utility to public health can be optimized.\nIn recognition of the 10 year anniversary of the incorporation of the internet search firm Google, the journal Nature issued a special supplement on big data and what the availability of large datasets meant and will mean for scientists and researchers.[1] In particular, the supplement focused on the opportunities that will be possible when issues such as interoperable data infrastructures, security, data standardization, storage and transfer requirements, and data governance are resolved. Now, nearly 10 years later, users of big data\u2014characterized by the 5 Vs (huge volume, high velocity, high variety, low veracity, and high value)\u2014still encounter the issues presented in the Nature special supplement.[2] In particular, the primary challenges to utilizing big data center around the diversity of data types (variety), the resources required to handle data collection, storage and processing (velocity), and uncertainties inherent in mixing and cleaning data from varied data streams that generates unpredictability in the data (veracity).[3]\nNevertheless, within the health care sector, despite these challenges, big data also promises great opportunities to improve quality of health care delivery, population management, early detection of disease, decision-making, and cost reduction.[4] Major contributors to the explosion of big data are investments in information technology (IT), such as increased adoption of electronic medical record systems[5], and the creation of health information exchanges (HIEs)[6] which facilitate sharing of electronic data and information between health care organizations.[7] While the focus of HIEs has been on sharing patient information between clinics, hospitals, pharmacies, laboratories, and payers, public health agencies (PHAs) are increasingly included in HIEs.[8] PHA participation in a HIE provides a single stream of data collated across disparate systems and sources for public health.\nPublic health is a data-intensive and -driven field. Data is a highly valued currency for assessing the health of the community; providing guidance to stakeholders for handling a foodborne illness outbreak; forecasting the burden of seasonal influenza to enable sufficient timing to vaccinate vulnerable populations; and innumerable other efforts that aim to prevent disease, prolong life, promote human health, and mitigate unnecessary suffering.[9] Within the context of big data, public health efforts include linking information technology systems to conduct population-based cancer research and surveillance[10], more effectively identify behaviors that can build healthier communities[11], and improve targeted and timely epidemiologic surveillance of communicable and infectious disease.[12]\nSpecific to public health surveillance of communicable diseases, effective surveillance relies on time-sensitive, complete, accurate, and useful data that are collected across a number of healthcare and public health systems. It could be assumed that PHA participation in a HIE would support and potentially improve surveillance efforts, as data collected within the clinical encounter could be shared with public health more rapidly and be integrated into PHA decision support systems to meet public health practice needs. However, given that these data are not collected specifically to meet public health objectives, it is unknown whether a PHA\u2019s secondary use of the data is supportive of or presents additional barriers to meeting disease reporting and surveillance needs. To explore this issue, we conducted an assessment of big data that is available to a PHA\u2014laboratory test results and clinician-generated notifiable condition report data\u2014through its participation in a HIE and discuss the extent to which its value impacts the rationale for investing in the infrastructure, including workforce training, that is required to collect and interpret this data and ultimately inform measurable improvements in the health of public health community stakeholders.\n\nObjective \nTo explore challenges and opportunities for utilizing a public health big data available through PHA participation in a HIE.\n\nMethods \nEthics \nThis study was approved by the Indiana University Institutional Review Board with cross-institutional and concurrent IRB deferral from the University of Washington.\n\nData source \nDatasets for the time period of January 1, 2012 through September 15, 2013 were pulled from two public health surveillance systems: (1) the Statewide Information Management Surveillance System (SWIMSS), which collects electronic lab reports (ELRs) and communicable disease reports (CDRs) for STIs; and (2) InSight, the county\u2019s core population health data system, which collects ELRs and CDRs of non-STI data for public health surveillance activities. The SWIMSS data pull was limited to the most prevalent and highly-reported conditions: chlamydia, gonorrhea, and syphilis. The InSight data pull was limited to acute hepatitis B, chronic hepatitis C, and salmonella.\n\nAnalysis \nThe two datasets were independently analyzed for their data quality, utility, and appropriateness for meeting public health surveillance objectives, including: (1) timeliness, defined as the difference between earliest date of a disease report and date the report is received at the PHA; (2) volume, defined as the number of disease report cases received by the PHA; and (3) completion, defined as the number of days until a case report is marked as closed by the investigator.\nEach dataset was separately reviewed for data quality issues. Duplicate records were removed and missing data rates tabulated. Patterns of missing data over time were visualized over time and change point analysis[13] used to estimate time points at which underlying process changes may have occurred. Processing times (time to receipt of test results and PHA time to process results) were calculated in calendar days. Metadata was not available on which days the PHA conducted work, and this was estimated from the data based on days on which any cases were closed, and this estimated metadata was used to calculate number of work days required to close each case. Analyses of factors associated with time to receive and time to process cases were conducted after removal of atypical times. We aggregated case counts by disease and month to examine seasonal patterns of disease counts, and we aggregated case counts by disease and week to examine possible outbreaks and associations between outbreaks of different disease types. Occurrences of possible outbreaks were examined using a thresholds of three standard deviations above a 31-day moving average.\n\nResults \nThe final SWIMSS dataset included chlamydia (n = 28018); gonorrhea (n = 7791); syphilis (n = 810); and syphilis, reactor (n = 3118). The final InSight dataset included acute hepatitis B (n = 563); chronic hepatitis C (n = 2160); histoplasmosis (n = 73); and salmonella (n = 210). Table 1 summarizes data exclusions resulting from the data quality analysis.\n\n\n\n\n\n\n\nTable 1. SWIMSS and InSight data quality summary\n\n\n\n\nExclusions\n\n\n\n\nDataset\n\nTotal Number of Records in Initial Data Pull\n\nMissing Data\n\nDate Anomalies\n\nFinal Number of Records in Dataset\n\n\nDate before 01\/01\/2012 or Could Not Calculate\n\nNo Diagnosis\n\nNo Lab Tests\n\n\"Time to Receipt\" Anomalies\n\nLab Test Date Anomalies\n\nPublic Health Activity Date Anomalies\n\n\"Time to Close\" Anomalies\n\n\nSWIMSS\n\n48,250\n\n0\n\n0\n\n5392\n\n325\n\n1178\n\n909\n\n709\n\n39,737\n\n\nInSight\n\n3719\n\n321\n\n4\n\n0\n\n163\n\n0\n\n12\n\n213\n\n3006\n\n\n\nWe identified five specific challenges to secondary use of HIE data for meeting public health communicable disease surveillance needs. These challenges are illustrated by accompanying analyses.\n\nChallenge 1 \nWhile PHAs almost exclusively rely on secondary use data for surveillance, big data that has been collected for clinical purposes omits data fields of high value for public health.\nFor example, demographic characteristics such as race\/ethnicity are highly valued for understanding population level disparities in health and health care. Detailed spatial data (for example zip code level or finer) are data values for population-based forecasting and targeted development of health promotion materials and resource allocation but little used by clinicians; we observed lower data quality for these fields in our analysis. However, as seen in Figure 1, this information is not reliably collected, which can diminish the secondary use of this big data. This is observed in other population level databases; for example, ethnicity information in Medicare enrollment data has low sensitivity and specificity.[14]\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. Missing value rates for ethnicity field, SWIMMS database\n\n\n\nChallenge 2 \nBig data is not always smart data, especially when the context within which the data is collected is absent. While big data is suitable for detecting an increase in volume of a particular variable of public health interest, it also presents classic, well-known outbreak detection problems such as unknown or fluctuating denominators (for example, where only positive test results are known and the underlying number of tests performed unknown) and signal-noise problems (for example, where early detection of outbreaks requires detecting low numbers of cases with non-specific symptoms from much larger volumes of health care encounters).\nAn illustration of this challenge is our observation in the data of an increase in the volume of salmonella cases (Figure 2). An initial interpretation would be that there is a probably salmonella outbreak. However, we learned that during the volume upticks, there was a shigella outbreak in the community. The observed increase then may be attributed to heightened clinical awareness and testing for any gastrointestinal illness symptoms, rather than a true increase in salmonella cases. Also, what appears to be an uptick may be understood to be the true prevalence of salmonella in the community and be interpreted as an indicator for low clinician reporting of a communicable disease.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2. Salmonella counts by week with alert thresholds, Insight data base.\n\n\n\nChallenge 3 \nData collected by disparate, varying systems and sources can introduce uncertainties and limit trustworthiness in the data, which may diminish its value for public health purposes.\nFor example, in the case of laboratory reports, a positive lab test result can be generated by numerous different types of lab tests. A lab test reporting a positive case of acute hepatitis B can be due to any one of 22 different lab test codes, representing multiple types of lab tests. Chronic hepatitis B has 31 different lab test codes, while chronic hepatitis C has 48 different lab codes. We identified considerable variation in use over time for some tests (tests 2, 3, 8, 10, and 11) as illustrated in Figure 3. Different lab tests may have different sensitivity and specificity characteristics, and so changes in lab test composition over time complicate interpretation of trends.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3. Lab test code used for positive hepatitis C reports by time for lab test codes with more than 30 reports. Each row represents a different lab test code, with vertical bars represent when reported cases occurred.\n\n\n\nChallenge 4 \nThe process by which data is obtained needs to be evident in order for big data to be useful to public health. Changes in the data generation and collection processes that underlay testing for disease and collection of test data can have big impacts on value of data for public health (examples could include changes in the type of test used at a facility or changes in personal resulting in changing patterns of coding usage).\nFor example, Figure 4 shows a curious parallel double bump in counts for three diseases. The parallel increase suggests a change in the underlying process of testing or acquiring data rather than in the disease processes. The date range for the increase in disease counts suggests that a change in the processes of disease testing associated with December holidays may have contributed. However, the previous year saw no pattern of increases during the same time period.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 4. Counts of positive test results for chlamydia, syphilis reactor, and gonorrhea aggregated by week.\n\n\n\nChallenge 5 \nUnlike many other domains in which big data is used, big data for public health purposes needs to answer both \"what\" and \"why\" questions. Also, unlike some other health care fields, PHAs are responsible not only for the health of the communities they serve but also accountable to other government agencies and elected officials who must make decisions and enact policies based on public health surveillance observations. Incorporating metadata about a big data source can help guide answers to \"what\" and \"why\" questions that can arise when analyzing and interpreting findings.\nAn illustration of this challenge is presented in Figure 5, a timeliness analysis which identified substantial differences by day of the week for lab test ordering and processing. These differences by day of the week appear to impact delivery of lab results to the PHA. It is unknown whether this could be accounted for in differences among labs in processing protocols, how a lab combines different test codes to generate a final test report, or other factors that might elucidate why this difference occurred. In turn, this timeliness difference could impact the timing for issuing a public health advisory to the community or to health care providers regarding an increased volume of, for example, acute hepatitis B. Needed metadata about lab processing and reporting practices could make the difference in timing for an advisory and also help elected officials feel more confident about a finding that could require policy decisions to stop the spread of a communicable disease in the community.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 5. Time to receive case report by public health by disease and day of week, Insight DB.\n\n\n\nTable 2 is another illustration of the need for metadata, this focused on clinician reporting. We identified significant variation between the day of the week that a case report is received at the PHA, as well as considerable variation in reporting by condition. However, in the absence of contextual factors that can influence reporting variation, such as seasonal fluctuations in illness (for example, higher prevalence of influenza during winter months), interpretation of this finding requires more information.\n\n\n\n\n\n\n\nTable 2. Variation in reporting by condition and day of week report received\n\n\n\n\nMonday\n\nTuesday\n\nWednesday\n\nThursday\n\nFriday\n\n\nN\n\n%\n\nN\n\n%\n\nN\n\n%\n\nN\n\n%\n\nN\n\n%\n\n\nHEPBA\n\n36\n\n24.2\n\n30\n\n20.1\n\n31\n\n20.8\n\n22\n\n14.8\n\n30\n\n20.1\n\n\nHEPBC\n\n132\n\n24.5\n\n78\n\n14.5\n\n127\n\n23.6\n\n98\n\n18.2\n\n104\n\n19.3\n\n\nHEPC\n\n699\n\n28.7\n\n457\n\n18.8\n\n406\n\n16.7\n\n460\n\n18.9\n\n414\n\n17.0\n\n\nHISTO\n\n29\n\n35.8\n\n14\n\n17.3\n\n14\n\n17.3\n\n11\n\n13.6\n\n13\n\n16.0\n\n\nSAL\n\n67\n\n30.0\n\n37\n\n16.6\n\n34\n\n15.2\n\n40\n\n17.9\n\n45\n\n20.2\n\n\n\nDiscussion \nAccording to Khoury and Ioannidis, effective utilization of big data in public health centers on two challenges: addressing the trade-off between access and accuracy and the task of separating true signal from large and varied noise.[15] Our assessment of a large dataset available to public health not only provides examples of these challenges but also points to pathways for turning these challenges into opportunities.\nChallenge 1: While PHAs almost exclusively rely on secondary use data for surveillance, big data that has been collected for clinical purposes omits data fields of high value for public health.\nOpportunity: As important as secondary use data is for public health surveillance, public health lacks mechanisms to enforce completeness of fields or timely reporting. Our example of missing race\/ethnicity data is a compelling case, as without this information a PHA will not be able to target health promotion efforts to the most affected or vulnerable populations. Public health is recognized as chronically underfunded; PHAs are not only unlikely to offer incentives for data collection, they need to use scarce resources wisely. Conducting a STI prevention program in a community that does not experience high levels of chlamydia, for example, would be wasteful as well as potentially cause friction in community relations. In recent years, some mechanisms, such as \"meaningful use,\"[16] have been enacted to expand current case reporting between hospitals\/providers and public health and increase capacity for data management and analysis. Figure 1 shows evidence of improvement in the completeness rates of the ethnicity field for one database, largely having resulted from changes in the underlying process of collecting data for this field. However, enforcing compliance in complete and timely reporting may be outside the resources of public health.\nChallenge 2: Big data is not always smart data, especially when the context within which the data is collected is absent.\nOpportunity: A constant issue with notifiable condition reporting systems is the lack of a denominator for the number of positive test results, in part due to privacy reasons that are difficult to avoid. This lack of context limits the value of reportable systems for disease detection, mainly in terms of increasing the rate of false positive alerts. Big data methods to determine context from other data sources would be of great value for public health. The opportunity here is to make use of the experience big data has with processing unstructured data and data from multiple sources to use big data methods to help understand the context of the clinical data.\nChallenge 3: Data collected by disparate, varying systems and sources can introduce uncertainties and limit trustworthiness in the data, which may diminish its value for public health purposes.\nOpportunity: The further away the use of the data gets from the original purpose for its collection, the higher the potential for data quality, integrity, and value problems. There is the opportunity for public health to play a role providing population health-level situational awareness information back to the data originators. This would show value to data originators of data fields that they collect but do not directly use. An example of population health-level situational awareness information would be obesity rates within populations that match characteristics of the provider\u2019s panel population.\nChallenge 4: The process by which data is obtained needs to be evident in order for big data to be useful to public health.\nOpportunity: Big data methods which can detect and adjust for underlying changes in the process that govern the collection of public health data would be beneficial. Three areas relating to metadata would be useful:\n\n Techniques for automatically identifying where metadata is needed would be useful (for example automatically identifying and flagging changes in data suggestive of underlying changes in the data generation process).\n Techniques for generating metadata from the data itself (for example, we used counts of cases processed on each day to generate metadata that indicates which days were days public health work was being performed).\n Techniques that adjust analyses based on metadata, especially with regard to data quality. In situations where PH entities have little recourse on improving DQ, methods that adjust for DQ need to be developed. For example, nowcasting methods (predicting the present state based on the incomplete data at hand) can account for data which accrues over time.[17][18][19]\nChallenge 5: Big data for public health purposes needs to answer both \"what\" and \"why\" questions.\nOpportunity: PH use of big data is unique in that it is constrained by risk of failure. If PH fails to stop an outbreak, preventable accidents, deaths, mortality can result (e.g., Ebola surveillance, detection, and prediction failure). If PH predicts an outbreak that does not materialize, the costs can include relationships with stakeholders, media, and the public. In addition, PH has a responsibility to monitor any data sources that it does receive; thus, data of unclear value to public health uses resources that may be better invested elsewhere.\n\nConclusions \nDespite these and other issues\u2014such as measurement error and confounding, well-known challenges to both big and small data\u2014strategies traditionally employed by public health epidemiologists and other public health professionals can uncover limitations and contribute to the design of solutions in collection, integration, warehousing, and analysis of big data so its value and utility to public health can be optimized.\n\nAcknowledgements \nThis study was conducted as part of the \u201cLeveraging a HIE to Improve Public Health Disease Investigation\u201d research project (RWJF Award #70338; PI: J Baseman, University of Washington, Seattle WA, USA). The content is solely the responsibility of the authors and does not necessarily represent the official views of the Robert Wood Johnson Foundation.\n\nAuthor contributions \nJ.G.B., D.R. and I.P. conceived this paper; I.P. analyzed the data; J.G.B., D.R. and I.P. wrote the paper. All authors reviewed and approved revisions.\n\nConflicts of interest \nThe authors declare no conflict of interest.\n\nReferences \n\n\n\u2191 Miller, E.&#32;(2008).&#32;\"Community cleverness required\".&#32;Nature&#32;455&#32;(1).&#32;doi:10.1038\/455001a. &#160; \n\n\u2191 Kruse, C.S.; Goswamy, R.; Raval, Y.; Marawi, S.&#32;(2016).&#32;\"Challenges and Opportunities of Big Data in Health Care: A Systematic Review\".&#32;JMIR Medical Informatics&#32;4&#32;(4): e38.&#32;doi:10.2196\/medinform.5359.&#32;PMC&#160;PMC5138448.&#32;PMID&#160;27872036.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5138448 . &#160; \n\n\u2191 Jin, X.; Wah, B.W.; Cheng, X.; Wang, Y.&#32;(2015).&#32;\"Significance and Challenges of Big Data Research\".&#32;Big Data Research&#32;2&#32;(2).&#32;doi:10.1016\/j.bdr.2015.01.006. &#160; \n\n\u2191 Nambiar, R.; Bhardwaj, R.; Sethi, A.; Vargheese, R.&#32;(2013).&#32;\"A look at challenges and opportunities of big data analytics in healthcare\".&#32;Proceedings from the 2013 IEEE International Conference on Big Data: 17\u201322.&#32;doi:10.1109\/BigData.2013.6691753. &#160; \n\n\u2191 Joseph, S.; Sow, M.; Furukawa, M.F. et al.&#32;(2014).&#32;\"HITECH spurs EHR vendor competition and innovation, resulting in increased adoption\".&#32;American Journal of Managed Care&#32;20&#32;(9): 734-40.&#32;PMID&#160;25365748. &#160; \n\n\u2191 Roski, J.; Bo-Linn, G.W.; Andrews, T.A.&#32;(2014).&#32;\"Creating value in health care through big data: Opportunities and policy implications\".&#32;Health Affairs&#32;33&#32;(7): 1115-22.&#32;doi:10.1377\/hlthaff.2014.0147.&#32;PMID&#160;25006136. &#160; \n\n\u2191 Groves, P.; Kayyali, B.; Knott, D.; Van Kuiken, S.&#32;(January 2013).&#32;\"The 'big data' revolution in healthcare: Accelerating value and innovation\".&#32;McKinsey &amp; Company.&#32;https:\/\/www.mckinsey.com\/~\/media\/mckinsey\/industries\/healthcare%20systems%20and%20services\/our%20insights\/the%20big%20data%20revolution%20in%20us%20health%20care\/the_big_data_revolution_in_healthcare.ashx . &#160; \n\n\u2191 Shah, G.H.; Leider, J.P.; Luo, H.; Kaur, R.&#32;(2016).&#32;\"Interoperability of Information Systems Managed and Used by the Local Health Departments\".&#32;Journal of Public Health Management and Practice&#32;22&#32;(Suppl. 6): S34-S43.&#32;doi:10.1097\/PHH.0000000000000436.&#32;PMC&#160;PMC5049946.&#32;PMID&#160;27684616.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5049946 . &#160; \n\n\u2191 \"What Is Public Health?\".&#32;CDC Foundation.&#32;https:\/\/www.cdcfoundation.org\/what-public-health .&#32;Retrieved 12 October 2017 . &#160; \n\n\u2191 Barrett, M.A.; Humblet, O.; Hiatt, R.A.; Adler, N.E.&#32;(2013).&#32;\"Big Data and Disease Prevention: From Quantified Self to Quantified Communities\".&#32;Big Data&#32;1&#32;(3): 168-75.&#32;doi:10.1089\/big.2013.0027.&#32;PMID&#160;27442198. &#160; \n\n\u2191 Meyer, A.M.; Olshan, A.F.; Green, L. et al.&#32;(2014).&#32;\"Big data for population-based cancer research: the integrated cancer information and surveillance system\".&#32;North Carolina Medical Journal&#32;75&#32;(4): 265\u20139.&#32;doi:10.1089\/big.2013.0027.&#32;PMC&#160;PMC4766858.&#32;PMID&#160;25046092.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4766858 . &#160; \n\n\u2191 Salath\u00e9, M.; Bengtsson, L.; Bodnar, T.J. et al.&#32;(2012).&#32;\"Digital epidemiology\".&#32;PLoS Computational Biology&#32;8&#32;(7): e1002616.&#32;doi:10.1371\/journal.pcbi.1002616.&#32;PMC&#160;PMC3406005.&#32;PMID&#160;22844241.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3406005 . &#160; \n\n\u2191 Painter, I.; Eaton. J.; Lober, B.&#32;(2013).&#32;\"Using Change Point Detection for Monitoring the Quality of Aggregate Data\".&#32;Online Journal of Public Health Informatics&#32;5&#32;(1): e186.&#32;doi:10.5210\/ojphi.v5i1.4597. &#160; \n\n\u2191 Zaslavsky, A.M.; Ayanian, J.Z.; Zaborski, L.B.&#32;(2012).&#32;\"The validity of race and ethnicity in enrollment data for Medicare beneficiaries\".&#32;Health Services Research&#32;47&#32;(3 Pt. 2): 1300\u201321.&#32;doi:10.1111\/j.1475-6773.2012.01411.x.&#32;PMC&#160;PMC3349013.&#32;PMID&#160;22515953.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3349013 . &#160; \n\n\u2191 Khoury, M.J.; Ioannidis, J.P.&#32;(2014).&#32;\"Medicine: Big data meets public health\".&#32;Science&#32;346&#32;(6213): 1054\u20135.&#32;doi:10.1126\/science.aaa2709.&#32;PMC&#160;PMC4684636.&#32;PMID&#160;25430753.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4684636 . &#160; \n\n\u2191 \"Meaningful Use\".&#32;CDC A\u2013Z Index.&#32;Centers for Disease Control and Prevention.&#32;https:\/\/www.cdc.gov\/ehrmeaningfuluse\/ .&#32;Retrieved 12 October 2017 . &#160; \n\n\u2191 Johansson, M.A.; Powers, A.M.; Pesik, N. et al..&#32;\"Nowcasting the spread of chikungunya virus in the Americas\".&#32;PLoS One&#32;9&#32;(8): e104915.&#32;doi:10.1371\/journal.pone.0104915.&#32;PMC&#160;PMC4128737.&#32;PMID&#160;25111394.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4128737 . &#160; \n\n\u2191 Preis, T.; Moat, H.S..&#32;\"Adaptive nowcasting of influenza outbreaks using Google searches\".&#32;Royal Society Open Science&#32;1&#32;(2): 140095.&#32;doi:10.1098\/rsos.140095.&#32;PMC&#160;PMC4448892.&#32;PMID&#160;26064532.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4448892 . &#160; \n\n\u2191 Althouse, B.M.; Scarpino, S.V.; Meyers, L.A. et al..&#32;\"Enhancing disease surveillance with novel data streams: Challenges and opportunities\".&#32;EPJ Data Science&#32;4: 17.&#32;doi:10.1140\/epjds\/s13688-015-0054-0.&#32;PMC&#160;PMC5156315.&#32;PMID&#160;27990325.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5156315 . &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Several URL from the original were dead, and more current URLs were substituted.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\">https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on data qualityLIMSwiki journal articles on public health informatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 6 November 2018, at 15:39.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 495 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","e29139b9d43cc4915ffca40cbc15f91c_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Big_data_in_the_era_of_health_information_exchanges_Challenges_and_opportunities_for_public_health skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Big data in the era of health information exchanges: Challenges and opportunities for public health<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>Public health surveillance of communicable diseases depends on timely, complete, accurate, and useful data that are collected across a number of health care and public health systems. <a href=\"https:\/\/www.limswiki.org\/index.php\/Health_information_exchange\" title=\"Health information exchange\" target=\"_blank\" class=\"wiki-link\" data-key=\"7ea2eaa504cfa533dd9340169625c4ba\">Health information exchanges<\/a> (HIEs) which support electronic sharing of data and <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> between health care organizations are recognized as a source of \"big data\" in health care and have the potential to provide public health with a single stream of data collated across disparate systems and sources. However, given these data are not collected specifically to meet public health objectives, it is unknown whether a public health agency\u2019s (PHA\u2019s) secondary use of the data is supportive of or presents additional barriers to meeting disease reporting and surveillance needs. To explore this issue, we conducted an assessment of big data that is available to a PHA\u2014<a href=\"https:\/\/www.limswiki.org\/index.php\/Public_health_laboratory\" title=\"Public health laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"34ffb658cb79bf322c65efaad95996f5\">laboratory<\/a> test results and clinician-generated notifiable condition report data\u2014through its participation in an HIE.\n<\/p><p><b>Keywords<\/b>: big data, communicable diseases, data mining, data quality, epidemiology, health information exchange, infectious diseases, population surveillance, public health\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>We evaluated two datasets\u2014for sexually-transmitted infections (STIs) and non-STIs\u2014for the time period of January 1, 2012 to September 15, 2013 used by a PHA that is part of one of the largest and oldest HIE infrastructures in the U.S. The two datasets were independently analyzed for their data quality, utility, and appropriateness for meeting public health surveillance objectives: (1) timeliness, defined as the difference between earliest date of a disease report and date the report is received at the PHA; (2) volume, defined as the number of disease report cases received by the PHA; and (3) completion, defined as the number of days to close a disease case report.\n<\/p><p>Our assessment uncovered the following challenges for effective utilization of big data by public health:\n<\/p>\n<ol><li> While PHAs almost exclusively rely on secondary use data for surveillance, big data that has been collected for clinical purposes omits data fields of high value for public health.<\/li>\n<li> Big data is not always smart data, especially when the context within which the data is collected is absent.<\/li>\n<li> Data collected by disparate, varying systems and sources can introduce uncertainties and limit trustworthiness in the data, which may diminish its value for public health purposes.<\/li>\n<li> The process by which data is obtained needs to be evident in order for big data to be useful to public health.<\/li>\n<li> Big data for public health purposes needs to answer both \"what\" and \"why\" questions.<\/li><\/ol>\n<p>Despite these and other issues\u2014such as measurement error and confounding, well-known challenges to both big and small data\u2014strategies traditionally employed by public health epidemiologists and other public health professionals can uncover limitations and contribute to the design of solutions in collection, integration, warehousing, and analysis of big data so its value and utility to public health can be optimized.\n<\/p><p>In recognition of the 10 year anniversary of the incorporation of the internet search firm Google, the journal <i>Nature<\/i> issued a special supplement on big data and what the availability of large datasets meant and will mean for scientists and researchers.<sup id=\"rdp-ebb-cite_ref-MillerComm08_1-0\" class=\"reference\"><a href=\"#cite_note-MillerComm08-1\" rel=\"external_link\">[1]<\/a><\/sup> In particular, the supplement focused on the opportunities that will be possible when issues such as interoperable <a href=\"https:\/\/www.limswiki.org\/index.php\/Information_management\" title=\"Information management\" target=\"_blank\" class=\"wiki-link\" data-key=\"f8672d270c0750a858ed940158ca0a73\">data infrastructures<\/a>, <a href=\"https:\/\/www.limswiki.org\/index.php\/Information_security\" title=\"Information security\" target=\"_blank\" class=\"wiki-link\" data-key=\"9eff362d944224ff1d4ffe3a149d7cff\">security<\/a>, data standardization, storage and transfer requirements, and data governance are resolved. Now, nearly 10 years later, users of big data\u2014characterized by the 5 Vs (huge volume, high velocity, high variety, low veracity, and high value)\u2014still encounter the issues presented in the <i>Nature<\/i> special supplement.<sup id=\"rdp-ebb-cite_ref-KruseChallenges16_2-0\" class=\"reference\"><a href=\"#cite_note-KruseChallenges16-2\" rel=\"external_link\">[2]<\/a><\/sup> In particular, the primary challenges to utilizing big data center around the diversity of data types (variety), the resources required to handle data collection, storage and processing (velocity), and uncertainties inherent in mixing and cleaning data from varied data streams that generates unpredictability in the data (veracity).<sup id=\"rdp-ebb-cite_ref-JinSignif15_3-0\" class=\"reference\"><a href=\"#cite_note-JinSignif15-3\" rel=\"external_link\">[3]<\/a><\/sup>\n<\/p><p>Nevertheless, within the health care sector, despite these challenges, big data also promises great opportunities to improve quality of health care delivery, population management, early detection of disease, decision-making, and cost reduction.<sup id=\"rdp-ebb-cite_ref-NambiarALook13_4-0\" class=\"reference\"><a href=\"#cite_note-NambiarALook13-4\" rel=\"external_link\">[4]<\/a><\/sup> Major contributors to the explosion of big data are investments in information technology (IT), such as increased adoption of <a href=\"https:\/\/www.limswiki.org\/index.php\/Electronic_medical_record\" title=\"Electronic medical record\" target=\"_blank\" class=\"wiki-link\" data-key=\"99a695d2af23397807da0537d29d0be7\">electronic medical record<\/a> systems<sup id=\"rdp-ebb-cite_ref-JosephHITECH14_5-0\" class=\"reference\"><a href=\"#cite_note-JosephHITECH14-5\" rel=\"external_link\">[5]<\/a><\/sup>, and the creation of health information exchanges (HIEs)<sup id=\"rdp-ebb-cite_ref-RoskiCreating14_6-0\" class=\"reference\"><a href=\"#cite_note-RoskiCreating14-6\" rel=\"external_link\">[6]<\/a><\/sup> which facilitate sharing of electronic data and information between health care organizations.<sup id=\"rdp-ebb-cite_ref-GrovesTheBig13_7-0\" class=\"reference\"><a href=\"#cite_note-GrovesTheBig13-7\" rel=\"external_link\">[7]<\/a><\/sup> While the focus of HIEs has been on sharing patient information between clinics, <a href=\"https:\/\/www.limswiki.org\/index.php\/Hospital\" title=\"Hospital\" target=\"_blank\" class=\"wiki-link\" data-key=\"b8f070c66d8123fe91063594befebdff\">hospitals<\/a>, pharmacies, <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratories<\/a>, and payers, public health agencies (PHAs) are increasingly included in HIEs.<sup id=\"rdp-ebb-cite_ref-ShahInter16_8-0\" class=\"reference\"><a href=\"#cite_note-ShahInter16-8\" rel=\"external_link\">[8]<\/a><\/sup> PHA participation in a HIE provides a single stream of data collated across disparate systems and sources for public health.\n<\/p><p>Public health is a data-intensive and -driven field. Data is a highly valued currency for assessing the health of the community; providing guidance to stakeholders for handling a foodborne illness outbreak; forecasting the burden of seasonal influenza to enable sufficient timing to vaccinate vulnerable populations; and innumerable other efforts that aim to prevent disease, prolong life, promote human health, and mitigate unnecessary suffering.<sup id=\"rdp-ebb-cite_ref-CDCFWhatIs_9-0\" class=\"reference\"><a href=\"#cite_note-CDCFWhatIs-9\" rel=\"external_link\">[9]<\/a><\/sup> Within the context of big data, public health efforts include linking information technology systems to conduct population-based cancer research and surveillance<sup id=\"rdp-ebb-cite_ref-BarrettBig13_10-0\" class=\"reference\"><a href=\"#cite_note-BarrettBig13-10\" rel=\"external_link\">[10]<\/a><\/sup>, more effectively identify behaviors that can build healthier communities<sup id=\"rdp-ebb-cite_ref-MeyerBig14_11-0\" class=\"reference\"><a href=\"#cite_note-MeyerBig14-11\" rel=\"external_link\">[11]<\/a><\/sup>, and improve targeted and timely epidemiologic surveillance of communicable and infectious disease.<sup id=\"rdp-ebb-cite_ref-Salath.C3.A9Digital12_12-0\" class=\"reference\"><a href=\"#cite_note-Salath.C3.A9Digital12-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p><p>Specific to public health surveillance of communicable diseases, effective surveillance relies on time-sensitive, complete, accurate, and useful data that are collected across a number of healthcare and public health systems. It could be assumed that PHA participation in a HIE would support and potentially improve surveillance efforts, as data collected within the clinical encounter could be shared with public health more rapidly and be integrated into PHA <a href=\"https:\/\/www.limswiki.org\/index.php\/Clinical_decision_support_system\" title=\"Clinical decision support system\" target=\"_blank\" class=\"wiki-link\" data-key=\"095141425468d057aa977016869ca37d\">decision support systems<\/a> to meet public health practice needs. However, given that these data are not collected specifically to meet public health objectives, it is unknown whether a PHA\u2019s secondary use of the data is supportive of or presents additional barriers to meeting disease reporting and surveillance needs. To explore this issue, we conducted an assessment of big data that is available to a PHA\u2014laboratory test results and clinician-generated notifiable condition report data\u2014through its participation in a HIE and discuss the extent to which its value impacts the rationale for investing in the infrastructure, including workforce training, that is required to collect and interpret this data and ultimately inform measurable improvements in the health of public health community stakeholders.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Objective\">Objective<\/span><\/h2>\n<p>To explore challenges and opportunities for utilizing a public health big data available through PHA participation in a HIE.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Methods\">Methods<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Ethics\">Ethics<\/span><\/h3>\n<p>This study was approved by the Indiana University Institutional Review Board with cross-institutional and concurrent IRB deferral from the University of Washington.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Data_source\">Data source<\/span><\/h3>\n<p>Datasets for the time period of January 1, 2012 through September 15, 2013 were pulled from two public health surveillance systems: (1) the Statewide Information Management Surveillance System (SWIMSS), which collects electronic lab reports (ELRs) and communicable disease reports (CDRs) for STIs; and (2) InSight, the county\u2019s core population health data system, which collects ELRs and CDRs of non-STI data for public health surveillance activities. The SWIMSS data pull was limited to the most prevalent and highly-reported conditions: chlamydia, gonorrhea, and syphilis. The InSight data pull was limited to acute hepatitis B, chronic hepatitis C, and salmonella.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Analysis\">Analysis<\/span><\/h3>\n<p>The two datasets were independently analyzed for their data quality, utility, and appropriateness for meeting public health surveillance objectives, including: (1) timeliness, defined as the difference between earliest date of a disease report and date the report is received at the PHA; (2) volume, defined as the number of disease report cases received by the PHA; and (3) completion, defined as the number of days until a case report is marked as closed by the investigator.\n<\/p><p>Each dataset was separately reviewed for data quality issues. Duplicate records were removed and missing data rates tabulated. Patterns of missing data over time were visualized over time and change point analysis<sup id=\"rdp-ebb-cite_ref-PainterUsing13_13-0\" class=\"reference\"><a href=\"#cite_note-PainterUsing13-13\" rel=\"external_link\">[13]<\/a><\/sup> used to estimate time points at which underlying process changes may have occurred. Processing times (time to receipt of test results and PHA time to process results) were calculated in calendar days. Metadata was not available on which days the PHA conducted work, and this was estimated from the data based on days on which any cases were closed, and this estimated metadata was used to calculate number of work days required to close each case. Analyses of factors associated with time to receive and time to process cases were conducted after removal of atypical times. We aggregated case counts by disease and month to examine seasonal patterns of disease counts, and we aggregated case counts by disease and week to examine possible outbreaks and associations between outbreaks of different disease types. Occurrences of possible outbreaks were examined using a thresholds of three standard deviations above a 31-day moving average.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Results\">Results<\/span><\/h2>\n<p>The final SWIMSS dataset included chlamydia (<i>n<\/i> = 28018); gonorrhea (<i>n<\/i> = 7791); syphilis (<i>n<\/i> = 810); and syphilis, reactor (<i>n<\/i> = 3118). The final InSight dataset included acute hepatitis B (<i>n<\/i> = 563); chronic hepatitis C (<i>n<\/i> = 2160); histoplasmosis (<i>n<\/i> = 73); and salmonella (<i>n<\/i> = 210). Table 1 summarizes data exclusions resulting from the <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">data quality analysis<\/a>.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"10\"><b>Table 1.<\/b> SWIMSS and InSight data quality summary\n<\/td><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"7\">Exclusions\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">\n<\/th><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Dataset\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Total Number of Records in Initial Data Pull\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"3\">Missing Data\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"4\">Date Anomalies\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Final Number of Records in Dataset\n<\/th><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Date before 01\/01\/2012 or Could Not Calculate\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">No Diagnosis\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">No Lab Tests\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">\"Time to Receipt\" Anomalies\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Lab Test Date Anomalies\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Public Health Activity Date Anomalies\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">\"Time to Close\" Anomalies\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SWIMSS\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">48,250\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5392\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">325\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1178\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">909\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">709\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">39,737\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">InSight\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3719\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">321\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">163\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">12\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">213\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3006\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>We identified five specific challenges to secondary use of HIE data for meeting public health communicable disease surveillance needs. These challenges are illustrated by accompanying analyses.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Challenge_1\">Challenge 1<\/span><\/h3>\n<p>While PHAs almost exclusively rely on secondary use data for surveillance, big data that has been collected for clinical purposes omits data fields of high value for public health.\n<\/p><p>For example, demographic characteristics such as race\/ethnicity are highly valued for understanding population level disparities in health and health care. Detailed spatial data (for example zip code level or finer) are data values for population-based forecasting and targeted development of health promotion materials and resource allocation but little used by clinicians; we observed lower data quality for these fields in our analysis. However, as seen in Figure 1, this information is not reliably collected, which can diminish the secondary use of this big data. This is observed in other population level databases; for example, ethnicity information in Medicare enrollment data has low sensitivity and specificity.<sup id=\"rdp-ebb-cite_ref-ZaslavskyTheValid12_14-0\" class=\"reference\"><a href=\"#cite_note-ZaslavskyTheValid12-14\" rel=\"external_link\">[14]<\/a><\/sup>\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Baseman_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"bc4ef3022784cecb747b54e976432bda\"><img alt=\"Fig1 Baseman Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/f\/f7\/Fig1_Baseman_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> Missing value rates for ethnicity field, SWIMMS database<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Challenge_2\">Challenge 2<\/span><\/h3>\n<p>Big data is not always smart data, especially when the context within which the data is collected is absent. While big data is suitable for detecting an increase in volume of a particular variable of public health interest, it also presents classic, well-known outbreak detection problems such as unknown or fluctuating denominators (for example, where only positive test results are known and the underlying number of tests performed unknown) and signal-noise problems (for example, where early detection of outbreaks requires detecting low numbers of cases with non-specific symptoms from much larger volumes of health care encounters).\n<\/p><p>An illustration of this challenge is our observation in the data of an increase in the volume of salmonella cases (Figure 2). An initial interpretation would be that there is a probably salmonella outbreak. However, we learned that during the volume upticks, there was a shigella outbreak in the community. The observed increase then may be attributed to heightened clinical awareness and testing for any gastrointestinal illness symptoms, rather than a true increase in salmonella cases. Also, what appears to be an uptick may be understood to be the true prevalence of salmonella in the community and be interpreted as an indicator for low clinician reporting of a communicable disease.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Baseman_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"e41399e3c5e07ce4c0d0ef6550366751\"><img alt=\"Fig2 Baseman Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/7\/70\/Fig2_Baseman_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> Salmonella counts by week with alert thresholds, Insight data base.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Challenge_3\">Challenge 3<\/span><\/h3>\n<p>Data collected by disparate, varying systems and sources can introduce uncertainties and limit trustworthiness in the data, which may diminish its value for public health purposes.\n<\/p><p>For example, in the case of laboratory reports, a positive lab test result can be generated by numerous different types of lab tests. A lab test reporting a positive case of acute hepatitis B can be due to any one of 22 different lab test codes, representing multiple types of lab tests. Chronic hepatitis B has 31 different lab test codes, while chronic hepatitis C has 48 different lab codes. We identified considerable variation in use over time for some tests (tests 2, 3, 8, 10, and 11) as illustrated in Figure 3. Different lab tests may have different sensitivity and specificity characteristics, and so changes in lab test composition over time complicate interpretation of trends.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Baseman_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"41b41f3b8a44c4c5f08cdfb5a0135118\"><img alt=\"Fig3 Baseman Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/5\/5d\/Fig3_Baseman_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3.<\/b> Lab test code used for positive hepatitis C reports by time for lab test codes with more than 30 reports. Each row represents a different lab test code, with vertical bars represent when reported cases occurred.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Challenge_4\">Challenge 4<\/span><\/h3>\n<p>The process by which data is obtained needs to be evident in order for big data to be useful to public health. Changes in the data generation and collection processes that underlay testing for disease and collection of test data can have big impacts on value of data for public health (examples could include changes in the type of test used at a facility or changes in personal resulting in changing patterns of coding usage).\n<\/p><p>For example, Figure 4 shows a curious parallel double bump in counts for three diseases. The parallel increase suggests a change in the underlying process of testing or acquiring data rather than in the disease processes. The date range for the increase in disease counts suggests that a change in the processes of disease testing associated with December holidays may have contributed. However, the previous year saw no pattern of increases during the same time period.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_Baseman_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"00b278b5e3f7a4d6aa89f96ae17ff0db\"><img alt=\"Fig4 Baseman Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/3\/32\/Fig4_Baseman_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 4.<\/b> Counts of positive test results for chlamydia, syphilis reactor, and gonorrhea aggregated by week.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Challenge_5\">Challenge 5<\/span><\/h3>\n<p>Unlike many other domains in which big data is used, big data for public health purposes needs to answer both \"what\" and \"why\" questions. Also, unlike some other health care fields, PHAs are responsible not only for the health of the communities they serve but also accountable to other government agencies and elected officials who must make decisions and enact policies based on public health surveillance observations. Incorporating metadata about a big data source can help guide answers to \"what\" and \"why\" questions that can arise when analyzing and interpreting findings.\n<\/p><p>An illustration of this challenge is presented in Figure 5, a timeliness analysis which identified substantial differences by day of the week for lab test ordering and processing. These differences by day of the week appear to impact delivery of lab results to the PHA. It is unknown whether this could be accounted for in differences among labs in processing protocols, how a lab combines different test codes to generate a final test report, or other factors that might elucidate why this difference occurred. In turn, this timeliness difference could impact the timing for issuing a public health advisory to the community or to health care providers regarding an increased volume of, for example, acute hepatitis B. Needed metadata about lab processing and reporting practices could make the difference in timing for an advisory and also help elected officials feel more confident about a finding that could require policy decisions to stop the spread of a communicable disease in the community.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_Baseman_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"3e93f4add4b24e5a1e7eb7cb9bea1da2\"><img alt=\"Fig5 Baseman Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/6\/69\/Fig5_Baseman_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 5.<\/b> Time to receive case report by public health by disease and day of week, Insight DB.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Table 2 is another illustration of the need for metadata, this focused on clinician reporting. We identified significant variation between the day of the week that a case report is received at the PHA, as well as considerable variation in reporting by condition. However, in the absence of contextual factors that can influence reporting variation, such as seasonal fluctuations in illness (for example, higher prevalence of influenza during winter months), interpretation of this finding requires more information.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"11\"><b>Table 2.<\/b> Variation in reporting by condition and day of week report received\n<\/td><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">Monday\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">Tuesday\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">Wednesday\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">Thursday\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">Friday\n<\/th><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\"><i>N<\/i>\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">%\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\"><i>N<\/i>\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">%\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\"><i>N<\/i>\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">%\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\"><i>N<\/i>\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">%\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\"><i>N<\/i>\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">%\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HEPBA\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">36\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">30\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">31\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20.8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">22\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14.8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">30\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20.1\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HEPBC\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">132\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">78\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">127\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">23.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">98\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">18.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">104\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">19.3\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HEPC\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">699\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">28.7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">457\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">18.8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">406\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">460\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">18.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">414\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17.0\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HISTO\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">29\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">35.8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17.3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17.3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">11\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">13.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">13\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.0\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SAL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">67\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">30.0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">37\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">34\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">15.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">40\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">45\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20.2\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>According to Khoury and Ioannidis, effective utilization of big data in public health centers on two challenges: addressing the trade-off between access and accuracy and the task of separating true signal from large and varied noise.<sup id=\"rdp-ebb-cite_ref-KhouryMedicine14_15-0\" class=\"reference\"><a href=\"#cite_note-KhouryMedicine14-15\" rel=\"external_link\">[15]<\/a><\/sup> Our assessment of a large dataset available to public health not only provides examples of these challenges but also points to pathways for turning these challenges into opportunities.\n<\/p><p><b>Challenge 1<\/b>: While PHAs almost exclusively rely on secondary use data for surveillance, big data that has been collected for clinical purposes omits data fields of high value for public health.\n<\/p><p>Opportunity: As important as secondary use data is for public health surveillance, public health lacks mechanisms to enforce completeness of fields or timely reporting. Our example of missing race\/ethnicity data is a compelling case, as without this information a PHA will not be able to target health promotion efforts to the most affected or vulnerable populations. Public health is recognized as chronically underfunded; PHAs are not only unlikely to offer incentives for data collection, they need to use scarce resources wisely. Conducting a STI prevention program in a community that does not experience high levels of chlamydia, for example, would be wasteful as well as potentially cause friction in community relations. In recent years, some mechanisms, such as \"meaningful use,\"<sup id=\"rdp-ebb-cite_ref-CDCMeaningful_16-0\" class=\"reference\"><a href=\"#cite_note-CDCMeaningful-16\" rel=\"external_link\">[16]<\/a><\/sup> have been enacted to expand current case reporting between hospitals\/providers and public health and increase capacity for data management and analysis. Figure 1 shows evidence of improvement in the completeness rates of the ethnicity field for one database, largely having resulted from changes in the underlying process of collecting data for this field. However, enforcing compliance in complete and timely reporting may be outside the resources of public health.\n<\/p><p><b>Challenge 2<\/b>: Big data is not always smart data, especially when the context within which the data is collected is absent.\n<\/p><p>Opportunity: A constant issue with notifiable condition reporting systems is the lack of a denominator for the number of positive test results, in part due to privacy reasons that are difficult to avoid. This lack of context limits the value of reportable systems for disease detection, mainly in terms of increasing the rate of false positive alerts. Big data methods to determine context from other data sources would be of great value for public health. The opportunity here is to make use of the experience big data has with processing unstructured data and data from multiple sources to use big data methods to help understand the context of the clinical data.\n<\/p><p><b>Challenge 3<\/b>: Data collected by disparate, varying systems and sources can introduce uncertainties and limit trustworthiness in the data, which may diminish its value for public health purposes.\n<\/p><p>Opportunity: The further away the use of the data gets from the original purpose for its collection, the higher the potential for data quality, integrity, and value problems. There is the opportunity for public health to play a role providing population health-level situational awareness information back to the data originators. This would show value to data originators of data fields that they collect but do not directly use. An example of population health-level situational awareness information would be obesity rates within populations that match characteristics of the provider\u2019s panel population.\n<\/p><p><b>Challenge 4<\/b>: The process by which data is obtained needs to be evident in order for big data to be useful to public health.\n<\/p><p>Opportunity: Big data methods which can detect and adjust for underlying changes in the process that govern the collection of public health data would be beneficial. Three areas relating to metadata would be useful:\n<\/p>\n<ol><li> Techniques for automatically identifying where metadata is needed would be useful (for example automatically identifying and flagging changes in data suggestive of underlying changes in the data generation process).<\/li>\n<li> Techniques for generating metadata from the data itself (for example, we used counts of cases processed on each day to generate metadata that indicates which days were days public health work was being performed).<\/li>\n<li> Techniques that adjust analyses based on metadata, especially with regard to data quality. In situations where PH entities have little recourse on improving DQ, methods that adjust for DQ need to be developed. For example, nowcasting methods (predicting the present state based on the incomplete data at hand) can account for data which accrues over time.<sup id=\"rdp-ebb-cite_ref-JohanssonNowcast14_17-0\" class=\"reference\"><a href=\"#cite_note-JohanssonNowcast14-17\" rel=\"external_link\">[17]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-PreisAdaptive14_18-0\" class=\"reference\"><a href=\"#cite_note-PreisAdaptive14-18\" rel=\"external_link\">[18]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-AlthouseEnhancing15_19-0\" class=\"reference\"><a href=\"#cite_note-AlthouseEnhancing15-19\" rel=\"external_link\">[19]<\/a><\/sup><\/li><\/ol>\n<p><b>Challenge 5<\/b>: Big data for public health purposes needs to answer both \"what\" and \"why\" questions.\n<\/p><p>Opportunity: PH use of big data is unique in that it is constrained by risk of failure. If PH fails to stop an outbreak, preventable accidents, deaths, mortality can result (e.g., Ebola surveillance, detection, and prediction failure). If PH predicts an outbreak that does not materialize, the costs can include relationships with stakeholders, media, and the public. In addition, PH has a responsibility to monitor any data sources that it does receive; thus, data of unclear value to public health uses resources that may be better invested elsewhere.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>Despite these and other issues\u2014such as measurement error and confounding, well-known challenges to both big and small data\u2014strategies traditionally employed by public health epidemiologists and other public health professionals can uncover limitations and contribute to the design of solutions in collection, integration, warehousing, and analysis of big data so its value and utility to public health can be optimized.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>This study was conducted as part of the \u201cLeveraging a HIE to Improve Public Health Disease Investigation\u201d research project (RWJF Award #70338; PI: J Baseman, University of Washington, Seattle WA, USA). The content is solely the responsibility of the authors and does not necessarily represent the official views of the Robert Wood Johnson Foundation.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>J.G.B., D.R. and I.P. conceived this paper; I.P. analyzed the data; J.G.B., D.R. and I.P. wrote the paper. All authors reviewed and approved revisions.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h3>\n<p>The authors declare no conflict of interest.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-MillerComm08-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MillerComm08_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Miller, E.&#32;(2008).&#32;\"Community cleverness required\".&#32;<i>Nature<\/i>&#32;<b>455<\/b>&#32;(1).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2F455001a\" target=\"_blank\">10.1038\/455001a<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Community+cleverness+required&amp;rft.jtitle=Nature&amp;rft.aulast=Miller%2C+E.&amp;rft.au=Miller%2C+E.&amp;rft.date=2008&amp;rft.volume=455&amp;rft.issue=1&amp;rft_id=info:doi\/10.1038%2F455001a&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KruseChallenges16-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KruseChallenges16_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kruse, C.S.; Goswamy, R.; Raval, Y.; Marawi, S.&#32;(2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5138448\" target=\"_blank\">\"Challenges and Opportunities of Big Data in Health Care: A Systematic Review\"<\/a>.&#32;<i>JMIR Medical Informatics<\/i>&#32;<b>4<\/b>&#32;(4): e38.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2196%2Fmedinform.5359\" target=\"_blank\">10.2196\/medinform.5359<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5138448\/\" target=\"_blank\">PMC5138448<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27872036\" target=\"_blank\">27872036<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5138448\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5138448<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Challenges+and+Opportunities+of+Big+Data+in+Health+Care%3A+A+Systematic+Review&amp;rft.jtitle=JMIR+Medical+Informatics&amp;rft.aulast=Kruse%2C+C.S.%3B+Goswamy%2C+R.%3B+Raval%2C+Y.%3B+Marawi%2C+S.&amp;rft.au=Kruse%2C+C.S.%3B+Goswamy%2C+R.%3B+Raval%2C+Y.%3B+Marawi%2C+S.&amp;rft.date=2016&amp;rft.volume=4&amp;rft.issue=4&amp;rft.pages=e38&amp;rft_id=info:doi\/10.2196%2Fmedinform.5359&amp;rft_id=info:pmc\/PMC5138448&amp;rft_id=info:pmid\/27872036&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5138448&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JinSignif15-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JinSignif15_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Jin, X.; Wah, B.W.; Cheng, X.; Wang, Y.&#32;(2015).&#32;\"Significance and Challenges of Big Data Research\".&#32;<i>Big Data Research<\/i>&#32;<b>2<\/b>&#32;(2).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.bdr.2015.01.006\" target=\"_blank\">10.1016\/j.bdr.2015.01.006<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Significance+and+Challenges+of+Big+Data+Research&amp;rft.jtitle=Big+Data+Research&amp;rft.aulast=Jin%2C+X.%3B+Wah%2C+B.W.%3B+Cheng%2C+X.%3B+Wang%2C+Y.&amp;rft.au=Jin%2C+X.%3B+Wah%2C+B.W.%3B+Cheng%2C+X.%3B+Wang%2C+Y.&amp;rft.date=2015&amp;rft.volume=2&amp;rft.issue=2&amp;rft_id=info:doi\/10.1016%2Fj.bdr.2015.01.006&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NambiarALook13-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NambiarALook13_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Nambiar, R.; Bhardwaj, R.; Sethi, A.; Vargheese, R.&#32;(2013).&#32;\"A look at challenges and opportunities of big data analytics in healthcare\".&#32;<i>Proceedings from the 2013 IEEE International Conference on Big Data<\/i>: 17\u201322.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FBigData.2013.6691753\" target=\"_blank\">10.1109\/BigData.2013.6691753<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+look+at+challenges+and+opportunities+of+big+data+analytics+in+healthcare&amp;rft.jtitle=Proceedings+from+the+2013+IEEE+International+Conference+on+Big+Data&amp;rft.aulast=Nambiar%2C+R.%3B+Bhardwaj%2C+R.%3B+Sethi%2C+A.%3B+Vargheese%2C+R.&amp;rft.au=Nambiar%2C+R.%3B+Bhardwaj%2C+R.%3B+Sethi%2C+A.%3B+Vargheese%2C+R.&amp;rft.date=2013&amp;rft.pages=17%E2%80%9322&amp;rft_id=info:doi\/10.1109%2FBigData.2013.6691753&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JosephHITECH14-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JosephHITECH14_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Joseph, S.; Sow, M.; Furukawa, M.F. et al.&#32;(2014).&#32;\"HITECH spurs EHR vendor competition and innovation, resulting in increased adoption\".&#32;<i>American Journal of Managed Care<\/i>&#32;<b>20<\/b>&#32;(9): 734-40.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25365748\" target=\"_blank\">25365748<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=HITECH+spurs+EHR+vendor+competition+and+innovation%2C+resulting+in+increased+adoption&amp;rft.jtitle=American+Journal+of+Managed+Care&amp;rft.aulast=Joseph%2C+S.%3B+Sow%2C+M.%3B+Furukawa%2C+M.F.+et+al.&amp;rft.au=Joseph%2C+S.%3B+Sow%2C+M.%3B+Furukawa%2C+M.F.+et+al.&amp;rft.date=2014&amp;rft.volume=20&amp;rft.issue=9&amp;rft.pages=734-40&amp;rft_id=info:pmid\/25365748&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RoskiCreating14-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RoskiCreating14_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Roski, J.; Bo-Linn, G.W.; Andrews, T.A.&#32;(2014).&#32;\"Creating value in health care through big data: Opportunities and policy implications\".&#32;<i>Health Affairs<\/i>&#32;<b>33<\/b>&#32;(7): 1115-22.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1377%2Fhlthaff.2014.0147\" target=\"_blank\">10.1377\/hlthaff.2014.0147<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25006136\" target=\"_blank\">25006136<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Creating+value+in+health+care+through+big+data%3A+Opportunities+and+policy+implications&amp;rft.jtitle=Health+Affairs&amp;rft.aulast=Roski%2C+J.%3B+Bo-Linn%2C+G.W.%3B+Andrews%2C+T.A.&amp;rft.au=Roski%2C+J.%3B+Bo-Linn%2C+G.W.%3B+Andrews%2C+T.A.&amp;rft.date=2014&amp;rft.volume=33&amp;rft.issue=7&amp;rft.pages=1115-22&amp;rft_id=info:doi\/10.1377%2Fhlthaff.2014.0147&amp;rft_id=info:pmid\/25006136&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GrovesTheBig13-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GrovesTheBig13_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Groves, P.; Kayyali, B.; Knott, D.; Van Kuiken, S.&#32;(January 2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.mckinsey.com\/~\/media\/mckinsey\/industries\/healthcare%20systems%20and%20services\/our%20insights\/the%20big%20data%20revolution%20in%20us%20health%20care\/the_big_data_revolution_in_healthcare.ashx\" target=\"_blank\">\"The 'big data' revolution in healthcare: Accelerating value and innovation\"<\/a>.&#32;McKinsey &amp; Company<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.mckinsey.com\/~\/media\/mckinsey\/industries\/healthcare%20systems%20and%20services\/our%20insights\/the%20big%20data%20revolution%20in%20us%20health%20care\/the_big_data_revolution_in_healthcare.ashx\" target=\"_blank\">https:\/\/www.mckinsey.com\/~\/media\/mckinsey\/industries\/healthcare%20systems%20and%20services\/our%20insights\/the%20big%20data%20revolution%20in%20us%20health%20care\/the_big_data_revolution_in_healthcare.ashx<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=The+%27big+data%27+revolution+in+healthcare%3A+Accelerating+value+and+innovation&amp;rft.atitle=&amp;rft.aulast=Groves%2C+P.%3B+Kayyali%2C+B.%3B+Knott%2C+D.%3B+Van+Kuiken%2C+S.&amp;rft.au=Groves%2C+P.%3B+Kayyali%2C+B.%3B+Knott%2C+D.%3B+Van+Kuiken%2C+S.&amp;rft.date=January+2013&amp;rft.pub=McKinsey+%26+Company&amp;rft_id=https%3A%2F%2Fwww.mckinsey.com%2F%7E%2Fmedia%2Fmckinsey%2Findustries%2Fhealthcare%2520systems%2520and%2520services%2Four%2520insights%2Fthe%2520big%2520data%2520revolution%2520in%2520us%2520health%2520care%2Fthe_big_data_revolution_in_healthcare.ashx&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ShahInter16-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ShahInter16_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Shah, G.H.; Leider, J.P.; Luo, H.; Kaur, R.&#32;(2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5049946\" target=\"_blank\">\"Interoperability of Information Systems Managed and Used by the Local Health Departments\"<\/a>.&#32;<i>Journal of Public Health Management and Practice<\/i>&#32;<b>22<\/b>&#32;(Suppl. 6): S34-S43.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1097%2FPHH.0000000000000436\" target=\"_blank\">10.1097\/PHH.0000000000000436<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5049946\/\" target=\"_blank\">PMC5049946<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27684616\" target=\"_blank\">27684616<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5049946\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5049946<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Interoperability+of+Information+Systems+Managed+and+Used+by+the+Local+Health+Departments&amp;rft.jtitle=Journal+of+Public+Health+Management+and+Practice&amp;rft.aulast=Shah%2C+G.H.%3B+Leider%2C+J.P.%3B+Luo%2C+H.%3B+Kaur%2C+R.&amp;rft.au=Shah%2C+G.H.%3B+Leider%2C+J.P.%3B+Luo%2C+H.%3B+Kaur%2C+R.&amp;rft.date=2016&amp;rft.volume=22&amp;rft.issue=Suppl.+6&amp;rft.pages=S34-S43&amp;rft_id=info:doi\/10.1097%2FPHH.0000000000000436&amp;rft_id=info:pmc\/PMC5049946&amp;rft_id=info:pmid\/27684616&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5049946&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CDCFWhatIs-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CDCFWhatIs_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.cdcfoundation.org\/what-public-health\" target=\"_blank\">\"What Is Public Health?\"<\/a>.&#32;CDC Foundation<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.cdcfoundation.org\/what-public-health\" target=\"_blank\">https:\/\/www.cdcfoundation.org\/what-public-health<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 12 October 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=What+Is+Public+Health%3F&amp;rft.atitle=&amp;rft.pub=CDC+Foundation&amp;rft_id=https%3A%2F%2Fwww.cdcfoundation.org%2Fwhat-public-health&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BarrettBig13-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BarrettBig13_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Barrett, M.A.; Humblet, O.; Hiatt, R.A.; Adler, N.E.&#32;(2013).&#32;\"Big Data and Disease Prevention: From Quantified Self to Quantified Communities\".&#32;<i>Big Data<\/i>&#32;<b>1<\/b>&#32;(3): 168-75.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1089%2Fbig.2013.0027\" target=\"_blank\">10.1089\/big.2013.0027<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27442198\" target=\"_blank\">27442198<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+Data+and+Disease+Prevention%3A+From+Quantified+Self+to+Quantified+Communities&amp;rft.jtitle=Big+Data&amp;rft.aulast=Barrett%2C+M.A.%3B+Humblet%2C+O.%3B+Hiatt%2C+R.A.%3B+Adler%2C+N.E.&amp;rft.au=Barrett%2C+M.A.%3B+Humblet%2C+O.%3B+Hiatt%2C+R.A.%3B+Adler%2C+N.E.&amp;rft.date=2013&amp;rft.volume=1&amp;rft.issue=3&amp;rft.pages=168-75&amp;rft_id=info:doi\/10.1089%2Fbig.2013.0027&amp;rft_id=info:pmid\/27442198&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MeyerBig14-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MeyerBig14_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Meyer, A.M.; Olshan, A.F.; Green, L. et al.&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4766858\" target=\"_blank\">\"Big data for population-based cancer research: the integrated cancer information and surveillance system\"<\/a>.&#32;<i>North Carolina Medical Journal<\/i>&#32;<b>75<\/b>&#32;(4): 265\u20139.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1089%2Fbig.2013.0027\" target=\"_blank\">10.1089\/big.2013.0027<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4766858\/\" target=\"_blank\">PMC4766858<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25046092\" target=\"_blank\">25046092<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4766858\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4766858<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+data+for+population-based+cancer+research%3A+the+integrated+cancer+information+and+surveillance+system&amp;rft.jtitle=North+Carolina+Medical+Journal&amp;rft.aulast=Meyer%2C+A.M.%3B+Olshan%2C+A.F.%3B+Green%2C+L.+et+al.&amp;rft.au=Meyer%2C+A.M.%3B+Olshan%2C+A.F.%3B+Green%2C+L.+et+al.&amp;rft.date=2014&amp;rft.volume=75&amp;rft.issue=4&amp;rft.pages=265%E2%80%939&amp;rft_id=info:doi\/10.1089%2Fbig.2013.0027&amp;rft_id=info:pmc\/PMC4766858&amp;rft_id=info:pmid\/25046092&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4766858&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Salath.C3.A9Digital12-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Salath.C3.A9Digital12_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Salath\u00e9, M.; Bengtsson, L.; Bodnar, T.J. et al.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3406005\" target=\"_blank\">\"Digital epidemiology\"<\/a>.&#32;<i>PLoS Computational Biology<\/i>&#32;<b>8<\/b>&#32;(7): e1002616.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pcbi.1002616\" target=\"_blank\">10.1371\/journal.pcbi.1002616<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3406005\/\" target=\"_blank\">PMC3406005<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22844241\" target=\"_blank\">22844241<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3406005\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3406005<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Digital+epidemiology&amp;rft.jtitle=PLoS+Computational+Biology&amp;rft.aulast=Salath%C3%A9%2C+M.%3B+Bengtsson%2C+L.%3B+Bodnar%2C+T.J.+et+al.&amp;rft.au=Salath%C3%A9%2C+M.%3B+Bengtsson%2C+L.%3B+Bodnar%2C+T.J.+et+al.&amp;rft.date=2012&amp;rft.volume=8&amp;rft.issue=7&amp;rft.pages=e1002616&amp;rft_id=info:doi\/10.1371%2Fjournal.pcbi.1002616&amp;rft_id=info:pmc\/PMC3406005&amp;rft_id=info:pmid\/22844241&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3406005&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PainterUsing13-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PainterUsing13_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Painter, I.; Eaton. J.; Lober, B.&#32;(2013).&#32;\"Using Change Point Detection for Monitoring the Quality of Aggregate Data\".&#32;<i>Online Journal of Public Health Informatics<\/i>&#32;<b>5<\/b>&#32;(1): e186.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5210%2Fojphi.v5i1.4597\" target=\"_blank\">10.5210\/ojphi.v5i1.4597<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Using+Change+Point+Detection+for+Monitoring+the+Quality+of+Aggregate+Data&amp;rft.jtitle=Online+Journal+of+Public+Health+Informatics&amp;rft.aulast=Painter%2C+I.%3B+Eaton.+J.%3B+Lober%2C+B.&amp;rft.au=Painter%2C+I.%3B+Eaton.+J.%3B+Lober%2C+B.&amp;rft.date=2013&amp;rft.volume=5&amp;rft.issue=1&amp;rft.pages=e186&amp;rft_id=info:doi\/10.5210%2Fojphi.v5i1.4597&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZaslavskyTheValid12-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ZaslavskyTheValid12_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zaslavsky, A.M.; Ayanian, J.Z.; Zaborski, L.B.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3349013\" target=\"_blank\">\"The validity of race and ethnicity in enrollment data for Medicare beneficiaries\"<\/a>.&#32;<i>Health Services Research<\/i>&#32;<b>47<\/b>&#32;(3 Pt. 2): 1300\u201321.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Fj.1475-6773.2012.01411.x\" target=\"_blank\">10.1111\/j.1475-6773.2012.01411.x<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3349013\/\" target=\"_blank\">PMC3349013<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22515953\" target=\"_blank\">22515953<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3349013\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3349013<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+validity+of+race+and+ethnicity+in+enrollment+data+for+Medicare+beneficiaries&amp;rft.jtitle=Health+Services+Research&amp;rft.aulast=Zaslavsky%2C+A.M.%3B+Ayanian%2C+J.Z.%3B+Zaborski%2C+L.B.&amp;rft.au=Zaslavsky%2C+A.M.%3B+Ayanian%2C+J.Z.%3B+Zaborski%2C+L.B.&amp;rft.date=2012&amp;rft.volume=47&amp;rft.issue=3+Pt.+2&amp;rft.pages=1300%E2%80%9321&amp;rft_id=info:doi\/10.1111%2Fj.1475-6773.2012.01411.x&amp;rft_id=info:pmc\/PMC3349013&amp;rft_id=info:pmid\/22515953&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3349013&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KhouryMedicine14-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KhouryMedicine14_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Khoury, M.J.; Ioannidis, J.P.&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4684636\" target=\"_blank\">\"Medicine: Big data meets public health\"<\/a>.&#32;<i>Science<\/i>&#32;<b>346<\/b>&#32;(6213): 1054\u20135.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1126%2Fscience.aaa2709\" target=\"_blank\">10.1126\/science.aaa2709<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4684636\/\" target=\"_blank\">PMC4684636<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25430753\" target=\"_blank\">25430753<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4684636\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4684636<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Medicine%3A+Big+data+meets+public+health&amp;rft.jtitle=Science&amp;rft.aulast=Khoury%2C+M.J.%3B+Ioannidis%2C+J.P.&amp;rft.au=Khoury%2C+M.J.%3B+Ioannidis%2C+J.P.&amp;rft.date=2014&amp;rft.volume=346&amp;rft.issue=6213&amp;rft.pages=1054%E2%80%935&amp;rft_id=info:doi\/10.1126%2Fscience.aaa2709&amp;rft_id=info:pmc\/PMC4684636&amp;rft_id=info:pmid\/25430753&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4684636&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CDCMeaningful-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CDCMeaningful_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.cdc.gov\/ehrmeaningfuluse\/\" target=\"_blank\">\"Meaningful Use\"<\/a>.&#32;<i>CDC A\u2013Z Index<\/i>.&#32;Centers for Disease Control and Prevention<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.cdc.gov\/ehrmeaningfuluse\/\" target=\"_blank\">https:\/\/www.cdc.gov\/ehrmeaningfuluse\/<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 12 October 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Meaningful+Use&amp;rft.atitle=CDC+A%E2%80%93Z+Index&amp;rft.pub=Centers+for+Disease+Control+and+Prevention&amp;rft_id=https%3A%2F%2Fwww.cdc.gov%2Fehrmeaningfuluse%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JohanssonNowcast14-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JohanssonNowcast14_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Johansson, M.A.; Powers, A.M.; Pesik, N. et al..&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4128737\" target=\"_blank\">\"Nowcasting the spread of chikungunya virus in the Americas\"<\/a>.&#32;<i>PLoS One<\/i>&#32;<b>9<\/b>&#32;(8): e104915.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0104915\" target=\"_blank\">10.1371\/journal.pone.0104915<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4128737\/\" target=\"_blank\">PMC4128737<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25111394\" target=\"_blank\">25111394<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4128737\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4128737<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Nowcasting+the+spread+of+chikungunya+virus+in+the+Americas&amp;rft.jtitle=PLoS+One&amp;rft.aulast=Johansson%2C+M.A.%3B+Powers%2C+A.M.%3B+Pesik%2C+N.+et+al.&amp;rft.au=Johansson%2C+M.A.%3B+Powers%2C+A.M.%3B+Pesik%2C+N.+et+al.&amp;rft.volume=9&amp;rft.issue=8&amp;rft.pages=e104915&amp;rft_id=info:doi\/10.1371%2Fjournal.pone.0104915&amp;rft_id=info:pmc\/PMC4128737&amp;rft_id=info:pmid\/25111394&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4128737&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PreisAdaptive14-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PreisAdaptive14_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Preis, T.; Moat, H.S..&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4448892\" target=\"_blank\">\"Adaptive nowcasting of influenza outbreaks using Google searches\"<\/a>.&#32;<i>Royal Society Open Science<\/i>&#32;<b>1<\/b>&#32;(2): 140095.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1098%2Frsos.140095\" target=\"_blank\">10.1098\/rsos.140095<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4448892\/\" target=\"_blank\">PMC4448892<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26064532\" target=\"_blank\">26064532<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4448892\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4448892<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Adaptive+nowcasting+of+influenza+outbreaks+using+Google+searches&amp;rft.jtitle=Royal+Society+Open+Science&amp;rft.aulast=Preis%2C+T.%3B+Moat%2C+H.S.&amp;rft.au=Preis%2C+T.%3B+Moat%2C+H.S.&amp;rft.volume=1&amp;rft.issue=2&amp;rft.pages=140095&amp;rft_id=info:doi\/10.1098%2Frsos.140095&amp;rft_id=info:pmc\/PMC4448892&amp;rft_id=info:pmid\/26064532&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4448892&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AlthouseEnhancing15-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AlthouseEnhancing15_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Althouse, B.M.; Scarpino, S.V.; Meyers, L.A. et al..&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5156315\" target=\"_blank\">\"Enhancing disease surveillance with novel data streams: Challenges and opportunities\"<\/a>.&#32;<i>EPJ Data Science<\/i>&#32;<b>4<\/b>: 17.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1140%2Fepjds%2Fs13688-015-0054-0\" target=\"_blank\">10.1140\/epjds\/s13688-015-0054-0<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5156315\/\" target=\"_blank\">PMC5156315<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27990325\" target=\"_blank\">27990325<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5156315\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5156315<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Enhancing+disease+surveillance+with+novel+data+streams%3A+Challenges+and+opportunities&amp;rft.jtitle=EPJ+Data+Science&amp;rft.aulast=Althouse%2C+B.M.%3B+Scarpino%2C+S.V.%3B+Meyers%2C+L.A.+et+al.&amp;rft.au=Althouse%2C+B.M.%3B+Scarpino%2C+S.V.%3B+Meyers%2C+L.A.+et+al.&amp;rft.volume=4&amp;rft.pages=17&amp;rft_id=info:doi\/10.1140%2Fepjds%2Fs13688-015-0054-0&amp;rft_id=info:pmc\/PMC5156315&amp;rft_id=info:pmid\/27990325&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5156315&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Several URL from the original were dead, and more current URLs were substituted.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193154\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.544 seconds\nReal time usage: 0.573 seconds\nPreprocessor visited node count: 16295\/1000000\nPreprocessor generated node count: 31827\/1000000\nPost\u2010expand include size: 136722\/2097152 bytes\nTemplate argument size: 42558\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 530.135 1 - -total\n 82.55% 437.633 1 - Template:Reflist\n 72.71% 385.471 19 - Template:Citation\/core\n 67.91% 360.005 16 - Template:Cite_journal\n 12.98% 68.797 1 - Template:Infobox_journal_article\n 12.46% 66.054 1 - Template:Infobox\n 10.16% 53.842 36 - Template:Citation\/identifier\n 9.37% 49.686 3 - Template:Cite_web\n 7.71% 40.882 80 - Template:Infobox\/row\n 4.07% 21.564 20 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10803-0!*!0!!en!5!* and timestamp 20181214193153 and revision id 34074\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health\">https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","e29139b9d43cc4915ffca40cbc15f91c_images":["https:\/\/www.limswiki.org\/images\/f\/f7\/Fig1_Baseman_Informatics2017_4-4.png","https:\/\/www.limswiki.org\/images\/7\/70\/Fig2_Baseman_Informatics2017_4-4.png","https:\/\/www.limswiki.org\/images\/5\/5d\/Fig3_Baseman_Informatics2017_4-4.png","https:\/\/www.limswiki.org\/images\/3\/32\/Fig4_Baseman_Informatics2017_4-4.png","https:\/\/www.limswiki.org\/images\/6\/69\/Fig5_Baseman_Informatics2017_4-4.png"],"e29139b9d43cc4915ffca40cbc15f91c_timestamp":1544815913,"c443b688b80703848e965b29dc3cba01_type":"article","c443b688b80703848e965b29dc3cba01_title":"GeoFIS: An open-source decision support tool for precision agriculture data (Leroux et al. 2018)","c443b688b80703848e965b29dc3cba01_url":"https:\/\/www.limswiki.org\/index.php\/Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data","c443b688b80703848e965b29dc3cba01_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:GeoFIS: An open-source decision support tool for precision agriculture data\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nGeoFIS: An open-source decision support tool for precision agriculture dataJournal\n \nAgricultureAuthor(s)\n \nLeroux, Corentin; Jones, Haza\u00ebl; Pichon, L\u00e9o; Guillaume, Serge; Lamour, Julien;\r\nTaylor, James; Naud, Olivier; Crestey, Thomas; Lablee, Jean-Luc; Tisseyre, BrunoAuthor affiliation(s)\n \nUniversity of Montpellier, SMAG, Compagnie Fruiti\u00e8rePrimary contact\n \nEmail: cleroux at smag-group dot comYear published\n \n2018Volume and issue\n \n8(6)Page(s)\n \n73DOI\n \n10.3390\/agriculture8060073ISSN\n \n2077-0472Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttp:\/\/www.mdpi.com\/2077-0472\/8\/6\/73\/htmDownload\n \nhttp:\/\/www.mdpi.com\/2077-0472\/8\/6\/73\/pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 The GeoFIS software \n\n3.1 Aim of the GeoFIS project \n3.2 Architecture and design of GeoFIS \n3.3 Functionalities implemented in GeoFIS \n\n\n4 Case studies \n\n4.1 Case study 1 \n\n4.1.1 Rationale and description \n4.1.2 Application in GeoFIS \n4.1.3 Results and discussion \n\n\n4.2 Case study 2 \n\n4.2.1 Rationale and description \n4.2.2 Application in GeoFIS \n4.2.3 Results and discussion \n\n\n4.3 Case study 3 \n\n4.3.1 Rationale and description \n4.3.2 Application in GeoFIS \n4.3.3 Results and discussion \n\n\n\n\n5 Conclusions \n6 Acknowledgements \n\n6.1 Author contributions \n6.2 Funding \n6.3 Conflicts of interest \n\n\n7 References \n8 Notes \n\n\n\nAbstract \nThe world we live in is an increasingly spatial and temporal data-rich environment, and the agriculture industry is no exception. However, data needs to be processed in order to first get information and then make informed management decisions. The concepts of \"precision agriculture\" and \"smart agriculture\" can and will be fully effective when methods and tools are available to practitioners to support this transformation. An open-source program called GeoFIS has been designed with this objective. It was designed to cover the whole process from spatial data to spatial information and decision support. The purpose of this paper is to evaluate the abilities of GeoFIS along with its embedded algorithms to address the main features required by farmers, advisors, or spatial analysts when dealing with precision agriculture data. Three case studies are investigated in the paper: (i) mapping of the spatial variability in the data, (ii) evaluation and cross-comparison of the opportunity for site-specific management in multiple fields, and (iii) delineation of within-field zones for variable-rate applications when these latter are considered opportune. These case studies were applied to three contrasting crop types: banana, wheat, and grapes. These were chosen to highlight the diversity of applications and data characteristics that might be handled with GeoFIS. For each case-study, up-to-date algorithms arising from research studies and implemented in GeoFIS were used to process these precision agriculture data. Areas for future development and possible relations with existing geographic information systems (GIS) software is also discussed.\nKeywords: decision making, GeoFIS, geostatistics, open-source software, precision agriculture, spatial analysis\n\nIntroduction \nWithin-field variability is now a widely accepted and reported phenomenon by the precision agriculture community.[1][2] Geolocalized data are effectively collected intensively within the fields by sensors embedded on agricultural machinery, satellites, flying platforms, static stations, or humans among others, to make sure that this variability is considered and accounted for.[3][4][5] Spatial data have particular characteristics that are worth careful consideration during analysis. First of all, their spatial resolution (density) is of interest as it defines the capacity to identify short- and long-scale spatial variability.[6][7] Spatial records are often associated with a high level of noise that originates for multiple reasons, such as the plant to plant variability, the accuracy of the sensor, or the conditions of data acquisition.[8] Except for images in which data are regularly distributed on a grid of pixels, many spatial observations collected in agriculture are irregular and do not follow a fixed pattern within the fields.[9] This feature is of great concern because many image processing algorithms cannot be directly used on these irregular data.\nTo benefit from this increasing flow of data, users should be provided with software or tools that allow them to:\n\nvisualize the data they have collected (simple or low-level functions),\nprocess these data (advanced or high-level functions), and\nincorporate the knowledge they have on these data into the data processing.\nIt is acknowledged that basic visualization tools\u2014e.g., data import, georeferencing, data display\u2014are available in many general (e.g., Quantum Geographic Information System (QGIS), gvSIG, Google Earth, Whitebow Geospatial Analysis Tools) and more specific[10][11] open-source platforms, including those not specific to agricultural applications. It is clear that such functionalities are of major importance for handling spatial data. However, when it comes to making informed management decisions, these visualization functions are not sufficient. It is necessary to provide users with more advanced or high-level functions so that they can turn this raw spatial data into information and decision layers. The most commonly required procedures in the precision agriculture domain are functions such as:\n\nfiltering, to ensure the quality of the datasets[12][13],\ninterpolation, to provide a continuous mapping of the property of interest[14][15][16],\nzoning, to define within-field zones for site-specific management[17][18], or\naggregation so that multiple layers of information can be combined.[19][20]\nTo foster the adoption of such tools, all the aforementioned functions have to be specifically dedicated to the processing of agricultural data from potentially very differing productions systems. This is an important consideration as these data come with a lot of associated knowledge that has to be considered when processing these data. More specifically, significant local expertise to support decision making might be available as users, e.g., farmers, advisors and\/or technicians, have normally been scouting the fields during all the growing season.[21][22][23] Site-specific management also requires the use of agricultural machinery with specific characteristics that have to be considered in these processing functions. This is to ensure that planned differential management is in accordance with the practical and operational limitations of machinery, e.g., working width, lag time, and application speed.[24][25]\nFrom a general perspective, there are only a few dedicated software programs available to explicitly process precision agriculture data and incorporate expert knowledge into the process. Moreover, very few of them are open-source. Some freeware and shareware tools have been developed and proposed by the precision agriculture community, but these generally focus on specific processing tasks or on a particular type of data. For example, the Vesper program[26], developed by the University of Sydney, provides users with a graphical interface to spatially interpolate their data. Despite the quite advanced functions that are available, e.g., local punctual and block kriging, users only end up with a continuous map of their data without much more practical information. The Yield Editor software from the United States Department of Agriculture[13][27] deals effectively with the filtering of within-field yield datasets that are known to contain many defective observations[28], but it does not perform interpolation or other high-level functions. Another interesting example is a QGIS plugin that was put into place to process spatial data of vine shoot diameter arising from the mounted sensor Physiocap (E.RE.C.A, Vaulx-en-Velin, France). This tool mainly incorporates functions to filter these highly noisy datasets. Other platforms have been proposed by agronomist to give farmers access to crop models, but they are very specific in terms of crop, data, and use.[29] An open-source platform that takes raw data through to a decision point is not available to the precision agriculture community yet.\nThe aim of this paper is to present the GeoFIS software (https:\/\/www.geofis.org\/), developed by a joint team from IRSTEA, INRA, and Montpellier SupAgro in France.[30] The goal of this platform is to provide users with up-to-date and reliable algorithms to process their precision agriculture data and incorporate expert knowledge from the fields. GeoFIS has been mainly developed for academic and research purposes, i.e., investigators and students willing to process their data, but also to a lesser extent for agronomists and advisors with a sufficient background in spatial analysis. The objective of this interface-based platform is to support users who do not necessarily have programming skills and to show that high level functions can be introduced in a GIS and could be integrated within precision agriculture programs. The first section introduces this open-source tool along with its architecture, design, interface, and main processing functions. Three different case studies on various crops are then considered to evaluate the ability of this software to answer most of the issues that are faced by the agricultural sector for processing their spatial data. The last section highlights the needs for future developments to promote precision agriculture adoption and the possibility to create connections with existing GIS software programs.\n\nThe GeoFIS software \nAim of the GeoFIS project \nGeoFIS has been designed to facilitate the movement from spatial data to spatial information, and to spatial decision making. It is an open-source program that proposes a simple and easy-to-use interface to build decision support systems (DSS) from spatial data.[30] While its development has been inspired by agri-environmental applications, the framework itself is open and accessible to applications in other domains. It is designed to be adaptable to different usages and for different end users, mostly for academic and research applications, for student and teaching applications, and, to a lesser extent, for GIS-skilled agronomists and advisors.\nGeoFIS deviates from other GIS software, e.g., QGIS, in the sense that specific tools have been implemented to answer the main expectations of agricultural professionals when it comes to processing precision agriculture data. These will be presented later on. It is acknowledged that multiple other open-source spatial programs (e.g., QGIS) or languages (e.g., R and Python) are available to process spatial and temporal data. However, these open-source tools do not have specific functions dedicated to the processing of precision agriculture data (as listed in the introduction section) and usually require users to have skills in programming. This is a major limiting factor for the practical use of spatial modelling in agriculture. Another strength of GeoFIS is that attention has been paid to the incorporation of expert knowledge into data analysis. This is not available in other related spatial processing tools. Agricultural professionals have significant local expert knowledge on their production system that needs to be taken into account. By incorporating this qualitative expert knowledge, the quality of the processing should be improved and the adoption of precision agriculture technologies should be enhanced.\n\nArchitecture and design of GeoFIS \nIn the proposed GeoFIS architecture, all the open-source toolboxes and libraries have been selected for their ability to handle spatial data and to incorporate expert knowledge (Figure 1). Statistical and geostatistical functions dedicated to precision agriculture data (see next subsection) are implemented in R (https:\/\/www.r-project.org). Outside these specific functions, spatial data are handled through two open-source libraries, i.e., Geotools (http:\/\/www.geotools.org) and CGAL (Computational Geometry Algorithms Library, https:\/\/www.cgal.org). Geotools is used because its Java implementation allows the design of user-friendly interfaces. CGAL was chosen for its ability to provide very efficient and reliable geometric algorithms, as its functions are developed in C++. Finally, the incorporation of expert knowledge is made possible with FisPro (https:\/\/www.fispro.org), a system that uses fuzzy sets for conceptual modeling.[30]\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1: The GeoFIS architecture[30]. CGAL, Computational Geometry Algorithms Library; DSS, Decision Support Systems; GIS, Geographic Information System; 1D, One dimension\n\n\n\nGeoFIS is available in four languages (French, English, Spanish, and Portuguese). The interface is designed with a man-machine cooperation objective. The goal is to facilitate the relationships between data, learning algorithms, and expert knowledge. Documentation, scientific papers, and video tutorials are available to better understand the implemented function and to facilitate the adoption of the GeoFIS software (https:\/\/www.geofis.org\/). Notifications are made when a new version of the software is available.\n\nFunctionalities implemented in GeoFIS \nGeoFIS contains a series of low and high-level non-spatial and spatial functionalities to interrogate spatial data. The general functionalities are introduced here and then expanded in several case studies in the following section. Figure 2 shows the generic flow required in precision agriculture, from raw data processing to decision making, with the functionalities within GeoFIS at each stage indicated. In agricultural systems, data are available in different formats (points, polygons, rasters) and at different scales. The quality of the data is also variable, with some sensors being inherently noisy and others less so. Different data need potentially different approaches to (i) data validation and clean-up (quality control), (ii) data display (visualization), and, when necessary, (iii) interpolation. These steps transform data into information layers. Within GeoFIS, data can be easily imported (Step 0) and displayed as a map (in its geographical space) and as a histogram (in its attribute space). This allows the user to \"expertly\" identify global outliers in both the geographical and attribute space and remove any erroneous data (Step 1). Interpolation is possible using inverse distance weighting (for small data sets) and via punctual kriging with a global variogram for larger data sets (&gt;100 points). The kriging method includes the ability to plot the experimental variogram and specify a theoretical variogram, which is then passed to the kriging function. Interpolated outputs can be directly displayed as rasters within the display (Step 2).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2: Generic flow of data in precision agriculture with main processing steps from raw data processing to decision-making.\n\n\n\n\"Precision agriculture\" or \"smart agriculture\" is only effective when effective decisions are made. End users can transform these information layers into decision layers to improve the management of their fields. Three main functionalities for management (practical) applications have been incorporated within GeoFIS to address this. Firstly, practitioners are provided with a method to delineate within-field homogeneous zones (Step 3.1). Zoning is of importance for precision agriculture data, as the identified zones will (i) facilitate spatial data visualization and interpretation and (ii) provide a spatial resolution that is practical and effective for many differential field operations. GeoFIS uses a segmentation algorithm to \"zone\" data layers.[18] The segmentation algorithm operates either on irregular or gridded (interpolated) data to generate potential management zones.\nSecondly, while data\/information collection tends to be focused around production issues, there is no restriction on its use. It can equally be used for strategic as well as tactical decision making. The example of the technical opportunity index (TOI)[31], which is implemented in GeoFIS, is a case in point. The TOI uses the production data to assess a field\u2019s suitability for site-specific management given machinery constraints and the observed production variation (Step 3.2). The algorithm processes the within-field data with a mathematical morphological filter based on erosion and dilation.[31] This filter allows end users to account for the passes of the agricultural machinery in the field and especially the minimum area (kernel) within which it can operate reliably. As the algorithm requires the data to be organized regularly on a grid, interpolating the data might therefore be required as a pre-processing step (Step 2).\nFinally, in the majority of cases, practical agronomic decisions are multi-variate in nature. Decision support therefore requires dedicated data fusion methods to merge multiple information layers into a single decision layer (Step 3.3). For instance, when available, historical yield data (high spatial resolution point information), as-applied historical fertilizer maps (polygon data), recent point soil testing (low spatial resolution point data), and early season satellite imagery (high resolution raster) should collectively feed into a decision on mid-season spatial fertilizer inputs, i.e., a prescription fertilizer map (normally a polygon layer). In the previous example, the prescription fertilization map (the decision layer) is based on a set of inputs (information layers) that are all related through expert rules. An example of a possible expert rule could be that if, on a given location in space, the observed yield is high and the soil fertilizer level is low, then it might be relevant to apply more fertilizer inputs. Within GeoFIS, the goal of the data aggregation process is to implement the expert rules so that the final spatial decision layer (that answers the question \"how much fertilizer input should be applied at this particular place at this particular time?\") can be obtained. Expert rules are implemented one at a time, as each rule leads to a practical agronomic decision.\nData aggregation in GeoFIS is a two-step process. First, each information layer is transformed into an expert layer, i.e., the numerical agronomic values in each information layer are transformed into degree values (from 0 to 1) according to the expert rule to be implemented. The transformation from an information layer to an expert layer is done using a fuzzy set-based function.[32] Secondly, all the expert layers are combined using an aggregation operator to respect the expert rules. Two aggregation operators are currently implemented in GeoFIS. The first operator is the Weighted Arithmetic Mean (WAM), which attributes a weight to each information source, e.g., the yield information layer may be given twice as much weight as the soil fertilizer level layer. The second operator is the Ordered Weighted Average (OWA)[33], where the weighing is slightly more complex. For a given location in space, the degree values associated with each layer involved in the expert rule are ordered, and the weights assigned to each layer will depend on their position in this ordering. This operator is of interest as it enables the implementation of logical operations, such as:\n\n \"OR,\" where the expert rule applies as soon as the highest degree associated with the layers is high, and\n \"AND,\" where the expert rule applies as soon as one of the degrees associated with the layers is high.\nThe result of the aggregation process is a single decision layer. The uniqueness of the GeoFIS approach is in its ability to incorporate the expert knowledge developed by farmers and advisors on the data and their fields directly into the data fusion process. The implemented data aggregation methods require the data to be collocated, either on irregular or regular grids.\n\nCase studies \nThe previous section introduced the GeoFIS framework, including the functionalities implemented and how they could be adapted to the individual needs of each end user (who will have their own unique constraints on management). The following subsections provide more detailed illustrations on the main processing steps in the context of precision agriculture applications. More specifically, the three cases deal with the typical tasks that advisors and farmers may face in their daily job:\n\nthe mapping of spatial data (Steps 0, 1 and 2),\nthe evaluation and cross-comparison of the opportunity for site-specific management in their fields (Step 3.2), and\nthe delineation of within-field zones for variable-rate applications where zoning is considered opportune (Steps 3.1 and 3.3).\nSteps 0 to 2 will be exemplified through medium spatial resolution manual measurements performed over a banana field to map the plant vigor. High resolution yield data across several wheat fields will be used to illustrate the value of Step 3.2 to rank the fields from the most to the least suitable for site-specific management. Step 3.1 and 3.3 will be applied on a precision viticulture example aimed at defining zones for differential irrigation management. The overall objective is to demonstrate how GeoFIS has the ability to address the main issues of data processing in precision agriculture. As the three case studies are performed on different crops (banana, wheat, and grapes), each exhibiting unique characteristics, the applicability and genericity of this open-source software will also be demonstrated.\n\nCase study 1 \nRationale and description \nMapping the spatial organization in the data\u2014An example of the vegetative response of an asynchronous plant, the banana\nVariography and mapping are two very important processing steps in the precision agriculture domain. The former helps evaluate the spatial structure in the data by quantifying the proportions of (i) spatially-structured variability or large-scale variations and (ii) spatially unstructured variability or small-scale variations within the field. The latter is mainly used for the correct display of the observed spatial variability and facilitate the process of decision making.\nIn this case study, GeoFIS was used to investigate and map the spatial variability in the pseudostem (trunk) circumference of banana crops. The proposed analysis was carried out on this crop for two major reasons. First of all, the spatial variability in the agronomic properties of banana crops has been poorly reported in the literature.[34] Secondly, this crop is known to be asynchronous in its production cycle, which means that spatial analyses are to be handled differently from what is commonly done in annual crops, e.g., wheat, canola, or perennial ones, e.g., grapes.[34] The proposed analysis (i) estimates the proportion of spatially-structured variability in pseudostem circumferences, i.e., the proportion of variance that is mainly due to spatially-structured environmental properties[15]; (ii) determines the proportion of spatially unstructured variability that is due to non-spatially structured phenomena, e.g., the inter-plant variability, plant competition, replanting, and measurement accuracy among others; and (iii) maps the overall within-field variability of trunk circumference in the plantation.\nThe plot under study is situated in a commercial banana plantation in Njombe, Cameroon (WGS84: E: 4.612, N: 9.639) in its fifteenth flowering cycle. The pseudostem circumference measurements were only taken on plants where vegetative growth had ceased, i.e., plants that were either flowering or at a later phenological stage. There were 551 measurements taken using a tape measure at 1-m height and georeferenced with a trail type hand-held GPS (Table 1). The proposed analysis in GeoFIS consisted of the following steps: (i) the dataset was imported within GeoFIS (Step 0), (ii) pseudostem circumference values were filtered to ensure the quality of the dataset (Step 1), and (iii) variograms were fitted to the filtered datasets and interpolation was performed using kriging with a local neighborhood onto a 1\u00d71 meter grid.\n\n\n\n\n\n\n\nTable 1. Description of the plot under investigation\n\n\nSurface (ha)\n\nTotal Number of Plant Observations\n\nNumber of Plants that Have Reached at Least the Flowering Stage\n\nTrunk Circumference (cm)\n\n\nMean\n\nVariance\n\n\n0.85\n\n1287\n\n551\n\n74.7\n\n69.7\n\n\n\nApplication in GeoFIS \nThe global distribution of the data was filtered within GeoFIS (Figure 3). Users can select the attribute to be filtered at the top of the window. Below the histogram, two threshold values that represent the two tails of the distribution can be changed, by either typing specific values or moving a slide bar. Observations outside these thresholds are then removed from the dataset. Note that there were two low values in this data set that were considered outside the normal distribution by the user (Figure 3). The lower threshold allowed the user to eliminate these non-compliant values.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3: Filtering of the pseudostem circumference values based on distribution of response in the attribute space\n\n\n\nThe spatial structure of the data can then be evaluated by plotting an experimental variogram, here using the within-field pseudostem circumferences. The number of lags and the maximum lag distance can be set in the left-hand corner of the window to make sure that the variogram is relevant. The interface (Figure 4) enables the user to specify and fit a theoretical variogram model to the experimental variogram. A theoretical variogram is automatically fitted, after which users can interactively change the values of the variogram parameters, i.e., nugget, partial sill, and range to improve the fit. The quality of the fit can be assessed with the root mean square error (RMSE) value that is detailed in the top right-hand corner of the interface. The theoretical model can then be saved and used later to perform interpolation by kriging.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 4: Screenshot from GeoFIS illustrating the calculation of the experimental variogram and the fitting of a theoretical variogram model to the within-field pseudostem circumference spatial data\n\n\n\nResults and discussion \nThe spatial locations of the measurements are displayed in Figure 5. It clearly shows that the spatial observations are irregularly-spaced within the plot. This aspect can be simply explained by the fact that not all the banana plants had reached the flowering phenological stage (only 551 out of the 1287 plants had). In the plot under study, the pseudostem circumference exhibits a quite strong spatial autocorrelation, the ratio of autocorrelated variance being close to 55% (Table 2). This finding demonstrates that spatially-structured environmental properties, e.g., soil physical and chemical characteristics, are likely in this case to exert a relatively strong influence on the pseudostem circumference of the banana plants. The determination of the factors affecting the pseudostem circumference is beyond the scope of this study. Further analyses of, e.g., soil and plant records, might help to answer this question.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 5: Spatial measurements of pseudostem circumference divided in five quantiles within the plot under study\n\n\n\n\r\n\n\n\n\n\n\n\n\nTable 2. Spatial statistics of pseudostem circumference in the plot under investigation\n\n\nNugget Variance (C0)\n\nPartial-Sill Variance (C1)\n\nSill Variance (C0 + C1)\n\nRatio of Autocorrelated Variance (C1\/C0 + C1)\n\n\n35.2\n\n43.4\n\n78.6\n\n55.2%\n\n\n\nTable 2 also shows that the proportion of spatially unstructured variability (C0) is not negligible. In this case study, it can be mainly explained by (i) the inherent within-plant variability that might be exacerbated by competition among neighbors, and (ii) the accuracy of the measurements which might be affected by Global Navigation Satellite Systems (GNSS) accuracy issues or operator errors.\nFigure 6 provides a surface (map) of the within-field pseudostem circumference after interpolation (ordinary kriging). This smooths the data in Figure 5 using information on spatial variability contained in the same data. The circumferences appear to be much lower (less than 70 cm) in the northeastern and southern portions of the plots. The larger pseudostems, those for which the circumference exceeded 87 cm, can be mainly found in the northern part of the field. Some local effects\u2014e.g., small sites of low circumference surrounded by high pseudostem circumferences\u2014are also visible on the maps. Those might be explained by several phenomena having a localized effect on plants, such as pest damage or replanting. It is worth recalling that this final map is not a map of circumferences of all pseudostems; rather, it's a map of potential circumference at flowering, as not all the banana plants have reached the flowering stage. This map is an alternative representation of the information displayed in Figure 5 and provides predictions for plants that were not measured in the original survey. As for Figure 6, this map may be very useful in locating sampling sites to perform further soil and\/or plant analyses and to better characterize the within-field pseudostem circumference variability. It has the advantage over the raw data plot (Figure 5) of being easier for the human eye to interpret the main patterns in the field.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 6: Kriged map of the potential pseudostem circumference within the field under study. The map represents a potential rather than an exhaustive analysis of plants because not all the plants have reached the flowering stage.\n\n\n\nGeoFIS proved to be a relevant tool to model the spatial variability in the banana pseudostem circumference data and for continuous mapping of this property of interest. However, a couple of limitations are worth discussing. Firstly, even if the filtering interface is user-friendly, it only provides a global filtering of the data. Only the tails of the distribution can be trimmed. It may have been that spatial data exhibit not only global but also local outliers. This was not a problem here, but removing local outliers would be a useful function in the software program. When present, local outliers (inliers) will affect the quality of interpolation procedures. Secondly, GeoFIS does not yet allow the fitting of nested variogram models. This was a potential issue in this case study. In Figure 4, it could be argued that there is a short-range spatial structure within the first 10 meters and a second spatial structure from 10 to 30 meters (with a longer range). Nested spatial structures are not common but do occur in agricultural data. Thirdly, regarding the continuous mapping of the data, GeoFIS only provides a kriged map of the property of interest. The mean estimates are given, but the error (kriging variance) associated with these estimates is not provided. This is a potential limitation for assessing the mapping accuracy and for interpreting uncertainty in future analyses with the interpolated data.\n\nCase study 2 \nRationale and description \nEvaluating and comparing the opportunity for site-specific management within-field\nSite-specific management requires a strong investment in time, money, and technical skills for growers. This investment requires certain conditions to be met. Firstly, the within-field variability has to be strong enough to justify differentiate management. Secondly, this variability has to be spatially structured or organized enough within the field to be able to be managed by agricultural machinery.[2] Farmers, therefore, are in need of tools that will help them to evaluate this opportunity for site-specific management. To make decisions at a larger level than the field, i.e., the whole farm, this opportunity also has to be cross-compared between fields. Farmers should preferentially commit their efforts towards the fields that are the most opportune for site-specific management. These are most likely to have the largest returns on investment in agri-technology, which should minimize the risk of investment for the farmer.\nIn this case study, GeoFIS was used to evaluate and compare the opportunity for adopting site-specific management across multiple fields using a defined opportunity index.[31] Opportunity indices are a way of assessing if the amount and structure of variation in a field makes site-specific management a potentially feasible option.[2][25] Seven yield datasets arising from two different farms located near Evreux, in the northwestern part of France (Farm 1\u2014WGS84: E: 0.779, N: 48.955; Farm 2\u2014WGS84: E: 1.032, N: 48.828) were used. Fields were cropped in wheat and harvested with various combines, primarily New Holland (Turin, Italy) and Claas (Harsewinkel, Germany) combines. Yield datasets are considered particularly relevant for this case study because the yield is directly related to the field economic returns. Quantifying the amount and structure of yield variance should therefore be a valuable indicator of whether site-specific management is opportune. Structured spatial variation in yield would indicate a potential for structured spatial crop management, particularly fertilizer and agrichemicals.\nThis case study also demonstrates the use of GeoFIS with dense sensor-derived spatial observations, in contrast to the spatial manual measurements presented in the first case study. Yield data are collected with on-board sensors at 1 Hz as the combine traverses the field. These observations are therefore irregularly-distributed in space because (i) the intra-row and inter-row distances are different and (ii) the acquisition conditions, such as the GNSS accuracy or variable combine speed, can impact the spatial distribution of the observations. The yield information is very dense (thousands of points per hectare) and very noisy because of stochastic error in sensor operation, the intrinsic local variability in production, and errors associated with the combine harvester passing through the field.[13][28]\nThese seven fields were selected because they exhibit various degrees of yield autocorrelation within the same systems (farms) and, as such, should represent a different opportunity for variable-rate applications. Within this case study, several functions of GeoFIS were used to arrive at a solution that ranks and compares the seven fields in terms of a technical opportunity for site-specific management. More specifically, (i) global outliers were filtered out (Step 1); (ii) variograms were fitted to the previously filtered yield datasets, and ordinary kriging with a global variogram and local neighborhood was performed onto a 3\u00d73 meter grid (Step 2); and (iii) the TOI was computed (see Section 2.3 Functionalities implemented in GeoFIS) (Step 3.2). To account for technical and operational constraints during the TOI computation, the following operational characteristics were assumed: a working width of 20 meters, a mean speed of three meters per second, and a delay rate of change between two different treatments of two seconds. This could be for instance the characteristics of a fertilizer spreader performing variate-rate application. The major yield statistics of the seven fields under consideration after data clean up are reported in Table 3.\n\n\n\n\n\n\n\nTable 3. Principal descriptive and spatial statistics of the seven yield datasets under consideration. The nugget to sill ratio can be calculated after variograms are fitted to the cleaned data in GeoFIS.\n\n\nField\n\nSize (ha)\n\nMean (t ha\u22121)\n\nCV (%)\n\nNugget to Sill Ratio (%)\n\n\n1\n\n8.9\n\n8.3\n\n8.7\n\n53.8\n\n\n2\n\n12.9\n\n7.0\n\n24.6\n\n46.3\n\n\n3\n\n8.9\n\n7.8\n\n11.6\n\n36.0\n\n\n4\n\n11.2\n\n6.1\n\n9.1\n\n37.5\n\n\n5\n\n18.1\n\n7.1\n\n14.5\n\n22.4\n\n\n6\n\n24.1\n\n9.6\n\n15.9\n\n19.9\n\n\n7\n\n32.5\n\n9.5\n\n15.4\n\n15.1\n\n\n\nApplication in GeoFIS \nThe filtering and interpolation procedures have already been detailed in the first case study and will not be discussed here. The technical opportunity index (TOI) can be computed in the Opportunity Index toolbar of the GeoFIS software. Figure 7 displays the window that appears when this menu is selected. The window is composed of three main sections. In the top drop-down menu (Border), users are asked to select the attribute on which the metric should be computed, e.g., yield, and to provide the field boundaries to make sure that the calculation of the TOI is restricted to the field of interest. Note that the boundary can be automatically derived with a convex hull; however, this may not be a good option for fields with an irregular geometric shape. In the second drop-down menu (Machine Footprint) the technical and operational constraints of future site-specific management can be specified. More specifically, users can provide the working width of machinery, its speed, the delay in the rate of change between two levels of outputs (management strategies), and the uncertainty in the GNSS positioning of the machine. The third drop-down menu (Interpolation) ensures that all observations are reported on a fixed grid and the TOI is calculated using the grid data. Users can select the size of the interpolation grid along with the interpolation procedure, i.e., inverse distance weighing or kriging. Note that both interpolation approaches need to be parameterized and require some user input.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 7: Screenshot of output from the computation of the Technical Opportunity Index (TOI) in GeoFIS for Field 7\n\n\n\nWhen all this information has been specified by a user, the TOI can be calculated. The window displays two major outputs: (i) the TOI value associated with the data along with the corresponding error rate of application, and (ii) the potential management zone map with the different strategies that should be applied (in the case of Figure 7, there are two strategies presented). This latter map can be exported and used in other GIS software if needed.\n\nResults and discussion \nFigure 8 shows the seven fields in the study, ranked by their respective TOI values along with the corresponding variable-rate application map for a two-management strategy. It clearly shows that the fields have different levels of yield spatial structure, from the lowest for Field 1 to the strongest for Field 7. Note that, in this case study, the order of the TOI values is consistent with the order of nugget to sill ratios (Table 3). The TOI values are however very close in absolute terms (Figure 8), with a range from 0.888 to 0.965. As the TOI value can theoretically range from 0 to 1, all the fields here are exhibiting high TOI values, indicating that a site-specific management is opportune for all of these fields. All the maps have spatially-structured patterns, in accordance with the technical and operational constraints of a future possible machine pass (Figure 8). These maps could be directly incorporated into a machinery system to perform site-specific management.\n\r\n\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 8: Ranking of the seven yield datasets in terms of the associated TOI value: (a) Field 1; (b) Field 2; (c) Field 3; (d) Field 4; (e) Field 5; (f) Field 6; (g) Field 7. Cleaned yield values and corresponding potential variable application maps are also displayed for each field. TOI: technical opportunity index\n\n\n\nThe high TOI values for these fields is due to two principal reasons: (i) the data interpolation and (ii) the operational constraints that were set. The computation of the TOI requires the data to be regularly distributed over the field, which is why a prior interpolation procedure is put into place. In this case study, the interpolation by kriging generated a relatively strong data smoothing that artificially increased the TOI values, as it is calculated on the interpolated data. Indeed, as the small-scale variations are smoothed, the yield patterns appear much more organized in space, and the site-specific management is consequently considered more opportune. The settings of the operational characteristics in these fields also facilitated high TOI values. As the minimal size of field management (working width of the machinery) decreases, the opportunity for variable-rate application will increase. Smaller machinery means that smaller areas of spatial variation become potentially manageable. In contrast, if field management were done at a coarser level, e.g., the working width of the machinery was set to 40 meters, then the opportunity for site-specific management would decrease, and there would likely be larger differences among the seven studied fields (data not shown). As can be seen in Figure 8, only two management strategies are proposed for each field. Even if this two-class categorization appears sufficient in some case studies, the actual computation of the TOI at the moment does not allow for alternative management strategies (three, four, \u2026, etc. classes) to be simultaneously considered. This aspect will be investigated in further studies.\nThe TOI is a valuable metric to evaluate and rank fields with respect to the opportunity for site-specific management. GeoFIS is an interesting tool to perform this case study because all the steps required to compute the TOI can be performed within the program. Note that potential management zone maps are also provided and can be simply exported through the easy-to-use interface (however, the target rates are not yet determined at this point; see the next case study). This should foster the adoption of precision agriculture technologies. Users must however be cautious when computing and interpreting the TOI, as this metric is particularly sensitive to the interpolation of the cleaned data and the setting of the technical and operation constraints for site-specific management. Users should be able to perform a series of tests within GeoFIS to evaluate the impact of their parametrization on the TOI values and management zone maps. To cross-compare this opportunity for potential differentiate application amongst fields, authors strongly advocate the application of the exact same process with similar settings for the calculation of the final TOI metric.\n\nCase study 3 \nRationale and description \nDelineating within-field zones for variable-rate applications using expert knowledge\nThe delineation of within-field zones is an important procedure in precision agriculture studies because it enables, or at least facilitates, growers to perform variable-rate applications. The creation of these zones is a complex process for multiple reasons: (i) there is a need to account for spatial relationships in the data, (ii) very often multiple layers of spatial information must be combined, and (iii) the decision rules associated with agronomic applications are complex and require the grower\u2019s knowledge to be involved in the processing. In this case study, GeoFIS is used to delineate within-field zones prior to the management of irrigation and fertilization in a Spanish vineyard using several layers of information and incorporating expert knowledge. This case study is an extension of previous work by Santesteban et al.[35] Interested readers are referred to this document for more information.\nThe study was carried out on a 90 hectare commercial vineyard containing 27 contiguous fields (Figure 9) located in Southern Navarre, Spain (WGS84: E: 1.405, N: 42.254). The vine vigor, soil, and water availability in the field were considered to be of major interest by the vine manager to manage irrigation and fertilization practices.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 9: Maps of the whole-vineyard showing the spatial variability in (a) elevation; (b) soil apparent conductivity (ECa); and (c) vegetative expression (normalized difference vegetation index (NDVI)). Points in (a, b) indicate sampling locations (n = 256) (reproduced with permission from Santesteban et al.).[35]\n\n\n\nGrapevine vigor was estimated using the normalized difference vegetation index (NDVI) on a 3\u00d73 meter raster layer derived from a Multi-spectral Airborne image acquired in August 2007 and provided and processed by the Geosys-Spain Company (Leica ADS40 sensor). Measurements of soil apparent electrical conductivity (ECa) on a 30\u00d730 meter grid (256 sampling points) were performed using a handheld ground conductivity meter (EM38, Geonics Ltd., Mississauga, ON, Canada) to map soil spatial variability. The same sample sites were used to create a digital terrain model from elevation data obtained with a laser Tachymeter (TPS 1001, Leica, Heerbrugg, Switzerland). Both ECa and elevation data were kriged onto a three-meter grid. Additional monitoring was performed to provide more information on the vine vigor, soil, and water variation.[35] As these additional observations were more expensive and\/or cumbersome to collect, only 64 out of the 256 sampling sites were monitored. These monitoring sites were selected using the high-resolution data layers. Additional observations were related to the (i) soil, e.g., observation of soil pits; (ii) plant, e.g., plant water status, pruning weight of wood, and yield; and (iii) production, e.g., berry size, berry composition, and yield characteristics. The analysis of all these data layers led to an explanatory reasoning summarized as[35]:\n\n Hydromorphic soils and wetlands are well defined by the ECa information. Their presence is mainly explained by variations in elevation.\n Vine vegetative expression is too high (and harvest quality too low) on the zones at the highest elevations, characterized by light and deep soils (low ECa values).\n Vine vegetative expression is too weak on the zones at the lowest elevations, characterized by clay soils, which suffer from water logging after rainfall events (high ECa values).\nBased on this explanatory reasoning, the vineyard manager defined several decision rules to identify the situations in which the current management practices were sub-optimal regarding grape quality and quantity at harvest. An example of one of these rules was: If NDVI is high (&gt;70) and ECa is low (&lt;180 mS m\u22121) and elevation is high (&gt;360 m), then the risk of having sub-optimal management practices is high.\nThis latter rule was modelled in GeoFIS to provide a map showing the risk of having sub-optimal management practices within the vineyard. First, the three data layers involved in the expert rule were transformed into risk maps using risk functions (Step 3.3). The parametrization of these risk functions was done with the vineyard manager. All the univariate risk maps were then combined into a final risk map using the OWA aggregator, which was again parameterized with the vineyard manager (see Section 2.3 Functionalities implemented in GeoFIS) (Step 3.3). Finally, a segmentation algorithm was applied to this last risk map to provide within-field risk zones (Step 3.1).\n\nApplication in GeoFIS \nFocusing on the computation of the risk functions and on the zoning of the resulting risk map, for each layer of information (ECa, NDVI, Elevation), risk functions can be defined within GeoFIS by implementing fuzzy rules as displayed in Figure 10. Here, a semi-trapezoidal function was used to model the risk of having sub-optimal practices by solely relying on the ECa layer. In this interface, the form of the risk function can be changed along with the associated fuzzy parameters, i.e., the kernel and support. Once the risk functions have been set for all the layers of interest, all the risks can be aggregated with respect to the aforementioned expert rule(s). This aggregation procedure can be performed through the interfaces displayed in Figure 11 where (i) the layers can be selected and the aggregation operator can be chosen (OWA aggregator here) and, (ii) the parameters associated to the OWA aggregator can be stated.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 10: Implementation of the risk function associated with the ECa information layer\n\n\n\n\n\n\n\n\n\n\n\n\n Figure 11: Parameterization of the Ordered Weighted Average (OWA) aggregator: (a) Selection of the layers to be aggregated; (b) setting of the OWA aggregator parameters. The weights for the minimum, medium and maximum values of univariate risk are respectively 0.7, 0.2, and 0.1.\n\n\n\nAfter the aggregation procedure has been run, practitioners end up with a continuous map of the global risk of having sub-optimal practices within the vineyard. To facilitate the interpretation of the map and the process of decision-making, the risk map can be zoned using the interface displayed in Figure 12. Before zoning, users must (i) define the boundary of the map, either by importing a predefined boundary or by using a default convex hull algorithm (that is proposed in GeoFIS) to generate a boundary and (ii) set the neighborhood associated to each spatial observation so that zones can be expanded using spatial neighbors. The zoning procedure can then be applied to the OWA risk map using the zoning algorithm implemented in GeoFIS.[18] Users can then display a risk map with a number of zones that they consider relevant.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 12: Delimitation of within-field yield zones of the risk of having sub-optimal management practices. (Map details described in Figure 13).\n\n\n\nResults and discussion \nThe map of the risk of arriving at sub-optimal management practices using a combination of available information and expert rules derived from local knowledge is displayed in Figure 13. This map shows five zones, three of which are relatively large, with specific risk levels. The highest risk area (dark red) is located on the western part of the vineyard and characterized by low ECa, high NDVI, and high elevation (Figure 13). In this part of the vineyard, it is likely that current management practices are not well adapted. Grape quality and quantity at harvest are not optimized in this area, and \u201cnitrogen applications should be avoided; water availability should be reduced by the introduction of a cover crop; and Regulated Deficit Irrigation strategies should held in order to moderate shoot growth and fertility.\u201d[35] In order to simplify the presentation of this example, only one rule has been taken into account. It would have been possible to introduce additional rules based on the work presented by Santesteban et al.[35]\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 13: Aggregated risk zones of sub-optimal management practices derived using the NDVI, ECa, and elevation layers together with local expert knowledge\n\n\n\nIt is interesting to note that the aggregation procedure though the OWA operator using the NDVI, ECa, and elevation layers (Figure 13) has resulted in a risk map that is different from that which would have been obtained by interpreting each layer of information independently (Figure 14). For instance, if the ECa layer had only been used to generate the risk map, the highest-risk area would have covered a much larger area of the vineyard.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 14: Maps of risk zones of sub-optimal management practices derived in the univariate space with variate specific local expert rules. ECa (left); NDVI (middle); and Elevation (right)\n\n\n\nThis case study illustrates that the expertise of farm managers and advisors can be incorporated into a data-fusing algorithm to generate decision layers. Indeed, GeoFIS enables users to incorporate their own expertise, i.e., though the use of univariate risk functions\/fuzzy rules, into the generation of risk maps. The use of fuzzy rules to account for this expertise is of interest as it makes it possible to avoid abrupt changes in risk and generates a more gradual variation in potential risk (Figure 10). The GeoFIS interface enables users to calibrate the risk and aggregation functions empirically by offering users the ability to test a calibration, visualize the resulting risk maps, and possibly adjust it to their convenience. However, it must be stated that this will require farmers and advisors to be supported so that their expertise can be translated correctly into the data aggregation algorithms.\nThe calibration of the OWA index presented in this case study (weight of 0.7 for the minimum value of univariate risk, 0.2 for the median value, and 0.1 for the maximum value) resulted from an iterative calibration process lead by the vineyard manager. This aggregation setting has strong similarities with the logical operation \u201cAND,\u201d i.e., the resulting risk is high if the minimum value of univariate risk is also high because it has the strongest weight. In other words, all the univariate risks are high because the median and maximum values for a univariate risk are necessarily higher than the minimum value of the univariate risk. Note that the real logical operation \u201cAND\u201d would be reproduced by changing the set of weights (1;0;0). By changing these weights, practitioners might also be able to reproduce the logical operation \u201cOR\u201d (0;0;1) for which the resulting risk is high as soon as the maximum value of a univariate risk is high. It would also be possible to perform a simple average of the different univariate risks by using the same weights for each layer.\nFrom a more general perspective, GeoFIS simplifies the processing of the three layers of information, as the entire process was done within a single software platform. It can be compared to the data processing by Santesteban et al.[35] in which data where cleaned with Excel, interpolated with Vesper, analyzed with Matlab, and represented with ArcGIS.\n\nConclusions \nThe increasing flow of precision agriculture data requires the development of free and open-source processing software to manage and make use of these data and promote precision agriculture adoption. As such, GeoFIS has been specifically designed to facilitate the movement from spatial data to spatial information and to spatial decision making. The application of GeoFIS on some example case studies that agricultural professionals may face when dealing with spatial data has demonstrated the potential of this software. GeoFIS is a released product; however, it is important to state that all the functionality currently introduced and implemented in GeoFIS are still areas of active investigation by the scientific community. GeoFIS will be updated when, and if, improved methodologies become available. It is one of the strengths of the GeoFIS platform that it is able to integrate the latest research developments to make sure that users are provided with the most up-to-date, reliable, and powerful processing algorithms.\nAs it is, GeoFIS is an excellent tool to promote teaching in precision agriculture. Indeed, GeoFIS has already been used within many higher education institutions in France to teach researchers and professionals how to process spatial data. The user-friendly interface effectively facilitates the understanding of some major precision agriculture concepts.\nThe analysis of the three case studies has been an opportunity to also evaluate the limits of the current algorithms and to propose areas for future development within the software. For instance, the data filtering procedure focuses solely on global outliers, while spatial datasets may contain outliers more deeply rooted within the data and sometimes referred to as spatial outliers. A second example is that the variography analysis is limited to single data layers, while cross-variography studies might be relevant to evaluate the spatial relationships between multiple layers of information. To foster the adoption of GeoFIS, the authors are more than open to collaboration and are ready to integrate relevant algorithms for processing precision agriculture data.\nAnother possibility to promote the processing of precision agriculture data would be to create links between GeoFIS and existing GIS programs such as QGIS, an open-source GIS already widely used by many communities working on spatial data. There is a possibility to integrate all the algorithms of GeoFIS directly within this open-source GIS software to benefit from the display and processing algorithms already implemented in QGIS. This would however require users to process their precision agriculture data in a more complex environment for which specific GIS skills are necessary. Another option is to transform GeoFIS into a web-based service, rather than its current download and desktop application structure, so that users would not have to care about the R installation, Java updates, and compatibility between different operating systems.\n\nAcknowledgements \nAuthor contributions \nJ.-L.L. and S.G. developed the GeoFIS software; B.T., J.T., O.N., H.J. and S.G. conceived and designed the experiments; J.L., C.L., and L.P. performed the experiments and analyzed the data; all the authors contributed to reagents\/materials\/analysis tools; C.L. organized the writing of the paper.\n\nFunding \nThis research received no external funding.\n\nConflicts of interest \nThe authors declare no conflict of interest.\n\nReferences \n\n\n\u2191 Oliver, M.A., ed.&#32;(2010).&#32;Geostatistical Applications for Precision Agriculture.&#32;Springer.&#32;pp.&#160;331.&#32;doi:10.1007\/978-90-481-9133-8.&#32;ISBN&#160;9789048191321. &#160; \n\n\u2191 2.0 2.1 2.2 Pringle, M.J.; McBratney, A.B.; Whelan, B.M.; Taylor, J.M.&#32;(2003).&#32;\"A preliminary approach to assessing the opportunity for site-specific crop management in a field, using yield monitor data\".&#32;Agricultural Systems&#32;76&#32;(1): 273\u201392.&#32;doi:10.1016\/S0308-521X(02)00005-7. &#160; \n\n\u2191 Acevedo-Opazo, C.; Tisseyre, B.; Guillaume, S.; Ojeda, H.&#32;(2008).&#32;\"The potential of high spatial resolution information to define within-vineyard zones related to vine water status\".&#32;Precision Agriculture&#32;9&#32;(5): 285\u2013302.&#32;doi:10.1007\/s11119-008-9073-1. &#160; \n\n\u2191 Bramley, R.G.V.&#32;(2005).&#32;\"Understanding variability in winegrape production systems 2. Within vineyard variation in quality over several vintages\".&#32;Australian Journal of Grape and Wine Research&#32;11&#32;(1): 33\u201342.&#32;doi:10.1111\/j.1755-0238.2005.tb00277.x. &#160; \n\n\u2191 Verdugo-V\u00e1squez, N.; Acevedo-Opazo, C.; Vald\u00e9s-G\u00f3mez, H. et al.&#32;(2016).&#32;\"Spatial variability of phenology in two irrigated grapevine cultivar growing under semi-arid conditions\".&#32;Precision Agriculture&#32;17&#32;(2): 218\u201345.&#32;doi:10.1007\/s11119-015-9418-5. &#160; \n\n\u2191 Baluja, J.; Diago, M.P.; Goovaerts, P.; Tardaguila, J.&#32;(2012).&#32;\"Assessment of the spatial variability of anthocyanins in grapes using a fluorescence sensor: Relationships with vine vigour and yield\".&#32;Precision Agriculture&#32;13&#32;(4): 457\u201372.&#32;doi:10.1007\/s11119-012-9261-x. &#160; \n\n\u2191 Debuisson, S.; Germain, C.; Garcia, O. et al.&#32;(2010).&#32;\"Using Multiplex And Greenseeker To Manage Spatial Variation Of Vine Vigor In Champagne\".&#32;Proceedings of the 10th International Conference on Precision Agriculture.&#32;https:\/\/www.ispag.org\/proceedings\/?action=abstract&amp;id=197 . &#160; \n\n\u2191 Taylor, J.A.; Acevedo\u2013Opazo, C.; Ojeda, H.; Tisseyre, B.&#32;(2010).&#32;\"Identification and significance of sources of spatial variation in grapevine water status\".&#32;Australian Journal of Grape and Wine Research&#32;16&#32;(1): 218\u201326.&#32;doi:10.1111\/j.1755-0238.2009.00066.x. &#160; \n\n\u2191 Taylor, J.A.; McBratney, A.B.; Whelan, B.M.&#32;(2007).&#32;\"Establishing Management Classes for Broadacre Agricultural Production\".&#32;Agronomy Journal&#32;99&#32;(5): 1366-76.&#32;doi:10.2134\/agronj2007.0070. &#160; \n\n\u2191 Jeong, J.S.; Garc\u00eda-Moruno, L.; Hern\u00e1ndez-Blanco, J.&#32;(2012).&#32;\"Integrating buildings into a rural landscape using a multi-criteria spatial decision analysis in GIS-enabled web environment\".&#32;Biosystems Engineering&#32;112&#32;(2): 82\u201392.&#32;doi:10.1016\/j.biosystemseng.2012.03.002. &#160; \n\n\u2191 Yalew, S.G.; van Griensven, A.; van der Zaag, P.&#32;(2016).&#32;\"AgriSuit: A web-based GIS-MCDA framework for agricultural land suitability assessment\".&#32;Computers and Electronics in Agriculture&#32;128&#32;(10): 1\u20138.&#32;doi:10.1016\/j.compag.2016.08.008. &#160; \n\n\u2191 Leroux, C.; Jones, H.; Clenet, A. et al.&#32;(2018).&#32;\"A general method to filter out defective spatial observations from yield mapping datasets\".&#32;Precision Agriculture: 1\u201320.&#32;doi:10.1007\/s11119-017-9555-0. &#160; \n\n\u2191 13.0 13.1 13.2 Sudduth, K.A.; Drummond, S.T.&#32;(2006).&#32;\"Yield Editor\".&#32;Agronomy Journal&#32;99&#32;(6): 1471\u201382.&#32;doi:10.2134\/agronj2006.0326. &#160; \n\n\u2191 Hengl, T.; Heuvelink, G.B.M.; Stein, A.&#32;(2004).&#32;\"A generic framework for spatial prediction of soil variables based on regression-kriging\".&#32;Geoderma&#32;120&#32;(1\u20132): 75\u201393.&#32;doi:10.1016\/j.geoderma.2003.08.018. &#160; \n\n\u2191 15.0 15.1 Oliver, M.A.; Webster, R.&#32;(2014).&#32;\"A tutorial guide to geostatistics: Computing and modelling variograms and kriging\".&#32;CATENA&#32;113&#32;(2): 56\u201369.&#32;doi:10.1016\/j.catena.2013.09.006. &#160; \n\n\u2191 Robinson, T.P.; Mettemicht, G.&#32;(2006).&#32;\"Testing the performance of spatial interpolation techniques for mapping soil properties\".&#32;Computers and Electronics in Agriculture&#32;50&#32;(2): 97\u2013108.&#32;doi:10.1016\/j.compag.2005.07.003. &#160; \n\n\u2191 Cid-Garcia, N.M.; Albornoz, V.; Rios-Solis, Y.A.; Ortega, R.&#32;(2013).&#32;\"Rectangular shape management zone delineation using integer linear programming\".&#32;Computers and Electronics in Agriculture&#32;93&#32;(4): 1\u20139.&#32;doi:10.1016\/j.compag.2013.01.009. &#160; \n\n\u2191 18.0 18.1 18.2 Pedroso, M.; Taylor, J.; Tisseyre, B. et al.&#32;(2010).&#32;\"A segmentation algorithm for the delineation of agricultural management zones\".&#32;Computers and Electronics in Agriculture&#32;70&#32;(1): 199\u2013208.&#32;doi:10.1016\/j.compag.2009.10.007. &#160; \n\n\u2191 Blackmore, S.; Godwin, R.J.; Fountas, S.&#32;(2003).&#32;\"The Analysis of Spatial and Temporal Trends in Yield Map Data over Six Years\".&#32;Biosystems Engineering&#32;84&#32;(4): 455\u201366.&#32;doi:10.1016\/S1537-5110(03)00038-2. &#160; \n\n\u2191 Li, Y.; Shi, Z.; Li, F.; Li, H.-Y.&#32;(2007).&#32;\"Delineation of site-specific management zones using fuzzy clustering analysis in a coastal saline land\".&#32;Computers and Electronics in Agriculture&#32;56&#32;(2): 174\u201386.&#32;doi:10.1016\/j.compag.2007.01.013. &#160; \n\n\u2191 Oliver, Y.M.; Robertson, M.J.; Wong, M.T.F.&#32;(2010).&#32;\"Integrating farmer knowledge, precision agriculture tools, and crop simulation modelling to evaluate management options for poor-performing patches in cropping fields\".&#32;European Journal of Agronomy&#32;32&#32;(1): 40\u201350.&#32;doi:10.1016\/j.eja.2009.05.002. &#160; \n\n\u2191 Pichon, L.; Besqueut, G.; Tisseyre, B.&#32;(2017).&#32;\"A systemic approach to identify relevant information provided by UAV in precision viticulture\".&#32;Advances in Animal Biosciences&#32;8&#32;(2): 823\u20137.&#32;doi:10.1017\/S2040470017001194. &#160; \n\n\u2191 Schenatto, K.; de Souza, E.G.; Bazzi, C.L. et al.&#32;(2017).&#32;\"Use of the farmer\u2019s experience variable in the generation of management zones\".&#32;Semina, Ci\u00eancias Agr\u00e1rias&#32;38&#32;(4): 2305\u201321.&#32;doi:10.5433\/1679-0359.2017v38n4Supl1p2305. &#160; \n\n\u2191 Leroux, C.; Jones, H.; Clenet, A.; Tisseyre, B.&#32;(2017).&#32;\"A new approach for zoning irregularly-spaced, within-field data\".&#32;Computers and Electronics in Agriculture&#32;141&#32;(9): 196\u2013206.&#32;doi:10.1016\/j.compag.2017.07.025. &#160; \n\n\u2191 25.0 25.1 Roudier, P.; Tisseyre, B.; Poilv\u00e9, H.; Roger, J.-M.&#32;(2008).&#32;\"Management zone delineation using a modified watershed algorithm\".&#32;Precision Agriculture&#32;9: 233.&#32;doi:10.1007\/s11119-008-9067-z. &#160; \n\n\u2191 Whelan, B.M.; McBratney, A.B.; Minasny, B.&#32;(2001).&#32;\"Vesper\u2014Spatial prediction software for precision agriculture\".&#32;ECPA 2001, Proceedings of the 3rd European Conference on Precision Agriculture: 139\u201344.&#32;https:\/\/www.semanticscholar.org\/paper\/Vesper-%E2%80%93-Spatial-Prediction-Software-for-Precision-Whelan-Mcbratney\/52caaed8c82c943d760e3166e75d783c26d3dfe4 . &#160; \n\n\u2191 Sudduth, K.A.; Drummond, S.T.; Myers, D.B.&#32;(2012).&#32;\"Yield Editor 2.0: Software for Automated Removal of Yield Map Errors\".&#32;Proceedings of the 2012 ASABE Annual International Meeting: 1\u201314.&#32;http:\/\/extension.missouri.edu\/sare\/documents\/asabeyieldeditor2012.pdf . &#160; \n\n\u2191 28.0 28.1 Simbahan, G.C.; Dobermann, A.; Ping, J.L.&#32;(2003).&#32;\"Screening Yield Monitor Data Improves Grain Yield Maps\".&#32;Agronomy Journal&#32;96&#32;(4): 1091\u2013102.&#32;doi:10.2134\/agronj2004.1091. &#160; \n\n\u2191 Krishnan, P.; Sharma, R.K.; Dass, A. et al.&#32;(2016).&#32;\"Web-based crop model: Web InfoCrop \u2013 Wheat to simulate the growth and yield of wheat\".&#32;Computers and Electronics in Agriculture&#32;127&#32;(9): 324\u201335.&#32;doi:10.1016\/j.compag.2016.06.008. &#160; \n\n\u2191 30.0 30.1 30.2 30.3 Guillaume, S.; Charnomordic, B.; Tisseyre, B.; Taylor, J.&#32;(2013).&#32;\"Soft computing-based decision support tools for spatial data\".&#32;International Journal of Computational Intelligence Systems&#32;6&#32;(Sup. 1): 18\u201333.&#32;doi:10.1080\/18756891.2013.818185. &#160; \n\n\u2191 31.0 31.1 31.2 Tisseyre, B.; McBratney, A.B.&#32;(2008).&#32;\"A technical opportunity index based on mathematical morphology for site-specific management: An application to viticulture\".&#32;Precision Agriculture&#32;9&#32;(1\u20132): 101\u201313.&#32;doi:10.1007\/s11119-008-9053-5. &#160; \n\n\u2191 Guillaume, S.; Charnomordic, B.; Loisel, P.&#32;(2013).&#32;\"Fuzzy partitions: A way to integrate expert knowledge into distance calculations\".&#32;Information Sciences&#32;245&#32;(10): 76\u201395.&#32;doi:10.1016\/j.ins.2012.07.045. &#160; \n\n\u2191 Yager, R.R.&#32;(1988).&#32;\"On ordered weighted averaging aggregation operators in multicriteria decisionmaking\".&#32;IEEE Transactions on Systems, Man, and Cybernetics&#32;18&#32;(1): 183\u201390.&#32;doi:10.1109\/21.87068. &#160; \n\n\u2191 34.0 34.1 Lamour, J.; Naud, O.; Lechaudel, M.; Tisseyre, B.&#32;(2017).&#32;\"Mapping properties of an asynchronous crop: The example of time interval between flowering and maturity of banana\".&#32;Advances in Animal Biosciences&#32;8&#32;(2): 481\u20136.&#32;doi:10.1017\/S2040470017000449. &#160; \n\n\u2191 35.0 35.1 35.2 35.3 35.4 35.5 35.6 Santesteban, L.G.; Guillaume, S.; Royo, J.B.; Tisseyre, B.&#32;(2013).&#32;\"Are precision agriculture tools and methods relevant at the whole-vineyard scale?\".&#32;Precision Agriculture&#32;14&#32;(1): 2\u201317.&#32;doi:10.1007\/s11119-012-9268-3. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\">https:\/\/www.limswiki.org\/index.php\/Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on agricultureLIMSwiki journal articles on big dataLIMSwiki journal articles on software\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 10 July 2018, at 23:01.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 1,316 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","c443b688b80703848e965b29dc3cba01_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_GeoFIS_An_open-source_decision_support_tool_for_precision_agriculture_data skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:GeoFIS: An open-source decision support tool for precision agriculture data<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>The world we live in is an increasingly spatial and temporal data-rich environment, and the <a href=\"https:\/\/www.limswiki.org\/index.php\/Agriculture_industry\" title=\"Agriculture industry\" target=\"_blank\" class=\"wiki-link\" data-key=\"4882fd1b1f6fb6017adf6f0c0741eafc\">agriculture industry<\/a> is no exception. However, data needs to be processed in order to first get <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> and then make informed management decisions. The concepts of \"precision agriculture\" and \"smart agriculture\" can and will be fully effective when methods and tools are available to practitioners to support this transformation. An open-source program called GeoFIS has been designed with this objective. It was designed to cover the whole process from spatial data to spatial information and decision support. The purpose of this paper is to evaluate the abilities of GeoFIS along with its embedded algorithms to address the main features required by farmers, advisors, or spatial analysts when dealing with precision agriculture data. Three case studies are investigated in the paper: (i) mapping of the spatial variability in the data, (ii) evaluation and cross-comparison of the opportunity for site-specific management in multiple fields, and (iii) delineation of within-field zones for variable-rate applications when these latter are considered opportune. These case studies were applied to three contrasting crop types: banana, wheat, and grapes. These were chosen to highlight the diversity of applications and data characteristics that might be handled with GeoFIS. For each case-study, up-to-date algorithms arising from research studies and implemented in GeoFIS were used to process these precision agriculture data. Areas for future development and possible relations with existing <a href=\"https:\/\/www.limswiki.org\/index.php\/Geographic_information_system\" title=\"Geographic information system\" target=\"_blank\" class=\"wiki-link\" data-key=\"8981ab93f8ebf0730c3b38949b39ad99\">geographic information systems<\/a> (GIS) software is also discussed.\n<\/p><p><b>Keywords<\/b>: decision making, GeoFIS, geostatistics, open-source software, precision agriculture, spatial analysis\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Within-field variability is now a widely accepted and reported phenomenon by the precision agriculture community.<sup id=\"rdp-ebb-cite_ref-OliverGeo10_1-0\" class=\"reference\"><a href=\"#cite_note-OliverGeo10-1\" rel=\"external_link\">[1]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-PringleAPrelim03_2-0\" class=\"reference\"><a href=\"#cite_note-PringleAPrelim03-2\" rel=\"external_link\">[2]<\/a><\/sup> Geolocalized data are effectively collected intensively within the fields by sensors embedded on agricultural machinery, satellites, flying platforms, static stations, or humans among others, to make sure that this variability is considered and accounted for.<sup id=\"rdp-ebb-cite_ref-Acevedo-OpazoThePot08_3-0\" class=\"reference\"><a href=\"#cite_note-Acevedo-OpazoThePot08-3\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BramleyUnder05_4-0\" class=\"reference\"><a href=\"#cite_note-BramleyUnder05-4\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-Verdugo-V.C3.A1squezSpatial16_5-0\" class=\"reference\"><a href=\"#cite_note-Verdugo-V.C3.A1squezSpatial16-5\" rel=\"external_link\">[5]<\/a><\/sup> Spatial data have particular characteristics that are worth careful consideration during analysis. First of all, their spatial resolution (density) is of interest as it defines the capacity to identify short- and long-scale spatial variability.<sup id=\"rdp-ebb-cite_ref-BalujaAss12_6-0\" class=\"reference\"><a href=\"#cite_note-BalujaAss12-6\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-DebuissonUsing10_7-0\" class=\"reference\"><a href=\"#cite_note-DebuissonUsing10-7\" rel=\"external_link\">[7]<\/a><\/sup> Spatial records are often associated with a high level of noise that originates for multiple reasons, such as the plant to plant variability, the accuracy of the sensor, or the conditions of data acquisition.<sup id=\"rdp-ebb-cite_ref-TaylorIdent10_8-0\" class=\"reference\"><a href=\"#cite_note-TaylorIdent10-8\" rel=\"external_link\">[8]<\/a><\/sup> Except for images in which data are regularly distributed on a grid of pixels, many spatial observations collected in agriculture are irregular and do not follow a fixed pattern within the fields.<sup id=\"rdp-ebb-cite_ref-TaylorEstab07_9-0\" class=\"reference\"><a href=\"#cite_note-TaylorEstab07-9\" rel=\"external_link\">[9]<\/a><\/sup> This feature is of great concern because many image processing algorithms cannot be directly used on these irregular data.\n<\/p><p>To benefit from this increasing flow of data, users should be provided with software or tools that allow them to:\n<\/p>\n<ol><li><a href=\"https:\/\/www.limswiki.org\/index.php\/Data_visualization\" title=\"Data visualization\" target=\"_blank\" class=\"wiki-link\" data-key=\"4a3b86cba74bc7bb7471aa3fc2fcccc3\">visualize the data<\/a> they have collected (simple or low-level functions),<\/li>\n<li>process these data (advanced or high-level functions), and<\/li>\n<li>incorporate the knowledge they have on these data into the data processing.<\/li><\/ol>\n<p>It is acknowledged that basic visualization tools\u2014e.g., data import, georeferencing, data display\u2014are available in many general (e.g., Quantum Geographic Information System (QGIS), gvSIG, Google Earth, Whitebow Geospatial Analysis Tools) and more specific<sup id=\"rdp-ebb-cite_ref-JeongInteg12_10-0\" class=\"reference\"><a href=\"#cite_note-JeongInteg12-10\" rel=\"external_link\">[10]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-YalewAgri16_11-0\" class=\"reference\"><a href=\"#cite_note-YalewAgri16-11\" rel=\"external_link\">[11]<\/a><\/sup> open-source platforms, including those not specific to agricultural applications. It is clear that such functionalities are of major importance for handling spatial data. However, when it comes to making informed management decisions, these visualization functions are not sufficient. It is necessary to provide users with more advanced or high-level functions so that they can turn this raw spatial data into information and decision layers. The most commonly required procedures in the precision agriculture domain are functions such as:\n<\/p>\n<ol><li>filtering, to ensure the quality of the datasets<sup id=\"rdp-ebb-cite_ref-LerouxAGen18_12-0\" class=\"reference\"><a href=\"#cite_note-LerouxAGen18-12\" rel=\"external_link\">[12]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-SudduthYield06_13-0\" class=\"reference\"><a href=\"#cite_note-SudduthYield06-13\" rel=\"external_link\">[13]<\/a><\/sup>,<\/li>\n<li>interpolation, to provide a continuous mapping of the property of interest<sup id=\"rdp-ebb-cite_ref-HenglAGen04_14-0\" class=\"reference\"><a href=\"#cite_note-HenglAGen04-14\" rel=\"external_link\">[14]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-OliverATut14_15-0\" class=\"reference\"><a href=\"#cite_note-OliverATut14-15\" rel=\"external_link\">[15]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-RobinsonTesting06_16-0\" class=\"reference\"><a href=\"#cite_note-RobinsonTesting06-16\" rel=\"external_link\">[16]<\/a><\/sup>,<\/li>\n<li>zoning, to define within-field zones for site-specific management<sup id=\"rdp-ebb-cite_ref-Cid-GarciaRect13_17-0\" class=\"reference\"><a href=\"#cite_note-Cid-GarciaRect13-17\" rel=\"external_link\">[17]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-PedrosoASeg10_18-0\" class=\"reference\"><a href=\"#cite_note-PedrosoASeg10-18\" rel=\"external_link\">[18]<\/a><\/sup>, or<\/li>\n<li>aggregation so that multiple layers of information can be combined.<sup id=\"rdp-ebb-cite_ref-BlackmoreTheAnal03_19-0\" class=\"reference\"><a href=\"#cite_note-BlackmoreTheAnal03-19\" rel=\"external_link\">[19]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-YanDeline07_20-0\" class=\"reference\"><a href=\"#cite_note-YanDeline07-20\" rel=\"external_link\">[20]<\/a><\/sup><\/li><\/ol>\n<p>To foster the adoption of such tools, all the aforementioned functions have to be specifically dedicated to the processing of agricultural data from potentially very differing productions systems. This is an important consideration as these data come with a lot of associated knowledge that has to be considered when processing these data. More specifically, significant local expertise to support decision making might be available as users, e.g., farmers, advisors and\/or technicians, have normally been scouting the fields during all the growing season.<sup id=\"rdp-ebb-cite_ref-OliverInteg10_21-0\" class=\"reference\"><a href=\"#cite_note-OliverInteg10-21\" rel=\"external_link\">[21]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-PichonASystem17_22-0\" class=\"reference\"><a href=\"#cite_note-PichonASystem17-22\" rel=\"external_link\">[22]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-SchenattoUseOf17_23-0\" class=\"reference\"><a href=\"#cite_note-SchenattoUseOf17-23\" rel=\"external_link\">[23]<\/a><\/sup> Site-specific management also requires the use of agricultural machinery with specific characteristics that have to be considered in these processing functions. This is to ensure that planned differential management is in accordance with the practical and operational limitations of machinery, e.g., working width, lag time, and application speed.<sup id=\"rdp-ebb-cite_ref-LerouxANew17_24-0\" class=\"reference\"><a href=\"#cite_note-LerouxANew17-24\" rel=\"external_link\">[24]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-RoudierManage08_25-0\" class=\"reference\"><a href=\"#cite_note-RoudierManage08-25\" rel=\"external_link\">[25]<\/a><\/sup>\n<\/p><p>From a general perspective, there are only a few dedicated software programs available to explicitly process precision agriculture data and incorporate expert knowledge into the process. Moreover, very few of them are open-source. Some freeware and shareware tools have been developed and proposed by the precision agriculture community, but these generally focus on specific processing tasks or on a particular type of data. For example, the Vesper program<sup id=\"rdp-ebb-cite_ref-WhelanVesper01_26-0\" class=\"reference\"><a href=\"#cite_note-WhelanVesper01-26\" rel=\"external_link\">[26]<\/a><\/sup>, developed by the University of Sydney, provides users with a graphical interface to spatially interpolate their data. Despite the quite advanced functions that are available, e.g., local punctual and block kriging, users only end up with a continuous map of their data without much more practical information. The Yield Editor software from the United States Department of Agriculture<sup id=\"rdp-ebb-cite_ref-SudduthYield06_13-1\" class=\"reference\"><a href=\"#cite_note-SudduthYield06-13\" rel=\"external_link\">[13]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-SudduthYield12_27-0\" class=\"reference\"><a href=\"#cite_note-SudduthYield12-27\" rel=\"external_link\">[27]<\/a><\/sup> deals effectively with the filtering of within-field yield datasets that are known to contain many defective observations<sup id=\"rdp-ebb-cite_ref-SimbahanScreen03_28-0\" class=\"reference\"><a href=\"#cite_note-SimbahanScreen03-28\" rel=\"external_link\">[28]<\/a><\/sup>, but it does not perform interpolation or other high-level functions. Another interesting example is a QGIS plugin that was put into place to process spatial data of vine shoot diameter arising from the mounted sensor Physiocap (E.RE.C.A, Vaulx-en-Velin, France). This tool mainly incorporates functions to filter these highly noisy datasets. Other platforms have been proposed by agronomist to give farmers access to crop models, but they are very specific in terms of crop, data, and use.<sup id=\"rdp-ebb-cite_ref-KrishnanWeb16_29-0\" class=\"reference\"><a href=\"#cite_note-KrishnanWeb16-29\" rel=\"external_link\">[29]<\/a><\/sup> An open-source platform that takes raw data through to a decision point is not available to the precision agriculture community yet.\n<\/p><p>The aim of this paper is to present the GeoFIS software (<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.geofis.org\/\" target=\"_blank\">https:\/\/www.geofis.org\/<\/a>), developed by a joint team from IRSTEA, INRA, and Montpellier SupAgro in France.<sup id=\"rdp-ebb-cite_ref-GuillaumeSoft13_30-0\" class=\"reference\"><a href=\"#cite_note-GuillaumeSoft13-30\" rel=\"external_link\">[30]<\/a><\/sup> The goal of this platform is to provide users with up-to-date and reliable algorithms to process their precision agriculture data and incorporate expert knowledge from the fields. GeoFIS has been mainly developed for academic and research purposes, i.e., investigators and students willing to process their data, but also to a lesser extent for agronomists and advisors with a sufficient background in spatial analysis. The objective of this interface-based platform is to support users who do not necessarily have programming skills and to show that high level functions can be introduced in a GIS and could be integrated within precision agriculture programs. The first section introduces this open-source tool along with its architecture, design, interface, and main processing functions. Three different case studies on various crops are then considered to evaluate the ability of this software to answer most of the issues that are faced by the agricultural sector for processing their spatial data. The last section highlights the needs for future developments to promote precision agriculture adoption and the possibility to create connections with existing GIS software programs.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"The_GeoFIS_software\">The GeoFIS software<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Aim_of_the_GeoFIS_project\">Aim of the GeoFIS project<\/span><\/h3>\n<p>GeoFIS has been designed to facilitate the movement from spatial data to spatial information, and to spatial decision making. It is an open-source program that proposes a simple and easy-to-use interface to build decision support systems (DSS) from spatial data.<sup id=\"rdp-ebb-cite_ref-GuillaumeSoft13_30-1\" class=\"reference\"><a href=\"#cite_note-GuillaumeSoft13-30\" rel=\"external_link\">[30]<\/a><\/sup> While its development has been inspired by agri-environmental applications, the framework itself is open and accessible to applications in other domains. It is designed to be adaptable to different usages and for different end users, mostly for academic and research applications, for student and teaching applications, and, to a lesser extent, for GIS-skilled agronomists and advisors.\n<\/p><p>GeoFIS deviates from other GIS software, e.g., QGIS, in the sense that specific tools have been implemented to answer the main expectations of agricultural professionals when it comes to processing precision agriculture data. These will be presented later on. It is acknowledged that multiple other open-source spatial programs (e.g., QGIS) or languages (e.g., R and Python) are available to process spatial and temporal data. However, these open-source tools do not have specific functions dedicated to the processing of precision agriculture data (as listed in the introduction section) and usually require users to have skills in programming. This is a major limiting factor for the practical use of spatial modelling in agriculture. Another strength of GeoFIS is that attention has been paid to the incorporation of expert knowledge into data analysis. This is not available in other related spatial processing tools. Agricultural professionals have significant local expert knowledge on their production system that needs to be taken into account. By incorporating this qualitative expert knowledge, the quality of the processing should be improved and the adoption of precision agriculture technologies should be enhanced.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Architecture_and_design_of_GeoFIS\">Architecture and design of GeoFIS<\/span><\/h3>\n<p>In the proposed GeoFIS architecture, all the open-source toolboxes and libraries have been selected for their ability to handle spatial data and to incorporate expert knowledge (Figure 1). Statistical and geostatistical functions dedicated to precision agriculture data (see next subsection) are implemented in R (<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.r-project.org\" target=\"_blank\">https:\/\/www.r-project.org<\/a>). Outside these specific functions, spatial data are handled through two open-source libraries, i.e., Geotools (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.geotools.org\" target=\"_blank\">http:\/\/www.geotools.org<\/a>) and CGAL (Computational Geometry Algorithms Library, <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.cgal.org\" target=\"_blank\">https:\/\/www.cgal.org<\/a>). Geotools is used because its Java implementation allows the design of user-friendly interfaces. CGAL was chosen for its ability to provide very efficient and reliable geometric algorithms, as its functions are developed in C++. Finally, the incorporation of expert knowledge is made possible with FisPro (<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.fispro.org\" target=\"_blank\">https:\/\/www.fispro.org<\/a>), a system that uses fuzzy sets for conceptual modeling.<sup id=\"rdp-ebb-cite_ref-GuillaumeSoft13_30-2\" class=\"reference\"><a href=\"#cite_note-GuillaumeSoft13-30\" rel=\"external_link\">[30]<\/a><\/sup>\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Leroux_Agri2018_8-6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"50ee0b33580d3c923669e03026bc19b3\"><img alt=\"Fig1 Leroux Agri2018 8-6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/2\/28\/Fig1_Leroux_Agri2018_8-6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1:<\/b> The GeoFIS architecture<sup id=\"rdp-ebb-cite_ref-GuillaumeSoft13_30-3\" class=\"reference\"><a href=\"#cite_note-GuillaumeSoft13-30\" rel=\"external_link\">[30]<\/a><\/sup>. CGAL, Computational Geometry Algorithms Library; DSS, Decision Support Systems; GIS, Geographic Information System; 1D, One dimension<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>GeoFIS is available in four languages (French, English, Spanish, and Portuguese). The interface is designed with a man-machine cooperation objective. The goal is to facilitate the relationships between data, learning algorithms, and expert knowledge. Documentation, scientific papers, and video tutorials are available to better understand the implemented function and to facilitate the adoption of the GeoFIS software (<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.geofis.org\/\" target=\"_blank\">https:\/\/www.geofis.org\/<\/a>). Notifications are made when a new version of the software is available.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Functionalities_implemented_in_GeoFIS\">Functionalities implemented in GeoFIS<\/span><\/h3>\n<p>GeoFIS contains a series of low and high-level non-spatial and spatial functionalities to interrogate spatial data. The general functionalities are introduced here and then expanded in several case studies in the following section. Figure 2 shows the generic flow required in precision agriculture, from raw data processing to decision making, with the functionalities within GeoFIS at each stage indicated. In agricultural systems, data are available in different formats (points, polygons, rasters) and at different scales. The quality of the data is also variable, with some sensors being inherently noisy and others less so. Different data need potentially different approaches to (i) data validation and clean-up (quality control), (ii) data display (visualization), and, when necessary, (iii) interpolation. These steps transform data into information layers. Within GeoFIS, data can be easily imported (Step 0) and displayed as a map (in its geographical space) and as a histogram (in its attribute space). This allows the user to \"expertly\" identify global outliers in both the geographical and attribute space and remove any erroneous data (Step 1). Interpolation is possible using inverse distance weighting (for small data sets) and via punctual kriging with a global variogram for larger data sets (&gt;100 points). The kriging method includes the ability to plot the experimental variogram and specify a theoretical variogram, which is then passed to the kriging function. Interpolated outputs can be directly displayed as rasters within the display (Step 2).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Leroux_Agri2018_8-6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"e9b971200a86f9942b00ff40e5960d90\"><img alt=\"Fig2 Leroux Agri2018 8-6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/0\/0d\/Fig2_Leroux_Agri2018_8-6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2:<\/b> Generic flow of data in precision agriculture with main processing steps from raw data processing to decision-making.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>\"Precision agriculture\" or \"smart agriculture\" is only effective when effective decisions are made. End users can transform these information layers into decision layers to improve the management of their fields. Three main functionalities for management (practical) applications have been incorporated within GeoFIS to address this. Firstly, practitioners are provided with a method to delineate within-field homogeneous zones (Step 3.1). Zoning is of importance for precision agriculture data, as the identified zones will (i) facilitate spatial data visualization and interpretation and (ii) provide a spatial resolution that is practical and effective for many differential field operations. GeoFIS uses a segmentation algorithm to \"zone\" data layers.<sup id=\"rdp-ebb-cite_ref-PedrosoASeg10_18-1\" class=\"reference\"><a href=\"#cite_note-PedrosoASeg10-18\" rel=\"external_link\">[18]<\/a><\/sup> The segmentation algorithm operates either on irregular or gridded (interpolated) data to generate potential management zones.\n<\/p><p>Secondly, while data\/information collection tends to be focused around production issues, there is no restriction on its use. It can equally be used for strategic as well as tactical decision making. The example of the technical opportunity index (TOI)<sup id=\"rdp-ebb-cite_ref-TisseyreATech08_31-0\" class=\"reference\"><a href=\"#cite_note-TisseyreATech08-31\" rel=\"external_link\">[31]<\/a><\/sup>, which is implemented in GeoFIS, is a case in point. The TOI uses the production data to assess a field\u2019s suitability for site-specific management given machinery constraints and the observed production variation (Step 3.2). The algorithm processes the within-field data with a mathematical morphological filter based on erosion and dilation.<sup id=\"rdp-ebb-cite_ref-TisseyreATech08_31-1\" class=\"reference\"><a href=\"#cite_note-TisseyreATech08-31\" rel=\"external_link\">[31]<\/a><\/sup> This filter allows end users to account for the passes of the agricultural machinery in the field and especially the minimum area (kernel) within which it can operate reliably. As the algorithm requires the data to be organized regularly on a grid, interpolating the data might therefore be required as a pre-processing step (Step 2).\n<\/p><p>Finally, in the majority of cases, practical agronomic decisions are multi-variate in nature. Decision support therefore requires dedicated data fusion methods to merge multiple information layers into a single decision layer (Step 3.3). For instance, when available, historical yield data (high spatial resolution point information), as-applied historical fertilizer maps (polygon data), recent point soil testing (low spatial resolution point data), and early season satellite imagery (high resolution raster) should collectively feed into a decision on mid-season spatial fertilizer inputs, i.e., a prescription fertilizer map (normally a polygon layer). In the previous example, the prescription fertilization map (the decision layer) is based on a set of inputs (information layers) that are all related through expert rules. An example of a possible expert rule could be that if, on a given location in space, the observed yield is high and the soil fertilizer level is low, then it might be relevant to apply more fertilizer inputs. Within GeoFIS, the goal of the data aggregation process is to implement the expert rules so that the final spatial decision layer (that answers the question \"how much fertilizer input should be applied at this particular place at this particular time?\") can be obtained. Expert rules are implemented one at a time, as each rule leads to a practical agronomic decision.\n<\/p><p>Data aggregation in GeoFIS is a two-step process. First, each information layer is transformed into an expert layer, i.e., the numerical agronomic values in each information layer are transformed into degree values (from 0 to 1) according to the expert rule to be implemented. The transformation from an information layer to an expert layer is done using a fuzzy set-based function.<sup id=\"rdp-ebb-cite_ref-GuillaumeFuzzy13_32-0\" class=\"reference\"><a href=\"#cite_note-GuillaumeFuzzy13-32\" rel=\"external_link\">[32]<\/a><\/sup> Secondly, all the expert layers are combined using an aggregation operator to respect the expert rules. Two aggregation operators are currently implemented in GeoFIS. The first operator is the Weighted Arithmetic Mean (WAM), which attributes a weight to each information source, e.g., the yield information layer may be given twice as much weight as the soil fertilizer level layer. The second operator is the Ordered Weighted Average (OWA)<sup id=\"rdp-ebb-cite_ref-YagerOnOrdered88_33-0\" class=\"reference\"><a href=\"#cite_note-YagerOnOrdered88-33\" rel=\"external_link\">[33]<\/a><\/sup>, where the weighing is slightly more complex. For a given location in space, the degree values associated with each layer involved in the expert rule are ordered, and the weights assigned to each layer will depend on their position in this ordering. This operator is of interest as it enables the implementation of logical operations, such as:\n<\/p>\n<ul><li> \"OR,\" where the expert rule applies as soon as the highest degree associated with the layers is high, and<\/li>\n<li> \"AND,\" where the expert rule applies as soon as one of the degrees associated with the layers is high.<\/li><\/ul>\n<p>The result of the aggregation process is a single decision layer. The uniqueness of the GeoFIS approach is in its ability to incorporate the expert knowledge developed by farmers and advisors on the data and their fields directly into the data fusion process. The implemented data aggregation methods require the data to be collocated, either on irregular or regular grids.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Case_studies\">Case studies<\/span><\/h2>\n<p>The previous section introduced the GeoFIS framework, including the functionalities implemented and how they could be adapted to the individual needs of each end user (who will have their own unique constraints on management). The following subsections provide more detailed illustrations on the main processing steps in the context of precision agriculture applications. More specifically, the three cases deal with the typical tasks that advisors and farmers may face in their daily job:\n<\/p>\n<ol><li>the mapping of spatial data (Steps 0, 1 and 2),<\/li>\n<li>the evaluation and cross-comparison of the opportunity for site-specific management in their fields (Step 3.2), and<\/li>\n<li>the delineation of within-field zones for variable-rate applications where zoning is considered opportune (Steps 3.1 and 3.3).<\/li><\/ol>\n<p>Steps 0 to 2 will be exemplified through medium spatial resolution manual measurements performed over a banana field to map the plant vigor. High resolution yield data across several wheat fields will be used to illustrate the value of Step 3.2 to rank the fields from the most to the least suitable for site-specific management. Step 3.1 and 3.3 will be applied on a precision viticulture example aimed at defining zones for differential irrigation management. The overall objective is to demonstrate how GeoFIS has the ability to address the main issues of data processing in precision agriculture. As the three case studies are performed on different crops (banana, wheat, and grapes), each exhibiting unique characteristics, the applicability and genericity of this open-source software will also be demonstrated.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Case_study_1\">Case study 1<\/span><\/h3>\n<h4><span class=\"mw-headline\" id=\"Rationale_and_description\">Rationale and description<\/span><\/h4>\n<p><b>Mapping the spatial organization in the data\u2014An example of the vegetative response of an asynchronous plant, the banana<\/b>\n<\/p><p>Variography and mapping are two very important processing steps in the precision agriculture domain. The former helps evaluate the spatial structure in the data by quantifying the proportions of (i) spatially-structured variability or large-scale variations and (ii) spatially unstructured variability or small-scale variations within the field. The latter is mainly used for the correct display of the observed spatial variability and facilitate the process of decision making.\n<\/p><p>In this case study, GeoFIS was used to investigate and map the spatial variability in the pseudostem (trunk) circumference of banana crops. The proposed analysis was carried out on this crop for two major reasons. First of all, the spatial variability in the agronomic properties of banana crops has been poorly reported in the literature.<sup id=\"rdp-ebb-cite_ref-LamourMapping17_34-0\" class=\"reference\"><a href=\"#cite_note-LamourMapping17-34\" rel=\"external_link\">[34]<\/a><\/sup> Secondly, this crop is known to be asynchronous in its production cycle, which means that spatial analyses are to be handled differently from what is commonly done in annual crops, e.g., wheat, canola, or perennial ones, e.g., grapes.<sup id=\"rdp-ebb-cite_ref-LamourMapping17_34-1\" class=\"reference\"><a href=\"#cite_note-LamourMapping17-34\" rel=\"external_link\">[34]<\/a><\/sup> The proposed analysis (i) estimates the proportion of spatially-structured variability in pseudostem circumferences, i.e., the proportion of variance that is mainly due to spatially-structured environmental properties<sup id=\"rdp-ebb-cite_ref-OliverATut14_15-1\" class=\"reference\"><a href=\"#cite_note-OliverATut14-15\" rel=\"external_link\">[15]<\/a><\/sup>; (ii) determines the proportion of spatially unstructured variability that is due to non-spatially structured phenomena, e.g., the inter-plant variability, plant competition, replanting, and measurement accuracy among others; and (iii) maps the overall within-field variability of trunk circumference in the plantation.\n<\/p><p>The plot under study is situated in a commercial banana plantation in Njombe, Cameroon (WGS84: E: 4.612, N: 9.639) in its fifteenth flowering cycle. The pseudostem circumference measurements were only taken on plants where vegetative growth had ceased, i.e., plants that were either flowering or at a later phenological stage. There were 551 measurements taken using a tape measure at 1-m height and georeferenced with a trail type hand-held GPS (Table 1). The proposed analysis in GeoFIS consisted of the following steps: (i) the dataset was imported within GeoFIS (Step 0), (ii) pseudostem circumference values were filtered to ensure the quality of the dataset (Step 1), and (iii) variograms were fitted to the filtered datasets and interpolation was performed using kriging with a local neighborhood onto a 1\u00d71 meter grid.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\"><b>Table 1.<\/b> Description of the plot under investigation\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\" rowspan=\"2\">Surface (ha)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" rowspan=\"2\">Total Number of Plant Observations\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" rowspan=\"2\">Number of Plants that Have Reached at Least the Flowering Stage\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" colspan=\"2\">Trunk Circumference (cm)\n<\/th><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Mean\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Variance\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.85\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1287\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">551\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">74.7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">69.7\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h4><span class=\"mw-headline\" id=\"Application_in_GeoFIS\">Application in GeoFIS<\/span><\/h4>\n<p>The global distribution of the data was filtered within GeoFIS (Figure 3). Users can select the attribute to be filtered at the top of the window. Below the histogram, two threshold values that represent the two tails of the distribution can be changed, by either typing specific values or moving a slide bar. Observations outside these thresholds are then removed from the dataset. Note that there were two low values in this data set that were considered outside the normal distribution by the user (Figure 3). The lower threshold allowed the user to eliminate these non-compliant values.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"dd17483b485205007e0732339a97bd25\"><img alt=\"Fig3 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/5\/59\/Fig3_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3:<\/b> Filtering of the pseudostem circumference values based on distribution of response in the attribute space<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The spatial structure of the data can then be evaluated by plotting an experimental variogram, here using the within-field pseudostem circumferences. The number of lags and the maximum lag distance can be set in the left-hand corner of the window to make sure that the variogram is relevant. The interface (Figure 4) enables the user to specify and fit a theoretical variogram model to the experimental variogram. A theoretical variogram is automatically fitted, after which users can interactively change the values of the variogram parameters, i.e., nugget, partial sill, and range to improve the fit. The quality of the fit can be assessed with the root mean square error (RMSE) value that is detailed in the top right-hand corner of the interface. The theoretical model can then be saved and used later to perform interpolation by kriging.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"d7162d4c52762f51eabc6a47e51fa5d8\"><img alt=\"Fig4 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/4\/42\/Fig4_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 4:<\/b> Screenshot from GeoFIS illustrating the calculation of the experimental variogram and the fitting of a theoretical variogram model to the within-field pseudostem circumference spatial data<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h4><span class=\"mw-headline\" id=\"Results_and_discussion\">Results and discussion<\/span><\/h4>\n<p>The spatial locations of the measurements are displayed in Figure 5. It clearly shows that the spatial observations are irregularly-spaced within the plot. This aspect can be simply explained by the fact that not all the banana plants had reached the flowering phenological stage (only 551 out of the 1287 plants had). In the plot under study, the pseudostem circumference exhibits a quite strong spatial autocorrelation, the ratio of autocorrelated variance being close to 55% (Table 2). This finding demonstrates that spatially-structured environmental properties, e.g., soil physical and chemical characteristics, are likely in this case to exert a relatively strong influence on the pseudostem circumference of the banana plants. The determination of the factors affecting the pseudostem circumference is beyond the scope of this study. Further analyses of, e.g., soil and plant records, might help to answer this question.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_Leroux_Agri2018_8-6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"5ddb22f54d4ef900f25f8f018348a7de\"><img alt=\"Fig5 Leroux Agri2018 8-6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/a\/a0\/Fig5_Leroux_Agri2018_8-6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 5:<\/b> Spatial measurements of pseudostem circumference divided in five quantiles within the plot under study<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><br \/>\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"4\"><b>Table 2.<\/b> Spatial statistics of pseudostem circumference in the plot under investigation\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Nugget Variance (C<sub>0<\/sub>)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Partial-Sill Variance (C<sub>1<\/sub>)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Sill Variance (C<sub>0<\/sub> + C<sub>1<\/sub>)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Ratio of Autocorrelated Variance (C<sub>1<\/sub>\/C<sub>0<\/sub> + C<sub>1<\/sub>)\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">35.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">43.4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">78.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">55.2%\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Table 2 also shows that the proportion of spatially unstructured variability (C<sub>0<\/sub>) is not negligible. In this case study, it can be mainly explained by (i) the inherent within-plant variability that might be exacerbated by competition among neighbors, and (ii) the accuracy of the measurements which might be affected by Global Navigation Satellite Systems (GNSS) accuracy issues or operator errors.\n<\/p><p>Figure 6 provides a surface (map) of the within-field pseudostem circumference after interpolation (ordinary kriging). This smooths the data in Figure 5 using information on spatial variability contained in the same data. The circumferences appear to be much lower (less than 70 cm) in the northeastern and southern portions of the plots. The larger pseudostems, those for which the circumference exceeded 87 cm, can be mainly found in the northern part of the field. Some local effects\u2014e.g., small sites of low circumference surrounded by high pseudostem circumferences\u2014are also visible on the maps. Those might be explained by several phenomena having a localized effect on plants, such as pest damage or replanting. It is worth recalling that this final map is not a map of circumferences of all pseudostems; rather, it's a map of potential circumference at flowering, as not all the banana plants have reached the flowering stage. This map is an alternative representation of the information displayed in Figure 5 and provides predictions for plants that were not measured in the original survey. As for Figure 6, this map may be very useful in locating sampling sites to perform further soil and\/or plant analyses and to better characterize the within-field pseudostem circumference variability. It has the advantage over the raw data plot (Figure 5) of being easier for the human eye to interpret the main patterns in the field.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig6_Leroux_Agri2018_8-6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"277c61f1832b17dbd0a801bc7ba3a3f9\"><img alt=\"Fig6 Leroux Agri2018 8-6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/c\/cf\/Fig6_Leroux_Agri2018_8-6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 6:<\/b> Kriged map of the potential pseudostem circumference within the field under study. The map represents a potential rather than an exhaustive analysis of plants because not all the plants have reached the flowering stage.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>GeoFIS proved to be a relevant tool to model the spatial variability in the banana pseudostem circumference data and for continuous mapping of this property of interest. However, a couple of limitations are worth discussing. Firstly, even if the filtering interface is user-friendly, it only provides a global filtering of the data. Only the tails of the distribution can be trimmed. It may have been that spatial data exhibit not only global but also local outliers. This was not a problem here, but removing local outliers would be a useful function in the software program. When present, local outliers (inliers) will affect the quality of interpolation procedures. Secondly, GeoFIS does not yet allow the fitting of nested variogram models. This was a potential issue in this case study. In Figure 4, it could be argued that there is a short-range spatial structure within the first 10 meters and a second spatial structure from 10 to 30 meters (with a longer range). Nested spatial structures are not common but do occur in agricultural data. Thirdly, regarding the continuous mapping of the data, GeoFIS only provides a kriged map of the property of interest. The mean estimates are given, but the error (kriging variance) associated with these estimates is not provided. This is a potential limitation for assessing the mapping accuracy and for interpreting uncertainty in future analyses with the interpolated data.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Case_study_2\">Case study 2<\/span><\/h3>\n<h4><span class=\"mw-headline\" id=\"Rationale_and_description_2\">Rationale and description<\/span><\/h4>\n<p><b>Evaluating and comparing the opportunity for site-specific management within-field<\/b>\n<\/p><p>Site-specific management requires a strong investment in time, money, and technical skills for growers. This investment requires certain conditions to be met. Firstly, the within-field variability has to be strong enough to justify differentiate management. Secondly, this variability has to be spatially structured or organized enough within the field to be able to be managed by agricultural machinery.<sup id=\"rdp-ebb-cite_ref-PringleAPrelim03_2-1\" class=\"reference\"><a href=\"#cite_note-PringleAPrelim03-2\" rel=\"external_link\">[2]<\/a><\/sup> Farmers, therefore, are in need of tools that will help them to evaluate this opportunity for site-specific management. To make decisions at a larger level than the field, i.e., the whole farm, this opportunity also has to be cross-compared between fields. Farmers should preferentially commit their efforts towards the fields that are the most opportune for site-specific management. These are most likely to have the largest returns on investment in agri-technology, which should minimize the risk of investment for the farmer.\n<\/p><p>In this case study, GeoFIS was used to evaluate and compare the opportunity for adopting site-specific management across multiple fields using a defined opportunity index.<sup id=\"rdp-ebb-cite_ref-TisseyreATech08_31-2\" class=\"reference\"><a href=\"#cite_note-TisseyreATech08-31\" rel=\"external_link\">[31]<\/a><\/sup> Opportunity indices are a way of assessing if the amount and structure of variation in a field makes site-specific management a potentially feasible option.<sup id=\"rdp-ebb-cite_ref-PringleAPrelim03_2-2\" class=\"reference\"><a href=\"#cite_note-PringleAPrelim03-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-RoudierManage08_25-1\" class=\"reference\"><a href=\"#cite_note-RoudierManage08-25\" rel=\"external_link\">[25]<\/a><\/sup> Seven yield datasets arising from two different farms located near Evreux, in the northwestern part of France (Farm 1\u2014WGS84: E: 0.779, N: 48.955; Farm 2\u2014WGS84: E: 1.032, N: 48.828) were used. Fields were cropped in wheat and harvested with various combines, primarily New Holland (Turin, Italy) and Claas (Harsewinkel, Germany) combines. Yield datasets are considered particularly relevant for this case study because the yield is directly related to the field economic returns. Quantifying the amount and structure of yield variance should therefore be a valuable indicator of whether site-specific management is opportune. Structured spatial variation in yield would indicate a potential for structured spatial crop management, particularly fertilizer and agrichemicals.\n<\/p><p>This case study also demonstrates the use of GeoFIS with dense sensor-derived spatial observations, in contrast to the spatial manual measurements presented in the first case study. Yield data are collected with on-board sensors at 1 Hz as the combine traverses the field. These observations are therefore irregularly-distributed in space because (i) the intra-row and inter-row distances are different and (ii) the acquisition conditions, such as the GNSS accuracy or variable combine speed, can impact the spatial distribution of the observations. The yield information is very dense (thousands of points per hectare) and very noisy because of stochastic error in sensor operation, the intrinsic local variability in production, and errors associated with the combine harvester passing through the field.<sup id=\"rdp-ebb-cite_ref-SudduthYield06_13-2\" class=\"reference\"><a href=\"#cite_note-SudduthYield06-13\" rel=\"external_link\">[13]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-SimbahanScreen03_28-1\" class=\"reference\"><a href=\"#cite_note-SimbahanScreen03-28\" rel=\"external_link\">[28]<\/a><\/sup>\n<\/p><p>These seven fields were selected because they exhibit various degrees of yield autocorrelation within the same systems (farms) and, as such, should represent a different opportunity for variable-rate applications. Within this case study, several functions of GeoFIS were used to arrive at a solution that ranks and compares the seven fields in terms of a technical opportunity for site-specific management. More specifically, (i) global outliers were filtered out (Step 1); (ii) variograms were fitted to the previously filtered yield datasets, and ordinary kriging with a global variogram and local neighborhood was performed onto a 3\u00d73 meter grid (Step 2); and (iii) the TOI was computed (see Section 2.3 Functionalities implemented in GeoFIS) (Step 3.2). To account for technical and operational constraints during the TOI computation, the following operational characteristics were assumed: a working width of 20 meters, a mean speed of three meters per second, and a delay rate of change between two different treatments of two seconds. This could be for instance the characteristics of a fertilizer spreader performing variate-rate application. The major yield statistics of the seven fields under consideration after data clean up are reported in Table 3.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\"><b>Table 3.<\/b> Principal descriptive and spatial statistics of the seven yield datasets under consideration. The nugget to sill ratio can be calculated after variograms are fitted to the cleaned data in GeoFIS.\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Field\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Size (ha)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Mean (t ha<sup>\u22121<\/sup>)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">CV (%)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Nugget to Sill Ratio (%)\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8.3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8.7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">53.8\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">12.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7.0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">46.3\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7.8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">11.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">36.0\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">11.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">9.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">37.5\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">18.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">22.4\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">9.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">15.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">19.9\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">32.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">9.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">15.4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">15.1\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h4><span class=\"mw-headline\" id=\"Application_in_GeoFIS_2\">Application in GeoFIS<\/span><\/h4>\n<p>The filtering and interpolation procedures have already been detailed in the first case study and will not be discussed here. The technical opportunity index (TOI) can be computed in the Opportunity Index toolbar of the GeoFIS software. Figure 7 displays the window that appears when this menu is selected. The window is composed of three main sections. In the top drop-down menu (Border), users are asked to select the attribute on which the metric should be computed, e.g., yield, and to provide the field boundaries to make sure that the calculation of the TOI is restricted to the field of interest. Note that the boundary can be automatically derived with a convex hull; however, this may not be a good option for fields with an irregular geometric shape. In the second drop-down menu (Machine Footprint) the technical and operational constraints of future site-specific management can be specified. More specifically, users can provide the working width of machinery, its speed, the delay in the rate of change between two levels of outputs (management strategies), and the uncertainty in the GNSS positioning of the machine. The third drop-down menu (Interpolation) ensures that all observations are reported on a fixed grid and the TOI is calculated using the grid data. Users can select the size of the interpolation grid along with the interpolation procedure, i.e., inverse distance weighing or kriging. Note that both interpolation approaches need to be parameterized and require some user input.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig7_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"9546aac5332760f587a644c70f34cf10\"><img alt=\"Fig7 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/3\/3b\/Fig7_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 7:<\/b> Screenshot of output from the computation of the Technical Opportunity Index (TOI) in GeoFIS for Field 7<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>When all this information has been specified by a user, the TOI can be calculated. The window displays two major outputs: (i) the TOI value associated with the data along with the corresponding error rate of application, and (ii) the potential management zone map with the different strategies that should be applied (in the case of Figure 7, there are two strategies presented). This latter map can be exported and used in other GIS software if needed.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Results_and_discussion_2\">Results and discussion<\/span><\/h4>\n<p>Figure 8 shows the seven fields in the study, ranked by their respective TOI values along with the corresponding variable-rate application map for a two-management strategy. It clearly shows that the fields have different levels of yield spatial structure, from the lowest for Field 1 to the strongest for Field 7. Note that, in this case study, the order of the TOI values is consistent with the order of nugget to sill ratios (Table 3). The TOI values are however very close in absolute terms (Figure 8), with a range from 0.888 to 0.965. As the TOI value can theoretically range from 0 to 1, all the fields here are exhibiting high TOI values, indicating that a site-specific management is opportune for all of these fields. All the maps have spatially-structured patterns, in accordance with the technical and operational constraints of a future possible machine pass (Figure 8). These maps could be directly incorporated into a machinery system to perform site-specific management.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig8_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"c0c7de7b9ee3fed2562f7c0433febfe2\"><img alt=\"Fig8 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/5\/5e\/Fig8_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig8b_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"4d1251dd84dfaa4f8bea80861f6e5b65\"><img alt=\"Fig8b Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/6\/6f\/Fig8b_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 8:<\/b> Ranking of the seven yield datasets in terms of the associated TOI value: (<b>a<\/b>) Field 1; (<b>b<\/b>) Field 2; (<b>c<\/b>) Field 3; (<b>d<\/b>) Field 4; (<b>e<\/b>) Field 5; (<b>f<\/b>) Field 6; (<b>g<\/b>) Field 7. Cleaned yield values and corresponding potential variable application maps are also displayed for each field. TOI: technical opportunity index<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The high TOI values for these fields is due to two principal reasons: (i) the data interpolation and (ii) the operational constraints that were set. The computation of the TOI requires the data to be regularly distributed over the field, which is why a prior interpolation procedure is put into place. In this case study, the interpolation by kriging generated a relatively strong data smoothing that artificially increased the TOI values, as it is calculated on the interpolated data. Indeed, as the small-scale variations are smoothed, the yield patterns appear much more organized in space, and the site-specific management is consequently considered more opportune. The settings of the operational characteristics in these fields also facilitated high TOI values. As the minimal size of field management (working width of the machinery) decreases, the opportunity for variable-rate application will increase. Smaller machinery means that smaller areas of spatial variation become potentially manageable. In contrast, if field management were done at a coarser level, e.g., the working width of the machinery was set to 40 meters, then the opportunity for site-specific management would decrease, and there would likely be larger differences among the seven studied fields (data not shown). As can be seen in Figure 8, only two management strategies are proposed for each field. Even if this two-class categorization appears sufficient in some case studies, the actual computation of the TOI at the moment does not allow for alternative management strategies (three, four, \u2026, etc. classes) to be simultaneously considered. This aspect will be investigated in further studies.\n<\/p><p>The TOI is a valuable metric to evaluate and rank fields with respect to the opportunity for site-specific management. GeoFIS is an interesting tool to perform this case study because all the steps required to compute the TOI can be performed within the program. Note that potential management zone maps are also provided and can be simply exported through the easy-to-use interface (however, the target rates are not yet determined at this point; see the next case study). This should foster the adoption of precision agriculture technologies. Users must however be cautious when computing and interpreting the TOI, as this metric is particularly sensitive to the interpolation of the cleaned data and the setting of the technical and operation constraints for site-specific management. Users should be able to perform a series of tests within GeoFIS to evaluate the impact of their parametrization on the TOI values and management zone maps. To cross-compare this opportunity for potential differentiate application amongst fields, authors strongly advocate the application of the exact same process with similar settings for the calculation of the final TOI metric.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Case_study_3\">Case study 3<\/span><\/h3>\n<h4><span class=\"mw-headline\" id=\"Rationale_and_description_3\">Rationale and description<\/span><\/h4>\n<p><b>Delineating within-field zones for variable-rate applications using expert knowledge<\/b>\n<\/p><p>The delineation of within-field zones is an important procedure in precision agriculture studies because it enables, or at least facilitates, growers to perform variable-rate applications. The creation of these zones is a complex process for multiple reasons: (i) there is a need to account for spatial relationships in the data, (ii) very often multiple layers of spatial information must be combined, and (iii) the decision rules associated with agronomic applications are complex and require the grower\u2019s knowledge to be involved in the processing. In this case study, GeoFIS is used to delineate within-field zones prior to the management of irrigation and fertilization in a Spanish vineyard using several layers of information and incorporating expert knowledge. This case study is an extension of previous work by Santesteban <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-SantestebanArePrec13_35-0\" class=\"reference\"><a href=\"#cite_note-SantestebanArePrec13-35\" rel=\"external_link\">[35]<\/a><\/sup> Interested readers are referred to this document for more information.\n<\/p><p>The study was carried out on a 90 hectare commercial vineyard containing 27 contiguous fields (Figure 9) located in Southern Navarre, Spain (WGS84: E: 1.405, N: 42.254). The vine vigor, soil, and water availability in the field were considered to be of major interest by the vine manager to manage irrigation and fertilization practices.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig9_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"d1ae612ae09451d798c3911e1323f4e1\"><img alt=\"Fig9 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/1\/1f\/Fig9_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 9:<\/b> Maps of the whole-vineyard showing the spatial variability in (<b>a<\/b>) elevation; (<b>b<\/b>) soil apparent conductivity (ECa); and (<b>c<\/b>) vegetative expression (normalized difference vegetation index (NDVI)). Points in (<b>a<\/b>, <b>b<\/b>) indicate sampling locations (<i>n<\/i> = 256) (reproduced with permission from Santesteban <i>et al.<\/i>).<sup id=\"rdp-ebb-cite_ref-SantestebanArePrec13_35-1\" class=\"reference\"><a href=\"#cite_note-SantestebanArePrec13-35\" rel=\"external_link\">[35]<\/a><\/sup><\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Grapevine vigor was estimated using the normalized difference vegetation index (NDVI) on a 3\u00d73 meter raster layer derived from a Multi-spectral Airborne image acquired in August 2007 and provided and processed by the Geosys-Spain Company (Leica ADS40 sensor). Measurements of soil apparent electrical conductivity (ECa) on a 30\u00d730 meter grid (256 sampling points) were performed using a handheld ground conductivity meter (EM38, Geonics Ltd., Mississauga, ON, Canada) to map soil spatial variability. The same sample sites were used to create a digital terrain model from elevation data obtained with a laser Tachymeter (TPS 1001, Leica, Heerbrugg, Switzerland). Both ECa and elevation data were kriged onto a three-meter grid. Additional monitoring was performed to provide more information on the vine vigor, soil, and water variation.<sup id=\"rdp-ebb-cite_ref-SantestebanArePrec13_35-2\" class=\"reference\"><a href=\"#cite_note-SantestebanArePrec13-35\" rel=\"external_link\">[35]<\/a><\/sup> As these additional observations were more expensive and\/or cumbersome to collect, only 64 out of the 256 sampling sites were monitored. These monitoring sites were selected using the high-resolution data layers. Additional observations were related to the (i) soil, e.g., observation of soil pits; (ii) plant, e.g., plant water status, pruning weight of wood, and yield; and (iii) production, e.g., berry size, berry composition, and yield characteristics. The analysis of all these data layers led to an explanatory reasoning summarized as<sup id=\"rdp-ebb-cite_ref-SantestebanArePrec13_35-3\" class=\"reference\"><a href=\"#cite_note-SantestebanArePrec13-35\" rel=\"external_link\">[35]<\/a><\/sup>:\n<\/p>\n<ul><li> Hydromorphic soils and wetlands are well defined by the EC<sub>a<\/sub> information. Their presence is mainly explained by variations in elevation.<\/li>\n<li> Vine vegetative expression is too high (and harvest quality too low) on the zones at the highest elevations, characterized by light and deep soils (low EC<sub>a<\/sub> values).<\/li>\n<li> Vine vegetative expression is too weak on the zones at the lowest elevations, characterized by clay soils, which suffer from water logging after rainfall events (high EC<sub>a<\/sub> values).<\/li><\/ul>\n<p>Based on this explanatory reasoning, the vineyard manager defined several decision rules to identify the situations in which the current management practices were sub-optimal regarding grape quality and quantity at harvest. An example of one of these rules was: If NDVI is high (&gt;70) and EC<sub>a<\/sub> is low (&lt;180 mS m<sup>\u22121<\/sup>) and elevation is high (&gt;360 m), then the risk of having sub-optimal management practices is high.\n<\/p><p>This latter rule was modelled in GeoFIS to provide a map showing the risk of having sub-optimal management practices within the vineyard. First, the three data layers involved in the expert rule were transformed into risk maps using risk functions (Step 3.3). The parametrization of these risk functions was done with the vineyard manager. All the univariate risk maps were then combined into a final risk map using the OWA aggregator, which was again parameterized with the vineyard manager (see Section 2.3 Functionalities implemented in GeoFIS) (Step 3.3). Finally, a segmentation algorithm was applied to this last risk map to provide within-field risk zones (Step 3.1).\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Application_in_GeoFIS_3\">Application in GeoFIS<\/span><\/h4>\n<p>Focusing on the computation of the risk functions and on the zoning of the resulting risk map, for each layer of information (EC<sub>a<\/sub>, NDVI, Elevation), risk functions can be defined within GeoFIS by implementing fuzzy rules as displayed in Figure 10. Here, a semi-trapezoidal function was used to model the risk of having sub-optimal practices by solely relying on the EC<sub>a<\/sub> layer. In this interface, the form of the risk function can be changed along with the associated fuzzy parameters, i.e., the kernel and support. Once the risk functions have been set for all the layers of interest, all the risks can be aggregated with respect to the aforementioned expert rule(s). This aggregation procedure can be performed through the interfaces displayed in Figure 11 where (i) the layers can be selected and the aggregation operator can be chosen (OWA aggregator here) and, (ii) the parameters associated to the OWA aggregator can be stated.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig10_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"190a11dc5892b9c2a87816935a59e628\"><img alt=\"Fig10 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/4\/4c\/Fig10_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 10:<\/b> Implementation of the risk function associated with the EC<sub>a<\/sub> information layer<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig11_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"91e637ce5caac6b7c50b92390e54acf2\"><img alt=\"Fig11 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/b\/b3\/Fig11_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 11:<\/b> Parameterization of the Ordered Weighted Average (OWA) aggregator: (<b>a<\/b>) Selection of the layers to be aggregated; (<b>b<\/b>) setting of the OWA aggregator parameters. The weights for the minimum, medium and maximum values of univariate risk are respectively 0.7, 0.2, and 0.1.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>After the aggregation procedure has been run, practitioners end up with a continuous map of the global risk of having sub-optimal practices within the vineyard. To facilitate the interpretation of the map and the process of decision-making, the risk map can be zoned using the interface displayed in Figure 12. Before zoning, users must (i) define the boundary of the map, either by importing a predefined boundary or by using a default convex hull algorithm (that is proposed in GeoFIS) to generate a boundary and (ii) set the neighborhood associated to each spatial observation so that zones can be expanded using spatial neighbors. The zoning procedure can then be applied to the OWA risk map using the zoning algorithm implemented in GeoFIS.<sup id=\"rdp-ebb-cite_ref-PedrosoASeg10_18-2\" class=\"reference\"><a href=\"#cite_note-PedrosoASeg10-18\" rel=\"external_link\">[18]<\/a><\/sup> Users can then display a risk map with a number of zones that they consider relevant.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig12_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"2adb8cac6a3639dffd8dd424c94a8e1c\"><img alt=\"Fig12 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/b\/bc\/Fig12_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 12:<\/b> Delimitation of within-field yield zones of the risk of having sub-optimal management practices. (Map details described in Figure 13).<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h4><span class=\"mw-headline\" id=\"Results_and_discussion_3\">Results and discussion<\/span><\/h4>\n<p>The map of the risk of arriving at sub-optimal management practices using a combination of available information and expert rules derived from local knowledge is displayed in Figure 13. This map shows five zones, three of which are relatively large, with specific risk levels. The highest risk area (dark red) is located on the western part of the vineyard and characterized by low EC<sub>a<\/sub>, high NDVI, and high elevation (Figure 13). In this part of the vineyard, it is likely that current management practices are not well adapted. Grape quality and quantity at harvest are not optimized in this area, and \u201cnitrogen applications should be avoided; water availability should be reduced by the introduction of a cover crop; and Regulated Deficit Irrigation strategies should held in order to moderate shoot growth and fertility.\u201d<sup id=\"rdp-ebb-cite_ref-SantestebanArePrec13_35-4\" class=\"reference\"><a href=\"#cite_note-SantestebanArePrec13-35\" rel=\"external_link\">[35]<\/a><\/sup> In order to simplify the presentation of this example, only one rule has been taken into account. It would have been possible to introduce additional rules based on the work presented by Santesteban <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-SantestebanArePrec13_35-5\" class=\"reference\"><a href=\"#cite_note-SantestebanArePrec13-35\" rel=\"external_link\">[35]<\/a><\/sup>\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig13_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"435e2f6c64053d958cea946cf153ef89\"><img alt=\"Fig13 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/4\/4e\/Fig13_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 13:<\/b> Aggregated risk zones of sub-optimal management practices derived using the NDVI, EC<sub>a<\/sub>, and elevation layers together with local expert knowledge<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>It is interesting to note that the aggregation procedure though the OWA operator using the NDVI, EC<sub>a<\/sub>, and elevation layers (Figure 13) has resulted in a risk map that is different from that which would have been obtained by interpreting each layer of information independently (Figure 14). For instance, if the EC<sub>a<\/sub> layer had only been used to generate the risk map, the highest-risk area would have covered a much larger area of the vineyard.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig14_Leroux_Agri2018_8-6.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"67428880c7abb3f438c8497a333c5f62\"><img alt=\"Fig14 Leroux Agri2018 8-6.png\" src=\"https:\/\/www.limswiki.org\/images\/1\/14\/Fig14_Leroux_Agri2018_8-6.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 14:<\/b> Maps of risk zones of sub-optimal management practices derived in the univariate space with variate specific local expert rules. EC<sub>a<\/sub> (<b>left<\/b>); NDVI (<b>middle<\/b>); and Elevation (<b>right<\/b>)<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>This case study illustrates that the expertise of farm managers and advisors can be incorporated into a data-fusing algorithm to generate decision layers. Indeed, GeoFIS enables users to incorporate their own expertise, i.e., though the use of univariate risk functions\/fuzzy rules, into the generation of risk maps. The use of fuzzy rules to account for this expertise is of interest as it makes it possible to avoid abrupt changes in risk and generates a more gradual variation in potential risk (Figure 10). The GeoFIS interface enables users to calibrate the risk and aggregation functions empirically by offering users the ability to test a calibration, visualize the resulting risk maps, and possibly adjust it to their convenience. However, it must be stated that this will require farmers and advisors to be supported so that their expertise can be translated correctly into the data aggregation algorithms.\n<\/p><p>The calibration of the OWA index presented in this case study (weight of 0.7 for the minimum value of univariate risk, 0.2 for the median value, and 0.1 for the maximum value) resulted from an iterative calibration process lead by the vineyard manager. This aggregation setting has strong similarities with the logical operation \u201cAND,\u201d i.e., the resulting risk is high if the minimum value of univariate risk is also high because it has the strongest weight. In other words, all the univariate risks are high because the median and maximum values for a univariate risk are necessarily higher than the minimum value of the univariate risk. Note that the real logical operation \u201cAND\u201d would be reproduced by changing the set of weights (1;0;0). By changing these weights, practitioners might also be able to reproduce the logical operation \u201cOR\u201d (0;0;1) for which the resulting risk is high as soon as the maximum value of a univariate risk is high. It would also be possible to perform a simple average of the different univariate risks by using the same weights for each layer.\n<\/p><p>From a more general perspective, GeoFIS simplifies the processing of the three layers of information, as the entire process was done within a single software platform. It can be compared to the data processing by Santesteban <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-SantestebanArePrec13_35-6\" class=\"reference\"><a href=\"#cite_note-SantestebanArePrec13-35\" rel=\"external_link\">[35]<\/a><\/sup> in which data where cleaned with Excel, interpolated with Vesper, analyzed with Matlab, and represented with ArcGIS.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>The increasing flow of precision agriculture data requires the development of free and open-source processing software to manage and make use of these data and promote precision agriculture adoption. As such, GeoFIS has been specifically designed to facilitate the movement from spatial data to spatial information and to spatial decision making. The application of GeoFIS on some example case studies that agricultural professionals may face when dealing with spatial data has demonstrated the potential of this software. GeoFIS is a released product; however, it is important to state that all the functionality currently introduced and implemented in GeoFIS are still areas of active investigation by the scientific community. GeoFIS will be updated when, and if, improved methodologies become available. It is one of the strengths of the GeoFIS platform that it is able to integrate the latest research developments to make sure that users are provided with the most up-to-date, reliable, and powerful processing algorithms.\n<\/p><p>As it is, GeoFIS is an excellent tool to promote teaching in precision agriculture. Indeed, GeoFIS has already been used within many higher education institutions in France to teach researchers and professionals how to process spatial data. The user-friendly interface effectively facilitates the understanding of some major precision agriculture concepts.\n<\/p><p>The analysis of the three case studies has been an opportunity to also evaluate the limits of the current algorithms and to propose areas for future development within the software. For instance, the data filtering procedure focuses solely on global outliers, while spatial datasets may contain outliers more deeply rooted within the data and sometimes referred to as spatial outliers. A second example is that the variography analysis is limited to single data layers, while cross-variography studies might be relevant to evaluate the spatial relationships between multiple layers of information. To foster the adoption of GeoFIS, the authors are more than open to collaboration and are ready to integrate relevant algorithms for processing precision agriculture data.\n<\/p><p>Another possibility to promote the processing of precision agriculture data would be to create links between GeoFIS and existing GIS programs such as QGIS, an open-source GIS already widely used by many communities working on spatial data. There is a possibility to integrate all the algorithms of GeoFIS directly within this open-source GIS software to benefit from the display and processing algorithms already implemented in QGIS. This would however require users to process their precision agriculture data in a more complex environment for which specific GIS skills are necessary. Another option is to transform GeoFIS into a web-based service, rather than its current download and desktop application structure, so that users would not have to care about the R installation, Java updates, and compatibility between different operating systems.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>J.-L.L. and S.G. developed the GeoFIS software; B.T., J.T., O.N., H.J. and S.G. conceived and designed the experiments; J.L., C.L., and L.P. performed the experiments and analyzed the data; all the authors contributed to reagents\/materials\/analysis tools; C.L. organized the writing of the paper.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Funding\">Funding<\/span><\/h3>\n<p>This research received no external funding.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h3>\n<p>The authors declare no conflict of interest.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-OliverGeo10-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OliverGeo10_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Oliver, M.A., ed.&#32;(2010).&#32;<i>Geostatistical Applications for Precision Agriculture<\/i>.&#32;Springer.&#32;pp.&#160;331.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2F978-90-481-9133-8\" target=\"_blank\">10.1007\/978-90-481-9133-8<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9789048191321.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Geostatistical+Applications+for+Precision+Agriculture&amp;rft.date=2010&amp;rft.pages=pp.%26nbsp%3B331&amp;rft.pub=Springer&amp;rft_id=info:doi\/10.1007%2F978-90-481-9133-8&amp;rft.isbn=9789048191321&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PringleAPrelim03-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-PringleAPrelim03_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-PringleAPrelim03_2-1\" rel=\"external_link\">2.1<\/a><\/sup> <sup><a href=\"#cite_ref-PringleAPrelim03_2-2\" rel=\"external_link\">2.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pringle, M.J.; McBratney, A.B.; Whelan, B.M.; Taylor, J.M.&#32;(2003).&#32;\"A preliminary approach to assessing the opportunity for site-specific crop management in a field, using yield monitor data\".&#32;<i>Agricultural Systems<\/i>&#32;<b>76<\/b>&#32;(1): 273\u201392.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS0308-521X%2802%2900005-7\" target=\"_blank\">10.1016\/S0308-521X(02)00005-7<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+preliminary+approach+to+assessing+the+opportunity+for+site-specific+crop+management+in+a+field%2C+using+yield+monitor+data&amp;rft.jtitle=Agricultural+Systems&amp;rft.aulast=Pringle%2C+M.J.%3B+McBratney%2C+A.B.%3B+Whelan%2C+B.M.%3B+Taylor%2C+J.M.&amp;rft.au=Pringle%2C+M.J.%3B+McBratney%2C+A.B.%3B+Whelan%2C+B.M.%3B+Taylor%2C+J.M.&amp;rft.date=2003&amp;rft.volume=76&amp;rft.issue=1&amp;rft.pages=273%E2%80%9392&amp;rft_id=info:doi\/10.1016%2FS0308-521X%2802%2900005-7&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Acevedo-OpazoThePot08-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Acevedo-OpazoThePot08_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Acevedo-Opazo, C.; Tisseyre, B.; Guillaume, S.; Ojeda, H.&#32;(2008).&#32;\"The potential of high spatial resolution information to define within-vineyard zones related to vine water status\".&#32;<i>Precision Agriculture<\/i>&#32;<b>9<\/b>&#32;(5): 285\u2013302.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11119-008-9073-1\" target=\"_blank\">10.1007\/s11119-008-9073-1<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+potential+of+high+spatial+resolution+information+to+define+within-vineyard+zones+related+to+vine+water+status&amp;rft.jtitle=Precision+Agriculture&amp;rft.aulast=Acevedo-Opazo%2C+C.%3B+Tisseyre%2C+B.%3B+Guillaume%2C+S.%3B+Ojeda%2C+H.&amp;rft.au=Acevedo-Opazo%2C+C.%3B+Tisseyre%2C+B.%3B+Guillaume%2C+S.%3B+Ojeda%2C+H.&amp;rft.date=2008&amp;rft.volume=9&amp;rft.issue=5&amp;rft.pages=285%E2%80%93302&amp;rft_id=info:doi\/10.1007%2Fs11119-008-9073-1&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BramleyUnder05-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BramleyUnder05_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bramley, R.G.V.&#32;(2005).&#32;\"Understanding variability in winegrape production systems 2. Within vineyard variation in quality over several vintages\".&#32;<i>Australian Journal of Grape and Wine Research<\/i>&#32;<b>11<\/b>&#32;(1): 33\u201342.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Fj.1755-0238.2005.tb00277.x\" target=\"_blank\">10.1111\/j.1755-0238.2005.tb00277.x<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Understanding+variability+in+winegrape+production+systems+2.+Within+vineyard+variation+in+quality+over+several+vintages&amp;rft.jtitle=Australian+Journal+of+Grape+and+Wine+Research&amp;rft.aulast=Bramley%2C+R.G.V.&amp;rft.au=Bramley%2C+R.G.V.&amp;rft.date=2005&amp;rft.volume=11&amp;rft.issue=1&amp;rft.pages=33%E2%80%9342&amp;rft_id=info:doi\/10.1111%2Fj.1755-0238.2005.tb00277.x&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Verdugo-V.C3.A1squezSpatial16-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Verdugo-V.C3.A1squezSpatial16_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Verdugo-V\u00e1squez, N.; Acevedo-Opazo, C.; Vald\u00e9s-G\u00f3mez, H. et al.&#32;(2016).&#32;\"Spatial variability of phenology in two irrigated grapevine cultivar growing under semi-arid conditions\".&#32;<i>Precision Agriculture<\/i>&#32;<b>17<\/b>&#32;(2): 218\u201345.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11119-015-9418-5\" target=\"_blank\">10.1007\/s11119-015-9418-5<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Spatial+variability+of+phenology+in+two+irrigated+grapevine+cultivar+growing+under+semi-arid+conditions&amp;rft.jtitle=Precision+Agriculture&amp;rft.aulast=Verdugo-V%C3%A1squez%2C+N.%3B+Acevedo-Opazo%2C+C.%3B+Vald%C3%A9s-G%C3%B3mez%2C+H.+et+al.&amp;rft.au=Verdugo-V%C3%A1squez%2C+N.%3B+Acevedo-Opazo%2C+C.%3B+Vald%C3%A9s-G%C3%B3mez%2C+H.+et+al.&amp;rft.date=2016&amp;rft.volume=17&amp;rft.issue=2&amp;rft.pages=218%E2%80%9345&amp;rft_id=info:doi\/10.1007%2Fs11119-015-9418-5&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BalujaAss12-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BalujaAss12_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Baluja, J.; Diago, M.P.; Goovaerts, P.; Tardaguila, J.&#32;(2012).&#32;\"Assessment of the spatial variability of anthocyanins in grapes using a fluorescence sensor: Relationships with vine vigour and yield\".&#32;<i>Precision Agriculture<\/i>&#32;<b>13<\/b>&#32;(4): 457\u201372.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11119-012-9261-x\" target=\"_blank\">10.1007\/s11119-012-9261-x<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Assessment+of+the+spatial+variability+of+anthocyanins+in+grapes+using+a+fluorescence+sensor%3A+Relationships+with+vine+vigour+and+yield&amp;rft.jtitle=Precision+Agriculture&amp;rft.aulast=Baluja%2C+J.%3B+Diago%2C+M.P.%3B+Goovaerts%2C+P.%3B+Tardaguila%2C+J.&amp;rft.au=Baluja%2C+J.%3B+Diago%2C+M.P.%3B+Goovaerts%2C+P.%3B+Tardaguila%2C+J.&amp;rft.date=2012&amp;rft.volume=13&amp;rft.issue=4&amp;rft.pages=457%E2%80%9372&amp;rft_id=info:doi\/10.1007%2Fs11119-012-9261-x&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DebuissonUsing10-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DebuissonUsing10_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Debuisson, S.; Germain, C.; Garcia, O. et al.&#32;(2010).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.ispag.org\/proceedings\/?action=abstract&id=197\" target=\"_blank\">\"Using Multiplex And Greenseeker To Manage Spatial Variation Of Vine Vigor In Champagne\"<\/a>.&#32;<i>Proceedings of the 10th International Conference on Precision Agriculture<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.ispag.org\/proceedings\/?action=abstract&id=197\" target=\"_blank\">https:\/\/www.ispag.org\/proceedings\/?action=abstract&amp;id=197<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Using+Multiplex+And+Greenseeker+To+Manage+Spatial+Variation+Of+Vine+Vigor+In+Champagne&amp;rft.jtitle=Proceedings+of+the+10th+International+Conference+on+Precision+Agriculture&amp;rft.aulast=Debuisson%2C+S.%3B+Germain%2C+C.%3B+Garcia%2C+O.+et+al.&amp;rft.au=Debuisson%2C+S.%3B+Germain%2C+C.%3B+Garcia%2C+O.+et+al.&amp;rft.date=2010&amp;rft_id=https%3A%2F%2Fwww.ispag.org%2Fproceedings%2F%3Faction%3Dabstract%26id%3D197&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TaylorIdent10-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TaylorIdent10_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Taylor, J.A.; Acevedo\u2013Opazo, C.; Ojeda, H.; Tisseyre, B.&#32;(2010).&#32;\"Identification and significance of sources of spatial variation in grapevine water status\".&#32;<i>Australian Journal of Grape and Wine Research<\/i>&#32;<b>16<\/b>&#32;(1): 218\u201326.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Fj.1755-0238.2009.00066.x\" target=\"_blank\">10.1111\/j.1755-0238.2009.00066.x<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Identification+and+significance+of+sources+of+spatial+variation+in+grapevine+water+status&amp;rft.jtitle=Australian+Journal+of+Grape+and+Wine+Research&amp;rft.aulast=Taylor%2C+J.A.%3B+Acevedo%E2%80%93Opazo%2C+C.%3B+Ojeda%2C+H.%3B+Tisseyre%2C+B.&amp;rft.au=Taylor%2C+J.A.%3B+Acevedo%E2%80%93Opazo%2C+C.%3B+Ojeda%2C+H.%3B+Tisseyre%2C+B.&amp;rft.date=2010&amp;rft.volume=16&amp;rft.issue=1&amp;rft.pages=218%E2%80%9326&amp;rft_id=info:doi\/10.1111%2Fj.1755-0238.2009.00066.x&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TaylorEstab07-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TaylorEstab07_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Taylor, J.A.; McBratney, A.B.; Whelan, B.M.&#32;(2007).&#32;\"Establishing Management Classes for Broadacre Agricultural Production\".&#32;<i>Agronomy Journal<\/i>&#32;<b>99<\/b>&#32;(5): 1366-76.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2134%2Fagronj2007.0070\" target=\"_blank\">10.2134\/agronj2007.0070<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Establishing+Management+Classes+for+Broadacre+Agricultural+Production&amp;rft.jtitle=Agronomy+Journal&amp;rft.aulast=Taylor%2C+J.A.%3B+McBratney%2C+A.B.%3B+Whelan%2C+B.M.&amp;rft.au=Taylor%2C+J.A.%3B+McBratney%2C+A.B.%3B+Whelan%2C+B.M.&amp;rft.date=2007&amp;rft.volume=99&amp;rft.issue=5&amp;rft.pages=1366-76&amp;rft_id=info:doi\/10.2134%2Fagronj2007.0070&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JeongInteg12-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JeongInteg12_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Jeong, J.S.; Garc\u00eda-Moruno, L.; Hern\u00e1ndez-Blanco, J.&#32;(2012).&#32;\"Integrating buildings into a rural landscape using a multi-criteria spatial decision analysis in GIS-enabled web environment\".&#32;<i>Biosystems Engineering<\/i>&#32;<b>112<\/b>&#32;(2): 82\u201392.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.biosystemseng.2012.03.002\" target=\"_blank\">10.1016\/j.biosystemseng.2012.03.002<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Integrating+buildings+into+a+rural+landscape+using+a+multi-criteria+spatial+decision+analysis+in+GIS-enabled+web+environment&amp;rft.jtitle=Biosystems+Engineering&amp;rft.aulast=Jeong%2C+J.S.%3B+Garc%C3%ADa-Moruno%2C+L.%3B+Hern%C3%A1ndez-Blanco%2C+J.&amp;rft.au=Jeong%2C+J.S.%3B+Garc%C3%ADa-Moruno%2C+L.%3B+Hern%C3%A1ndez-Blanco%2C+J.&amp;rft.date=2012&amp;rft.volume=112&amp;rft.issue=2&amp;rft.pages=82%E2%80%9392&amp;rft_id=info:doi\/10.1016%2Fj.biosystemseng.2012.03.002&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YalewAgri16-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-YalewAgri16_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yalew, S.G.; van Griensven, A.; van der Zaag, P.&#32;(2016).&#32;\"AgriSuit: A web-based GIS-MCDA framework for agricultural land suitability assessment\".&#32;<i>Computers and Electronics in Agriculture<\/i>&#32;<b>128<\/b>&#32;(10): 1\u20138.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compag.2016.08.008\" target=\"_blank\">10.1016\/j.compag.2016.08.008<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=AgriSuit%3A+A+web-based+GIS-MCDA+framework+for+agricultural+land+suitability+assessment&amp;rft.jtitle=Computers+and+Electronics+in+Agriculture&amp;rft.aulast=Yalew%2C+S.G.%3B+van+Griensven%2C+A.%3B+van+der+Zaag%2C+P.&amp;rft.au=Yalew%2C+S.G.%3B+van+Griensven%2C+A.%3B+van+der+Zaag%2C+P.&amp;rft.date=2016&amp;rft.volume=128&amp;rft.issue=10&amp;rft.pages=1%E2%80%938&amp;rft_id=info:doi\/10.1016%2Fj.compag.2016.08.008&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LerouxAGen18-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LerouxAGen18_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Leroux, C.; Jones, H.; Clenet, A. et al.&#32;(2018).&#32;\"A general method to filter out defective spatial observations from yield mapping datasets\".&#32;<i>Precision Agriculture<\/i>: 1\u201320.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11119-017-9555-0\" target=\"_blank\">10.1007\/s11119-017-9555-0<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+general+method+to+filter+out+defective+spatial+observations+from+yield+mapping+datasets&amp;rft.jtitle=Precision+Agriculture&amp;rft.aulast=Leroux%2C+C.%3B+Jones%2C+H.%3B+Clenet%2C+A.+et+al.&amp;rft.au=Leroux%2C+C.%3B+Jones%2C+H.%3B+Clenet%2C+A.+et+al.&amp;rft.date=2018&amp;rft.pages=1%E2%80%9320&amp;rft_id=info:doi\/10.1007%2Fs11119-017-9555-0&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SudduthYield06-13\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-SudduthYield06_13-0\" rel=\"external_link\">13.0<\/a><\/sup> <sup><a href=\"#cite_ref-SudduthYield06_13-1\" rel=\"external_link\">13.1<\/a><\/sup> <sup><a href=\"#cite_ref-SudduthYield06_13-2\" rel=\"external_link\">13.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Sudduth, K.A.; Drummond, S.T.&#32;(2006).&#32;\"Yield Editor\".&#32;<i>Agronomy Journal<\/i>&#32;<b>99<\/b>&#32;(6): 1471\u201382.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2134%2Fagronj2006.0326\" target=\"_blank\">10.2134\/agronj2006.0326<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Yield+Editor&amp;rft.jtitle=Agronomy+Journal&amp;rft.aulast=Sudduth%2C+K.A.%3B+Drummond%2C+S.T.&amp;rft.au=Sudduth%2C+K.A.%3B+Drummond%2C+S.T.&amp;rft.date=2006&amp;rft.volume=99&amp;rft.issue=6&amp;rft.pages=1471%E2%80%9382&amp;rft_id=info:doi\/10.2134%2Fagronj2006.0326&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HenglAGen04-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HenglAGen04_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hengl, T.; Heuvelink, G.B.M.; Stein, A.&#32;(2004).&#32;\"A generic framework for spatial prediction of soil variables based on regression-kriging\".&#32;<i>Geoderma<\/i>&#32;<b>120<\/b>&#32;(1\u20132): 75\u201393.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.geoderma.2003.08.018\" target=\"_blank\">10.1016\/j.geoderma.2003.08.018<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+generic+framework+for+spatial+prediction+of+soil+variables+based+on+regression-kriging&amp;rft.jtitle=Geoderma&amp;rft.aulast=Hengl%2C+T.%3B+Heuvelink%2C+G.B.M.%3B+Stein%2C+A.&amp;rft.au=Hengl%2C+T.%3B+Heuvelink%2C+G.B.M.%3B+Stein%2C+A.&amp;rft.date=2004&amp;rft.volume=120&amp;rft.issue=1%E2%80%932&amp;rft.pages=75%E2%80%9393&amp;rft_id=info:doi\/10.1016%2Fj.geoderma.2003.08.018&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OliverATut14-15\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-OliverATut14_15-0\" rel=\"external_link\">15.0<\/a><\/sup> <sup><a href=\"#cite_ref-OliverATut14_15-1\" rel=\"external_link\">15.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Oliver, M.A.; Webster, R.&#32;(2014).&#32;\"A tutorial guide to geostatistics: Computing and modelling variograms and kriging\".&#32;<i>CATENA<\/i>&#32;<b>113<\/b>&#32;(2): 56\u201369.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.catena.2013.09.006\" target=\"_blank\">10.1016\/j.catena.2013.09.006<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+tutorial+guide+to+geostatistics%3A+Computing+and+modelling+variograms+and+kriging&amp;rft.jtitle=CATENA&amp;rft.aulast=Oliver%2C+M.A.%3B+Webster%2C+R.&amp;rft.au=Oliver%2C+M.A.%3B+Webster%2C+R.&amp;rft.date=2014&amp;rft.volume=113&amp;rft.issue=2&amp;rft.pages=56%E2%80%9369&amp;rft_id=info:doi\/10.1016%2Fj.catena.2013.09.006&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RobinsonTesting06-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RobinsonTesting06_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Robinson, T.P.; Mettemicht, G.&#32;(2006).&#32;\"Testing the performance of spatial interpolation techniques for mapping soil properties\".&#32;<i>Computers and Electronics in Agriculture<\/i>&#32;<b>50<\/b>&#32;(2): 97\u2013108.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compag.2005.07.003\" target=\"_blank\">10.1016\/j.compag.2005.07.003<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Testing+the+performance+of+spatial+interpolation+techniques+for+mapping+soil+properties&amp;rft.jtitle=Computers+and+Electronics+in+Agriculture&amp;rft.aulast=Robinson%2C+T.P.%3B+Mettemicht%2C+G.&amp;rft.au=Robinson%2C+T.P.%3B+Mettemicht%2C+G.&amp;rft.date=2006&amp;rft.volume=50&amp;rft.issue=2&amp;rft.pages=97%E2%80%93108&amp;rft_id=info:doi\/10.1016%2Fj.compag.2005.07.003&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Cid-GarciaRect13-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Cid-GarciaRect13_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Cid-Garcia, N.M.; Albornoz, V.; Rios-Solis, Y.A.; Ortega, R.&#32;(2013).&#32;\"Rectangular shape management zone delineation using integer linear programming\".&#32;<i>Computers and Electronics in Agriculture<\/i>&#32;<b>93<\/b>&#32;(4): 1\u20139.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compag.2013.01.009\" target=\"_blank\">10.1016\/j.compag.2013.01.009<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Rectangular+shape+management+zone+delineation+using+integer+linear+programming&amp;rft.jtitle=Computers+and+Electronics+in+Agriculture&amp;rft.aulast=Cid-Garcia%2C+N.M.%3B+Albornoz%2C+V.%3B+Rios-Solis%2C+Y.A.%3B+Ortega%2C+R.&amp;rft.au=Cid-Garcia%2C+N.M.%3B+Albornoz%2C+V.%3B+Rios-Solis%2C+Y.A.%3B+Ortega%2C+R.&amp;rft.date=2013&amp;rft.volume=93&amp;rft.issue=4&amp;rft.pages=1%E2%80%939&amp;rft_id=info:doi\/10.1016%2Fj.compag.2013.01.009&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PedrosoASeg10-18\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-PedrosoASeg10_18-0\" rel=\"external_link\">18.0<\/a><\/sup> <sup><a href=\"#cite_ref-PedrosoASeg10_18-1\" rel=\"external_link\">18.1<\/a><\/sup> <sup><a href=\"#cite_ref-PedrosoASeg10_18-2\" rel=\"external_link\">18.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pedroso, M.; Taylor, J.; Tisseyre, B. et al.&#32;(2010).&#32;\"A segmentation algorithm for the delineation of agricultural management zones\".&#32;<i>Computers and Electronics in Agriculture<\/i>&#32;<b>70<\/b>&#32;(1): 199\u2013208.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compag.2009.10.007\" target=\"_blank\">10.1016\/j.compag.2009.10.007<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+segmentation+algorithm+for+the+delineation+of+agricultural+management+zones&amp;rft.jtitle=Computers+and+Electronics+in+Agriculture&amp;rft.aulast=Pedroso%2C+M.%3B+Taylor%2C+J.%3B+Tisseyre%2C+B.+et+al.&amp;rft.au=Pedroso%2C+M.%3B+Taylor%2C+J.%3B+Tisseyre%2C+B.+et+al.&amp;rft.date=2010&amp;rft.volume=70&amp;rft.issue=1&amp;rft.pages=199%E2%80%93208&amp;rft_id=info:doi\/10.1016%2Fj.compag.2009.10.007&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BlackmoreTheAnal03-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BlackmoreTheAnal03_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Blackmore, S.; Godwin, R.J.; Fountas, S.&#32;(2003).&#32;\"The Analysis of Spatial and Temporal Trends in Yield Map Data over Six Years\".&#32;<i>Biosystems Engineering<\/i>&#32;<b>84<\/b>&#32;(4): 455\u201366.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS1537-5110%2803%2900038-2\" target=\"_blank\">10.1016\/S1537-5110(03)00038-2<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+Analysis+of+Spatial+and+Temporal+Trends+in+Yield+Map+Data+over+Six+Years&amp;rft.jtitle=Biosystems+Engineering&amp;rft.aulast=Blackmore%2C+S.%3B+Godwin%2C+R.J.%3B+Fountas%2C+S.&amp;rft.au=Blackmore%2C+S.%3B+Godwin%2C+R.J.%3B+Fountas%2C+S.&amp;rft.date=2003&amp;rft.volume=84&amp;rft.issue=4&amp;rft.pages=455%E2%80%9366&amp;rft_id=info:doi\/10.1016%2FS1537-5110%2803%2900038-2&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YanDeline07-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-YanDeline07_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Li, Y.; Shi, Z.; Li, F.; Li, H.-Y.&#32;(2007).&#32;\"Delineation of site-specific management zones using fuzzy clustering analysis in a coastal saline land\".&#32;<i>Computers and Electronics in Agriculture<\/i>&#32;<b>56<\/b>&#32;(2): 174\u201386.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compag.2007.01.013\" target=\"_blank\">10.1016\/j.compag.2007.01.013<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Delineation+of+site-specific+management+zones+using+fuzzy+clustering+analysis+in+a+coastal+saline+land&amp;rft.jtitle=Computers+and+Electronics+in+Agriculture&amp;rft.aulast=Li%2C+Y.%3B+Shi%2C+Z.%3B+Li%2C+F.%3B+Li%2C+H.-Y.&amp;rft.au=Li%2C+Y.%3B+Shi%2C+Z.%3B+Li%2C+F.%3B+Li%2C+H.-Y.&amp;rft.date=2007&amp;rft.volume=56&amp;rft.issue=2&amp;rft.pages=174%E2%80%9386&amp;rft_id=info:doi\/10.1016%2Fj.compag.2007.01.013&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OliverInteg10-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OliverInteg10_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Oliver, Y.M.; Robertson, M.J.; Wong, M.T.F.&#32;(2010).&#32;\"Integrating farmer knowledge, precision agriculture tools, and crop simulation modelling to evaluate management options for poor-performing patches in cropping fields\".&#32;<i>European Journal of Agronomy<\/i>&#32;<b>32<\/b>&#32;(1): 40\u201350.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.eja.2009.05.002\" target=\"_blank\">10.1016\/j.eja.2009.05.002<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Integrating+farmer+knowledge%2C+precision+agriculture+tools%2C+and+crop+simulation+modelling+to+evaluate+management+options+for+poor-performing+patches+in+cropping+fields&amp;rft.jtitle=European+Journal+of+Agronomy&amp;rft.aulast=Oliver%2C+Y.M.%3B+Robertson%2C+M.J.%3B+Wong%2C+M.T.F.&amp;rft.au=Oliver%2C+Y.M.%3B+Robertson%2C+M.J.%3B+Wong%2C+M.T.F.&amp;rft.date=2010&amp;rft.volume=32&amp;rft.issue=1&amp;rft.pages=40%E2%80%9350&amp;rft_id=info:doi\/10.1016%2Fj.eja.2009.05.002&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PichonASystem17-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PichonASystem17_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pichon, L.; Besqueut, G.; Tisseyre, B.&#32;(2017).&#32;\"A systemic approach to identify relevant information provided by UAV in precision viticulture\".&#32;<i>Advances in Animal Biosciences<\/i>&#32;<b>8<\/b>&#32;(2): 823\u20137.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1017%2FS2040470017001194\" target=\"_blank\">10.1017\/S2040470017001194<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+systemic+approach+to+identify+relevant+information+provided+by+UAV+in+precision+viticulture&amp;rft.jtitle=Advances+in+Animal+Biosciences&amp;rft.aulast=Pichon%2C+L.%3B+Besqueut%2C+G.%3B+Tisseyre%2C+B.&amp;rft.au=Pichon%2C+L.%3B+Besqueut%2C+G.%3B+Tisseyre%2C+B.&amp;rft.date=2017&amp;rft.volume=8&amp;rft.issue=2&amp;rft.pages=823%E2%80%937&amp;rft_id=info:doi\/10.1017%2FS2040470017001194&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SchenattoUseOf17-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SchenattoUseOf17_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Schenatto, K.; de Souza, E.G.; Bazzi, C.L. et al.&#32;(2017).&#32;\"Use of the farmer\u2019s experience variable in the generation of management zones\".&#32;<i>Semina, Ci\u00eancias Agr\u00e1rias<\/i>&#32;<b>38<\/b>&#32;(4): 2305\u201321.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5433%2F1679-0359.2017v38n4Supl1p2305\" target=\"_blank\">10.5433\/1679-0359.2017v38n4Supl1p2305<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Use+of+the+farmer%E2%80%99s+experience+variable+in+the+generation+of+management+zones&amp;rft.jtitle=Semina%2C+Ci%C3%AAncias+Agr%C3%A1rias&amp;rft.aulast=Schenatto%2C+K.%3B+de+Souza%2C+E.G.%3B+Bazzi%2C+C.L.+et+al.&amp;rft.au=Schenatto%2C+K.%3B+de+Souza%2C+E.G.%3B+Bazzi%2C+C.L.+et+al.&amp;rft.date=2017&amp;rft.volume=38&amp;rft.issue=4&amp;rft.pages=2305%E2%80%9321&amp;rft_id=info:doi\/10.5433%2F1679-0359.2017v38n4Supl1p2305&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LerouxANew17-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LerouxANew17_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Leroux, C.; Jones, H.; Clenet, A.; Tisseyre, B.&#32;(2017).&#32;\"A new approach for zoning irregularly-spaced, within-field data\".&#32;<i>Computers and Electronics in Agriculture<\/i>&#32;<b>141<\/b>&#32;(9): 196\u2013206.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compag.2017.07.025\" target=\"_blank\">10.1016\/j.compag.2017.07.025<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+new+approach+for+zoning+irregularly-spaced%2C+within-field+data&amp;rft.jtitle=Computers+and+Electronics+in+Agriculture&amp;rft.aulast=Leroux%2C+C.%3B+Jones%2C+H.%3B+Clenet%2C+A.%3B+Tisseyre%2C+B.&amp;rft.au=Leroux%2C+C.%3B+Jones%2C+H.%3B+Clenet%2C+A.%3B+Tisseyre%2C+B.&amp;rft.date=2017&amp;rft.volume=141&amp;rft.issue=9&amp;rft.pages=196%E2%80%93206&amp;rft_id=info:doi\/10.1016%2Fj.compag.2017.07.025&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RoudierManage08-25\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-RoudierManage08_25-0\" rel=\"external_link\">25.0<\/a><\/sup> <sup><a href=\"#cite_ref-RoudierManage08_25-1\" rel=\"external_link\">25.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Roudier, P.; Tisseyre, B.; Poilv\u00e9, H.; Roger, J.-M.&#32;(2008).&#32;\"Management zone delineation using a modified watershed algorithm\".&#32;<i>Precision Agriculture<\/i>&#32;<b>9<\/b>: 233.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11119-008-9067-z\" target=\"_blank\">10.1007\/s11119-008-9067-z<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Management+zone+delineation+using+a+modified+watershed+algorithm&amp;rft.jtitle=Precision+Agriculture&amp;rft.aulast=Roudier%2C+P.%3B+Tisseyre%2C+B.%3B+Poilv%C3%A9%2C+H.%3B+Roger%2C+J.-M.&amp;rft.au=Roudier%2C+P.%3B+Tisseyre%2C+B.%3B+Poilv%C3%A9%2C+H.%3B+Roger%2C+J.-M.&amp;rft.date=2008&amp;rft.volume=9&amp;rft.pages=233&amp;rft_id=info:doi\/10.1007%2Fs11119-008-9067-z&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WhelanVesper01-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WhelanVesper01_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Whelan, B.M.; McBratney, A.B.; Minasny, B.&#32;(2001).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.semanticscholar.org\/paper\/Vesper-%E2%80%93-Spatial-Prediction-Software-for-Precision-Whelan-Mcbratney\/52caaed8c82c943d760e3166e75d783c26d3dfe4\" target=\"_blank\">\"Vesper\u2014Spatial prediction software for precision agriculture\"<\/a>.&#32;<i>ECPA 2001, Proceedings of the 3rd European Conference on Precision Agriculture<\/i>: 139\u201344<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.semanticscholar.org\/paper\/Vesper-%E2%80%93-Spatial-Prediction-Software-for-Precision-Whelan-Mcbratney\/52caaed8c82c943d760e3166e75d783c26d3dfe4\" target=\"_blank\">https:\/\/www.semanticscholar.org\/paper\/Vesper-%E2%80%93-Spatial-Prediction-Software-for-Precision-Whelan-Mcbratney\/52caaed8c82c943d760e3166e75d783c26d3dfe4<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Vesper%E2%80%94Spatial+prediction+software+for+precision+agriculture&amp;rft.jtitle=ECPA+2001%2C+Proceedings+of+the+3rd+European+Conference+on+Precision+Agriculture&amp;rft.aulast=Whelan%2C+B.M.%3B+McBratney%2C+A.B.%3B+Minasny%2C+B.&amp;rft.au=Whelan%2C+B.M.%3B+McBratney%2C+A.B.%3B+Minasny%2C+B.&amp;rft.date=2001&amp;rft.pages=139%E2%80%9344&amp;rft_id=https%3A%2F%2Fwww.semanticscholar.org%2Fpaper%2FVesper-%25E2%2580%2593-Spatial-Prediction-Software-for-Precision-Whelan-Mcbratney%2F52caaed8c82c943d760e3166e75d783c26d3dfe4&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SudduthYield12-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SudduthYield12_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Sudduth, K.A.; Drummond, S.T.; Myers, D.B.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/extension.missouri.edu\/sare\/documents\/asabeyieldeditor2012.pdf\" target=\"_blank\">\"Yield Editor 2.0: Software for Automated Removal of Yield Map Errors\"<\/a>.&#32;<i>Proceedings of the 2012 ASABE Annual International Meeting<\/i>: 1\u201314<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/extension.missouri.edu\/sare\/documents\/asabeyieldeditor2012.pdf\" target=\"_blank\">http:\/\/extension.missouri.edu\/sare\/documents\/asabeyieldeditor2012.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Yield+Editor+2.0%3A+Software+for+Automated+Removal+of+Yield+Map+Errors&amp;rft.jtitle=Proceedings+of+the+2012+ASABE+Annual+International+Meeting&amp;rft.aulast=Sudduth%2C+K.A.%3B+Drummond%2C+S.T.%3B+Myers%2C+D.B.&amp;rft.au=Sudduth%2C+K.A.%3B+Drummond%2C+S.T.%3B+Myers%2C+D.B.&amp;rft.date=2012&amp;rft.pages=1%E2%80%9314&amp;rft_id=http%3A%2F%2Fextension.missouri.edu%2Fsare%2Fdocuments%2Fasabeyieldeditor2012.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SimbahanScreen03-28\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-SimbahanScreen03_28-0\" rel=\"external_link\">28.0<\/a><\/sup> <sup><a href=\"#cite_ref-SimbahanScreen03_28-1\" rel=\"external_link\">28.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Simbahan, G.C.; Dobermann, A.; Ping, J.L.&#32;(2003).&#32;\"Screening Yield Monitor Data Improves Grain Yield Maps\".&#32;<i>Agronomy Journal<\/i>&#32;<b>96<\/b>&#32;(4): 1091\u2013102.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2134%2Fagronj2004.1091\" target=\"_blank\">10.2134\/agronj2004.1091<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Screening+Yield+Monitor+Data+Improves+Grain+Yield+Maps&amp;rft.jtitle=Agronomy+Journal&amp;rft.aulast=Simbahan%2C+G.C.%3B+Dobermann%2C+A.%3B+Ping%2C+J.L.&amp;rft.au=Simbahan%2C+G.C.%3B+Dobermann%2C+A.%3B+Ping%2C+J.L.&amp;rft.date=2003&amp;rft.volume=96&amp;rft.issue=4&amp;rft.pages=1091%E2%80%93102&amp;rft_id=info:doi\/10.2134%2Fagronj2004.1091&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KrishnanWeb16-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KrishnanWeb16_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Krishnan, P.; Sharma, R.K.; Dass, A. et al.&#32;(2016).&#32;\"Web-based crop model: Web InfoCrop \u2013 Wheat to simulate the growth and yield of wheat\".&#32;<i>Computers and Electronics in Agriculture<\/i>&#32;<b>127<\/b>&#32;(9): 324\u201335.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compag.2016.06.008\" target=\"_blank\">10.1016\/j.compag.2016.06.008<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Web-based+crop+model%3A+Web+InfoCrop+%E2%80%93+Wheat+to+simulate+the+growth+and+yield+of+wheat&amp;rft.jtitle=Computers+and+Electronics+in+Agriculture&amp;rft.aulast=Krishnan%2C+P.%3B+Sharma%2C+R.K.%3B+Dass%2C+A.+et+al.&amp;rft.au=Krishnan%2C+P.%3B+Sharma%2C+R.K.%3B+Dass%2C+A.+et+al.&amp;rft.date=2016&amp;rft.volume=127&amp;rft.issue=9&amp;rft.pages=324%E2%80%9335&amp;rft_id=info:doi\/10.1016%2Fj.compag.2016.06.008&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GuillaumeSoft13-30\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-GuillaumeSoft13_30-0\" rel=\"external_link\">30.0<\/a><\/sup> <sup><a href=\"#cite_ref-GuillaumeSoft13_30-1\" rel=\"external_link\">30.1<\/a><\/sup> <sup><a href=\"#cite_ref-GuillaumeSoft13_30-2\" rel=\"external_link\">30.2<\/a><\/sup> <sup><a href=\"#cite_ref-GuillaumeSoft13_30-3\" rel=\"external_link\">30.3<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Guillaume, S.; Charnomordic, B.; Tisseyre, B.; Taylor, J.&#32;(2013).&#32;\"Soft computing-based decision support tools for spatial data\".&#32;<i>International Journal of Computational Intelligence Systems<\/i>&#32;<b>6<\/b>&#32;(Sup. 1): 18\u201333.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1080%2F18756891.2013.818185\" target=\"_blank\">10.1080\/18756891.2013.818185<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Soft+computing-based+decision+support+tools+for+spatial+data&amp;rft.jtitle=International+Journal+of+Computational+Intelligence+Systems&amp;rft.aulast=Guillaume%2C+S.%3B+Charnomordic%2C+B.%3B+Tisseyre%2C+B.%3B+Taylor%2C+J.&amp;rft.au=Guillaume%2C+S.%3B+Charnomordic%2C+B.%3B+Tisseyre%2C+B.%3B+Taylor%2C+J.&amp;rft.date=2013&amp;rft.volume=6&amp;rft.issue=Sup.+1&amp;rft.pages=18%E2%80%9333&amp;rft_id=info:doi\/10.1080%2F18756891.2013.818185&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TisseyreATech08-31\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-TisseyreATech08_31-0\" rel=\"external_link\">31.0<\/a><\/sup> <sup><a href=\"#cite_ref-TisseyreATech08_31-1\" rel=\"external_link\">31.1<\/a><\/sup> <sup><a href=\"#cite_ref-TisseyreATech08_31-2\" rel=\"external_link\">31.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Tisseyre, B.; McBratney, A.B.&#32;(2008).&#32;\"A technical opportunity index based on mathematical morphology for site-specific management: An application to viticulture\".&#32;<i>Precision Agriculture<\/i>&#32;<b>9<\/b>&#32;(1\u20132): 101\u201313.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11119-008-9053-5\" target=\"_blank\">10.1007\/s11119-008-9053-5<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+technical+opportunity+index+based+on+mathematical+morphology+for+site-specific+management%3A+An+application+to+viticulture&amp;rft.jtitle=Precision+Agriculture&amp;rft.aulast=Tisseyre%2C+B.%3B+McBratney%2C+A.B.&amp;rft.au=Tisseyre%2C+B.%3B+McBratney%2C+A.B.&amp;rft.date=2008&amp;rft.volume=9&amp;rft.issue=1%E2%80%932&amp;rft.pages=101%E2%80%9313&amp;rft_id=info:doi\/10.1007%2Fs11119-008-9053-5&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GuillaumeFuzzy13-32\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GuillaumeFuzzy13_32-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Guillaume, S.; Charnomordic, B.; Loisel, P.&#32;(2013).&#32;\"Fuzzy partitions: A way to integrate expert knowledge into distance calculations\".&#32;<i>Information Sciences<\/i>&#32;<b>245<\/b>&#32;(10): 76\u201395.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ins.2012.07.045\" target=\"_blank\">10.1016\/j.ins.2012.07.045<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Fuzzy+partitions%3A+A+way+to+integrate+expert+knowledge+into+distance+calculations&amp;rft.jtitle=Information+Sciences&amp;rft.aulast=Guillaume%2C+S.%3B+Charnomordic%2C+B.%3B+Loisel%2C+P.&amp;rft.au=Guillaume%2C+S.%3B+Charnomordic%2C+B.%3B+Loisel%2C+P.&amp;rft.date=2013&amp;rft.volume=245&amp;rft.issue=10&amp;rft.pages=76%E2%80%9395&amp;rft_id=info:doi\/10.1016%2Fj.ins.2012.07.045&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YagerOnOrdered88-33\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-YagerOnOrdered88_33-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yager, R.R.&#32;(1988).&#32;\"On ordered weighted averaging aggregation operators in multicriteria decisionmaking\".&#32;<i>IEEE Transactions on Systems, Man, and Cybernetics<\/i>&#32;<b>18<\/b>&#32;(1): 183\u201390.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2F21.87068\" target=\"_blank\">10.1109\/21.87068<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=On+ordered+weighted+averaging+aggregation+operators+in+multicriteria+decisionmaking&amp;rft.jtitle=IEEE+Transactions+on+Systems%2C+Man%2C+and+Cybernetics&amp;rft.aulast=Yager%2C+R.R.&amp;rft.au=Yager%2C+R.R.&amp;rft.date=1988&amp;rft.volume=18&amp;rft.issue=1&amp;rft.pages=183%E2%80%9390&amp;rft_id=info:doi\/10.1109%2F21.87068&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LamourMapping17-34\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LamourMapping17_34-0\" rel=\"external_link\">34.0<\/a><\/sup> <sup><a href=\"#cite_ref-LamourMapping17_34-1\" rel=\"external_link\">34.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lamour, J.; Naud, O.; Lechaudel, M.; Tisseyre, B.&#32;(2017).&#32;\"Mapping properties of an asynchronous crop: The example of time interval between flowering and maturity of banana\".&#32;<i>Advances in Animal Biosciences<\/i>&#32;<b>8<\/b>&#32;(2): 481\u20136.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1017%2FS2040470017000449\" target=\"_blank\">10.1017\/S2040470017000449<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Mapping+properties+of+an+asynchronous+crop%3A+The+example+of+time+interval+between+flowering+and+maturity+of+banana&amp;rft.jtitle=Advances+in+Animal+Biosciences&amp;rft.aulast=Lamour%2C+J.%3B+Naud%2C+O.%3B+Lechaudel%2C+M.%3B+Tisseyre%2C+B.&amp;rft.au=Lamour%2C+J.%3B+Naud%2C+O.%3B+Lechaudel%2C+M.%3B+Tisseyre%2C+B.&amp;rft.date=2017&amp;rft.volume=8&amp;rft.issue=2&amp;rft.pages=481%E2%80%936&amp;rft_id=info:doi\/10.1017%2FS2040470017000449&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SantestebanArePrec13-35\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-SantestebanArePrec13_35-0\" rel=\"external_link\">35.0<\/a><\/sup> <sup><a href=\"#cite_ref-SantestebanArePrec13_35-1\" rel=\"external_link\">35.1<\/a><\/sup> <sup><a href=\"#cite_ref-SantestebanArePrec13_35-2\" rel=\"external_link\">35.2<\/a><\/sup> <sup><a href=\"#cite_ref-SantestebanArePrec13_35-3\" rel=\"external_link\">35.3<\/a><\/sup> <sup><a href=\"#cite_ref-SantestebanArePrec13_35-4\" rel=\"external_link\">35.4<\/a><\/sup> <sup><a href=\"#cite_ref-SantestebanArePrec13_35-5\" rel=\"external_link\">35.5<\/a><\/sup> <sup><a href=\"#cite_ref-SantestebanArePrec13_35-6\" rel=\"external_link\">35.6<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Santesteban, L.G.; Guillaume, S.; Royo, J.B.; Tisseyre, B.&#32;(2013).&#32;\"Are precision agriculture tools and methods relevant at the whole-vineyard scale?\".&#32;<i>Precision Agriculture<\/i>&#32;<b>14<\/b>&#32;(1): 2\u201317.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11119-012-9268-3\" target=\"_blank\">10.1007\/s11119-012-9268-3<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Are+precision+agriculture+tools+and+methods+relevant+at+the+whole-vineyard+scale%3F&amp;rft.jtitle=Precision+Agriculture&amp;rft.aulast=Santesteban%2C+L.G.%3B+Guillaume%2C+S.%3B+Royo%2C+J.B.%3B+Tisseyre%2C+B.&amp;rft.au=Santesteban%2C+L.G.%3B+Guillaume%2C+S.%3B+Royo%2C+J.B.%3B+Tisseyre%2C+B.&amp;rft.date=2013&amp;rft.volume=14&amp;rft.issue=1&amp;rft.pages=2%E2%80%9317&amp;rft_id=info:doi\/10.1007%2Fs11119-012-9268-3&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193153\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.814 seconds\nReal time usage: 0.864 seconds\nPreprocessor visited node count: 26792\/1000000\nPreprocessor generated node count: 36395\/1000000\nPost\u2010expand include size: 186595\/2097152 bytes\nTemplate argument size: 63603\/2097152 bytes\nHighest expansion depth: 15\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 778.919 1 - -total\n 85.66% 667.205 1 - Template:Reflist\n 75.44% 587.639 35 - Template:Citation\/core\n 73.96% 576.120 34 - Template:Cite_journal\n 7.80% 60.727 1 - Template:Infobox_journal_article\n 7.47% 58.170 1 - Template:Infobox\n 6.10% 47.525 33 - Template:Citation\/identifier\n 5.60% 43.632 1 - Template:Cite_book\n 4.48% 34.863 80 - Template:Infobox\/row\n 3.37% 26.288 35 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10669-0!*!0!!en!5!* and timestamp 20181214193152 and revision id 33573\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data\">https:\/\/www.limswiki.org\/index.php\/Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","c443b688b80703848e965b29dc3cba01_images":["https:\/\/www.limswiki.org\/images\/2\/28\/Fig1_Leroux_Agri2018_8-6.jpg","https:\/\/www.limswiki.org\/images\/0\/0d\/Fig2_Leroux_Agri2018_8-6.jpg","https:\/\/www.limswiki.org\/images\/5\/59\/Fig3_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/4\/42\/Fig4_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/a\/a0\/Fig5_Leroux_Agri2018_8-6.jpg","https:\/\/www.limswiki.org\/images\/c\/cf\/Fig6_Leroux_Agri2018_8-6.jpg","https:\/\/www.limswiki.org\/images\/3\/3b\/Fig7_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/5\/5e\/Fig8_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/6\/6f\/Fig8b_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/1\/1f\/Fig9_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/4\/4c\/Fig10_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/b\/b3\/Fig11_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/b\/bc\/Fig12_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/4\/4e\/Fig13_Leroux_Agri2018_8-6.png","https:\/\/www.limswiki.org\/images\/1\/14\/Fig14_Leroux_Agri2018_8-6.png"],"c443b688b80703848e965b29dc3cba01_timestamp":1544815912,"69cd9560f847d37e95c0bf5ffc36d532_type":"article","69cd9560f847d37e95c0bf5ffc36d532_title":"Wireless positioning in IoT: A look at current and future trends (e Silva et al. 2018)","69cd9560f847d37e95c0bf5ffc36d532_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends","69cd9560f847d37e95c0bf5ffc36d532_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Wireless positioning in IoT: A look at current and future trends\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nWireless positioning in IoT: A look at current and future trendsJournal\n \nSensorsAuthor(s)\n \ne Silva, Pedro Figueiredo; Kaseva, Ville; Lohan, Elena SimonaAuthor affiliation(s)\n \nTampere University of Technology, WirepasPrimary contact\n \nEmail: pedro.figs.silva@gmail.comYear published\n \n2018Volume and issue\n \n18(8)Page(s)\n \n2470DOI\n \n10.3390\/s18082470ISSN\n \n1424-8220Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttp:\/\/www.mdpi.com\/1424-8220\/18\/8\/2470\/htmDownload\n \nhttp:\/\/www.mdpi.com\/1424-8220\/18\/8\/2470\/pdf (PDF)\n\n\n\n\n \n This article contains rendered mathematical formulae. You may require the Math Anywhere plugin for Chrome or the Native MathML add-on and fonts for Firefox if they don't render properly for you. \n\n\nContents\n\n1 Abstract \n2 Introduction \n3 Related work \n4 Designing an IoT positioning system \n\n4.1 Positioning domains \n\n4.1.1 Power domain \n4.1.2 Time domain \n4.1.3 Space domain \n\n\n4.2 IoT classifications \n4.3 IoT system parameters \n\n4.3.1 Topology \n4.3.2 Range \n4.3.3 Channel bandwidth \n4.3.4 Carrier frequency \n4.3.5 Modulation types \n4.3.6 Positioning signaling\/data exchange \n4.3.7 Roaming \n4.3.8 Network ownership \n4.3.9 Power consumption \n\n\n4.4 Comparing IoT technologies and IoT enablers \n\n\n5 Simulation-based performance metrics \n\n5.1 Case Study 1: 802.11az IoT enabler, simulation-based results, time domain \n5.2 Case-Study 2: LoRa, simulation-based results, time domain \n\n\n6 Measurement-based performance metrics with Wirepas IoT platform \n7 Conclusions \n8 Acknowledgements \n\n8.1 Author contributions \n8.2 Funding \n8.3 Conflicts of interest \n\n\n9 References \n10 Notes \n\n\n\nAbstract \nConnectivity solutions for the internet of things (IoT) aim to support the needs imposed by several applications or use cases across multiple sectors, such as logistics, agriculture, asset management, or smart lighting. Each of these applications has its own challenges to solve, such as dealing with large or massive networks, low and ultra-low latency requirements, long battery life requirements (i.e., more than ten years operation on battery), continuously monitoring of the location of certain nodes, security, and authentication. Hence, a part of picking a connectivity solution for a certain application depends on how well its features solve the specific needs of the end application. One key feature that we see as a need for future IoT networks is the ability to provide location-based information for large-scale IoT applications. The goal of this paper is to highlight the importance of positioning features for IoT applications and to provide means of comparing and evaluating different connectivity protocols in terms of their positioning capabilities. Our compact and unified analysis ends with several case studies, both simulation-based and measurement-based, which show that high positioning accuracy on low-cost low-power devices is feasible if one designs the system properly.\nKeywords: internet of things (IoT), wireless positioning, indoor location\n\nIntroduction \nNowadays, the amount of connected wireless devices is growing, e.g., smart watches, smart light bulbs, smart toothbrushes, smart coffee mugs, etc. The trend in the information technology industry is towards connecting and extracting analytics from a variety of inter-connected wireless devices.\nWhile many IoT applications have so far focused on the consumer realm, more and more industrial applications are also appearing, such as utilities measurement (e.g., water, electricity, and gas), industrial lighting, logistics, and smart agriculture. Enabling such industrial applications means that IoT networks need to support large amounts of devices, multiple years of operation on battery, different latency requirements, and low costs per unit.\nWe believe that, on top of the communications and reliability requirements of a wireless link, many IoT applications will require or benefit from knowing the location of certain devices. Such location information will be needed seamlessly, both indoors and outdoors, and without the battery-draining Global Navigation Satellite Systems (GNSS) chipsets. The need for localization and tracking appears not only from the network management point of view, but also from a business perspective, driving new business models and new business avenues.\nNevertheless, enabling or creating a positioning system with an IoT network is not a trivial task. The reason behind this is that industrial applications seek a low per unit cost of their IoT devices, which results in devices with very limited hardware components, such as CPU, memory, and battery. The limited hardware has an impact on the number of devices that a single device can serve and how fast it can process network and application requests. However, while CPU and memory will have an important impact on the scale of the network, the biggest challenge for enabling a positioning system lies on the proper management of the devices\u2019 radio.\nThe need for proper radio management becomes evident as there are devices with known coordinates which will broadcast specific payloads on a regular basis and other devices whose locations are to be determined, which will need to scan the spectrum frequently. Hence, too frequent broadcasting will lead to spectrum congestion and increased packet collision, whereas frequent scanning leads to high battery consumption, which is particularly problematic for battery-operated devices.\nOverall, the biggest challenge to tackle for an IoT positioning network is to balance the power consumption against the performance of the system. A very reactive system will have to rely on frequent scanning and broadcasting of its members, which means that devices will need to draw large quantities of power. A low reactive system will draw less power with devices scanning very seldom.\nThe goal of this paper is to provide an insight on positioning capabilities of the current IoT technologies and other relevant IoT-enabling wireless technologies, such as WiFi. The paper starts by classifying three domains of positioning and discussing the main shortcomings of each of these domains for IoT devices. It then classifies the different IoT solutions according to six classification criteria, and it provides a discussion on the main system parameters relevant to positioning and tracking purposes. This discussion acts as a basis for comparison between the different IoT wireless solutions. To further complement this discussion, we present positioning results based on simulation-based scenarios and field experiments with a platform built on top of the Wirepas Mesh connectivity solution. In the end, we provide a short summary and conclusions of our findings.\n\nRelated work \nAt this moment, to the authors\u2019 best knowledge, there are no comprehensive comparisons in the literature between different IoT protocols in terms of their positioning capabilities. There are, however, other studies that compare specific IoT technologies and which look at IoT from the communications point of view, as well as studies focusing on positioning with a particular technology, such as narrow-band IoT (NB-IoT) or BLE. In this section, we highlight the related work from literature studies.\nA survey of localization methods for 5G, containing a short section also on IoT positioning, has been recently published as a white paper by the European Cooperation in Science &amp; Technology (COST).[1] It has also been emphasized in this paper that localization will become a key component of future 5G systems, though accurate future localization solutions in 5G should exploit the multipath and non-line-of-sight information and should put more emphasis on heterogeneous data fusion mechanisms. However, such advanced solutions would also increase the power consumption on the devices and are not well-suited for the majority of IoT systems. Distinct from COST's work[1], our paper focuses mostly on low-cost low-power consumption IoT solutions.\nLin et al.[2] focus on the Long-Term Evolution (LTE) Machine type communications (LTE-M) and Narrow Band Internet of Things (NB-IoT) protocols and their positioning capabilities. The authors demonstrate that at 46 dBm power of the transmit AN, positioning accuracy goes to around 10 m and that NB-IoT protocol supports better positioning accuracy than LTE-M protocol.[2] A similar study focuses on indoor localization via improved received signal strength (RSS) fingerprinting in generic IoT devices.[3] The results are based on 802.11b\/g\/n signals where location errors below 5 m are achieved in more than 50 percent of the studied cases.\ndel Peral-Rosado et al. investigate a time-domain based positioning with additional frequency hopping for the NB-IoT system. The obtained positioning accuracy is down to 30\u201350 m under strong signal-to-noise ratio conditions, and it deteriorates quickly for medium and low signal-to-noise ratios.[4]\nChen et al.[5] released a study complementary to our work, looking at IoT positioning from the perspective of security, privacy, and robustness of the localization technology. No positioning results were reported in the study. Another complementary study by Singh and Kapoor[6] focuses on existing and emerging software and hardware platforms for IoT applications, but positioning was not part of that study. IoT positioning has recently been considered by Zhang et al.[7] from the point of view of spoofing resistance in time of arrival (TOA) ultra-wideband (UWB) for IoT systems.\nOther complementary comprehensive studies, focusing solely on the communication aspects of IoT, are authored by Al-Sarawi et al.[8] and Raza et al.[9]\n\nDesigning an IoT positioning system \nAt its core, a positioning system translates a set of measurements from well-known reference points into a coordinate pair. The reference points\u2014known as anchors in localization terminology or Access Nodes (AN) in IoT terminology\u2014act as a means for the device of interest, a mobile or an IoT tag, to be in a local or global reference frame. Depending on who takes the measurements, the positioning is considered to be network-centric (i.e., when the anchors make the positioning-related measurements) or device-centric (i.e., when the IoT end nodes or tags perform the positioning-related measurements).\nThese two types of positioning have very different implications on security and privacy, which should always be carefully considered regarding the final application. For example, privacy-preserving positioning solutions are easier to be achieved in a device-centric approach than in a network-centric approach as the device would not need to disclose its position to the network.\n\nPositioning domains \nIn terms of measurements, there are multiple domains from which they can be extracted from, as long as there are means to do so in the devices. For that reason, we briefly present three of the main domains we consider of interest for an IoT positioning system:\n\n power or signal strength-based\n time-based\n space-based\nOther domains, such as natural or artificial fields, e.g., geo-magnetic field, light, sounds, or smell are out of the scope of our study, but they could also serve as relevant sources of information for future IoT positioning systems.\nThe following subsections provide a short summary of main challenges in each of these three positioning domains and their system-wide impacts.\n\nPower domain \nSignal strength measurements are derived from the protocol operation, which most of the times results in a measurement of no additional cost to the device and battery consumption. However, positioning solutions in the power domain must tackle several challenges, in particular those related to the fast fluctuations of the Received Signal Strength (RSS) or of the backscattered power (BP), due to fading and shadowing caused by the surrounding environment. One key factor to model the RSS measurements relies on the possibility of understanding, with a given degree of accuracy, how the signal power changes in its surrounding environments. The signal power models as a function of the distance between the transmitter and the receiver are known as path-loss models.[10][11] A typical empirical Log distance model is the single-slope path loss model[11]:\n\n \n \n \n \n P\n \n r\n \n \n \n (\n d\n )\n \n =\n \n P\n \n r\n \n \n \n (\n \n d\n \n 0\n \n \n )\n \n &#x2212;\n 10\n &#x03B7;\n \n log\n \n 10\n \n \n \n \n (\n \n \n d\n \n d\n \n 0\n \n \n \n \n )\n \n +\n w\n \n \n {\\displaystyle P_{r}\\left(d\\right)=P_{r}\\left(d_{0}\\right)-10\\eta \\log _{10}\\;\\left({\\frac {d}{d_{0}}}\\right)+w}\n \n ,\nwhere \n \n \n \n \n P\n \n r\n \n \n \n (\n \n )\n \n &lt;\n 0\n \n \n {\\displaystyle P_{r}\\left(\\right)&lt;0}\n \n is the received signal power in logarithmic scale dependent on distance d, \n \n \n \n \n d\n \n 0\n \n \n \n \n {\\displaystyle d_{0}}\n \n is a reference distance (usually 1 m), \n \n \n \n &#x03B7;\n &gt;\n 0\n \n \n {\\displaystyle \\eta &gt;0}\n \n is the path-loss exponent, and \n \n \n \n w\n &#x223C;\n l\n o\n g\n \n (\n \n N\n \n (\n \n 0\n ,\n \n &#x03C3;\n \n 2\n \n \n \n )\n \n \n )\n \n \n \n {\\displaystyle w\\sim log\\left(N\\left(0,\\sigma ^{2}\\right)\\right)}\n \n is a log-normally distributed random variable that models the slow fading phenomenon and possible RSS measurements errors (e.g., due to quantization). Both \n \n \n \n &#x03B7;\n \n \n {\\displaystyle \\eta }\n \n and w are dependent on the propagation environment and are typically dependent on the device type and environment type. In addition, w can depend on factors such as device orientation and the amount of people present in the measurement area at the time of acquisition.\nIn terms of an IoT positioning system, the fact that one can extrapolate this information directly from the communication\u2019s signal, which means that there is no additional cost for the device. In terms of battery, the cost will depend on the amount of positioning location requests demanded per second. Ideally, if the requirement is to have an opportunistic location, based on the sporadic communication of the device, acquiring the RSS-based positioning will have no impact on the battery life. However, if the device or the infrastructure will have to listen periodically for a specific pilot signal, acquiring the RSS-based positioning will cause further demands in terms of battery consumption. One limitation of RSS-based approaches is that some current IoT standards support only a coarse RSS measurement (e.g., in steps of 6 dB), which can adversely impact the positioning accuracy, as the noise variance \n \n \n \n \n &#x03C3;\n \n 2\n \n \n \n \n {\\displaystyle \\sigma ^{2}}\n \n will increase.\nAnother interesting aspect of the RSS measurements is that, based on simulations, RSS-based positioning errors are shown to be frequency independent (as shown later in Figure 1). However, one would expect different levels of location-based service at different frequency ranges. The frequency ranges can be coarsely divided into three categories: sub-GHz (i.e., carrier frequencies less than 1 GHz), GHz (1 to 30 GHz) and mmWave (above 30 GHz). The scattering becomes more prominent as frequency increases, thus one would expect different target positioning accuracy according to the frequency range. In addition, as the operating frequency increases, the antenna\u2019s effective area is smaller, and the signal coverage decreases. This is possible to see in Figure 2 where the ideal signal propagation in drawn over a 100 by 100 square area, based on the Friis equation and assuming zero system gains, G,\n\n \n \n \n \n P\n \n r\n \n \n \n (\n d\n )\n \n =\n \n P\n \n t\n \n \n +\n G\n +\n 20\n \n log\n \n 10\n \n \n \n \n (\n \n \n c\n \n f\n 4\n &#x03C0;\n \n \n \n )\n \n &#x2212;\n 20\n \n log\n \n 10\n \n \n \n \n (\n d\n )\n \n \n \n {\\displaystyle P_{r}\\left(d\\right)=P_{t}+G+20\\log _{10}\\;\\left({\\frac {c}{f4\\pi }}\\right)-20\\log _{10}\\;\\left(d\\right)}\n \n ,\nwhere \n \n \n \n \n P\n \n t\n \n \n \n \n {\\displaystyle P_{t}}\n \n is the transmission power, f the operating frequency and c the speed of light.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. Comparative analysis of RSS-based estimates at various carrier frequencies and various AN densities\n\n\n\n\n\n\n\n\n\n\n\n\n Figure 2. Ideal radio signal propagation at 0.5, 2.4, 30 and 60 GHz\n\n\n\nBased on the signal\u2019s behavior, it is easy to understand that a sparser infrastructure at higher frequencies will likely result in a degradation of the positioning performance (as shown in Figure 1).\n\nTime domain \nPositioning estimation based on timing information is based on estimating the time-of-arrival (TOA) or the time-difference-of-arrival (TDOA) from three or more fixed access nodes and then converting those timing estimates into distances. For example, 3D location based on TOA is possible with three synchronized measurements from three known devices. The goal is to solve the following set of equations and find out the node\u2019s location, \n \n \n \n \n &#x03BE;\n \n n\n \n \n =\n \n (\n \n \n x\n \n n\n \n \n ,\n \n y\n \n n\n \n \n ,\n \n z\n \n n\n \n \n \n )\n \n \n \n {\\displaystyle \\xi _{n}=\\left(x_{n},y_{n},z_{n}\\right)}\n \n , assuming that \n \n \n \n \n &#x03BE;\n \n n\n \n \n =\n \n (\n \n \n x\n \n a\n \n \n ,\n \n y\n \n a\n \n \n ,\n \n z\n \n a\n \n \n \n )\n \n \n \n {\\displaystyle \\xi _{n}=\\left(x_{a},y_{a},z_{a}\\right)}\n \n are known coordinates:\n\n \n \n \n \n \n L\n \n \n n\n \n \n =\n \n \n \n \n (\n \n \n &#x03BE;\n \n a\n \n \n &#x2212;\n \n &#x03BE;\n \n n\n \n \n \n )\n \n \n 2\n \n \n \n \n \n \n {\\displaystyle \\mathbf {L} _{n}={\\sqrt {\\left(\\xi _{a}-\\xi _{n}\\right)^{2}}}}\n \n .\nFor TDOA, the range is now a difference of ranges, based on the TOA at the measurement device. Hence, the TDOA from a node n to a measurement device m would be written as\n\n \n \n \n \n \n L\n \n \n n\n m\n \n \n =\n \n \n L\n \n \n n\n \n \n &#x2212;\n \n \n L\n \n \n m\n \n \n \n \n {\\displaystyle \\mathbf {L} _{nm}=\\mathbf {L} _{n}-\\mathbf {L} _{m}}\n \n .\nDue to this difference, the range measurement is free of errors imposed by the measurement device\u2019s clock, since it cancels out when subtracting the two TOA measurements.\nOverall, the time measurements require synchronized clocks, either at the receiver or at the transmitter side, leading to a significant burden on device cost. This does not play well for IoT applications, which are driven by the need of having low-cost devices.\nIt is also important to keep in mind the relationship between bandwidth and accuracy for TOA measurements. This is illustrated in Figure 3, where the positioning error is plotted against the available channel bandwidth at different Signal-to-Noise Ratio (SNR) values. Clearly, sub-m positioning accuracy with time-based approaches is achievable only with high bandwidths (of the order of 100 MHz), but it is very challenging for narrowband and ultra-narrowband systems even at very high SNR.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3. Comparative analysis of TOA-based position estimates at various bandwidths\n\n\n\nSpace domain \nIn the space domain, the ranges are estimated by measuring the angle (or direction) of arrival (AoA) for the signal of interest. Often, this is done by the means of an antenna array or a sectorized antenna. For a given device n, it is possible to describe its measurement at m as\n\n \n \n \n \n (\n \n \n x\n \n n\n \n \n ,\n \n y\n \n n\n \n \n \n )\n \n =\n \n r\n \n m\n \n \n c\n o\n s\n \n (\n \n &#x03B8;\n \n m\n \n \n )\n \n +\n \n r\n \n m\n \n \n s\n i\n n\n \n (\n \n &#x03B8;\n \n m\n \n \n )\n \n \n \n {\\displaystyle \\left(x_{n},y_{n}\\right)=r_{m}cos\\left(\\theta _{m}\\right)+r_{m}sin\\left(\\theta _{m}\\right)}\n \n ,\nwhere \n \n \n \n \n r\n \n m\n \n \n \n \n {\\displaystyle r_{m}}\n \n is the distance m to n and \n \n \n \n &#x03B8;\n \n \n {\\displaystyle \\theta }\n \n the angle of arrival determined at m. Hence, by solving for the unknown coordinates, one can obtain a range estimate.\nIn summary, AoA is particularly interesting for IoT, as the major constraint for achieving angle measurements relies only on the antenna design. However, its major drawback is that the error increases with the distance to the transmitter, which means that a small deviation in the angle results in a large error for the devices at the service edge.\n\nIoT classifications \nWhile there are several domains from where to extract measurements for building knowledge of a device\u2019s location, several limitations arise from the actual IoT system that is built upon. The goal of this subsection is to introduce the IoT technologies, by classifying them into six main categories (see Figure 4):\n\n licensed versus unlicensed, which refers to the operation in a protected band, such as cellular bands versus operation in unlicensed bands, such as industrial, scientific and medical (ISM) bands\n operating frequency bands, which refers to the carrier frequency of each IoT technology; here, we divide the frequency spectrum into three parts: sub-GHz, GHz, and mmWave bands, with some IoT technologies spreading over multiple ranges\n protocols versus enablers, which refers to whether a technology is seen as a specific IoT communication protocol or a possible wireless positioning enabler\n range-based classification, which refers to short-, medium-, or long-range operation\n rate-based classification, which refers to Low-Rate (LR) or High-Rate (HR) data rates; typically, most IoT connectivity solutions are meant for LR high delay applications, while solutions such as WiFi and 5G cover HR and low latency applications\n power-based classification, which refers to Low-Power (LP) versus High-Power (HP) operation; typically, LP approaches go hand in hand with LR approaches, while HP approaches go hand in hand with HP approaches; in LP operation, the devices can function for several years on batteries\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 4. Classification of IoT networks\n\n\n\nIoT system parameters \nThis subsection discusses the relevant parameters for a positioning fit in an IoT system.\n\nTopology \nTopology relates to a message passing from one node to another and the possibility to discover new nodes in the network. The network topology, illustrated in Figure 5, has a significant impact on how nodes with known locations are discovered by others. On a mesh topology, any node can be set as a reference node, whereas, on a star topology, only the access nodes can be defined as such. The density of the fixed nodes also plays an important role in the location accuracy. For example, a denser network with a well-spread distribution of nodes is likely to provide a better location accuracy than a network with few reference nodes all placed in the same direction from the device to be located. An IoT network typically has a star or mesh topology. In a star topology, devices can only talk to their parent device, while, in a mesh topology, nodes can exchange messages between each other. Star topologies are susceptible to single points of failure, since losing the connection to the parent means that the node will be outside the network. In a mesh topology, if a link fault occurs, the device can look for any other neighbor to connect to. Thus, mesh networks provide better coverage and, implicitly, they are likely to offer better positioning accuracy than star networks.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 5. Comparison of a simplified IoT mesh and star network topology\n\n\n\nRange \nRange of an IoT system is important in the sense that it defines an upper bound of the positioning error, which cannot be larger than the communication range. In this aspect, mesh-capable networks have a better footing for positioning purposes as any device can extend service without the need to have specific and dedicated infrastructure.\n\nChannel bandwidth \nChannel bandwidth is directly related to the achievable accuracy in positioning when a TOA-based estimation is used. The Cr\u00e1mer\u2013Rao lower bound for any unbiased estimator[12] of a time delay \n \n \n \n \n &#x03C4;\n \n 0\n \n \n \n \n {\\displaystyle \\tau _{0}}\n \n of a signal S is given as\n\n \n \n \n \n \n \n \n \n \n v\n a\n r\n \n \n (\n \n \n \n \n &#x03C4;\n &#x005E;\n \n \n \n \n 0\n \n \n )\n \n &#x2265;\n \n \n 1\n \n \n \n &#x03B5;\n \n \n N\n \n 0\n \n \n \n \/\n \n 2\n \n \n \n \n \n \n \n &#x222B;\n \n &#x2212;\n &#x221E;\n \n \n +\n &#x221E;\n \n \n \n \n \n (\n \n 2\n \n \n s\n s\n \n \n f\n \n )\n \n \n 2\n \n \n \n \n |\n \n S\n \n (\n f\n )\n \n \n |\n \n \n 2\n \n \n d\n f\n \n \n \n \n &#x222B;\n \n &#x2212;\n &#x221E;\n \n \n +\n &#x221E;\n \n \n \n \n \n |\n \n S\n \n (\n f\n )\n \n \n |\n \n \n 2\n \n \n d\n t\n \n \n \n \n \n \n \n ,\n \n \n \n \n \n \n \n {\\displaystyle {\\begin{array}{r}{{var}\\left({\\hat {\\tau }}_{0}\\right)\\geq {\\frac {1}{{\\frac {\\varepsilon }{N_{0}\/2}}{\\frac {\\int _{-\\infty }^{+\\infty }{\\left(2{\\mathsf {ss}}f\\right)^{2}\\left|S\\left(f\\right)\\right|^{2}df}}{\\int _{-\\infty }^{+\\infty }{\\left|S\\left(f\\right)\\right|^{2}dt}}}}},}\\\\\\end{array}}}\n \n \nwhere \u03b5 is the signal energy, \n \n \n \n \n N\n \n 0\n \n \n \n \n {\\displaystyle N_{0}}\n \n the noise spectral density and \n \n \n \n \n &#x222B;\n \n &#x2212;\n &#x221E;\n \n \n +\n &#x221E;\n \n \n \n \n \n (\n \n 2\n \n \n s\n s\n \n \n f\n \n )\n \n \n 2\n \n \n \n \n |\n \n S\n \n (\n f\n )\n \n \n |\n \n \n 2\n \n \n d\n f\n \n \n \n {\\displaystyle \\int _{-\\infty }^{+\\infty }{\\left(2{\\mathsf {ss}}f\\right)^{2}\\left|S\\left(f\\right)\\right|^{2}df}}\n \n is the mean square bandwidth of the signal. However, since we have do not have all the necessary information to accurately determine each IoT signal\u2019s spectrum density, we provide instead the multipath resolution or time-frequency resolution defined as follows:\n\n \n \n \n \n \n \n \n \n &#x0394;\n t\n &#x2265;\n \n \n \n 2\n &#x03C0;\n \n \n &#x0394;\n w\n \n \n \n \n \n \n \n \n \n \n {\\displaystyle {\\begin{array}{r}{\\Delta t\\geq {\\frac {2\\pi }{\\Delta w}}}\\\\\\end{array}}}\n \n .\nThe above equation determines how the time duration \n \n \n \n &#x0394;\n t\n \n \n {\\displaystyle \\Delta t}\n \n and the spectral bandwidth \n \n \n \n &#x0394;\n &#x03C9;\n \n \n {\\displaystyle \\Delta \\omega }\n \n relate to each other. The spectral bandwidth is defined as the bandwidth that includes most of the signal\u2019s energy. In this study, we assume it to be equal to the channel bandwidth. Overall, what both equations in this subsection show is that, for time-based approaches, it is favorable to have signals with high SNR and short time duration (i.e., higher bandwidth).\n\nCarrier frequency \nCarrier frequency is inversely proportional to the signal wavelength and to the path losses exhibited by the signal. As we move from sub-GHz carriers towards mmWave carriers, the path losses are increasingly stronger, which results in smaller communication ranges. The differences in path losses are due to a multitude of phenomena, but, as frequency increases, they are especially due to the smaller effective area of the devices\u2019 antennae. Overall, combining lower carrier frequencies and mesh topologies results in an enhanced service coverage.\n\nModulation types \nModulation types in IoT systems rely on various digital modulation types, from Ultra Narrow Band (UNB), defined as systems with bandwidths below 1 kHz, to Ultra Wide Band (UWB) modulations, i.e., bandwidths above 500 MHz. In addition, spread spectrum (SS) or Orthogonal Frequency Division Multiplexing (OFDM) modulations are also widely encountered. The modulation type plays a big role in the achievable positioning accuracy when TOA, TDOA, or AOA methods are used, but it has little or no impact when RSS methods are used. Certain modulation-based characteristics can be exploited for positioning purposes. For example, this is the case of SS signals (e.g., LoRa, ZigBee, etc.), where the spreading pseudo-random sequence can be used to infer the signal\u2019s travel time in a similar fashion to GNSS.\n\nPositioning signaling\/data exchange \nPositioning signaling or data exchange is the ability to use either pilot signals or sequences of data packets to provide the location of nearby devices. However, few of the existing IoT technologies support positioning-related signaling, except for most of the cellular IoT technologies (e.g., NB-IoT), which rely on the observed time difference of arrival (OTDOA), introduced in the LTE radio. Apart from the cellular IoT technologies, the future WiFi 802.11az standards also showcase a dedicated data exchange regarding the time-of-flight information to determine the location of its devices.\n\nRoaming \nRoaming is the ability to provide continuity of service across multiple networks, owned or not by a single entity. As mobility is a keystone of most positioning applications, it is important to take note of this when looking at IoT systems. In this aspect, protocols such as Sigfox or Ingenu are at an advantage, as they operate similarly to cellular systems and they offer service across multi continents. Despite that, even proprietary solutions start to provide open application interface specifications and open guest periods in the radio access, which facilitate the exchange of data across multiple vendors and technologies.\n\nNetwork ownership \nNetwork ownership raises security and privacy concerns. Security is becoming a strong requirement in IoT systems, especially as the data access, transport, and storage become increasingly regulated by international and European bodies.[13] Technologies such as Ingenu and Sigfox own the entirety of the network, meaning that the transportation of data is under their full responsibility. Thus, positioning solutions enabled by such systems will be protected by the system provider, as the infrastructure device\u2019s location will not be known to the user.\n\nPower consumption \nPower consumption is a main topic for all IoT technologies. For positioning applications, low-power consumption is crucial for the viability of several systems, especially when the goal is to continuously track and monitor inexpensive items. For example, low-power consumption is mandatory in several use cases from the logistics and construction sectors.\n\nComparing IoT technologies and IoT enablers \nAfter discussing the positioning domain and the main system parameters relevant to positioning, here we present two comparative tables between 29 IoT solutions (see Table 1 and Table 2), with the goal of summing up the key points mentioned so far and enabling an easy comparison between the different technologies. Throughout the rest of this subsection, our goal is to make comparisons and drive the reader towards a better understanding of how a certain technology would fare as the backbone of a positioning system, in a GNSS-free case.\nTable 1 presents for each technology, from left to right, the network topology, network type, the impact of each measurement domain on the device battery life and cost, the achievable positioning accuracy, the most suitable domain, and reported accuracy studies.\n\n\n\n\n\n\n\nTable 1. Summary of key positioning related aspects for several IoT protocols and IEEE 802.11\u2217 family protocols\r\n1(+, +): low impact, (++, ++): medium impact, (+++, +++): high impact\r\n2 assuming implementation without external sensors, such as GNSS\n\n\nTechnology\n\nNetwork Topology\n\nNetwork Type\n\nImpact on (Battery, Device Cost) per Domain1\n\nAchievable Positioning Accuracy2\n\nMost Suitable Domain\n\nAccuracy Studies\n\n\nTime-Based Positioning\n\nPower-Based Positioning\n\nSpace-Based Positioning\n\n\n5G\n\nstar\n\nHR\/HP-Short range\n\n+, +\n\n+, +\n\n+, +\n\nHigh\n\nTime\n\n[14][15]\n\n\nANT+\n\nmesh\n\nLR\/LP-Short range\n\n+, +++\n\n+, +\n\n++, ++\n\nLow\n\nPower\n\n\n\n\nBLEmesh\n\nmesh\n\nLR\/LP-Short range\n\n+, +++\n\n+, +\n\n++, +\n\nMedium\n\nPower\n\n[11][16][17]\n\n\nDash7\n\nstar\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nLow\n\nPower or space\n\n\n\n\nEC-GSM-IOT\n\nstar\n\nHR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nLow\n\nPower\n\n\n\n\nEnOcean\n\nmesh\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nLow\n\nPower or space\n\n\n\n\nIngenu \/RPMA\n\nstar\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower or space\n\n\n\n\nISA101.11a\n\nmesh\n\nLR\/LP-Short range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower or space\n\n\n\n\nLoRa\n\nstar\n\nLR\/LP-Long range\n\n+, ++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n[18]\n\n\nLTE-M\n\nstar\n\nLR\/LP-Long range\n\n+, +\n\n+, +\n\n++, ++\n\nMedium\n\nTime\n\n[2]\n\n\nMiWi\n\nmesh\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n\n\n\nNB-IoT\n\nstar\n\nLR\/LP-Long range\n\n+, +\n\n+, +\n\n++, ++\n\nMedium\n\nTime\n\n[2]\n\n\nRFID\n\nstar\n\nLR\/LP-Short range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n[19][20][21][22][23]\n\n\nSigfox\n\nstar\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n[24][25]\n\n\nTelensa\n\nstar\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nLow\n\nPower or space\n\n\n\n\nThread\n\nmesh\n\nLR\/LP-Short range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n\n\n\nWeightless-N\n\nstar\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower or space\n\n\n\n\nWeightless-P\n\nstar\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nLow\n\nPower or space\n\n\n\n\nWeightless-W\n\nstar\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n\n\n\nWirelessHART\n\nmesh\n\nLR\/LP-Short range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n\n\n\nWiFi802.11af\n\nstar\n\nHR\/HP-Long range\n\n+, +\n\n+, +\n\n++, ++\n\nHigh\n\nTime\n\n\n\n\nWiFi802.11ah\/HaLoW\n\nstar\n\nLR\/LP-Long range\n\n+, +\n\n+, +\n\n++, ++\n\nHigh\n\nTime\n\n\n\n\nWiFi802.11az\n\nstar\n\nHR\/HP-Short range\n\n+, +\n\n+, +\n\n++, ++\n\nHigh\n\nTime\n\n\n\n\nWiFi802.11p (V2X)\n\nmesh\n\nHR\/HP-Short range\n\n+, +\n\n+, +\n\n++, ++\n\nHigh\n\nTime\n\n[26][27]\n\n\nWirepas\n\nmesh\n\nHR-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n\n\n\nWiSUN\n\nmesh\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n\n\n\nZigBee\/ZigBee-NaN\n\nmesh\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower\n\n[28][29][30]\n\n\nZ-Wave\n\nmesh\n\nLR\/LP-Long range\n\n+, +++\n\n+, +\n\n++, ++\n\nMedium\n\nPower or space\n\n\n\n\n\nThe Network Topology column maps each technology\u2019s topology to either a star or mesh topology.\nThe Network Type column presents the network type, where each entry starts with the rate, power consumption, and maximum operating range offered by the technology. The operating range has a correlation with the frequency bands in Table 2.\nThe three Impact columns show the impact on battery consumption and device cost for each measurement domain in discussion. While most of the technologies do not offer such capabilities, this classification assumes that it would be possible to couple the necessary measurement units to provide such information. Hence, the classification of low impact (+), medium impact (++), or high impact (+++) are based on what the authors expect to be the additional burden in terms of device cost and battery burden. The power domain is seen to be the one with the smallest impact, due to the fact that it would be easily available to all of these technologies.\nThe Achievable Positioning Accuracy column provides a qualitative indicator for the expected accuracy based on what the technology currently offers. When available, this information is based on the related studies.\nThe Most Suitable Domain column states the most suitable measurement domain to use with each technology. The domain is attributed based on the technology\u2019s signal characteristics presented in Table 2 and its current capabilities.\n\r\n\n\n\n\n\n\n\n\nTable 2. Summary of key physical layer parameters for several IoT protocols\n\n\n\nTechnology\n\nFrequency Bands\n\nChannel Bandwidth (MHz)\n\nModulation Type (UNB\/NB\/SS\/OFDM\/UWB)\n\n\n5G\n\nGHz, mmWave\n\n&lt;100\n\nOFDM\n\n\nANT+\n\nGHz\n\n1\n\nNB\n\n\nBLEmesh\n\nGHz\n\n1\n\nNB\n\n\nDash7\n\nsub-GHz\n\n0.025, 0.200\n\nNB\n\n\nEC-GSM-IOT\n\nsub-GHz\n\n0.2\n\nNB\n\n\nEnOcean\n\nsub-GHz\n\n0.0625\n\nNB\n\n\nIngenu \/RPMA\n\nsub-GHz and GHz\n\n1\n\nSS\n\n\nISA101.11a\n\nGHz\n\n5\n\nSS\n\n\nLoRa\n\nsub-GHz\n\n0.125, 0.500\n\nSS\n\n\nLTE-M\n\nsub-GHz and GHz\n\n1.08, 1.4\n\nOFDM\n\n\nMiWi\n\nsub-GHz and GHz\n\n0.040, 0.250\n\nNB\n\n\nNB-IoT\n\nsub-GHz and GHz\n\n0.18\n\nNB, OFDM\n\n\nRFID\n\nsub-GHz and GHz\n\n0.2\n\nNB\n\n\nSigfox\n\nsub-GHz\n\n0.2\n\nUNB\n\n\nTelensa\n\nsub-GHz\n\n0.1\n\nNB\n\n\nThread\n\nGHz\n\n5\n\nNB\n\n\nWeightless-N\n\nsub-GHz\n\n0.2\n\nUNB\n\n\nWeightless-P\n\nsub-GHz\n\n0.0125\n\nNB\n\n\nWeightless-W\n\nsub-GHz\n\n5\n\nSS\n\n\nWirelessHART\n\nGHz\n\n0.25\n\nSS\n\n\nWiFi802.11af\n\nsub-GHz\n\n8\n\nOFDM\n\n\nWiFi802.11ah\/HaLoW\n\nsub-GHz\n\n1, 2, 4, 8, 16\n\nOFDM\n\n\nWiFi802.11az\n\nGHz, mmWave\n\n20, 40, 60, 80, 160\n\nOFDM\n\n\nWiFi802.11p (V2X)\n\nGHz\n\n10\n\nOFDM\n\n\nWirepas\n\nsub-GHz and GHz\n\n0.126, 0.5\n\nNB\n\n\nWiSUN\n\nsub-GHz and GHz\n\n0.2\u20131.2\n\nNB, SS, and OFDM\n\n\nZigBee\n\nsub-GHz and GHz\n\n0.6, 1.2, 2\n\nSS\n\n\nZigBee-NaN\n\nsub-GHz\n\n0.6, 1.2, 2\n\nSS\n\n\nZ-Wave\n\nsub-GHz\n\n0.2\n\nNB\n\n\n\nTable 2 describes each technologies\u2019 key physical aspects, such as frequency bands, channel bandwidth and modulation type.\nPositioning services often have a high demand for power consumption. Operating a positioning based infrastructure is often tied to the need of having a fully plugged-in (powered) infrastructure. However, there are several industrial applications that would benefit from a fully battery-operated network, especially where an electricity network might not yet be present, e.g., construction sites, or for facility of service extension and maintainability.\nIn terms of positioning, we found that most IoT systems are yet to offer specific signaling to support accurate measurements for localization. Few of the existing IoT systems have already raised interest in the academic field in terms of their positioning capabilities, as shown in the last column of Table 1. Most of the existing studies focus on RSS-based approaches, and several of them rely on low-cost probabilistic methods requiring an underlying path loss model. The few studies that focus on time-based and space-based approaches are mostly targeting the current and future cellular IoT signals, derived from LTE, such as LTE-M, which are retaining some of LTE\u2019s positioning characteristics such as positioning-specific signaling. In addition, future 5G networks are likely to rely on time-based and space-based positioning approaches. Our paper further contributes with additional results based on RSS and time-based approaches as shown in the next sections.\nIn addition, we have found that network-centric positioning solutions are being favored as opposed to device-centric ones, which is often related to the limited resources at the end nodes. However, a centralized architecture places an additional burden on the network capacity and latency as the number of devices grow. For many of the IoT systems, a centralized architecture will have difficulties accommodating real-time location systems, especially due to the strict latency requirements of such systems. Integration with other high-capacity technologies, such as WiFi and 5G, could decrease the latency at the expense of per unit cost and power consumption. The support of positioning updates at very sparse intervals ought to be feasible for many IoT technologies, which will certainly find its application in several niche markets, especially if the positioning system is supported fully by battery-powered networks over a span of multiple years.\nTo further complement our study, we end with a perspective on what the achievable positioning accuracy is. The next two sections focus on measurement-based and simulation-based studies, respectively. We introduce simulation-based results from two systems whose performance was difficult to find as benchmarks in the existing literature, namely IEEE 802.11az and LoRa. Then, we present measurement-based results from an office environment of a positioning system built on top of the Wirepas\u2019 mesh solution.\n\nSimulation-based performance metrics \nCase Study 1: 802.11az IoT enabler, simulation-based results, time domain \nIn 802.11az, a position estimate is obtained by solving the hyperbolic location based on the measured TOA \n \n \n \n \n t\n \n \n A\n \n a\n \n \n \n \n \n \n {\\displaystyle t_{A_{a}}}\n \n at the mobile side from several ANs, where a is the AN index, \n \n \n \n a\n =\n 1\n ,\n &#x2026;\n ,\n \n N\n \n A\n N\n \n \n \n \n {\\displaystyle a=1,\\ldots ,N_{AN}}\n \n \n\n \n \n \n \n t\n \n \n A\n \n a\n \n \n \n \n =\n \n t\n \n s\n \n \n +\n \n \n \n d\n \n (\n \n T\n a\n g\n ,\n A\n \n N\n \n a\n \n \n \n )\n \n \n c\n \n \n +\n \n &#x2211;\n \n a\n =\n 2\n \n \n \n N\n \n A\n \n \n N\n \n \n \n (\n \n \n \n \n d\n \n (\n \n A\n \n N\n \n a\n &#x2212;\n 1\n \n \n ,\n A\n \n N\n \n a\n \n \n \n )\n \n \n c\n \n \n +\n \n t\n \n f\n \n \n \n )\n \n \n \n {\\displaystyle t_{A_{a}}=t_{s}+{\\frac {d\\left(Tag,AN_{a}\\right)}{c}}+\\sum \\limits _{a=2}^{N_{A}N}\\left({{\\frac {d\\left(AN_{a-1},AN_{a}\\right)}{c}}+t_{f}}\\right)}\n \n ,\nwhere \n \n \n \n \n t\n \n s\n \n \n \n \n {\\displaystyle t_{s}}\n \n is the starting time of transmission from one AN in the network, taken arbitrarily as the first AN (\n \n \n \n A\n \n N\n \n 1\n \n \n \n \n {\\displaystyle AN_{1}}\n \n ), c is the speed of light, \n \n \n \n d\n \n (\n \n T\n a\n g\n ,\n A\n \n N\n \n a\n \n \n \n )\n \n \n \n {\\displaystyle d\\left(Tag,AN_{a}\\right)}\n \n is the geometric distance between the mobile device and the a-th AN, \n \n \n \n a\n =\n 1\n ,\n &#x2026;\n ,\n \n N\n \n A\n N\n \n \n \n \n {\\displaystyle a=1,\\ldots ,N_{AN}}\n \n is the forwarding time of the signaling message between two access nodes, and \n \n \n \n d\n \n (\n \n A\n \n N\n \n a\n &#x2212;\n 1\n \n \n ,\n A\n \n N\n \n a\n \n \n \n )\n \n \n \n {\\displaystyle d\\left(AN_{a-1},AN_{a}\\right)}\n \n is the geometric distance between the \n \n \n \n a\n =\n 1\n \n \n {\\displaystyle a=1}\n \n -th AN and a-th AN. With several noisy observations of the measured time of arrivals, the IoT device can compute its position (as well as the unknown \n \n \n \n \n t\n \n s\n \n \n \n \n {\\displaystyle t_{s}}\n \n ). It is assumed that the AN positions and the forwarding time are known and transmitted in the signaling message. In addition to that, a minimum of four synchronized access points are needed to estimate the four unknowns \n \n \n \n \n (\n \n x\n ,\n y\n ,\n z\n ,\n \n t\n \n s\n \n \n \n )\n \n \n \n {\\displaystyle \\left(x,y,z,t_{s}\\right)}\n \n , with the \n \n \n \n \n (\n \n x\n ,\n y\n ,\n z\n \n )\n \n \n \n {\\displaystyle \\left(x,y,z\\right)}\n \n the device location.\nTo understand an achievable location performance, we defined a simulation over a square area of 0.4 km2 at the highest bandwidth available (160 MHz). We observe in Figure 6 that this solution would be able to offer sub-meter accuracy 80 percent of the times when at least seven ANs are available.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 6. Example of 802.11az performance at various number of access nodes at signal to noise ration (SNR), SNR = 10 dB\n\n\n\nCase-Study 2: LoRa, simulation-based results, time domain \nA chirp spread spectrum (CSS) system with a 125 kHz bandwidth and a spreading factor of seven was used in the simulations. It was assumed that we have a single-floor square indoor area of 200 m \u00d7 200 m size, in which \n \n \n \n \n N\n \n A\n N\n \n \n \n \n {\\displaystyle N_{AN}}\n \n access nodes are distributed uniformly, with \n \n \n \n \n N\n \n A\n N\n \n \n \n \n {\\displaystyle N_{AN}}\n \n between 3 and 100. Ten thousand Monte Carlo iterations were used to generate randomly the position of the ANs and of the IoT device. The positioning was based on TOA principle, where the TOA was estimated based on the correlation between the incoming signal and a reference CSS code. The results are shown in Figure 7 in terms of cumulative distribution function (CDF) of error, for a different number of LoRA access nodes, respectively. For three access nodes and at an SNR = \u221218 dB, the positioning error is higher than 50 m in more than 50 percent of cases. On the other hand, with 100 access nodes distributed in the 0.4 km2 area, we can reach below 10 m accuracy in more than 50 percent of cases.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 7. Example of LoRa performance at various numbers of access nodes\n\n\n\nMeasurement-based performance metrics with Wirepas IoT platform \nThis section presents experimental results from an IoT positioning system built with Wirepas IoT mesh solution. The results presented in this section that were obtained are Wirepas\u2019 offices and are based on power measurements.\nThe environment where measurements took place, with a total area of 180 m2 (10 by 18 meters), is a typical work environment with few small rooms and a large open areas (see Figure 8). Several battery powered operated devices were placed across the floor extending the network coverage in and outside the rooms. Some of these devices acted as known reference points while others as tracked devices. The reference points are identified in Figure 8 as routing devices (blue squares) and the measurement devices as yellow dots. All the devices were operating in the 2.4 GHz using Nordic\u2019s NRF51 as the radio chipset.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 8. Office environment where the measurements were acquired\n\n\n\nIn this setup, the measurement devices were statically collecting information about network beacons\u2019 broadcast periodically by the routing devices. The information about the routing devices\u2019 beacons, as seen by the measurement devices, was sent regularly towards the network sink. In turn, the network sink and gateway communicated the measurements to a positioning engine running on a local computer. The position engine provided a location estimate based on the known location of the routing devices and the RSS observed by the measurement devices. A location estimate was calculated by one-shot runs of a weighted centroid algorithm, meaning that no average or filtering were applied to the location estimates. However, the RSS measurements were averaged over a window of time to mitigate the channel propagation effects.\nIn addition to a location estimate on a global or local reference frame, the position engine also provided an area-based location. The area-based location consists of matching the location estimate to a set of geographical areas of interest (shaded areas in Figure 8). For a device to be in such area, it meant that its location estimate was found to be inside the geographic area defined by the four coordinate points of each area.\nThe results on Table 3 show the probability of correctly classifying the measurement devices in the areas of interest. The percentage is calculated by summing the amount of location estimates in the node\u2019s correct area versus the total amount of location estimates in any other area of interest.\n\r\n\n\n\n\n\n\n\n\nTable 3. Experimental results with an IoT testbed using Wirepas connectivity with 60 fixes per second and static nodes\n\n\n\nArea (m2)\n\nOffice Hours\n\nOutside Office Hours\n\nAll Day\n\n\n% of Correct Location Area Classification\n\n\n10\n\n95.41\n\n89.47\n\n91.16\n\n\n10\n\n96.16\n\n97.59\n\n97.18\n\n\n2\n\n91.56\n\n94.85\n\n93.90\n\n\n3\n\n96.76\n\n99.20\n\n98.50\n\n\nMean\n\n95.51\n\n95.61\n\n95.57\n\n\n\nThe results on Table 3 show that, during a day, the devices were correctly located inside the logical area where they were known to be at more than 90 percent of the time.\n\nConclusions \nWe believe positioning is important not only for IoT end applications, but also to support network self-management. Our paper addresses the lack of comprehensive studies comparing IoT solutions and their fit-for-positioning applications. The paper first covered three possible measurement domains from which IoT devices could derive their location. Afterwards, we focused on classification of the IoT solutions and we discussed several system parameters that should be considered when designing a positioning system. We concluded our study with a comparative table and discussion between multiple IoT and other wireless solutions. We also provided an overview of achievable system performance with unique results for three positioning systems built on top of IEEE 802.11az, LoRa, and Wirepas.\nOverall, based on our study, we conclude that power-domain positioning currently offers the best trade-off between implementation cost and positioning accuracy for low-power systems. Dedicated positioning signaling as well as space-based approaches are some of the feasible ways to push for higher accuracy and still offer low-power operation. Cooperation with other wireless technologies, such as WiFi and 5G, could allow for mobility support and ability to operate at large scales when low-power operation is not critical.\n\nAcknowledgements \nAuthor contributions \nConceptualization, P.F.eS. and V.K.; Funding acquisition, E.-S.L.; Investigation, P.F.eS.; Software, P.F.eS. and E.-S.L.; Supervision, V.K. and E.-S.L.; Writing\u2014Review and Editing, P.F.eS., V.K. and E.-S.L.\n\nFunding \nThis work has been partly supported by EU FP7 Marie Curie Initial Training Network MULTI-POS (Multi-technology Positioning Professionals) under Grant No. 316528 and by the Academy of Finland, project numbers 303576 and 313039.\n\nConflicts of interest \nTwo of the authors are currently full time employees at Wirepas. Regardless of their relationship with Wirepas, the authors have kept the reporting objective and fair for all the technologies under examination. Furthermore the authors have not received any additional financial incentive to execute or complete this work. Wirepas contributed with hardware for the activities performed in this report, which was part of the research visit arranged within the MULTI-POS project.\n\nReferences \n\n\n\u2191 1.0 1.1 del Peral-Rosado, J.A.; Seco-Granados, G.; Raulefs, R. et al.&#32;(April 2018).&#32;\"Whitepaper on New Localization Methods for 5G Wireless Systems and the Internet-of-Things\".&#32;In&#32;Witrisal, K.; Ant\u00f3n-Haro, C.&#32;(PDF).&#32;COST.&#32;http:\/\/www.iracon.org\/wp-content\/uploads\/2018\/03\/IRACON-WP2.pdf . &#160; \n\n\u2191 2.0 2.1 2.2 2.3 Lin, X.; Bergman, J.; Gunnarsson, F. et al.&#32;(2017).&#32;\"Positioning for the Internet of Things: A 3GPP Perspective\".&#32;IEEE Communications Magazine&#32;55&#32;(12): 179\u201385.&#32;doi:10.1109\/MCOM.2017.1700269. &#160; \n\n\u2191 Lin, K.; Chen, M..; Deng, J. et al.&#32;(2016).&#32;\"Enhanced Fingerprinting and Trajectory Prediction for IoT Localization in Smart Buildings\".&#32;IEEE Transactions on Automation Science and Engineering&#32;13&#32;(3): 1294\u2013307.&#32;doi:10.1109\/TASE.2016.2543242. &#160; \n\n\u2191 del Peral-Rosado, J.A.; L\u00f3pez-Salcedo, J.A.; Seco-Granados, G.&#32;(2017).&#32;\"Impact of frequency-hopping NB-IoT positioning in 4G and future 5G networks\".&#32;IEEE International Conference on Communications Workshops: 815\u201320.&#32;doi:10.1109\/ICCW.2017.7962759. &#160; \n\n\u2191 Chen, L.; Thombre, S.; J\u00e4rvinen, K. et al.&#32;(2017).&#32;\"Robustness, Security and Privacy in Location-Based Services for Future IoT: A Survey\".&#32;IEEE Access&#32;5: 8956\u201377.&#32;doi:10.1109\/ACCESS.2017.2695525. &#160; \n\n\u2191 Singh, K.J.; Kapoor, D.S.&#32;(2017).&#32;\"Create Your Own Internet of Things: A survey of IoT platforms\".&#32;IEEE Consumer Electronics Magazine&#32;6&#32;(2): 57\u201368.&#32;doi:10.1109\/MCE.2016.2640718. &#160; \n\n\u2191 Zhang, P.; Nagarajan, S.G.; Nevat, I.&#32;(2017).&#32;\"Secure Location of Things (SLOT): Mitigating Localization Spoofing Attacks in the Internet of Things\".&#32;IEEE Internet of Things Journal&#32;4&#32;(6): 2199\u2013206.&#32;doi:10.1109\/JIOT.2017.2753579. &#160; \n\n\u2191 Al-Sarawi, S.; Anbar, M.; Alieyan, K. et al.&#32;(2017).&#32;\"Internet of Things (IoT) communication protocols: Review\".&#32;Proceedings from the 8th International Conference on Information Technology: 685\u201390.&#32;doi:10.1109\/ICITECH.2017.8079928. &#160; \n\n\u2191 Raza, U.; Kulkarni, P.; Sooriyabandara, M.&#32;(2017).&#32;\"Low Power Wide Area Networks: An Overview\".&#32;IEEE Communications Surveys &amp; Tutorials&#32;19&#32;(2): 855\u201373.&#32;doi:10.1109\/COMST.2017.2652320. &#160; \n\n\u2191 Zanella, A.&#32;(2016).&#32;\"Best Practice in RSS Measurements and Ranging\".&#32;IEEE Communications Surveys &amp; Tutorials&#32;18&#32;(4): 2662\u201386.&#32;doi:10.1109\/COMST.2016.2553452. &#160; \n\n\u2191 11.0 11.1 11.2 Lohan, E.S.; Talvitie, J.; e Silva, P.F. et al.&#32;(2015).&#32;\"Received signal strength models for WLAN and BLE-based indoor positioning in multi-floor buildings\".&#32;Proceedings from the 2015 International Conference on Location and GNSS: 1\u20136.&#32;doi:10.1109\/ICL-GNSS.2015.7217154. &#160; \n\n\u2191 Kay, S.M.&#32;(1993).&#32;Fundamentals of Statistical Signal Processing: Estimation Theory.&#32;1.&#32;Prentice Hall.&#32;ISBN&#160;9780133457117. &#160; \n\n\u2191 \"Regulation (EU) 2016\/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95\/46\/EC (General Data Protection Regulation)\".&#32;EUR-Lex.&#32;European Union.&#32;27 April 2016.&#32;https:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX%3A32016R0679 . &#160; \n\n\u2191 \"Joint 3D Positioning and Network Synchronization in 5G Ultra-Dense Networks Using UKF and EKF\".&#32;Proceedings of the 2016 IEEE Globecom Workshops: 1\u20137.&#32;2016.&#32;doi:10.1109\/GLOCOMW.2016.7848938. &#160; \n\n\u2191 \"Location-aware 5G communications and Doppler compensation for high-speed train networks\".&#32;Proceedings of the 2017 European Conference on Networks and Communications: 1\u20136.&#32;2017.&#32;doi:10.1109\/EuCNC.2017.7980755. &#160; \n\n\u2191 \"Location Fingerprinting With Bluetooth Low Energy Beacons\".&#32;IEEE Journal on Selected Areas in Communications&#32;33&#32;(11): 2418\u201328.&#32;2015.&#32;doi:10.1109\/JSAC.2015.2430281. &#160; \n\n\u2191 \"A Survey of Selected Indoor Positioning Methods for Smartphones\".&#32;IEEE Communications Surveys &amp; Tutorials&#32;19&#32;(2): 1347\u201370.&#32;2017.&#32;doi:10.1109\/COMST.2016.2637663. &#160; \n\n\u2191 \"GPS-free geolocation using LoRa in low-power WANs\".&#32;Proceedings of the 2017 Global Internet of Things Summit: 1\u20136.&#32;2017.&#32;doi:10.1109\/GIOTS.2017.8016251. &#160; \n\n\u2191 \"Localization of RFID Tags Using Stochastic Tunneling\".&#32;IEEE Transactions on Mobile Computing&#32;12&#32;(6): 1225\u201335.&#32;2013.&#32;doi:10.1109\/TMC.2012.80. &#160; \n\n\u2191 \"Hybrid WLAN-RFID Indoor Localization Solution Utilizing Textile Tag\".&#32;IEEE Antennas and Wireless Propagation Letters&#32;14: 1358\u201361.&#32;2015.&#32;doi:10.1109\/LAWP.2015.2406951. &#160; \n\n\u2191 \"BackPos: High Accuracy Backscatter Positioning System\".&#32;IEEE Transactions on Mobile Computing&#32;15&#32;(3): 586\u201398.&#32;2016.&#32;doi:10.1109\/TMC.2015.2424437. &#160; \n\n\u2191 \"Fusion of RSS and Phase Shift Using the Kalman Filter for RFID Tracking\".&#32;IEEE Sensors Journal&#32;17&#32;(11): 3551\u201358.&#32;2017.&#32;doi:10.1109\/JSEN.2017.2696054. &#160; \n\n\u2191 \"The Optimization for Hyperbolic Positioning of UHF Passive RFID Tags\".&#32;IEEE Transactions on Automation Science and Engineering&#32;14&#32;(4): 1590\u20131600.&#32;2017.&#32;doi:10.1109\/TASE.2017.2656947. &#160; \n\n\u2191 \"Localization in long-range ultra narrow band IoT networks using RSSI\".&#32;Proceedings of the 2017 IEEE International Conference on Communications: 1\u20136.&#32;2017.&#32;doi:10.1109\/ICC.2017.7997195. &#160; \n\n\u2191 \"Localization in Low Power Wide Area Networks Using Wi-Fi Fingerprints\".&#32;Applied Sciences&#32;7&#32;(9): 936.&#32;2017.&#32;doi:10.3390\/app7090936. &#160; \n\n\u2191 \"Neighbor-Aided Localization in Vehicular Networks\".&#32;IEEE Transactions on Intelligent Transportation Systems&#32;18&#32;(10): 2693\u2013702.&#32;2017.&#32;doi:10.1109\/TITS.2017.2655146. &#160; \n\n\u2191 \"OFDM-Based Ranging Approach for Vehicular Safety Applications\".&#32;Proceedings of the 2013 IEEE 78th Vehicular Technology Conference: 1\u20135.&#32;2013.&#32;doi:10.1109\/VTCFall.2013.6692309. &#160; \n\n\u2191 \"A ZigBee position technique for indoor localization based on proximity learning\".&#32;Proceedings of the 2017 IEEE International Conference on Mechatronics and Automation: 875\u201380.&#32;2017.&#32;doi:10.1109\/ICMA.2017.8015931. &#160; \n\n\u2191 \"Implementation of indoor fingerprint positioning based on ZigBee\".&#32;Proceedings of the 2017 29th Chinese Control And Decision Conference: 2654-59.&#32;2017.&#32;doi:10.1109\/CCDC.2017.7978963. &#160; \n\n\u2191 \"IEEE 802.15.4 ZigBee-Based Time-of-Arrival Estimation for Wireless Sensor Networks\".&#32;Sensors&#32;16&#32;(2): 203.&#32;2016.&#32;doi:10.3390\/s16020203.&#32;PMC&#160;PMC4801579.&#32;PMID&#160;26861331.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4801579 . &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Some grammar, punctuation, and minor wording issues have been corrected.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\">https:\/\/www.limswiki.org\/index.php\/Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on internet of things\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 8 August 2018, at 00:30.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 586 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","69cd9560f847d37e95c0bf5ffc36d532_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Wireless_positioning_in_IoT_A_look_at_current_and_future_trends skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Wireless positioning in IoT: A look at current and future trends<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>Connectivity solutions for the <a href=\"https:\/\/www.limswiki.org\/index.php\/Internet_of_things\" title=\"Internet of things\" target=\"_blank\" class=\"wiki-link\" data-key=\"13e0b826fa1770fe4bea72e3cb942f0f\">internet of things<\/a> (IoT) aim to support the needs imposed by several applications or use cases across multiple sectors, such as logistics, <a href=\"https:\/\/www.limswiki.org\/index.php\/Agriculture_industry\" title=\"Agriculture industry\" target=\"_blank\" class=\"wiki-link\" data-key=\"4882fd1b1f6fb6017adf6f0c0741eafc\">agriculture<\/a>, asset management, or smart lighting. Each of these applications has its own challenges to solve, such as dealing with large or massive networks, low and ultra-low latency requirements, long battery life requirements (i.e., more than ten years operation on battery), continuously monitoring of the location of certain nodes, security, and authentication. Hence, a part of picking a connectivity solution for a certain application depends on how well its features solve the specific needs of the end application. One key feature that we see as a need for future IoT networks is the ability to provide location-based <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> for large-scale IoT applications. The goal of this paper is to highlight the importance of positioning features for IoT applications and to provide means of comparing and evaluating different connectivity protocols in terms of their positioning capabilities. Our compact and unified analysis ends with several case studies, both simulation-based and measurement-based, which show that high positioning accuracy on low-cost low-power devices is feasible if one designs the system properly.\n<\/p><p><b>Keywords<\/b>: internet of things (IoT), wireless positioning, indoor location\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Nowadays, the amount of connected wireless devices is growing, e.g., smart watches, smart light bulbs, smart toothbrushes, smart coffee mugs, etc. The trend in the information technology industry is towards connecting and extracting analytics from a variety of inter-connected wireless devices.\n<\/p><p>While many IoT applications have so far focused on the consumer realm, more and more industrial applications are also appearing, such as utilities measurement (e.g., water, electricity, and gas), industrial lighting, logistics, and smart agriculture. Enabling such industrial applications means that IoT networks need to support large amounts of devices, multiple years of operation on battery, different latency requirements, and low costs per unit.\n<\/p><p>We believe that, on top of the communications and reliability requirements of a wireless link, many IoT applications will require or benefit from knowing the location of certain devices. Such location information will be needed seamlessly, both indoors and outdoors, and without the battery-draining Global Navigation Satellite Systems (GNSS) chipsets. The need for localization and tracking appears not only from the network management point of view, but also from a business perspective, driving new business models and new business avenues.\n<\/p><p>Nevertheless, enabling or creating a positioning system with an IoT network is not a trivial task. The reason behind this is that industrial applications seek a low per unit cost of their IoT devices, which results in devices with very limited hardware components, such as CPU, memory, and battery. The limited hardware has an impact on the number of devices that a single device can serve and how fast it can process network and application requests. However, while CPU and memory will have an important impact on the scale of the network, the biggest challenge for enabling a positioning system lies on the proper management of the devices\u2019 radio.\n<\/p><p>The need for proper radio management becomes evident as there are devices with known coordinates which will broadcast specific payloads on a regular basis and other devices whose locations are to be determined, which will need to scan the spectrum frequently. Hence, too frequent broadcasting will lead to spectrum congestion and increased packet collision, whereas frequent scanning leads to high battery consumption, which is particularly problematic for battery-operated devices.\n<\/p><p>Overall, the biggest challenge to tackle for an IoT positioning network is to balance the power consumption against the performance of the system. A very reactive system will have to rely on frequent scanning and broadcasting of its members, which means that devices will need to draw large quantities of power. A low reactive system will draw less power with devices scanning very seldom.\n<\/p><p>The goal of this paper is to provide an insight on positioning capabilities of the current IoT technologies and other relevant IoT-enabling wireless technologies, such as WiFi. The paper starts by classifying three domains of positioning and discussing the main shortcomings of each of these domains for IoT devices. It then classifies the different IoT solutions according to six classification criteria, and it provides a discussion on the main system parameters relevant to positioning and tracking purposes. This discussion acts as a basis for comparison between the different IoT wireless solutions. To further complement this discussion, we present positioning results based on simulation-based scenarios and field experiments with a platform built on top of the Wirepas Mesh connectivity solution. In the end, we provide a short summary and conclusions of our findings.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Related_work\">Related work<\/span><\/h2>\n<p>At this moment, to the authors\u2019 best knowledge, there are no comprehensive comparisons in the literature between different IoT protocols in terms of their positioning capabilities. There are, however, other studies that compare specific IoT technologies and which look at IoT from the communications point of view, as well as studies focusing on positioning with a particular technology, such as narrow-band IoT (NB-IoT) or BLE. In this section, we highlight the related work from literature studies.\n<\/p><p>A survey of localization methods for 5G, containing a short section also on IoT positioning, has been recently published as a white paper by the European Cooperation in Science &amp; Technology (COST).<sup id=\"rdp-ebb-cite_ref-delPeral-RosadoWhite18_1-0\" class=\"reference\"><a href=\"#cite_note-delPeral-RosadoWhite18-1\" rel=\"external_link\">[1]<\/a><\/sup> It has also been emphasized in this paper that localization will become a key component of future 5G systems, though accurate future localization solutions in 5G should exploit the multipath and non-line-of-sight information and should put more emphasis on heterogeneous data fusion mechanisms. However, such advanced solutions would also increase the power consumption on the devices and are not well-suited for the majority of IoT systems. Distinct from COST's work<sup id=\"rdp-ebb-cite_ref-delPeral-RosadoWhite18_1-1\" class=\"reference\"><a href=\"#cite_note-delPeral-RosadoWhite18-1\" rel=\"external_link\">[1]<\/a><\/sup>, our paper focuses mostly on low-cost low-power consumption IoT solutions.\n<\/p><p>Lin <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-LinPositioning17_2-0\" class=\"reference\"><a href=\"#cite_note-LinPositioning17-2\" rel=\"external_link\">[2]<\/a><\/sup> focus on the Long-Term Evolution (LTE) Machine type communications (LTE-M) and Narrow Band Internet of Things (NB-IoT) protocols and their positioning capabilities. The authors demonstrate that at 46 dBm power of the transmit AN, positioning accuracy goes to around 10 m and that NB-IoT protocol supports better positioning accuracy than LTE-M protocol.<sup id=\"rdp-ebb-cite_ref-LinPositioning17_2-1\" class=\"reference\"><a href=\"#cite_note-LinPositioning17-2\" rel=\"external_link\">[2]<\/a><\/sup> A similar study focuses on indoor localization via improved received signal strength (RSS) fingerprinting in generic IoT devices.<sup id=\"rdp-ebb-cite_ref-LinEnhanced16_3-0\" class=\"reference\"><a href=\"#cite_note-LinEnhanced16-3\" rel=\"external_link\">[3]<\/a><\/sup> The results are based on 802.11b\/g\/n signals where location errors below 5 m are achieved in more than 50 percent of the studied cases.\n<\/p><p>del Peral-Rosado <i>et al.<\/i> investigate a time-domain based positioning with additional frequency hopping for the NB-IoT system. The obtained positioning accuracy is down to 30\u201350 m under strong signal-to-noise ratio conditions, and it deteriorates quickly for medium and low signal-to-noise ratios.<sup id=\"rdp-ebb-cite_ref-delPeral-RosadoImpact17_4-0\" class=\"reference\"><a href=\"#cite_note-delPeral-RosadoImpact17-4\" rel=\"external_link\">[4]<\/a><\/sup>\n<\/p><p>Chen <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-ChenRobust17_5-0\" class=\"reference\"><a href=\"#cite_note-ChenRobust17-5\" rel=\"external_link\">[5]<\/a><\/sup> released a study complementary to our work, looking at IoT positioning from the perspective of security, privacy, and robustness of the localization technology. No positioning results were reported in the study. Another complementary study by Singh and Kapoor<sup id=\"rdp-ebb-cite_ref-SinghCreate17_6-0\" class=\"reference\"><a href=\"#cite_note-SinghCreate17-6\" rel=\"external_link\">[6]<\/a><\/sup> focuses on existing and emerging software and hardware platforms for IoT applications, but positioning was not part of that study. IoT positioning has recently been considered by Zhang <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-ZhangSecure17_7-0\" class=\"reference\"><a href=\"#cite_note-ZhangSecure17-7\" rel=\"external_link\">[7]<\/a><\/sup> from the point of view of spoofing resistance in time of arrival (TOA) ultra-wideband (UWB) for IoT systems.\n<\/p><p>Other complementary comprehensive studies, focusing solely on the communication aspects of IoT, are authored by Al-Sarawi <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-Al-SarawiInternet17_8-0\" class=\"reference\"><a href=\"#cite_note-Al-SarawiInternet17-8\" rel=\"external_link\">[8]<\/a><\/sup> and Raza <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-RazaLow17_9-0\" class=\"reference\"><a href=\"#cite_note-RazaLow17-9\" rel=\"external_link\">[9]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Designing_an_IoT_positioning_system\">Designing an IoT positioning system<\/span><\/h2>\n<p>At its core, a positioning system translates a set of measurements from well-known reference points into a coordinate pair. The reference points\u2014known as anchors in localization terminology or Access Nodes (AN) in IoT terminology\u2014act as a means for the device of interest, a mobile or an IoT tag, to be in a local or global reference frame. Depending on who takes the measurements, the positioning is considered to be network-centric (i.e., when the anchors make the positioning-related measurements) or device-centric (i.e., when the IoT end nodes or tags perform the positioning-related measurements).\n<\/p><p>These two types of positioning have very different implications on security and privacy, which should always be carefully considered regarding the final application. For example, privacy-preserving positioning solutions are easier to be achieved in a device-centric approach than in a network-centric approach as the device would not need to disclose its position to the network.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Positioning_domains\">Positioning domains<\/span><\/h3>\n<p>In terms of measurements, there are multiple domains from which they can be extracted from, as long as there are means to do so in the devices. For that reason, we briefly present three of the main domains we consider of interest for an IoT positioning system:\n<\/p>\n<ul><li> power or signal strength-based<\/li>\n<li> time-based<\/li>\n<li> space-based<\/li><\/ul>\n<p>Other domains, such as natural or artificial fields, e.g., geo-magnetic field, light, sounds, or smell are out of the scope of our study, but they could also serve as relevant sources of information for future IoT positioning systems.\n<\/p><p>The following subsections provide a short summary of main challenges in each of these three positioning domains and their system-wide impacts.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Power_domain\">Power domain<\/span><\/h4>\n<p>Signal strength measurements are derived from the protocol operation, which most of the times results in a measurement of no additional cost to the device and battery consumption. However, positioning solutions in the power domain must tackle several challenges, in particular those related to the fast fluctuations of the Received Signal Strength (RSS) or of the backscattered power (BP), due to fading and shadowing caused by the surrounding environment. One key factor to model the RSS measurements relies on the possibility of understanding, with a given degree of accuracy, how the signal power changes in its surrounding environments. The signal power models as a function of the distance between the transmitter and the receiver are known as path-loss models.<sup id=\"rdp-ebb-cite_ref-ZanellaBest16_10-0\" class=\"reference\"><a href=\"#cite_note-ZanellaBest16-10\" rel=\"external_link\">[10]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LohanReceived15_11-0\" class=\"reference\"><a href=\"#cite_note-LohanReceived15-11\" rel=\"external_link\">[11]<\/a><\/sup> A typical empirical Log distance model is the single-slope path loss model<sup id=\"rdp-ebb-cite_ref-LohanReceived15_11-1\" class=\"reference\"><a href=\"#cite_note-LohanReceived15-11\" rel=\"external_link\">[11]<\/a><\/sup>:\n<\/p><p><span id=\"rdp-ebb-M1\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M1\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/c5cf714f660e8dd5603fcd52b8d27b9921f3369b&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -2.505ex; width:39.529ex; height:6.176ex;\" \/><\/span>,\n<\/p><p>where <span id=\"rdp-ebb-M2\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M2\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/acf8f144720df8180a949d0d8dddf53594fd77bf&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:8.923ex; height:2.843ex;\" \/><\/span> is the received signal power in logarithmic scale dependent on distance <i>d<\/i>, <span id=\"rdp-ebb-M3\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M3\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/4740381c16ea98c4132510daa642e93c1e42c049&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:2.263ex; height:2.509ex;\" \/><\/span> is a reference distance (usually 1 m), <span id=\"rdp-ebb-M4\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M4\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/ce70a0b6474dcb5aaeea68b799038e8f60b54ef1&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:5.43ex; height:2.676ex;\" \/><\/span> is the path-loss exponent, and <span id=\"rdp-ebb-M5\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M5\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/39ebe94b7ce7f0f9dd43c39ab3525ed0a0b9ba6e&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -1.005ex; width:19.378ex; height:3.343ex;\" \/><\/span> is a log-normally distributed random variable that models the slow fading phenomenon and possible RSS measurements errors (e.g., due to quantization). Both <span id=\"rdp-ebb-M6\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M6\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/e4d701857cf5fbec133eebaf94deadf722537f64&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:1.169ex; height:2.176ex;\" \/><\/span> and <i>w<\/i> are dependent on the propagation environment and are typically dependent on the device type and environment type. In addition, <i>w<\/i> can depend on factors such as device orientation and the amount of people present in the measurement area at the time of acquisition.\n<\/p><p>In terms of an IoT positioning system, the fact that one can extrapolate this information directly from the communication\u2019s signal, which means that there is no additional cost for the device. In terms of battery, the cost will depend on the amount of positioning location requests demanded per second. Ideally, if the requirement is to have an opportunistic location, based on the sporadic communication of the device, acquiring the RSS-based positioning will have no impact on the battery life. However, if the device or the infrastructure will have to listen periodically for a specific pilot signal, acquiring the RSS-based positioning will cause further demands in terms of battery consumption. One limitation of RSS-based approaches is that some current IoT standards support only a coarse RSS measurement (e.g., in steps of 6 dB), which can adversely impact the positioning accuracy, as the noise variance <span id=\"rdp-ebb-M7\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M7\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/53a5c55e536acf250c1d3e0f754be5692b843ef5&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.338ex; width:2.385ex; height:2.676ex;\" \/><\/span> will increase.\n<\/p><p>Another interesting aspect of the RSS measurements is that, based on simulations, RSS-based positioning errors are shown to be frequency independent (as shown later in Figure 1). However, one would expect different levels of location-based service at different frequency ranges. The frequency ranges can be coarsely divided into three categories: sub-GHz (i.e., carrier frequencies less than 1 GHz), GHz (1 to 30 GHz) and mmWave (above 30 GHz). The scattering becomes more prominent as frequency increases, thus one would expect different target positioning accuracy according to the frequency range. In addition, as the operating frequency increases, the antenna\u2019s effective area is smaller, and the signal coverage decreases. This is possible to see in Figure 2 where the ideal signal propagation in drawn over a 100 by 100 square area, based on the Friis equation and assuming zero system gains, <i>G<\/i>,\n<\/p><p><span id=\"rdp-ebb-M8\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M8\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/aade15a7787ae08c9af30292f5267fe7b7600c8b&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -2.505ex; width:49.883ex; height:6.176ex;\" \/><\/span>,\n<\/p><p>where <span id=\"rdp-ebb-M9\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M9\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/7bdf246d27d8dd80dc45c1a1eaac69d42ce532d6&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:2.318ex; height:2.509ex;\" \/><\/span> is the transmission power, <i>f<\/i> the operating frequency and <i>c<\/i> the speed of light.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_eSilva_Sensors2018_18-8.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"c37db9baba3a4cac3bdda14b8a6c828d\"><img alt=\"Fig1 eSilva Sensors2018 18-8.jpg\" src=\"https:\/\/www.limswiki.org\/images\/9\/99\/Fig1_eSilva_Sensors2018_18-8.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> Comparative analysis of RSS-based estimates at various carrier frequencies and various AN densities<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_eSilva_Sensors2018_18-8.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"d53b9e26d8797d94c26169f53c207c8e\"><img alt=\"Fig2 eSilva Sensors2018 18-8.jpg\" src=\"https:\/\/www.limswiki.org\/images\/b\/b1\/Fig2_eSilva_Sensors2018_18-8.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> Ideal radio signal propagation at 0.5, 2.4, 30 and 60 GHz<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Based on the signal\u2019s behavior, it is easy to understand that a sparser infrastructure at higher frequencies will likely result in a degradation of the positioning performance (as shown in Figure 1).\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Time_domain\">Time domain<\/span><\/h4>\n<p>Positioning estimation based on timing information is based on estimating the time-of-arrival (TOA) or the time-difference-of-arrival (TDOA) from three or more fixed access nodes and then converting those timing estimates into distances. For example, 3D location based on TOA is possible with three synchronized measurements from three known devices. The goal is to solve the following set of equations and find out the node\u2019s location, <span id=\"rdp-ebb-M10\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M10\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/3a30528269129d375af88d7b903c36267842ab55&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:16.418ex; height:2.843ex;\" \/><\/span>, assuming that <span id=\"rdp-ebb-M11\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M11\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/22915a8c8e6bceacfae6550235b74a73dac610b7&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:16.068ex; height:2.843ex;\" \/><\/span> are known coordinates:\n<\/p><p><span id=\"rdp-ebb-M12\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M12\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/9038190889eee3a6769719c835ae6a50abe69bd5&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -1.671ex; width:18.31ex; height:4.843ex;\" \/><\/span>.\n<\/p><p>For TDOA, the range is now a difference of ranges, based on the TOA at the measurement device. Hence, the TDOA from a node <i>n<\/i> to a measurement device <i>m<\/i> would be written as\n<\/p><p><span id=\"rdp-ebb-M13\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M13\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/e213132af5bdc5ecbd4b594dfb6b89dc26fda7e1&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:16.319ex; height:2.509ex;\" \/><\/span>.\n<\/p><p>Due to this difference, the range measurement is free of errors imposed by the measurement device\u2019s clock, since it cancels out when subtracting the two TOA measurements.\n<\/p><p>Overall, the time measurements require synchronized clocks, either at the receiver or at the transmitter side, leading to a significant burden on device cost. This does not play well for IoT applications, which are driven by the need of having low-cost devices.\n<\/p><p>It is also important to keep in mind the relationship between bandwidth and accuracy for TOA measurements. This is illustrated in Figure 3, where the positioning error is plotted against the available channel bandwidth at different Signal-to-Noise Ratio (SNR) values. Clearly, sub-m positioning accuracy with time-based approaches is achievable only with high bandwidths (of the order of 100 MHz), but it is very challenging for narrowband and ultra-narrowband systems even at very high SNR.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_eSilva_Sensors2018_18-8.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"970aad36eab976fead1e14c1328b92f1\"><img alt=\"Fig3 eSilva Sensors2018 18-8.jpg\" src=\"https:\/\/www.limswiki.org\/images\/e\/e4\/Fig3_eSilva_Sensors2018_18-8.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3.<\/b> Comparative analysis of TOA-based position estimates at various bandwidths<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h4><span class=\"mw-headline\" id=\"Space_domain\">Space domain<\/span><\/h4>\n<p>In the space domain, the ranges are estimated by measuring the angle (or direction) of arrival (AoA) for the signal of interest. Often, this is done by the means of an antenna array or a sectorized antenna. For a given device <i>n<\/i>, it is possible to describe its measurement at <i>m<\/i> as\n<\/p><p><span id=\"rdp-ebb-M14\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M14\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/7dfca2b4d5302e264a8192acffd509d2b275d1cb&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:35.572ex; height:2.843ex;\" \/><\/span>,\n<\/p><p>where <span id=\"rdp-ebb-M15\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M15\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/6d890e9041463a4a1cf563bf55f3943ca3b318d5&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:2.724ex; height:2.009ex;\" \/><\/span> is the distance <i>m<\/i> to <i>n<\/i> and <span id=\"rdp-ebb-M16\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M16\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/6e5ab2664b422d53eb0c7df3b87e1360d75ad9af&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.338ex; width:1.09ex; height:2.176ex;\" \/><\/span> the angle of arrival determined at <i>m<\/i>. Hence, by solving for the unknown coordinates, one can obtain a range estimate.\n<\/p><p>In summary, AoA is particularly interesting for IoT, as the major constraint for achieving angle measurements relies only on the antenna design. However, its major drawback is that the error increases with the distance to the transmitter, which means that a small deviation in the angle results in a large error for the devices at the service edge.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"IoT_classifications\">IoT classifications<\/span><\/h3>\n<p>While there are several domains from where to extract measurements for building knowledge of a device\u2019s location, several limitations arise from the actual IoT system that is built upon. The goal of this subsection is to introduce the IoT technologies, by classifying them into six main categories (see Figure 4):\n<\/p>\n<ul><li> <i>licensed versus unlicensed<\/i>, which refers to the operation in a protected band, such as cellular bands versus operation in unlicensed bands, such as industrial, scientific and medical (ISM) bands<\/li>\n<li> <i>operating frequency bands<\/i>, which refers to the carrier frequency of each IoT technology; here, we divide the frequency spectrum into three parts: sub-GHz, GHz, and mmWave bands, with some IoT technologies spreading over multiple ranges<\/li>\n<li> <i>protocols versus enablers<\/i>, which refers to whether a technology is seen as a specific IoT communication protocol or a possible wireless positioning enabler<\/li>\n<li> <i>range-based classification<\/i>, which refers to short-, medium-, or long-range operation<\/li>\n<li> <i>rate-based classification<\/i>, which refers to Low-Rate (LR) or High-Rate (HR) data rates; typically, most IoT connectivity solutions are meant for LR high delay applications, while solutions such as WiFi and 5G cover HR and low latency applications<\/li>\n<li> <i>power-based classification<\/i>, which refers to Low-Power (LP) versus High-Power (HP) operation; typically, LP approaches go hand in hand with LR approaches, while HP approaches go hand in hand with HP approaches; in LP operation, the devices can function for several years on batteries<\/li><\/ul>\n<p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_eSilva_Sensors2018_18-8.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"5ffc446b6d66d7ccdc3b53b98dce5585\"><img alt=\"Fig4 eSilva Sensors2018 18-8.jpg\" src=\"https:\/\/www.limswiki.org\/images\/6\/62\/Fig4_eSilva_Sensors2018_18-8.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 4.<\/b> Classification of IoT networks<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"IoT_system_parameters\">IoT system parameters<\/span><\/h3>\n<p>This subsection discusses the relevant parameters for a positioning fit in an IoT system.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Topology\">Topology<\/span><\/h4>\n<p>Topology relates to a message passing from one node to another and the possibility to discover new nodes in the network. The network topology, illustrated in Figure 5, has a significant impact on how nodes with known locations are discovered by others. On a mesh topology, any node can be set as a reference node, whereas, on a star topology, only the access nodes can be defined as such. The density of the fixed nodes also plays an important role in the location accuracy. For example, a denser network with a well-spread distribution of nodes is likely to provide a better location accuracy than a network with few reference nodes all placed in the same direction from the device to be located. An IoT network typically has a star or mesh topology. In a star topology, devices can only talk to their parent device, while, in a mesh topology, nodes can exchange messages between each other. Star topologies are susceptible to single points of failure, since losing the connection to the parent means that the node will be outside the network. In a mesh topology, if a link fault occurs, the device can look for any other neighbor to connect to. Thus, mesh networks provide better coverage and, implicitly, they are likely to offer better positioning accuracy than star networks.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_eSilva_Sensors2018_18-8.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"508114076c4db2679f692521479f03a7\"><img alt=\"Fig5 eSilva Sensors2018 18-8.jpg\" src=\"https:\/\/www.limswiki.org\/images\/c\/cf\/Fig5_eSilva_Sensors2018_18-8.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 5.<\/b> Comparison of a simplified IoT mesh and star network topology<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h4><span class=\"mw-headline\" id=\"Range\">Range<\/span><\/h4>\n<p>Range of an IoT system is important in the sense that it defines an upper bound of the positioning error, which cannot be larger than the communication range. In this aspect, mesh-capable networks have a better footing for positioning purposes as any device can extend service without the need to have specific and dedicated infrastructure.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Channel_bandwidth\">Channel bandwidth<\/span><\/h4>\n<p>Channel bandwidth is directly related to the achievable accuracy in positioning when a TOA-based estimation is used. The Cr\u00e1mer\u2013Rao lower bound for any unbiased estimator<sup id=\"rdp-ebb-cite_ref-KayFund93_12-0\" class=\"reference\"><a href=\"#cite_note-KayFund93-12\" rel=\"external_link\">[12]<\/a><\/sup> of a time delay <span id=\"rdp-ebb-M17\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M17\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/56dad457e274b970f5d98b9dc40bef7f895c7f6f&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:2.07ex; height:2.009ex;\" \/><\/span> of a signal <i>S<\/i> is given as\n<\/p><p><span id=\"rdp-ebb-M18\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M18\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/d9e7914f7b7284490a3dcaee20d607ed9fb3c8ee&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -3.171ex; width:31.87ex; height:7.509ex;\" \/><\/span>\n<\/p><p>where <i>\u03b5<\/i> is the signal energy, <span id=\"rdp-ebb-M19\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M19\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/5b6328fbe0cded37216c90735c89ee188be26a30&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:2.92ex; height:2.509ex;\" \/><\/span> the noise spectral density and <span id=\"rdp-ebb-M20\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M20\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/cd3f68ae6f81f64aee74e694c94ef001070b8a27&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -2.505ex; width:22.908ex; height:6.176ex;\" \/><\/span> is the mean square bandwidth of the signal. However, since we have do not have all the necessary information to accurately determine each IoT signal\u2019s spectrum density, we provide instead the multipath resolution or time-frequency resolution defined as follows:\n<\/p><p><span id=\"rdp-ebb-M21\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M21\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/8d0c55af5ae3b6bf4d5e2c523f13180efe6ab767&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -1.338ex; width:10.007ex; height:3.843ex;\" \/><\/span>.\n<\/p><p>The above equation determines how the time duration <span id=\"rdp-ebb-M22\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M22\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/8c28867ecd34e2caed12cf38feadf6a81a7ee542&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.338ex; width:2.775ex; height:2.176ex;\" \/><\/span> and the spectral bandwidth <span id=\"rdp-ebb-M23\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M23\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/b32f8e32a72fd8afb871214c670c130ea3e7e325&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.338ex; width:3.382ex; height:2.176ex;\" \/><\/span> relate to each other. The spectral bandwidth is defined as the bandwidth that includes most of the signal\u2019s energy. In this study, we assume it to be equal to the channel bandwidth. Overall, what both equations in this subsection show is that, for time-based approaches, it is favorable to have signals with high SNR and short time duration (i.e., higher bandwidth).\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Carrier_frequency\">Carrier frequency<\/span><\/h4>\n<p>Carrier frequency is inversely proportional to the signal wavelength and to the path losses exhibited by the signal. As we move from sub-GHz carriers towards mmWave carriers, the path losses are increasingly stronger, which results in smaller communication ranges. The differences in path losses are due to a multitude of phenomena, but, as frequency increases, they are especially due to the smaller effective area of the devices\u2019 antennae. Overall, combining lower carrier frequencies and mesh topologies results in an enhanced service coverage.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Modulation_types\">Modulation types<\/span><\/h4>\n<p>Modulation types in IoT systems rely on various digital modulation types, from Ultra Narrow Band (UNB), defined as systems with bandwidths below 1 kHz, to Ultra Wide Band (UWB) modulations, i.e., bandwidths above 500 MHz. In addition, spread spectrum (SS) or Orthogonal Frequency Division Multiplexing (OFDM) modulations are also widely encountered. The modulation type plays a big role in the achievable positioning accuracy when TOA, TDOA, or AOA methods are used, but it has little or no impact when RSS methods are used. Certain modulation-based characteristics can be exploited for positioning purposes. For example, this is the case of SS signals (e.g., LoRa, ZigBee, etc.), where the spreading pseudo-random sequence can be used to infer the signal\u2019s travel time in a similar fashion to GNSS.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Positioning_signaling.2Fdata_exchange\">Positioning signaling\/data exchange<\/span><\/h4>\n<p>Positioning signaling or data exchange is the ability to use either pilot signals or sequences of data packets to provide the location of nearby devices. However, few of the existing IoT technologies support positioning-related signaling, except for most of the cellular IoT technologies (e.g., NB-IoT), which rely on the observed time difference of arrival (OTDOA), introduced in the LTE radio. Apart from the cellular IoT technologies, the future WiFi 802.11az standards also showcase a dedicated data exchange regarding the time-of-flight information to determine the location of its devices.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Roaming\">Roaming<\/span><\/h4>\n<p>Roaming is the ability to provide continuity of service across multiple networks, owned or not by a single entity. As mobility is a keystone of most positioning applications, it is important to take note of this when looking at IoT systems. In this aspect, protocols such as Sigfox or Ingenu are at an advantage, as they operate similarly to cellular systems and they offer service across multi continents. Despite that, even proprietary solutions start to provide open application interface specifications and open guest periods in the radio access, which facilitate the exchange of data across multiple vendors and technologies.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Network_ownership\">Network ownership<\/span><\/h4>\n<p>Network ownership raises security and privacy concerns. Security is becoming a strong requirement in IoT systems, especially as the data access, transport, and storage become increasingly regulated by international and European bodies.<sup id=\"rdp-ebb-cite_ref-EUGeneralData16_13-0\" class=\"reference\"><a href=\"#cite_note-EUGeneralData16-13\" rel=\"external_link\">[13]<\/a><\/sup> Technologies such as Ingenu and Sigfox own the entirety of the network, meaning that the transportation of data is under their full responsibility. Thus, positioning solutions enabled by such systems will be protected by the system provider, as the infrastructure device\u2019s location will not be known to the user.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Power_consumption\">Power consumption<\/span><\/h4>\n<p>Power consumption is a main topic for all IoT technologies. For positioning applications, low-power consumption is crucial for the viability of several systems, especially when the goal is to continuously track and monitor inexpensive items. For example, low-power consumption is mandatory in several use cases from the logistics and construction sectors.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Comparing_IoT_technologies_and_IoT_enablers\">Comparing IoT technologies and IoT enablers<\/span><\/h3>\n<p>After discussing the positioning domain and the main system parameters relevant to positioning, here we present two comparative tables between 29 IoT solutions (see Table 1 and Table 2), with the goal of summing up the key points mentioned so far and enabling an easy comparison between the different technologies. Throughout the rest of this subsection, our goal is to make comparisons and drive the reader towards a better understanding of how a certain technology would fare as the backbone of a positioning system, in a GNSS-free case.\n<\/p><p>Table 1 presents for each technology, from left to right, the network topology, network type, the impact of each measurement domain on the device battery life and cost, the achievable positioning accuracy, the most suitable domain, and reported accuracy studies.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"9\"><b>Table 1.<\/b> Summary of key positioning related aspects for several IoT protocols and IEEE 802.11\u2217 family protocols<br \/><sup>1<\/sup>(+, +): low impact, (++, ++): medium impact, (+++, +++): high impact<br \/><sup>2<\/sup> assuming implementation without external sensors, such as GNSS\n<\/td><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Technology\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Network Topology\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Network Type\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"3\">Impact on (Battery, Device Cost) per Domain<sup>1<\/sup>\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Achievable Positioning Accuracy<sup>2<\/sup>\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Most Suitable Domain\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Accuracy Studies\n<\/th><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Time-Based Positioning\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Power-Based Positioning\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Space-Based Positioning\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5G\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HR\/HP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">High\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Time\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-KoivistoJoint16_14-0\" class=\"reference\"><a href=\"#cite_note-KoivistoJoint16-14\" rel=\"external_link\">[14]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LevanenLocation17_15-0\" class=\"reference\"><a href=\"#cite_note-LevanenLocation17-15\" rel=\"external_link\">[15]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ANT+\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Low\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">BLEmesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-LohanReceived15_11-2\" class=\"reference\"><a href=\"#cite_note-LohanReceived15-11\" rel=\"external_link\">[11]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-FaragherLocation15_16-0\" class=\"reference\"><a href=\"#cite_note-FaragherLocation15-16\" rel=\"external_link\">[16]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-DavidsonASurv16_17-0\" class=\"reference\"><a href=\"#cite_note-DavidsonASurv16-17\" rel=\"external_link\">[17]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Dash7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Low\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power or space\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">EC-GSM-IOT\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Low\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">EnOcean\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Low\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power or space\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Ingenu \/RPMA\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power or space\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ISA101.11a\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power or space\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LoRa\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-FargasGPS17_18-0\" class=\"reference\"><a href=\"#cite_note-FargasGPS17-18\" rel=\"external_link\">[18]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LTE-M\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Time\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-LinPositioning17_2-2\" class=\"reference\"><a href=\"#cite_note-LinPositioning17-2\" rel=\"external_link\">[2]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">MiWi\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB-IoT\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Time\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-LinPositioning17_2-3\" class=\"reference\"><a href=\"#cite_note-LinPositioning17-2\" rel=\"external_link\">[2]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">RFID\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-BasheerLocal13_19-0\" class=\"reference\"><a href=\"#cite_note-BasheerLocal13-19\" rel=\"external_link\">[19]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HasaniHybrid15_20-0\" class=\"reference\"><a href=\"#cite_note-HasaniHybrid15-20\" rel=\"external_link\">[20]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LiuBackPos16_21-0\" class=\"reference\"><a href=\"#cite_note-LiuBackPos16-21\" rel=\"external_link\">[21]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-MaFusion17_22-0\" class=\"reference\"><a href=\"#cite_note-MaFusion17-22\" rel=\"external_link\">[22]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-MaTheOpt17_23-0\" class=\"reference\"><a href=\"#cite_note-MaTheOpt17-23\" rel=\"external_link\">[23]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Sigfox\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-SallouhaLocal17_24-0\" class=\"reference\"><a href=\"#cite_note-SallouhaLocal17-24\" rel=\"external_link\">[24]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-JanssenLocal17_25-0\" class=\"reference\"><a href=\"#cite_note-JanssenLocal17-25\" rel=\"external_link\">[25]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Telensa\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Low\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power or space\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Thread\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Weightless-N\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power or space\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Weightless-P\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Low\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power or space\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Weightless-W\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WirelessHART\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiFi802.11af\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HR\/HP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">High\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Time\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiFi802.11ah\/HaLoW\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">High\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Time\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiFi802.11az\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">star\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HR\/HP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">High\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Time\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiFi802.11p (V2X)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HR\/HP-Short range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">High\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Time\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-CruzNeigh17_26-0\" class=\"reference\"><a href=\"#cite_note-CruzNeigh17-26\" rel=\"external_link\">[26]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-KalverkampOFDM13_27-0\" class=\"reference\"><a href=\"#cite_note-KalverkampOFDM13-27\" rel=\"external_link\">[27]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Wirepas\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">HR-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiSUN\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ZigBee\/ZigBee-NaN\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><sup id=\"rdp-ebb-cite_ref-OuAZig17_28-0\" class=\"reference\"><a href=\"#cite_note-OuAZig17-28\" rel=\"external_link\">[28]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-DongImp17_29-0\" class=\"reference\"><a href=\"#cite_note-DongImp17-29\" rel=\"external_link\">[29]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-CheonIEEE16_30-0\" class=\"reference\"><a href=\"#cite_note-CheonIEEE16-30\" rel=\"external_link\">[30]<\/a><\/sup>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Z-Wave\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LR\/LP-Long range\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">+, +\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">++, ++\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Medium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Power or space\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The Network Topology column maps each technology\u2019s topology to either a star or mesh topology.\n<\/p><p>The Network Type column presents the network type, where each entry starts with the rate, power consumption, and maximum operating range offered by the technology. The operating range has a correlation with the frequency bands in Table 2.\n<\/p><p>The three Impact columns show the impact on battery consumption and device cost for each measurement domain in discussion. While most of the technologies do not offer such capabilities, this classification assumes that it would be possible to couple the necessary measurement units to provide such information. Hence, the classification of low impact (+), medium impact (++), or high impact (+++) are based on what the authors expect to be the additional burden in terms of device cost and battery burden. The power domain is seen to be the one with the smallest impact, due to the fact that it would be easily available to all of these technologies.\n<\/p><p>The Achievable Positioning Accuracy column provides a qualitative indicator for the expected accuracy based on what the technology currently offers. When available, this information is based on the related studies.\n<\/p><p>The Most Suitable Domain column states the most suitable measurement domain to use with each technology. The domain is attributed based on the technology\u2019s signal characteristics presented in Table 2 and its current capabilities.\n<\/p><p><br \/>\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"4\"><b>Table 2.<\/b> Summary of key physical layer parameters for several IoT protocols\n<\/td><\/tr>\n\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Technology\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Frequency Bands\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Channel Bandwidth (MHz)\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Modulation Type (UNB\/NB\/SS\/OFDM\/UWB)\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5G\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GHz, mmWave\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">&lt;100\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">OFDM\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ANT+\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">BLEmesh\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Dash7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.025, 0.200\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">EC-GSM-IOT\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">EnOcean\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.0625\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Ingenu \/RPMA\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz and GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SS\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ISA101.11a\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SS\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LoRa\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.125, 0.500\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SS\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">LTE-M\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz and GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1.08, 1.4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">OFDM\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">MiWi\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz and GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.040, 0.250\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB-IoT\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz and GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.18\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB, OFDM\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">RFID\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz and GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Sigfox\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">UNB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Telensa\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Thread\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Weightless-N\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">UNB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Weightless-P\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.0125\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Weightless-W\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SS\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WirelessHART\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.25\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SS\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiFi802.11af\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">OFDM\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiFi802.11ah\/HaLoW\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1, 2, 4, 8, 16\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">OFDM\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiFi802.11az\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GHz, mmWave\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20, 40, 60, 80, 160\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">OFDM\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiFi802.11p (V2X)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">10\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">OFDM\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Wirepas\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz and GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.126, 0.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">WiSUN\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz and GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.2\u20131.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB, SS, and OFDM\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ZigBee\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz and GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.6, 1.2, 2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SS\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ZigBee-NaN\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.6, 1.2, 2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SS\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Z-Wave\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">sub-GHz\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NB\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Table 2 describes each technologies\u2019 key physical aspects, such as frequency bands, channel bandwidth and modulation type.\n<\/p><p>Positioning services often have a high demand for power consumption. Operating a positioning based infrastructure is often tied to the need of having a fully plugged-in (powered) infrastructure. However, there are several industrial applications that would benefit from a fully battery-operated network, especially where an electricity network might not yet be present, e.g., construction sites, or for facility of service extension and maintainability.\n<\/p><p>In terms of positioning, we found that most IoT systems are yet to offer specific signaling to support accurate measurements for localization. Few of the existing IoT systems have already raised interest in the academic field in terms of their positioning capabilities, as shown in the last column of Table 1. Most of the existing studies focus on RSS-based approaches, and several of them rely on low-cost probabilistic methods requiring an underlying path loss model. The few studies that focus on time-based and space-based approaches are mostly targeting the current and future cellular IoT signals, derived from LTE, such as LTE-M, which are retaining some of LTE\u2019s positioning characteristics such as positioning-specific signaling. In addition, future 5G networks are likely to rely on time-based and space-based positioning approaches. Our paper further contributes with additional results based on RSS and time-based approaches as shown in the next sections.\n<\/p><p>In addition, we have found that network-centric positioning solutions are being favored as opposed to device-centric ones, which is often related to the limited resources at the end nodes. However, a centralized architecture places an additional burden on the network capacity and latency as the number of devices grow. For many of the IoT systems, a centralized architecture will have difficulties accommodating real-time location systems, especially due to the strict latency requirements of such systems. Integration with other high-capacity technologies, such as WiFi and 5G, could decrease the latency at the expense of per unit cost and power consumption. The support of positioning updates at very sparse intervals ought to be feasible for many IoT technologies, which will certainly find its application in several niche markets, especially if the positioning system is supported fully by battery-powered networks over a span of multiple years.\n<\/p><p>To further complement our study, we end with a perspective on what the achievable positioning accuracy is. The next two sections focus on measurement-based and simulation-based studies, respectively. We introduce simulation-based results from two systems whose performance was difficult to find as benchmarks in the existing literature, namely IEEE 802.11az and LoRa. Then, we present measurement-based results from an office environment of a positioning system built on top of the Wirepas\u2019 mesh solution.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Simulation-based_performance_metrics\">Simulation-based performance metrics<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Case_Study_1:_802.11az_IoT_enabler.2C_simulation-based_results.2C_time_domain\">Case Study 1: 802.11az IoT enabler, simulation-based results, time domain<\/span><\/h3>\n<p>In 802.11az, a position estimate is obtained by solving the hyperbolic location based on the measured TOA <span id=\"rdp-ebb-M24\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M24\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/8f5d6a09963f0dc49d02727104757dd9dfdc3b19&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:3.175ex; height:2.509ex;\" \/><\/span> at the mobile side from several ANs, where <i>a<\/i> is the AN index, <span id=\"rdp-ebb-M25\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M25\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/c9bed9e0fc61ff46e0f5fc833fb93e4393af5219&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:15.459ex; height:2.509ex;\" \/><\/span>\n<\/p><p><span id=\"rdp-ebb-M26\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M26\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/041f1a1519946d8fd5cbb035eee702549c105f24&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -3.005ex; width:57.146ex; height:7.509ex;\" \/><\/span>,\n<\/p><p>where <span id=\"rdp-ebb-M27\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M27\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/ddf19ec057a1467dc4b6c7452aaa9a35bb099fea&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:1.843ex; height:2.343ex;\" \/><\/span> is the starting time of transmission from one AN in the network, taken arbitrarily as the first AN (<span id=\"rdp-ebb-M28\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M28\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/b88d42cbb53eee4d504cdb44c810a7bd03c8f2e1&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:4.664ex; height:2.509ex;\" \/><\/span>), <i>c<\/i> is the speed of light, <span id=\"rdp-ebb-M29\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M29\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/c00ada76023ea43806f9fc327a2b14f8e21a9d07&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:13.139ex; height:2.843ex;\" \/><\/span> is the geometric distance between the mobile device and the <i>a<\/i>-th AN, <span id=\"rdp-ebb-M30\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M30\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/c9bed9e0fc61ff46e0f5fc833fb93e4393af5219&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:15.459ex; height:2.509ex;\" \/><\/span> is the forwarding time of the signaling message between two access nodes, and <span id=\"rdp-ebb-31\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-31\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/d5855d552e125689de393afcabce750737ca0007&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:15.969ex; height:2.843ex;\" \/><\/span> is the geometric distance between the <span id=\"rdp-ebb-M32\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M32\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/6104442ed30596ef4d7795d3186273f68d796ea4&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.338ex; width:5.491ex; height:2.176ex;\" \/><\/span>-th AN and <i>a<\/i>-th AN. With several noisy observations of the measured time of arrivals, the IoT device can compute its position (as well as the unknown <span id=\"rdp-ebb-M27\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M27\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/ddf19ec057a1467dc4b6c7452aaa9a35bb099fea&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:1.843ex; height:2.343ex;\" \/><\/span>). It is assumed that the AN positions and the forwarding time are known and transmitted in the signaling message. In addition to that, a minimum of four synchronized access points are needed to estimate the four unknowns <span id=\"rdp-ebb-M33\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M33\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/60c130b74411a391c7c9313f00b237e1f68374e0&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:10.327ex; height:2.843ex;\" \/><\/span>, with the <span id=\"rdp-ebb-M33\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M33\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/bfbaae38455f94395b7f2d71b896f13be1dcaff6&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.838ex; width:7.45ex; height:2.843ex;\" \/><\/span> the device location.\n<\/p><p>To understand an achievable location performance, we defined a simulation over a square area of 0.4 km<sup>2<\/sup> at the highest bandwidth available (160 MHz). We observe in Figure 6 that this solution would be able to offer sub-meter accuracy 80 percent of the times when at least seven ANs are available.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig6_eSilva_Sensors2018_18-8.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"5ddd13097799d8748cb4bef11b85b58d\"><img alt=\"Fig6 eSilva Sensors2018 18-8.jpg\" src=\"https:\/\/www.limswiki.org\/images\/d\/db\/Fig6_eSilva_Sensors2018_18-8.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 6.<\/b> Example of 802.11az performance at various number of access nodes at signal to noise ration (SNR), SNR = 10 dB<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Case-Study_2:_LoRa.2C_simulation-based_results.2C_time_domain\">Case-Study 2: LoRa, simulation-based results, time domain<\/span><\/h3>\n<p>A chirp spread spectrum (CSS) system with a 125 kHz bandwidth and a spreading factor of seven was used in the simulations. It was assumed that we have a single-floor square indoor area of 200 m \u00d7 200 m size, in which <span id=\"rdp-ebb-34\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-34\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/863cbb4afaa1513a32bb0c3c9f4b25b027e7e49f&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:4.79ex; height:2.509ex;\" \/><\/span> access nodes are distributed uniformly, with <span id=\"rdp-ebb-34\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-34\" aria-hidden=\"true\" style=\"background-image: url(&#039;https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/863cbb4afaa1513a32bb0c3c9f4b25b027e7e49f&#039;); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -0.671ex; width:4.79ex; height:2.509ex;\" \/><\/span> between 3 and 100. Ten thousand Monte Carlo iterations were used to generate randomly the position of the ANs and of the IoT device. The positioning was based on TOA principle, where the TOA was estimated based on the correlation between the incoming signal and a reference CSS code. The results are shown in Figure 7 in terms of cumulative distribution function (CDF) of error, for a different number of LoRA access nodes, respectively. For three access nodes and at an SNR = \u221218 dB, the positioning error is higher than 50 m in more than 50 percent of cases. On the other hand, with 100 access nodes distributed in the 0.4 km<sup>2<\/sup> area, we can reach below 10 m accuracy in more than 50 percent of cases.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig7_eSilva_Sensors2018_18-8.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"abc7803e75fdb040b2cfb807a078acdc\"><img alt=\"Fig7 eSilva Sensors2018 18-8.jpg\" src=\"https:\/\/www.limswiki.org\/images\/b\/b9\/Fig7_eSilva_Sensors2018_18-8.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 7.<\/b> Example of LoRa performance at various numbers of access nodes<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Measurement-based_performance_metrics_with_Wirepas_IoT_platform\">Measurement-based performance metrics with Wirepas IoT platform<\/span><\/h2>\n<p>This section presents experimental results from an IoT positioning system built with Wirepas IoT mesh solution. The results presented in this section that were obtained are Wirepas\u2019 offices and are based on power measurements.\n<\/p><p>The environment where measurements took place, with a total area of 180 m<sup>2<\/sup> (10 by 18 meters), is a typical work environment with few small rooms and a large open areas (see Figure 8). Several battery powered operated devices were placed across the floor extending the network coverage in and outside the rooms. Some of these devices acted as known reference points while others as tracked devices. The reference points are identified in Figure 8 as routing devices (blue squares) and the measurement devices as yellow dots. All the devices were operating in the 2.4 GHz using Nordic\u2019s NRF51 as the radio chipset.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig8_eSilva_Sensors2018_18-8.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"7a36a2364db51a3771fe6fe5ff5cc42e\"><img alt=\"Fig8 eSilva Sensors2018 18-8.jpg\" src=\"https:\/\/www.limswiki.org\/images\/a\/a9\/Fig8_eSilva_Sensors2018_18-8.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 8.<\/b> Office environment where the measurements were acquired<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>In this setup, the measurement devices were statically collecting information about network beacons\u2019 broadcast periodically by the routing devices. The information about the routing devices\u2019 beacons, as seen by the measurement devices, was sent regularly towards the network sink. In turn, the network sink and gateway communicated the measurements to a positioning engine running on a local computer. The position engine provided a location estimate based on the known location of the routing devices and the RSS observed by the measurement devices. A location estimate was calculated by one-shot runs of a weighted centroid algorithm, meaning that no average or filtering were applied to the location estimates. However, the RSS measurements were averaged over a window of time to mitigate the channel propagation effects.\n<\/p><p>In addition to a location estimate on a global or local reference frame, the position engine also provided an area-based location. The area-based location consists of matching the location estimate to a set of geographical areas of interest (shaded areas in Figure 8). For a device to be in such area, it meant that its location estimate was found to be inside the geographic area defined by the four coordinate points of each area.\n<\/p><p>The results on Table 3 show the probability of correctly classifying the measurement devices in the areas of interest. The percentage is calculated by summing the amount of location estimates in the node\u2019s correct area versus the total amount of location estimates in any other area of interest.\n<\/p><p><br \/>\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"4\"><b>Table 3.<\/b> Experimental results with an IoT testbed using Wirepas connectivity with 60 fixes per second and static nodes\n<\/td><\/tr>\n\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" rowspan=\"2\">Area (m2)\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Office Hours\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Outside Office Hours\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">All Day\n<\/th><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"3\">% of Correct Location Area Classification\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">10\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">95.41\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">89.47\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">91.16\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">10\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">96.16\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">97.59\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">97.18\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">91.56\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">94.85\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">93.90\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">96.76\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">99.20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">98.50\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Mean\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">95.51\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">95.61\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">95.57\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The results on Table 3 show that, during a day, the devices were correctly located inside the logical area where they were known to be at more than 90 percent of the time.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>We believe positioning is important not only for IoT end applications, but also to support network self-management. Our paper addresses the lack of comprehensive studies comparing IoT solutions and their fit-for-positioning applications. The paper first covered three possible measurement domains from which IoT devices could derive their location. Afterwards, we focused on classification of the IoT solutions and we discussed several system parameters that should be considered when designing a positioning system. We concluded our study with a comparative table and discussion between multiple IoT and other wireless solutions. We also provided an overview of achievable system performance with unique results for three positioning systems built on top of IEEE 802.11az, LoRa, and Wirepas.\n<\/p><p>Overall, based on our study, we conclude that power-domain positioning currently offers the best trade-off between implementation cost and positioning accuracy for low-power systems. Dedicated positioning signaling as well as space-based approaches are some of the feasible ways to push for higher accuracy and still offer low-power operation. Cooperation with other wireless technologies, such as WiFi and 5G, could allow for mobility support and ability to operate at large scales when low-power operation is not critical.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>Conceptualization, P.F.eS. and V.K.; Funding acquisition, E.-S.L.; Investigation, P.F.eS.; Software, P.F.eS. and E.-S.L.; Supervision, V.K. and E.-S.L.; Writing\u2014Review and Editing, P.F.eS., V.K. and E.-S.L.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Funding\">Funding<\/span><\/h3>\n<p>This work has been partly supported by EU FP7 Marie Curie Initial Training Network MULTI-POS (Multi-technology Positioning Professionals) under Grant No. 316528 and by the Academy of Finland, project numbers 303576 and 313039.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h3>\n<p>Two of the authors are currently full time employees at Wirepas. Regardless of their relationship with Wirepas, the authors have kept the reporting objective and fair for all the technologies under examination. Furthermore the authors have not received any additional financial incentive to execute or complete this work. Wirepas contributed with hardware for the activities performed in this report, which was part of the research visit arranged within the MULTI-POS project.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-delPeral-RosadoWhite18-1\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-delPeral-RosadoWhite18_1-0\" rel=\"external_link\">1.0<\/a><\/sup> <sup><a href=\"#cite_ref-delPeral-RosadoWhite18_1-1\" rel=\"external_link\">1.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">del Peral-Rosado, J.A.; Seco-Granados, G.; Raulefs, R. et al.&#32;(April 2018).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.iracon.org\/wp-content\/uploads\/2018\/03\/IRACON-WP2.pdf\" target=\"_blank\">\"Whitepaper on New Localization Methods for 5G Wireless Systems and the Internet-of-Things\"<\/a>.&#32;In&#32;Witrisal, K.; Ant\u00f3n-Haro, C.&#32;(PDF).&#32;COST<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.iracon.org\/wp-content\/uploads\/2018\/03\/IRACON-WP2.pdf\" target=\"_blank\">http:\/\/www.iracon.org\/wp-content\/uploads\/2018\/03\/IRACON-WP2.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Whitepaper+on+New+Localization+Methods+for+5G+Wireless+Systems+and+the+Internet-of-Things&amp;rft.atitle=&amp;rft.aulast=del+Peral-Rosado%2C+J.A.%3B+Seco-Granados%2C+G.%3B+Raulefs%2C+R.+et+al.&amp;rft.au=del+Peral-Rosado%2C+J.A.%3B+Seco-Granados%2C+G.%3B+Raulefs%2C+R.+et+al.&amp;rft.date=April+2018&amp;rft.pub=COST&amp;rft_id=http%3A%2F%2Fwww.iracon.org%2Fwp-content%2Fuploads%2F2018%2F03%2FIRACON-WP2.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LinPositioning17-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LinPositioning17_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-LinPositioning17_2-1\" rel=\"external_link\">2.1<\/a><\/sup> <sup><a href=\"#cite_ref-LinPositioning17_2-2\" rel=\"external_link\">2.2<\/a><\/sup> <sup><a href=\"#cite_ref-LinPositioning17_2-3\" rel=\"external_link\">2.3<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lin, X.; Bergman, J.; Gunnarsson, F. et al.&#32;(2017).&#32;\"Positioning for the Internet of Things: A 3GPP Perspective\".&#32;<i>IEEE Communications Magazine<\/i>&#32;<b>55<\/b>&#32;(12): 179\u201385.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FMCOM.2017.1700269\" target=\"_blank\">10.1109\/MCOM.2017.1700269<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Positioning+for+the+Internet+of+Things%3A+A+3GPP+Perspective&amp;rft.jtitle=IEEE+Communications+Magazine&amp;rft.aulast=Lin%2C+X.%3B+Bergman%2C+J.%3B+Gunnarsson%2C+F.+et+al.&amp;rft.au=Lin%2C+X.%3B+Bergman%2C+J.%3B+Gunnarsson%2C+F.+et+al.&amp;rft.date=2017&amp;rft.volume=55&amp;rft.issue=12&amp;rft.pages=179%E2%80%9385&amp;rft_id=info:doi\/10.1109%2FMCOM.2017.1700269&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LinEnhanced16-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LinEnhanced16_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lin, K.; Chen, M..; Deng, J. et al.&#32;(2016).&#32;\"Enhanced Fingerprinting and Trajectory Prediction for IoT Localization in Smart Buildings\".&#32;<i>IEEE Transactions on Automation Science and Engineering<\/i>&#32;<b>13<\/b>&#32;(3): 1294\u2013307.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FTASE.2016.2543242\" target=\"_blank\">10.1109\/TASE.2016.2543242<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Enhanced+Fingerprinting+and+Trajectory+Prediction+for+IoT+Localization+in+Smart+Buildings&amp;rft.jtitle=IEEE+Transactions+on+Automation+Science+and+Engineering&amp;rft.aulast=Lin%2C+K.%3B+Chen%2C+M..%3B+Deng%2C+J.+et+al.&amp;rft.au=Lin%2C+K.%3B+Chen%2C+M..%3B+Deng%2C+J.+et+al.&amp;rft.date=2016&amp;rft.volume=13&amp;rft.issue=3&amp;rft.pages=1294%E2%80%93307&amp;rft_id=info:doi\/10.1109%2FTASE.2016.2543242&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-delPeral-RosadoImpact17-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-delPeral-RosadoImpact17_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">del Peral-Rosado, J.A.; L\u00f3pez-Salcedo, J.A.; Seco-Granados, G.&#32;(2017).&#32;\"Impact of frequency-hopping NB-IoT positioning in 4G and future 5G networks\".&#32;<i>IEEE International Conference on Communications Workshops<\/i>: 815\u201320.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FICCW.2017.7962759\" target=\"_blank\">10.1109\/ICCW.2017.7962759<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Impact+of+frequency-hopping+NB-IoT+positioning+in+4G+and+future+5G+networks&amp;rft.jtitle=IEEE+International+Conference+on+Communications+Workshops&amp;rft.aulast=del+Peral-Rosado%2C+J.A.%3B+L%C3%B3pez-Salcedo%2C+J.A.%3B+Seco-Granados%2C+G.&amp;rft.au=del+Peral-Rosado%2C+J.A.%3B+L%C3%B3pez-Salcedo%2C+J.A.%3B+Seco-Granados%2C+G.&amp;rft.date=2017&amp;rft.pages=815%E2%80%9320&amp;rft_id=info:doi\/10.1109%2FICCW.2017.7962759&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ChenRobust17-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ChenRobust17_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Chen, L.; Thombre, S.; J\u00e4rvinen, K. et al.&#32;(2017).&#32;\"Robustness, Security and Privacy in Location-Based Services for Future IoT: A Survey\".&#32;<i>IEEE Access<\/i>&#32;<b>5<\/b>: 8956\u201377.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FACCESS.2017.2695525\" target=\"_blank\">10.1109\/ACCESS.2017.2695525<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Robustness%2C+Security+and+Privacy+in+Location-Based+Services+for+Future+IoT%3A+A+Survey&amp;rft.jtitle=IEEE+Access&amp;rft.aulast=Chen%2C+L.%3B+Thombre%2C+S.%3B+J%C3%A4rvinen%2C+K.+et+al.&amp;rft.au=Chen%2C+L.%3B+Thombre%2C+S.%3B+J%C3%A4rvinen%2C+K.+et+al.&amp;rft.date=2017&amp;rft.volume=5&amp;rft.pages=8956%E2%80%9377&amp;rft_id=info:doi\/10.1109%2FACCESS.2017.2695525&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SinghCreate17-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SinghCreate17_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Singh, K.J.; Kapoor, D.S.&#32;(2017).&#32;\"Create Your Own Internet of Things: A survey of IoT platforms\".&#32;<i>IEEE Consumer Electronics Magazine<\/i>&#32;<b>6<\/b>&#32;(2): 57\u201368.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FMCE.2016.2640718\" target=\"_blank\">10.1109\/MCE.2016.2640718<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Create+Your+Own+Internet+of+Things%3A+A+survey+of+IoT+platforms&amp;rft.jtitle=IEEE+Consumer+Electronics+Magazine&amp;rft.aulast=Singh%2C+K.J.%3B+Kapoor%2C+D.S.&amp;rft.au=Singh%2C+K.J.%3B+Kapoor%2C+D.S.&amp;rft.date=2017&amp;rft.volume=6&amp;rft.issue=2&amp;rft.pages=57%E2%80%9368&amp;rft_id=info:doi\/10.1109%2FMCE.2016.2640718&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZhangSecure17-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ZhangSecure17_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zhang, P.; Nagarajan, S.G.; Nevat, I.&#32;(2017).&#32;\"Secure Location of Things (SLOT): Mitigating Localization Spoofing Attacks in the Internet of Things\".&#32;<i>IEEE Internet of Things Journal<\/i>&#32;<b>4<\/b>&#32;(6): 2199\u2013206.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FJIOT.2017.2753579\" target=\"_blank\">10.1109\/JIOT.2017.2753579<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Secure+Location+of+Things+%28SLOT%29%3A+Mitigating+Localization+Spoofing+Attacks+in+the+Internet+of+Things&amp;rft.jtitle=IEEE+Internet+of+Things+Journal&amp;rft.aulast=Zhang%2C+P.%3B+Nagarajan%2C+S.G.%3B+Nevat%2C+I.&amp;rft.au=Zhang%2C+P.%3B+Nagarajan%2C+S.G.%3B+Nevat%2C+I.&amp;rft.date=2017&amp;rft.volume=4&amp;rft.issue=6&amp;rft.pages=2199%E2%80%93206&amp;rft_id=info:doi\/10.1109%2FJIOT.2017.2753579&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Al-SarawiInternet17-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Al-SarawiInternet17_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Al-Sarawi, S.; Anbar, M.; Alieyan, K. et al.&#32;(2017).&#32;\"Internet of Things (IoT) communication protocols: Review\".&#32;<i>Proceedings from the 8th International Conference on Information Technology<\/i>: 685\u201390.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FICITECH.2017.8079928\" target=\"_blank\">10.1109\/ICITECH.2017.8079928<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Internet+of+Things+%28IoT%29+communication+protocols%3A+Review&amp;rft.jtitle=Proceedings+from+the+8th+International+Conference+on+Information+Technology&amp;rft.aulast=Al-Sarawi%2C+S.%3B+Anbar%2C+M.%3B+Alieyan%2C+K.+et+al.&amp;rft.au=Al-Sarawi%2C+S.%3B+Anbar%2C+M.%3B+Alieyan%2C+K.+et+al.&amp;rft.date=2017&amp;rft.pages=685%E2%80%9390&amp;rft_id=info:doi\/10.1109%2FICITECH.2017.8079928&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RazaLow17-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RazaLow17_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Raza, U.; Kulkarni, P.; Sooriyabandara, M.&#32;(2017).&#32;\"Low Power Wide Area Networks: An Overview\".&#32;<i>IEEE Communications Surveys &amp; Tutorials<\/i>&#32;<b>19<\/b>&#32;(2): 855\u201373.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FCOMST.2017.2652320\" target=\"_blank\">10.1109\/COMST.2017.2652320<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Low+Power+Wide+Area+Networks%3A+An+Overview&amp;rft.jtitle=IEEE+Communications+Surveys+%26+Tutorials&amp;rft.aulast=Raza%2C+U.%3B+Kulkarni%2C+P.%3B+Sooriyabandara%2C+M.&amp;rft.au=Raza%2C+U.%3B+Kulkarni%2C+P.%3B+Sooriyabandara%2C+M.&amp;rft.date=2017&amp;rft.volume=19&amp;rft.issue=2&amp;rft.pages=855%E2%80%9373&amp;rft_id=info:doi\/10.1109%2FCOMST.2017.2652320&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZanellaBest16-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ZanellaBest16_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zanella, A.&#32;(2016).&#32;\"Best Practice in RSS Measurements and Ranging\".&#32;<i>IEEE Communications Surveys &amp; Tutorials<\/i>&#32;<b>18<\/b>&#32;(4): 2662\u201386.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FCOMST.2016.2553452\" target=\"_blank\">10.1109\/COMST.2016.2553452<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Best+Practice+in+RSS+Measurements+and+Ranging&amp;rft.jtitle=IEEE+Communications+Surveys+%26+Tutorials&amp;rft.aulast=Zanella%2C+A.&amp;rft.au=Zanella%2C+A.&amp;rft.date=2016&amp;rft.volume=18&amp;rft.issue=4&amp;rft.pages=2662%E2%80%9386&amp;rft_id=info:doi\/10.1109%2FCOMST.2016.2553452&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LohanReceived15-11\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LohanReceived15_11-0\" rel=\"external_link\">11.0<\/a><\/sup> <sup><a href=\"#cite_ref-LohanReceived15_11-1\" rel=\"external_link\">11.1<\/a><\/sup> <sup><a href=\"#cite_ref-LohanReceived15_11-2\" rel=\"external_link\">11.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lohan, E.S.; Talvitie, J.; e Silva, P.F. et al.&#32;(2015).&#32;\"Received signal strength models for WLAN and BLE-based indoor positioning in multi-floor buildings\".&#32;<i>Proceedings from the 2015 International Conference on Location and GNSS<\/i>: 1\u20136.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FICL-GNSS.2015.7217154\" target=\"_blank\">10.1109\/ICL-GNSS.2015.7217154<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Received+signal+strength+models+for+WLAN+and+BLE-based+indoor+positioning+in+multi-floor+buildings&amp;rft.jtitle=Proceedings+from+the+2015+International+Conference+on+Location+and+GNSS&amp;rft.aulast=Lohan%2C+E.S.%3B+Talvitie%2C+J.%3B+e+Silva%2C+P.F.+et+al.&amp;rft.au=Lohan%2C+E.S.%3B+Talvitie%2C+J.%3B+e+Silva%2C+P.F.+et+al.&amp;rft.date=2015&amp;rft.pages=1%E2%80%936&amp;rft_id=info:doi\/10.1109%2FICL-GNSS.2015.7217154&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KayFund93-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KayFund93_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Kay, S.M.&#32;(1993).&#32;<i>Fundamentals of Statistical Signal Processing: Estimation Theory<\/i>.&#32;<b>1<\/b>.&#32;Prentice Hall.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9780133457117.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Fundamentals+of+Statistical+Signal+Processing%3A+Estimation+Theory&amp;rft.aulast=Kay%2C+S.M.&amp;rft.au=Kay%2C+S.M.&amp;rft.date=1993&amp;rft.volume=1&amp;rft.pub=Prentice+Hall&amp;rft.isbn=9780133457117&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EUGeneralData16-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-EUGeneralData16_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX%3A32016R0679\" target=\"_blank\">\"Regulation (EU) 2016\/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95\/46\/EC (General Data Protection Regulation)\"<\/a>.&#32;<i>EUR-Lex<\/i>.&#32;European Union.&#32;27 April 2016<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX%3A32016R0679\" target=\"_blank\">https:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX%3A32016R0679<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Regulation+%28EU%29+2016%2F679+of+the+European+Parliament+and+of+the+Council+of+27+April+2016+on+the+protection+of+natural+persons+with+regard+to+the+processing+of+personal+data+and+on+the+free+movement+of+such+data%2C+and+repealing+Directive+95%2F46%2FEC+%28General+Data+Protection+Regulation%29&amp;rft.atitle=EUR-Lex&amp;rft.date=27+April+2016&amp;rft.pub=European+Union&amp;rft_id=https%3A%2F%2Feur-lex.europa.eu%2Flegal-content%2FEN%2FTXT%2F%3Furi%3DCELEX%253A32016R0679&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KoivistoJoint16-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KoivistoJoint16_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Joint 3D Positioning and Network Synchronization in 5G Ultra-Dense Networks Using UKF and EKF\".&#32;<i>Proceedings of the 2016 IEEE Globecom Workshops<\/i>: 1\u20137.&#32;2016.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FGLOCOMW.2016.7848938\" target=\"_blank\">10.1109\/GLOCOMW.2016.7848938<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Joint+3D+Positioning+and+Network+Synchronization+in+5G+Ultra-Dense+Networks+Using+UKF+and+EKF&amp;rft.jtitle=Proceedings+of+the+2016+IEEE+Globecom+Workshops&amp;rft.date=2016&amp;rft.pages=1%E2%80%937&amp;rft_id=info:doi\/10.1109%2FGLOCOMW.2016.7848938&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LevanenLocation17-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LevanenLocation17_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Location-aware 5G communications and Doppler compensation for high-speed train networks\".&#32;<i>Proceedings of the 2017 European Conference on Networks and Communications<\/i>: 1\u20136.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FEuCNC.2017.7980755\" target=\"_blank\">10.1109\/EuCNC.2017.7980755<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Location-aware+5G+communications+and+Doppler+compensation+for+high-speed+train+networks&amp;rft.jtitle=Proceedings+of+the+2017+European+Conference+on+Networks+and+Communications&amp;rft.date=2017&amp;rft.pages=1%E2%80%936&amp;rft_id=info:doi\/10.1109%2FEuCNC.2017.7980755&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FaragherLocation15-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FaragherLocation15_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Location Fingerprinting With Bluetooth Low Energy Beacons\".&#32;<i>IEEE Journal on Selected Areas in Communications<\/i>&#32;<b>33<\/b>&#32;(11): 2418\u201328.&#32;2015.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FJSAC.2015.2430281\" target=\"_blank\">10.1109\/JSAC.2015.2430281<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Location+Fingerprinting+With+Bluetooth+Low+Energy+Beacons&amp;rft.jtitle=IEEE+Journal+on+Selected+Areas+in+Communications&amp;rft.date=2015&amp;rft.volume=33&amp;rft.issue=11&amp;rft.pages=2418%E2%80%9328&amp;rft_id=info:doi\/10.1109%2FJSAC.2015.2430281&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DavidsonASurv16-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DavidsonASurv16_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"A Survey of Selected Indoor Positioning Methods for Smartphones\".&#32;<i>IEEE Communications Surveys &amp; Tutorials<\/i>&#32;<b>19<\/b>&#32;(2): 1347\u201370.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FCOMST.2016.2637663\" target=\"_blank\">10.1109\/COMST.2016.2637663<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+Survey+of+Selected+Indoor+Positioning+Methods+for+Smartphones&amp;rft.jtitle=IEEE+Communications+Surveys+%26+Tutorials&amp;rft.date=2017&amp;rft.volume=19&amp;rft.issue=2&amp;rft.pages=1347%E2%80%9370&amp;rft_id=info:doi\/10.1109%2FCOMST.2016.2637663&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FargasGPS17-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FargasGPS17_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"GPS-free geolocation using LoRa in low-power WANs\".&#32;<i>Proceedings of the 2017 Global Internet of Things Summit<\/i>: 1\u20136.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FGIOTS.2017.8016251\" target=\"_blank\">10.1109\/GIOTS.2017.8016251<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=GPS-free+geolocation+using+LoRa+in+low-power+WANs&amp;rft.jtitle=Proceedings+of+the+2017+Global+Internet+of+Things+Summit&amp;rft.date=2017&amp;rft.pages=1%E2%80%936&amp;rft_id=info:doi\/10.1109%2FGIOTS.2017.8016251&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BasheerLocal13-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BasheerLocal13_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Localization of RFID Tags Using Stochastic Tunneling\".&#32;<i>IEEE Transactions on Mobile Computing<\/i>&#32;<b>12<\/b>&#32;(6): 1225\u201335.&#32;2013.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FTMC.2012.80\" target=\"_blank\">10.1109\/TMC.2012.80<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Localization+of+RFID+Tags+Using+Stochastic+Tunneling&amp;rft.jtitle=IEEE+Transactions+on+Mobile+Computing&amp;rft.date=2013&amp;rft.volume=12&amp;rft.issue=6&amp;rft.pages=1225%E2%80%9335&amp;rft_id=info:doi\/10.1109%2FTMC.2012.80&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HasaniHybrid15-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HasaniHybrid15_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Hybrid WLAN-RFID Indoor Localization Solution Utilizing Textile Tag\".&#32;<i>IEEE Antennas and Wireless Propagation Letters<\/i>&#32;<b>14<\/b>: 1358\u201361.&#32;2015.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FLAWP.2015.2406951\" target=\"_blank\">10.1109\/LAWP.2015.2406951<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Hybrid+WLAN-RFID+Indoor+Localization+Solution+Utilizing+Textile+Tag&amp;rft.jtitle=IEEE+Antennas+and+Wireless+Propagation+Letters&amp;rft.date=2015&amp;rft.volume=14&amp;rft.pages=1358%E2%80%9361&amp;rft_id=info:doi\/10.1109%2FLAWP.2015.2406951&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LiuBackPos16-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LiuBackPos16_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"BackPos: High Accuracy Backscatter Positioning System\".&#32;<i>IEEE Transactions on Mobile Computing<\/i>&#32;<b>15<\/b>&#32;(3): 586\u201398.&#32;2016.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FTMC.2015.2424437\" target=\"_blank\">10.1109\/TMC.2015.2424437<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=BackPos%3A+High+Accuracy+Backscatter+Positioning+System&amp;rft.jtitle=IEEE+Transactions+on+Mobile+Computing&amp;rft.date=2016&amp;rft.volume=15&amp;rft.issue=3&amp;rft.pages=586%E2%80%9398&amp;rft_id=info:doi\/10.1109%2FTMC.2015.2424437&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MaFusion17-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MaFusion17_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Fusion of RSS and Phase Shift Using the Kalman Filter for RFID Tracking\".&#32;<i>IEEE Sensors Journal<\/i>&#32;<b>17<\/b>&#32;(11): 3551\u201358.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FJSEN.2017.2696054\" target=\"_blank\">10.1109\/JSEN.2017.2696054<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Fusion+of+RSS+and+Phase+Shift+Using+the+Kalman+Filter+for+RFID+Tracking&amp;rft.jtitle=IEEE+Sensors+Journal&amp;rft.date=2017&amp;rft.volume=17&amp;rft.issue=11&amp;rft.pages=3551%E2%80%9358&amp;rft_id=info:doi\/10.1109%2FJSEN.2017.2696054&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MaTheOpt17-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MaTheOpt17_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"The Optimization for Hyperbolic Positioning of UHF Passive RFID Tags\".&#32;<i>IEEE Transactions on Automation Science and Engineering<\/i>&#32;<b>14<\/b>&#32;(4): 1590\u20131600.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FTASE.2017.2656947\" target=\"_blank\">10.1109\/TASE.2017.2656947<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+Optimization+for+Hyperbolic+Positioning+of+UHF+Passive+RFID+Tags&amp;rft.jtitle=IEEE+Transactions+on+Automation+Science+and+Engineering&amp;rft.date=2017&amp;rft.volume=14&amp;rft.issue=4&amp;rft.pages=1590%E2%80%931600&amp;rft_id=info:doi\/10.1109%2FTASE.2017.2656947&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SallouhaLocal17-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SallouhaLocal17_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Localization in long-range ultra narrow band IoT networks using RSSI\".&#32;<i>Proceedings of the 2017 IEEE International Conference on Communications<\/i>: 1\u20136.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FICC.2017.7997195\" target=\"_blank\">10.1109\/ICC.2017.7997195<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Localization+in+long-range+ultra+narrow+band+IoT+networks+using+RSSI&amp;rft.jtitle=Proceedings+of+the+2017+IEEE+International+Conference+on+Communications&amp;rft.date=2017&amp;rft.pages=1%E2%80%936&amp;rft_id=info:doi\/10.1109%2FICC.2017.7997195&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JanssenLocal17-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JanssenLocal17_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Localization in Low Power Wide Area Networks Using Wi-Fi Fingerprints\".&#32;<i>Applied Sciences<\/i>&#32;<b>7<\/b>&#32;(9): 936.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3390%2Fapp7090936\" target=\"_blank\">10.3390\/app7090936<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Localization+in+Low+Power+Wide+Area+Networks+Using+Wi-Fi+Fingerprints&amp;rft.jtitle=Applied+Sciences&amp;rft.date=2017&amp;rft.volume=7&amp;rft.issue=9&amp;rft.pages=936&amp;rft_id=info:doi\/10.3390%2Fapp7090936&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CruzNeigh17-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CruzNeigh17_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Neighbor-Aided Localization in Vehicular Networks\".&#32;<i>IEEE Transactions on Intelligent Transportation Systems<\/i>&#32;<b>18<\/b>&#32;(10): 2693\u2013702.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FTITS.2017.2655146\" target=\"_blank\">10.1109\/TITS.2017.2655146<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Neighbor-Aided+Localization+in+Vehicular+Networks&amp;rft.jtitle=IEEE+Transactions+on+Intelligent+Transportation+Systems&amp;rft.date=2017&amp;rft.volume=18&amp;rft.issue=10&amp;rft.pages=2693%E2%80%93702&amp;rft_id=info:doi\/10.1109%2FTITS.2017.2655146&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KalverkampOFDM13-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KalverkampOFDM13_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"OFDM-Based Ranging Approach for Vehicular Safety Applications\".&#32;<i>Proceedings of the 2013 IEEE 78th Vehicular Technology Conference<\/i>: 1\u20135.&#32;2013.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FVTCFall.2013.6692309\" target=\"_blank\">10.1109\/VTCFall.2013.6692309<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=OFDM-Based+Ranging+Approach+for+Vehicular+Safety+Applications&amp;rft.jtitle=Proceedings+of+the+2013+IEEE+78th+Vehicular+Technology+Conference&amp;rft.date=2013&amp;rft.pages=1%E2%80%935&amp;rft_id=info:doi\/10.1109%2FVTCFall.2013.6692309&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OuAZig17-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OuAZig17_28-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"A ZigBee position technique for indoor localization based on proximity learning\".&#32;<i>Proceedings of the 2017 IEEE International Conference on Mechatronics and Automation<\/i>: 875\u201380.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FICMA.2017.8015931\" target=\"_blank\">10.1109\/ICMA.2017.8015931<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+ZigBee+position+technique+for+indoor+localization+based+on+proximity+learning&amp;rft.jtitle=Proceedings+of+the+2017+IEEE+International+Conference+on+Mechatronics+and+Automation&amp;rft.date=2017&amp;rft.pages=875%E2%80%9380&amp;rft_id=info:doi\/10.1109%2FICMA.2017.8015931&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DongImp17-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DongImp17_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">\"Implementation of indoor fingerprint positioning based on ZigBee\".&#32;<i>Proceedings of the 2017 29th Chinese Control And Decision Conference<\/i>: 2654-59.&#32;2017.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FCCDC.2017.7978963\" target=\"_blank\">10.1109\/CCDC.2017.7978963<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Implementation+of+indoor+fingerprint+positioning+based+on+ZigBee&amp;rft.jtitle=Proceedings+of+the+2017+29th+Chinese+Control+And+Decision+Conference&amp;rft.date=2017&amp;rft.pages=2654-59&amp;rft_id=info:doi\/10.1109%2FCCDC.2017.7978963&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CheonIEEE16-30\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CheonIEEE16_30-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4801579\" target=\"_blank\">\"IEEE 802.15.4 ZigBee-Based Time-of-Arrival Estimation for Wireless Sensor Networks\"<\/a>.&#32;<i>Sensors<\/i>&#32;<b>16<\/b>&#32;(2): 203.&#32;2016.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3390%2Fs16020203\" target=\"_blank\">10.3390\/s16020203<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4801579\/\" target=\"_blank\">PMC4801579<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26861331\" target=\"_blank\">26861331<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4801579\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4801579<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=IEEE+802.15.4+ZigBee-Based+Time-of-Arrival+Estimation+for+Wireless+Sensor+Networks&amp;rft.jtitle=Sensors&amp;rft.date=2016&amp;rft.volume=16&amp;rft.issue=2&amp;rft.pages=203&amp;rft_id=info:doi\/10.3390%2Fs16020203&amp;rft_id=info:pmc\/PMC4801579&amp;rft_id=info:pmid\/26861331&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4801579&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Some grammar, punctuation, and minor wording issues have been corrected.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193152\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 1.329 seconds\nReal time usage: 2.325 seconds\nPreprocessor visited node count: 22688\/1000000\nPreprocessor generated node count: 38011\/1000000\nPost\u2010expand include size: 144296\/2097152 bytes\nTemplate argument size: 49809\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 670.543 1 - -total\n 83.86% 562.298 1 - Template:Reflist\n 72.87% 488.607 30 - Template:Citation\/core\n 67.61% 453.368 27 - Template:Cite_journal\n 9.26% 62.065 1 - Template:Infobox_journal_article\n 8.84% 59.259 1 - Template:Infobox\n 7.06% 47.350 2 - Template:Cite_web\n 6.49% 43.520 30 - Template:Citation\/identifier\n 5.29% 35.493 80 - Template:Infobox\/row\n 3.67% 24.615 31 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10707-0!*!0!!en!5!*!math=5 and timestamp 20181214193150 and revision id 33673\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends\">https:\/\/www.limswiki.org\/index.php\/Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","69cd9560f847d37e95c0bf5ffc36d532_images":["https:\/\/www.limswiki.org\/images\/9\/99\/Fig1_eSilva_Sensors2018_18-8.jpg","https:\/\/www.limswiki.org\/images\/b\/b1\/Fig2_eSilva_Sensors2018_18-8.jpg","https:\/\/www.limswiki.org\/images\/e\/e4\/Fig3_eSilva_Sensors2018_18-8.jpg","https:\/\/www.limswiki.org\/images\/6\/62\/Fig4_eSilva_Sensors2018_18-8.jpg","https:\/\/www.limswiki.org\/images\/c\/cf\/Fig5_eSilva_Sensors2018_18-8.jpg","https:\/\/www.limswiki.org\/images\/d\/db\/Fig6_eSilva_Sensors2018_18-8.jpg","https:\/\/www.limswiki.org\/images\/b\/b9\/Fig7_eSilva_Sensors2018_18-8.jpg","https:\/\/www.limswiki.org\/images\/a\/a9\/Fig8_eSilva_Sensors2018_18-8.jpg"],"69cd9560f847d37e95c0bf5ffc36d532_timestamp":1544815910,"d6135e8d32b77d11c05c7b261fe72044_type":"article","d6135e8d32b77d11c05c7b261fe72044_title":"systemPipeR: NGS workflow and report generation environment (Backman and Girke 2016)","d6135e8d32b77d11c05c7b261fe72044_url":"https:\/\/www.limswiki.org\/index.php\/Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment","d6135e8d32b77d11c05c7b261fe72044_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:systemPipeR: NGS workflow and report generation environment\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\t \n\nFull article title\n \nsystemPipeR: NGS workflow and report generation environmentJournal\n \nBMC BioinformaticsAuthor(s)\n \nBackman, Tyler W.H.; Girke, ThomasAuthor affiliation(s)\n \nUniversity of California, RiversidePrimary contact\n \nEmail: thomas dot girke at ucr dot eduYear published\n \n2016Volume and issue\n \n17Page(s)\n \n388DOI\n \n10.1186\/s12859-016-1241-0ISSN\n \n1471-2105Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-016-1241-0Download\n \nhttps:\/\/bmcbioinformatics.biomedcentral.com\/track\/pdf\/10.1186\/s12859-016-1241-0 (PDF)\n\nContents\n\n1 Abstract \n2 Background \n3 Implementation \n\n3.1 Environment \n3.2 Workflow design \n3.3 Command-line software support \n3.4 Parallel evaluation \n3.5 Automated analysis reports \n\n\n4 Results and discussion \n\n4.1 Overview \n4.2 Workflow templates \n4.3 Add-on tools \n4.4 Performance and scalability \n4.5 Need for an R-based NGS workflow environment \n\n\n5 Conclusion \n6 Availability and requirements \n7 Abbreviations \n8 Declarations \n\n8.1 Acknowledgements \n8.2 Funding \n8.3 Authors\u2019 contributions \n8.4 Competing interests \n8.5 Additional files \n\n\n9 References \n10 Notes \n\n\n\nAbstract \nBackground: Next-generation sequencing (NGS) has revolutionized how research is carried out in many areas of biology and medicine. However, the analysis of NGS data remains a major obstacle to the efficient utilization of the technology, as it requires complex multi-step processing of big data, demanding considerable computational expertise from users. While substantial effort has been invested on the development of software dedicated to the individual analysis steps of NGS experiments, insufficient resources are currently available for integrating the individual software components within the widely used R\/Bioconductor environment into automated workflows capable of running the analysis of most types of NGS applications from start-to-finish in a time-efficient and reproducible manner.\nResults: To address this need, we have developed the R\/Bioconductor package systemPipeR. It is an extensible environment for both building and running end-to-end analysis workflows with automated report generation for a wide range of NGS applications. Its unique features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software on local computers and computer clusters. A flexible sample annotation infrastructure efficiently handles complex sample sets and experimental designs. To simplify the analysis of widely used NGS applications, the package provides pre-configured workflows and reporting templates for RNA-Seq, ChIP-Seq, VAR-Seq, and Ribo-Seq. Additional workflow templates will be provided in the future.\nConclusions: systemPipeR accelerates the extraction of reproducible analysis results from NGS experiments. By combining the capabilities of many R\/Bioconductor and command-line tools, it makes efficient use of existing software resources without limiting the user to a set of predefined methods or environments. systemPipeR is freely available for all common operating systems from Bioconductor (http:\/\/bioconductor.org\/packages\/devel\/systemPipeR).\nKeywords: analysis workflow, next generation sequencing (NGS), Ribo-Seq, ChIP-Seq, RNA-Seq, VAR-Seq\n\nBackground \nBy allowing scientists to rapidly sequence and quantify DNA and RNA molecules, next-generation sequencing (NGS) technology has transformed biology into one of the most data intensive research disciplines. In the past, experiments have been performed on a gene-by-gene basis, while NGS has introduced an age where it is has become a routine to sequence entire transcriptomes, genomes, or epigenomes rather than their isolated parts of interest. It will soon be possible to conduct these experiments on large numbers of single cell samples[1][2] for a wide range of time points, treatments, and genetic backgrounds to study biological systems with greater resolution and precision. Sequencing the genetic material of each individual within entire populations of organisms of the same species or genus will enable the study of adaptation processes[3], disease progression, and micro-evolution in real time.[4] This technological shift empowers researchers to address questions at a genome-wide scale, for example by profiling the mRNA, miRNA, and DNA methylation states of a large set of biological samples in parallel.[5]\nThe success of NGS-driven research has led to a data explosion of increasing size and complexity, making it now more time-consuming and challenging for researchers to extract knowledge from their experiments. Rapid processing of the results is essential to test, refine, and formulate new hypotheses for designing follow-up experiments. As a result, biologists have to dedicate nowadays substantial time to data analysis tasks while training themselves effectively as genome data scientists rather than focusing on experimentation as they used to in the past.\nIn recent years, a considerable number of algorithms, statistical methods, and software tools has been developed to perform the individual analysis steps of different NGS applications. These include short read pre-processors, aligners, variant and peak callers, as well as statistical methods for the analysis of genomic regions that are differentially expressed[6][7], bound[8], or methylated.[9][10] Also essential are tools for processing short read alignments[11], genomic intervals[12][13], and annotations.[14] However, most data analysis routines of NGS applications are very complex, involving multiple software tools for their many processing steps. As a result, there is a great need for flexible software environments connecting the individual software components to automated workflows in order to perform complex genome-wide analyses in an efficient and reproducible manner. While many workflow management resources exist[15][16][17][18][19][20][21][22][23][24] for a variety of data analysis programming languages (for details see below), only insufficient general purpose NGS workflow solutions are currently available for the popular R programming language. R and the affiliated Bioconductor environment provide a substantial number of widely used tools with a large user base in this area.[10] Thus, a workflow framework for federating NGS applications from within R will have many benefits for experimental and computational scientists who use R for NGS data analysis.\nTo address this need, we designed systemPipeR as a Bioconductor package for building and running workflows for most NGS applications, with support for integrating a wide array of command-line and R\/Bioconductor software.\n\nImplementation \nEnvironment \nsystemPipeR has been implemented as an open-source Bioconductor package using the R programming language for statistical computing and graphics. R was chosen as the core development platform for systemPipeR because of the following reasons. (i) R is currently one of the most popular statistical data analysis and programming environments in bioinformatics. (ii) Its external language bindings support the implementation of computationally time-consuming analysis steps in high-performance languages such as C\/C++. (iii) It supports advanced parallel computation on multi-core machines and computer clusters. (iv) A well developed infrastructure interfaces R with several other popular programing languages such as Python. (v) R provides advanced graphical and visualization utilities for scientific computing. (vi) It offers access to a vast landscape of statistical and machine learning tools. (vii) Its integration with the Bioconductor project promotes reusability of genomics software components, while also making efficient use of a large number of existing NGS packages that are well tested and widely used by the community. To support long-term reproducibility of analysis outcomes, systemPipeR is also distributed as a Docker image of Bioconductor\u2019s sequencing division. Docker containers provide an efficient solution for packaging complex software together with all its system dependencies to ensure it will run the same in the future across different environments, including different operating systems and cloud-based solutions.\n\nWorkflow design \nsystemPipeR workflows (Fig. 1) can be run from start-to-finish with a single command, or stepwise in interactive mode from the R console. New workflows are constructed, or existing ones modified, by connecting so-called SYSargs workflow control modules (R S4 class). Each SYSargs module contains instructions needed for processing a set of input files with a specific command-line or R software; as well as the paths to the corresponding outputs generated by a specific NGS tool such as a read preprocessor (trimmed\/filtered FASTQ files), aligner (SAM\/BAM files), read counter, variant caller (VCF\/BCF files), peak caller (BED\/WIG files), or statistical function. Typically, the only input the user needs to provide for running workflows is a single tabular targets file containing the paths to the initial sample input files (e.g. FASTQ) along with sample labels, and if appropriate biological replicate and contrast information for controlling differential abundance analyses (e.g., gene expression). Downstream derivatives of these targets files along with the corresponding SYSargs instances (see Fig. 1) are created automatically within each workflow. \n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. Workflow steps with input\/output file operations are controlled by SYSargs objects. Each SYSargs instance is constructed from a targets and a param file. The only input required from the user is the initial targets file. Subsequent instances are created automatically. Any number of predefined or custom workflow steps is supported.\n\n\n\nThe parameters required for running command-line software are provided by parameter (param) files, described below. For R-based workflow steps, param files are not required but can be useful for operations importing and\/or exporting sample-level files. This modular design has several advantages. First, it provides a high level of flexibility for designing workflows, such as allowing the user to start workflows from the very beginning or anywhere in-between (e.g. FASTQ or BAM level). Second, it is straightforward to add custom workflow steps without requiring computational expert knowledge from users. Workflows can also have any number of steps including branch points. Lastly, it also minimizes errors as all input and output files are registered, and sample labels specified in the initial targets file will be consistently used throughout all workflow results, including plots, tables, and workflow reports.\n\nCommand-line software support \nAn important feature of systemPipeR is support for running command-line software directly from R on both single machines or computer clusters. This offers several advantages, such as seamless integration of most command-line software available in the NGS field with the extensive genome analysis resources provided by R\/Bioconductor. The user interface for running command-line software has been generalized as a single function for ease of use, while only one additional command will run the same tool in parallel mode on a computer cluster (see below). Examples of command-line software used by systemPipeR\u2019s preconfigured workflow templates (see below) include the aligners BWA-MEM[25], Bowtie2[26], TopHat2[27], HISAT2[28], as well as the peak\/variant callers MACS[29], GATK[30], and BCFtools.[11] Support for additional command-line NGS software can be added by simply providing the argument settings of a chosen software in a tabular param file. If appropriate, new param files can be permanently included in the package to share them with the community. Functionality for creating param files automatically will be provided in the future. This will allow users to create new param instances simply by providing an example of the command-line syntax of a chosen software tool. \nMajor advantages of running command-line software from within systemPipeR include: a uniform sample management infrastructure within and across workflows; integration of BatchJobs\u2019[31] efficient error management infrastructure for job submissions on computer clusters; the simplicity of restarting failed processes; as well as seamless addition of new samples (e.g., FASTQ or BAM files). In case of a restart, the system will skip the analysis steps of already completed samples and only perform the analysis of the missing ones. If required, any workflow step can be rerun on demand for all or a subset of samples. When submitting command-line software to computer clusters, BatchJobs monitors the status of job submissions and alerts users of exceptions, while recording warning and error messages for each process in a log directory with a database-like structure that is accessible from within R or the command-line. This organization helps to diagnose and resolve errors.\n\nParallel evaluation \nThe processing time for NGS experiments can be greatly reduced by making use of parallel evaluation across several CPU cores on single machines, or multiple nodes of computer clusters and cloud-based systems. systemPipeR simplifies these parallelization tasks without creating any limitations for users who do not have access to high-performance computer (HPC) resources by providing the option to run workflows in serial or parallel mode. The parallelization functionalities available in systemPipeR are largely based on existing and well maintained R packages, mainly BatchJobs and BiocParallel.[31] By making use of cluster template files, most schedulers and queuing systems are also supported (e.g., Torque, Sun Grid Engine, Slurm). If required, entire workflows can be executed in parallel mode by issuing a single command, while simultaneously generating a detailed analysis report (for details see below). If sufficient parallel computer resources are available, systemPipeR can complete the entire analysis workflow of several complex NGS experiments, each containing large numbers of FASTQ files, within hours rather than days or weeks, as can be the case for non-parallelized workflows.\n\nAutomated analysis reports \nsystemPipeR generates automated analysis reports with knitr and R markdown.[32] These modern reporting environments integrate R code with LaTeX or Markdown. During the evaluation of the R code, reports are dynamically generated in PDF or HTML format. A caching system allows to re-execute selected workflow reporting steps without repeating unnecessary components. This way one can generate reports that resemble a research paper where user generated text is combined with analysis results. This includes support for citations, autogenerated bibliographies, code chunks with syntax highlighting, and inline evaluation of variables to update text content. Data components in a report such as tables and figures are updated automatically when rebuilding the document and\/or rerunning workflows partially or entirely.\n\nResults and discussion \nOverview \nsystemPipeR provides utilities for building and running NGS analysis workflows. To adapt to community standards, widely used R\/Bioconductor packages are integrated where possible. This includes the Bioconductor packages ShortRead, Biostrings, and Rsamtools for processing sequence and alignment files[33]; GenomicRanges, GenomicAlignments, and GenomicFeatures for handling genomic range operations, read counting, and annotation data[12]; edgeR and DESeq2 for differential abundance analysis[6][7]; and VariantTools and VariantAnnotation for filtering and annotating genome variants.[34] If necessary, one can substitute most of these packages with alternative R or command-line tools. \nBecause many NGS applications share overlapping analysis needs (Fig. 2 a), certain workflow steps are conceptualized in systemPipeR by a single generic function, with support for application-specific parameter settings (Table 1). For instance, most NGS applications involve a short read alignment step (see Fig. 2 b), but with very distinct mapping requirements, such as splice junction awareness and variant tolerance for RNA-Seq and VAR-Seq, respectively. To simplify their execution for the user, the different aligners can be run with the same runCommandline function where the software, and its parameter settings are specified in the corresponding SYSargs instance (see above and Fig. 1).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2. Workflow Steps and Graphical Features. Relevant workflow steps of several NGS applications (a) are illustrated in form of a simplified flowchart (b). Examples of systemPipeR\u2019s functionalities are given under (c) including: (1) eight different plots for summarizing the quality and diversity of short reads provided as FASTQ files; (2) strand-specific read count summaries for all feature types provided by a genome annotation; (3) summary plots of read depth coverage for any number of transcripts with nucleotide resolution upstream\/downstream of their start and stop codons, as well as binned coverage for their coding regions; (4) enumeration of up- and down-regulated DEGs for user defined sample comparisons; (5) similarity clustering of sample profiles; (6) 2-5-way Venn diagrams for DEGs, peak and variant sets; (7) gene-wise clustering with a wide range of algorithms; and (8) support for plotting read pileups and variants in the context of genome annotations along with genome browser support.\n\n\n\n\n\n\n\n\n\nTable 1. Selected functions. The table lists a subset of over 50 methods and functions defined by systemPipeR. Usage instructions are provided in the corresponding help pages and vignettes of the package.\n\n\n\nFunction name\n\nDescription\n\n\ngenWorkenvir\n\nGenerates workflow templates provided by systemPipeRdata helper package\n\n\nsystemArgs\n\nConstructs SYSargs workflow control module (S4 object) from targets and param files\n\n\nrunCommandline\n\nExecutes command-line software on samples and parameters specified in SYSargs\n\n\nclusterRun\n\nRuns command-line software in parallel mode on a computer cluster\n\n\nclusterRun\n\nRuns command-line software in parallel mode on a computer cluster\n\n\npreprocessReads\n\nFiltering and\/or trimming of short reads using predefined or custom parameters\n\n\nseeFASTQ\/seeFASTQplot\n\nGenerates quality reports for any number of FASTQ files\n\n\nalignStats\n\nGenerates alignment statistics, such as total number of reads and alignment frequency\n\n\nrun_edgeR\/run_DESeq2\n\nRuns edgeR or DESeq2 for any number of pairwise sample comparisons\n\n\nfilterDEGs\n\nFilters and plots DEG results based on user-defined parameters\n\n\noverLapper\/vennPlot\n\nComputation of Venn intersects for 2-20 or more samples and 2-5 way Venn diagrams\n\n\nGOCluster_Report\n\nGO term enrichment analysis for large numbers of gene sets\n\n\nvariantReport\n\nGenerates a variant report containing genomic annotations and confidence statistics\n\n\npredORF\n\nPrediction of short open reading frames in DNA sequences\n\n\nfeaturetypeCounts\n\nComputes and plots read distribution for many feature types at once\n\n\nfeatureCoverage\n\nComputes and plots read depth coverage from many transcripts\n\n\n\nWorkflow templates \nsystemPipeR also provides end-to-end workflow templates for RNA-Seq, Ribo-Seq, ChIP-Seq, and VAR-Seq analysis. A detailed vignette (manual) is provided for each workflow, while an overview vignette introduces the general design concepts. Templates for additional NGS applications will be made available in the future. To test workflows quickly or design new ones from existing templates, users can generate with a single command (genWorkenvir) workflow instances fully populated with sample data and parameter files required for running a chosen workflow. The corresponding sample data are provided by the affiliated data package systemPipeRdata, also available from Bioconductor. To illustrates the utilities of systemPipeR\u2019s workflow templates, a case study has been included as Additional file 1 that guides the reader through the most important steps of a sample workflow. A typical gene-level RNA-Seq analysis was chosen here because it is currently one of the most widely used applications in the NGS field.\n\nAdd-on tools \nIn addition to providing a framework for running NGS analysis workflows, systemPipeR includes many functions and methods that expand and enhance its workflows. The following gives selected examples of these utilities (also illustrated in Fig. 2 c and Table 1). A read pre-processor function (preprocessReads) addresses the often very sophisticated quality filtering and adaptor trimming needs of specialized NGS applications such as Ribo-Seq or smallRNA-Seq. The functions seeFastq and seeFastqPlot generate and plot detailed quality reports for FASTQ files (Fig. 2 c1). These reports are easy to generate and designed to facilitate the visual inspection of large numbers of FASTQ files in a single report. The featuretypeCounts function computes and plots the distribution of reads across all features available in a genome annotation rather than just a single one (Fig. 2 c2). The featureCoverage function generates from genome-level alignments read depth coverage summaries for all or a subset of transcripts with nucleotide resolution upstream\/downstream of their start and stop codons, as well as binned coverage for their coding regions (Fig. 2 c3). Additional utilities include functions to automate the analysis of differentially expressed genes (DEGs) with edgeR or DESeq2 (Fig. 2 c4), to compute Venn intersects for large numbers of sample sets (e.g. 2-20 or as many as available memory allows) with plotting functionalities for 2-5 way Venn diagrams (Fig. 2 c6), and to run gene set enrichment analyses in batch mode on large numbers of gene sets. The modular design of the systemPipeR environment allows users to easily substitute any of these built-in tools with alternative R-based or command-line software, such as using FastQC[35], FASTX-Toolkit[36], or MultiQC[37] for quality reporting, read trimming or result aggregation, respectively.\n\nPerformance and scalability \nsystemPipeR has been optimized to run workflows in a time and memory efficient manner even on very large read sets from complex genomes (e.g., mammalian genomes). This is achieved by making heavy use of indexing, file streaming, and parallelization functionalities. For instance, users can limit the RAM requirements of several workflow steps by specifying the maximum number of reads or alignments to stream into memory at any time. This enables analysis of very large files with tens of GBs of storage space on systems with limited RAM resources, making it possible to run systemPipeR workflows even on laptops or smaller workstations, provided they have the required software installed and enough disk space available for storing large NGS input and result files. The processing time of non-parallelized analysis steps depends on the time performance of a specific software tool chosen for a workflow step. For instance, in the RNA-Seq workflow described under Additional file 1 the alignment step will run on a single sample (FASTQ file) with the native time performance of the chosen aligner Bowtie2\/Tophat2. Using the much faster HISAT2 aligner instead would accelerate the alignment step proportionally to the time improvements provided by this aligner without the need of additional parallel computer resources.[28] \nOn a computer cluster, parallelized systemPipeR workflows scale nearly linearly in time with the number of sample files (i.e., FASTQ files) since every step can be parallelized at the sample level. In practice this means, the runtime of an analysis of 100 FASTQ files can be accelerated by 10 or 100 fold when using instead of a single CPU core 10 or 100 CPU cores, respectively. For example, the RNA-Seq workflow in Additional file 1 can process 100 FASTQ files, each with 30\u201340 million reads from a mammalian genome, in six to eight hours using 100 CPU cores (CPU Model: AMD 6376, 2.3 GHz) and a maximum RAM requirement of less than 10 GB per node. Since the alignment step with Bowtie2\/Tophat2 accounts for most of the compute time of the entire workflow, the use of faster RNA-Seq aligners, such as Rsubread or HISAT2, can reduce the compute time to less than three hours. With comparable parallel computer resources available, one can complete with systemPipeR the end-to-end analysis of several complex NGS experiments each containing 50\u2013100 FASTQ files in less than a day rather than many days or weeks as is common in non-parallelized workflows.\n\nNeed for an R-based NGS workflow environment \nSeveral related software tools with NGS workflow functionality are available. These include Galaxy[15][38], Snakemake[16], Taverna[17], BioBlend[39], bcbio-nextgen[18], Knime[19], Ruffus[20], Kepler[21], Wasp[22], ViennaNGS[23], Mercury[24], RAP[40], and LONI[41] among others. Additionally, general purpose utilities for workflow management and design are provided by Rabix[42] and WDL.[43] \nThese tools provide infrastructure for streamlining the analysis of NGS data in a variety of data analysis environments and computer languages. However, only limited resources are available for designing and running analysis workflows for a wide range of NGS applications directly from within R as is possible with systemPipeR. One of the few exceptions is QuasR.[44] This Bioconductor package supports the initial analysis steps of several NGS applications, but it lacks an interface to integrate external command-line software and functionalities to build new workflows. Other existing R\/Bioconductor resources for analyzing NGS data address the needs in this area only partially. For instance, many of them are limited to certain NGS applications, or cover only a subset of the processing steps required for complete workflows; do not support command-line software; or lack workflow design functionalities for different NGS applications. systemPipeR has been designed to address these requirements. However, it is important to mention here that well established community workflow environments like Galaxy provide several additional features not available in systemPipeR. A small sub-selection of them includes: (i) a web interface to support non-expert users who are not familiar with data analysis programming environments like R; (ii) support for a wider range of data types outside of the NGS field; (iii) a well-established infrastructure and community for archiving and sharing workflow protocols; or (iv) support for additional reporting technologies such as iPython notebooks. To take advantage of this powerful infrastructure, Galaxy compatible versions of systemPipeR\u2019s NGS workflows will be released in the future. This will allow biologists to run them from an easy-to-use web interface, while also being able to access additional functionalities available in Galaxy\u2019s large ecosystem of analysis tools.\n\nConclusion \nThe systemPipeR package unites R\/Bioconductor resources with external command-line software to standardize and automate the analysis of a wide range of NGS applications. Its functionalities reduce the complexity and time required to translate NGS data into interpretable research results, while a built-in reporting feature improves reproducibility. The environment provides sufficient flexibility to choose the optimal software for each step in complex NGS workflows, customize workflows, and design new workflows. Pre-configured workflow templates are included for several NGS applications. Templates for additional NGS applications are under development and will be added to the package in the near future.\n\nAvailability and requirements \nProject name: systemPipeR workflow environment\nProject home page: https:\/\/bioconductor.org\/packages\/systemPipeR\/\nArchived version: systemPipeR\nOperating system(s): Platform-independent\nProgramming language: R\nOther requirements: R version \u22653.2, Bioconductor version \u22653.2\nLicense: Artistic-2-0\nAny restrictions to use by non-academics: none\n\nAbbreviations \nBAM: Binary version of sequence alignment map format\nChIP-Seq: Chromatin immunoprecipitation sequencing\nDEG: Differentially expressed genes\nFASTQ: short read sequence file format\nNGS: Next generation sequencing\nRibo-Seq: NGS profiling of mRNA populations bound to ribosomes\nRNA-Seq: NGS profiling of mRNA\nSAM: Sequence alignment map format\nVAR-Seq: NGS-based variant detection\n\nDeclarations \nAcknowledgements \nWe acknowledge the Bioconductor core team and community for providing valuable input for developing systemPipeR.\n\nFunding \nThis work was supported by grants from the National Science Foundation (ABI-0957099, MCB-1021969, IOS-1546879), the National Institutes of Health (U24AG051129, R01-AI36959), and the National Institute of Food and Agriculture (2011-68004-30154).\n\nAuthors\u2019 contributions \nTB and TG conceived the idea for systemPipeR. TG developed the methods, implemented the R package, and wrote the article. Both authors read and approved the final manuscript.\n\nCompeting interests \nThe authors declare that they have no competing interests.\n\nAdditional files \nAdditional file 1: RNA-Seq Workflow Example. Case study to illustrate the utilities of systemPipeR using an RNA-Seq workflow as example. (PDF 89 kb)\n\nReferences \n\n\n\u2191 Kalisky, T.; Quake, S.R.&#32;(2011).&#32;\"Single-cell genomics\".&#32;Nature Methods&#32;8&#32;(4): 311\u20134.&#32;doi:10.1038\/nmeth0411-311.&#32;PMID&#160;21451520. &#160; \n\n\u2191 Trapnell, C.; Cacchiarelli, D.; Grimsby, J. et al.&#32;(2014).&#32;\"The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells\".&#32;Nature Biotechnology&#32;32&#32;(4): 381\u201386.&#32;doi:10.1038\/nbt.2859.&#32;PMC&#160;PMC4122333.&#32;PMID&#160;24658644.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4122333 . &#160; \n\n\u2191 Lindblad-Toh, K.; Garber, M.; Zuk, O. et al.&#32;(2011).&#32;\"A high-resolution map of human evolutionary constraint using 29 mammals\".&#32;Nature&#32;478&#32;(7370): 476\u201382.&#32;doi:10.1038\/nature10530.&#32;PMC&#160;PMC3207357.&#32;PMID&#160;21993624.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3207357 . &#160; \n\n\u2191 Kato-Maeda, M.; Ho, C.; Passarelli, B. et al.&#32;(2013).&#32;\"Use of whole genome sequencing to determine the microevolution of Mycobacterium tuberculosis during an outbreak\".&#32;PLoS One&#32;8&#32;(3): e58235.&#32;doi:10.1371\/journal.pone.0058235.&#32;PMC&#160;PMC3589338.&#32;PMID&#160;23472164.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3589338 . &#160; \n\n\u2191 Holt, R.A.; Jones, S.J.&#32;(2008).&#32;\"The new paradigm of flow cell sequencing\".&#32;Genome Research&#32;18&#32;(6): 839-46.&#32;doi:10.1101\/gr.073262.107.&#32;PMID&#160;18519653. &#160; \n\n\u2191 6.0 6.1 Robinson, M.D.; McCarthy, D.J.; Smyth, G.K.&#32;(2010).&#32;\"edgeR: A Bioconductor package for differential expression analysis of digital gene expression data\".&#32;Bioinformatics&#32;26&#32;(1): 139\u201340.&#32;doi:10.1093\/bioinformatics\/btp616.&#32;PMC&#160;PMC2796818.&#32;PMID&#160;19910308.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2796818 . &#160; \n\n\u2191 7.0 7.1 Love, M.I.; Huber, W.; Anders, S.&#32;(2014).&#32;\"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2\".&#32;Genome Biology&#32;15&#32;(12): 550.&#32;doi:10.1186\/s13059-014-0550-8.&#32;PMC&#160;PMC4302049.&#32;PMID&#160;25516281.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4302049 . &#160; \n\n\u2191 Kharchenko, P.V.; Tolstorukov, M.Y.; Park, P.J.&#32;(2008).&#32;\"Design and analysis of ChIP-seq experiments for DNA-binding proteins\".&#32;Nature Biotechnology&#32;26&#32;(12): 1351\u20139.&#32;doi:10.1038\/nbt.1508.&#32;PMC&#160;PMC2597701.&#32;PMID&#160;19029915.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2597701 . &#160; \n\n\u2191 Akalin, A.; Kormaksson, M.; Li, S. et al.&#32;(2012).&#32;\"methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles\".&#32;Genome Biology&#32;13&#32;(10): R87.&#32;doi:10.1186\/gb-2012-13-10-r87.&#32;PMC&#160;PMC3491415.&#32;PMID&#160;23034086.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3491415 . &#160; \n\n\u2191 10.0 10.1 Huber, W.; Carey, V.J.; Gentleman, R. et al.&#32;(2015).&#32;\"Orchestrating high-throughput genomic analysis with Bioconductor\".&#32;Nature Methods&#32;12&#32;(2): 115\u201321.&#32;doi:10.1038\/nmeth.3252.&#32;PMC&#160;PMC4509590.&#32;PMID&#160;25633503.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4509590 . &#160; \n\n\u2191 11.0 11.1 Li, H.; Handsaker, B.; Wysoker, A. et al.&#32;(2009).&#32;\"The Sequence Alignment\/Map format and SAMtools\".&#32;Bioinformatics&#32;25&#32;(16): 2078\u20139.&#32;doi:10.1093\/bioinformatics\/btp352.&#32;PMC&#160;PMC2723002.&#32;PMID&#160;19505943.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2723002 . &#160; \n\n\u2191 12.0 12.1 Lawrence, M.; Huber, W.; Pag\u00e8s, H. et al.&#32;(2013).&#32;\"Software for computing and annotating genomic ranges\".&#32;PLoS Computational Biology&#32;9&#32;(8): e1003118.&#32;doi:10.1371\/journal.pcbi.1003118.&#32;PMC&#160;PMC3738458.&#32;PMID&#160;23950696.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3738458 . &#160; \n\n\u2191 Quinlan, A.R.; Hall, I.M.&#32;(2010).&#32;\"BEDTools: a flexible suite of utilities for comparing genomic features\".&#32;Bioinformatics&#32;26&#32;(6): 841-2.&#32;doi:10.1093\/bioinformatics\/btq033.&#32;PMC&#160;PMC2832824.&#32;PMID&#160;20110278.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2832824 . &#160; \n\n\u2191 Durinck, S.; Moreau, Y.; Kasprzyk, A.&#32;(2005).&#32;\"BioMart and Bioconductor: A powerful link between biological databases and microarray data analysis\".&#32;Bioinformatics&#32;21&#32;(16): 3439-40.&#32;doi:10.1093\/bioinformatics\/bti525.&#32;PMID&#160;16082012. &#160; \n\n\u2191 15.0 15.1 Goecks, Jeremey; Nekrutenko, Anton; Taylor, James; The Galaxy Team&#32;(2010).&#32;\"Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences\".&#32;Genome Biology&#32;11&#32;(8): R86.&#32;doi:10.1186\/gb-2010-11-8-r86.&#32;PMC&#160;PMC2945788.&#32;PMID&#160;20738864.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2945788 . &#160; \n\n\u2191 16.0 16.1 K\u00f6ster, J.; Rahmann, S.&#32;(2012).&#32;\"Snakemake: A scalable bioinformatics workflow engine\".&#32;Bioinformatics&#32;28&#32;(19): 2520-2.&#32;doi:10.1093\/bioinformatics\/bts480.&#32;PMID&#160;22908215. &#160; \n\n\u2191 17.0 17.1 Wolstencroft, K.; Haines, R.; Fellows, D. et al.&#32;(2013).&#32;\"The Taverna workflow suite: Designing and executing workflows of Web Services on the desktop, web or in the cloud\".&#32;Nucleic Acids Research&#32;41&#32;(W1): W557-W561.&#32;doi:10.1093\/nar\/gkt328.&#32;PMC&#160;PMC3692062.&#32;PMID&#160;23640334.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3692062 . &#160; \n\n\u2191 18.0 18.1 Guimera, R.V.&#32;(2012).&#32;\"bcbio-nextgen: Automated, distributed next-gen sequencing pipeline\".&#32;ENBnet Journal&#32;17&#32;(B): 30.&#32;doi:10.14806\/ej.17.B.286. &#160; \n\n\u2191 19.0 19.1 Warr, W.A.&#32;(2012).&#32;\"Scientific workflow systems: Pipeline Pilot and KNIME\".&#32;Journal of Computer-aided Molecular Design&#32;26&#32;(7): 801\u20134.&#32;doi:10.1007\/s10822-012-9577-7.&#32;PMC&#160;PMC3414708.&#32;PMID&#160;22644661.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3414708 . &#160; \n\n\u2191 20.0 20.1 Goodstadt, L.&#32;(2010).&#32;\"Ruffus: A lightweight Python library for computational pipelines\".&#32;Bioinformatics&#32;26&#32;(21): 2778-9.&#32;doi:10.1093\/bioinformatics\/btq524.&#32;PMID&#160;20847218. &#160; \n\n\u2191 21.0 21.1 Stropp, T.; McPhillips, T.; Lud\u00e4scher, B.; Bieda, M.&#32;(2012).&#32;\"Workflows for microarray data processing in the Kepler environment\".&#32;BMC Bioinformatics&#32;13: 102.&#32;doi:10.1186\/1471-2105-13-102.&#32;PMC&#160;PMC3431220.&#32;PMID&#160;22594911.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3431220 . &#160; \n\n\u2191 22.0 22.1 McLellan, A.S.; Dubin, R.; Jing, Q. et al.&#32;(2012).&#32;\"The Wasp System: An open source environment for managing and analyzing genomic data\".&#32;Genomics&#32;100&#32;(6): 345-51.&#32;doi:10.1016\/j.ygeno.2012.08.005.&#32;PMID&#160;22944616. &#160; \n\n\u2191 23.0 23.1 Wolfinger, M.T.; Fallmann, J.; Eggenhofer, F.; Amman, F.&#32;(2015).&#32;\"ViennaNGS: A toolbox for building efficient next-generation sequencing analysis pipelines\".&#32;F1000Research&#32;4: 50.&#32;doi:10.12688\/f1000research.6157.2.&#32;PMC&#160;PMC4513691.&#32;PMID&#160;26236465.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4513691 . &#160; \n\n\u2191 24.0 24.1 Reid, J.G.; Carroll, A.; Veeraraghavan, N. et al.&#32;(2014).&#32;\"Launching genomics into the cloud: Deployment of Mercury, a next generation sequence analysis pipeline\".&#32;BMC Bioinformatics&#32;15: 30.&#32;doi:10.1186\/1471-2105-15-30.&#32;PMC&#160;PMC3922167.&#32;PMID&#160;24475911.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3922167 . &#160; \n\n\u2191 Li, H.&#32;(26 May 2013).&#32;\"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM\".&#32;arXiv.org.&#32;Cornell University Library.&#32;https:\/\/arxiv.org\/abs\/1303.3997 . &#160; \n\n\u2191 Langmead, B.; Salzberg, S.L.&#32;(2012).&#32;\"Fast gapped-read alignment with Bowtie 2\".&#32;Nature Methods&#32;9&#32;(4): 357-9.&#32;doi:10.1038\/nmeth.1923.&#32;PMC&#160;PMC3322381.&#32;PMID&#160;22388286.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3322381 . &#160; \n\n\u2191 Kim, D.; Pertea, G.; Trapnell, C. et al.&#32;(2013).&#32;\"TopHat2: Accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions\".&#32;Genome Biology&#32;14: R36.&#32;doi:10.1186\/gb-2013-14-4-r36.&#32;PMC&#160;PMC4053844.&#32;PMID&#160;23618408.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4053844 . &#160; \n\n\u2191 28.0 28.1 Kim, D.; Langmead, B.; Salzberg, S.L.&#32;(2015).&#32;\"HISAT: A fast spliced aligner with low memory requirements\".&#32;Nature Methods&#32;12&#32;(4): 357-60.&#32;doi:10.1038\/nmeth.3317.&#32;PMC&#160;PMC4655817.&#32;PMID&#160;25751142.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4655817 . &#160; \n\n\u2191 Zhang, Y.; Liu, T.; Meyer, C.A. et al.&#32;(2008).&#32;\"Model-based analysis of ChIP-Seq (MACS)\".&#32;Genome Biology&#32;9&#32;(9): R137.&#32;doi:10.1186\/gb-2008-9-9-r137.&#32;PMC&#160;PMC2592715.&#32;PMID&#160;18798982.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2592715 . &#160; \n\n\u2191 McKenna, A.; Hanna, M.; Banks, E. et al.&#32;(2010).&#32;\"The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data\".&#32;Genome Research&#32;20: 1297-1303.&#32;doi:10.1101\/gr.107524.110. &#160; \n\n\u2191 31.0 31.1 Bischl, B.; Lang, M.; Mersmann, O. et al.&#32;(2012).&#32;\"BatchJobs and BatchExperiments: Abstraction Mechanisms for Using R in Batch Environments\".&#32;Journal of Statistical Software&#32;64&#32;(11): 1\u201325.&#32;doi:10.18637\/jss.v064.i11. &#160; \n\n\u2191 Xie, Y.&#32;(2013).&#32;Dynamic Documents with R and knitr&#32;(1st ed.).&#32;Chapman and Hall\/CRC.&#32;pp.&#160;216.&#32;ISBN&#160;9781482203530. &#160; \n\n\u2191 Morgan, M.; Anders, S.; Lawrence, M. et al.&#32;(2009).&#32;\"ShortRead: A bioconductor package for input, quality assessment and exploration of high-throughput sequence data\".&#32;Bioinformatics&#32;25&#32;(19): 2607-8.&#32;doi:10.1093\/bioinformatics\/btp450.&#32;PMC&#160;PMC2752612.&#32;PMID&#160;19654119.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2752612 . &#160; \n\n\u2191 Obenchain, V.; Lawrence, M.; Carey, V. et al.&#32;(2014).&#32;\"VariantAnnotation: a Bioconductor package for exploration and annotation of genetic variants\".&#32;Bioinformatics&#32;30&#32;(14): 2076-8.&#32;doi:10.1093\/bioinformatics\/btu168.&#32;PMC&#160;PMC4080743.&#32;PMID&#160;24681907.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4080743 . &#160; \n\n\u2191 \"FastQC\".&#32;Babraham Bioinformatics.&#32;http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/ .&#32;Retrieved 15 September 2015 . &#160; \n\n\u2191 \"FASTX-Toolkit\".&#32;Hannon Laboratory.&#32;http:\/\/hannonlab.cshl.edu\/fastx_toolkit\/index.html .&#32;Retrieved 17 September 2015 . &#160; \n\n\u2191 Ewels, P.; Magnusson, M.; Lundin, S.; K\u00e4ller, M.&#32;(2016).&#32;\"MultiQC: summarize analysis results for multiple tools and samples in a single report\".&#32;Bioinformatics&#32;32&#32;(19): 3047\u20133048.&#32;doi:10.1093\/bioinformatics\/btw354.&#32;PMC&#160;PMC5039924.&#32;PMID&#160;27312411.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5039924 . &#160; \n\n\u2191 Afgan, E.; Baker, D.; Coraor, N. et al.&#32;(2011).&#32;\"Harnessing cloud computing with Galaxy Cloud\".&#32;Nature Biotechnology&#32;29&#32;(11): 972-4.&#32;doi:10.1038\/nbt.2028.&#32;PMC&#160;PMC3868438.&#32;PMID&#160;22068528.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3868438 . &#160; \n\n\u2191 Sloggett, C.; Goonasekera, N.; Afgan, E.&#32;(2013).&#32;\"BioBlend: automating pipeline analyses within Galaxy and CloudMan\".&#32;Bioinformatics&#32;29&#32;(13): 1685-6.&#32;doi:10.1093\/bioinformatics\/btt199.&#32;PMC&#160;PMC4288140.&#32;PMID&#160;23630176.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4288140 . &#160; \n\n\u2191 D'Antonio, M.; D'Onorio De Meo, P.; Pallocca, M. et al.&#32;(2015).&#32;\"RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application\".&#32;BMC Genomics&#32;16: S3.&#32;doi:10.1186\/1471-2164-16-S6-S3.&#32;PMC&#160;PMC4461013.&#32;PMID&#160;26046471.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4461013 . &#160; \n\n\u2191 Torri, F.; Dinov, I.D.; Zamanyan, A. et al.&#32;(2012).&#32;\"Next generation sequence analysis and computational genomics using graphical pipeline workflows\".&#32;Genes&#32;3&#32;(3): 545\u201375.&#32;doi:10.3390\/genes3030545.&#32;PMC&#160;PMC3490498.&#32;PMID&#160;23139896.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3490498 . &#160; \n\n\u2191 \"Rabix\".&#32;Seven Bridges Genomics, Inc.&#32;http:\/\/rabix.io\/ . &#160; \n\n\u2191 Broad Institute.&#32;\"broadinstitute\/wdl\".&#32;GitHub.&#32;https:\/\/github.com\/broadinstitute\/wdl .&#32;Retrieved 16 September 2015 . &#160; \n\n\u2191 Gaidatzis, D.; Lerch, A.; Hahne, F. et al.&#32;(2015).&#32;\"QuasR: Quantification and annotation of short reads in R\".&#32;Bioinformatics&#32;31: 7.&#32;doi:10.1093\/bioinformatics\/btu781.&#32;PMC&#160;PMC4382904.&#32;PMID&#160;25417205.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4382904 . &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original URL to Rabix was dead, and it was replaced with a current one for this version.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\">https:\/\/www.limswiki.org\/index.php\/Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on bioinformaticsLIMSwiki journal articles on software\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 27 August 2018, at 20:15.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 120 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","d6135e8d32b77d11c05c7b261fe72044_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_SystemPipeR_NGS_workflow_and_report_generation_environment skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:systemPipeR: NGS workflow and report generation environment<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\"><p><span><\/span>\n<\/p>\n\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p><b>Background<\/b>: Next-generation sequencing (NGS) has revolutionized how research is carried out in many areas of biology and medicine. However, the analysis of NGS data remains a major obstacle to the efficient utilization of the technology, as it requires complex multi-step processing of big data, demanding considerable computational expertise from users. While substantial effort has been invested on the development of software dedicated to the individual analysis steps of NGS experiments, insufficient resources are currently available for integrating the individual software components within the widely used R\/Bioconductor environment into automated <a href=\"https:\/\/www.limswiki.org\/index.php\/Workflow\" title=\"Workflow\" target=\"_blank\" class=\"wiki-link\" data-key=\"92bd8748272e20d891008dcb8243e8a8\">workflows<\/a> capable of running the analysis of most types of NGS applications from start-to-finish in a time-efficient and reproducible manner.\n<\/p><p><b>Results<\/b>: To address this need, we have developed the R\/Bioconductor package systemPipeR. It is an extensible environment for both building and running end-to-end analysis workflows with automated report generation for a wide range of NGS applications. Its unique features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software on local computers and computer clusters. A flexible sample annotation infrastructure efficiently handles complex sample sets and experimental designs. To simplify the analysis of widely used NGS applications, the package provides pre-configured workflows and reporting templates for RNA-Seq, ChIP-Seq, VAR-Seq, and Ribo-Seq. Additional workflow templates will be provided in the future.\n<\/p><p><b>Conclusions<\/b>: systemPipeR accelerates the extraction of reproducible analysis results from NGS experiments. By combining the capabilities of many R\/Bioconductor and command-line tools, it makes efficient use of existing software resources without limiting the user to a set of predefined methods or environments. systemPipeR is freely available for all common operating systems from Bioconductor (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/bioconductor.org\/packages\/devel\/systemPipeR\" target=\"_blank\">http:\/\/bioconductor.org\/packages\/devel\/systemPipeR<\/a>).\n<\/p><p><b>Keywords<\/b>: analysis workflow, next generation sequencing (NGS), Ribo-Seq, ChIP-Seq, RNA-Seq, VAR-Seq\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Background\">Background<\/span><\/h2>\n<p>By allowing scientists to rapidly sequence and quantify DNA and RNA molecules, next-generation sequencing (NGS) technology has transformed biology into one of the most data intensive research disciplines. In the past, experiments have been performed on a gene-by-gene basis, while NGS has introduced an age where it is has become a routine to sequence entire transcriptomes, genomes, or epigenomes rather than their isolated parts of interest. It will soon be possible to conduct these experiments on large numbers of single cell samples<sup id=\"rdp-ebb-cite_ref-KaliskySingle11_1-0\" class=\"reference\"><a href=\"#cite_note-KaliskySingle11-1\" rel=\"external_link\">[1]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-TrapnellTheDynamics14_2-0\" class=\"reference\"><a href=\"#cite_note-TrapnellTheDynamics14-2\" rel=\"external_link\">[2]<\/a><\/sup> for a wide range of time points, treatments, and genetic backgrounds to study biological systems with greater resolution and precision. Sequencing the genetic material of each individual within entire populations of organisms of the same species or genus will enable the study of adaptation processes<sup id=\"rdp-ebb-cite_ref-Lindblad-TohAHigh11_3-0\" class=\"reference\"><a href=\"#cite_note-Lindblad-TohAHigh11-3\" rel=\"external_link\">[3]<\/a><\/sup>, disease progression, and micro-evolution in real time.<sup id=\"rdp-ebb-cite_ref-Kato-MaedaUseOf13_4-0\" class=\"reference\"><a href=\"#cite_note-Kato-MaedaUseOf13-4\" rel=\"external_link\">[4]<\/a><\/sup> This technological shift empowers researchers to address questions at a <a href=\"https:\/\/www.limswiki.org\/index.php\/Genomics\" title=\"Genomics\" target=\"_blank\" class=\"wiki-link\" data-key=\"96a82dabf51cf9510dd00c5a03396c44\">genome-wide<\/a> scale, for example by profiling the mRNA, miRNA, and DNA methylation states of a large set of biological samples in parallel.<sup id=\"rdp-ebb-cite_ref-HoltTheNew08_5-0\" class=\"reference\"><a href=\"#cite_note-HoltTheNew08-5\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p>The success of NGS-driven research has led to a data explosion of increasing size and complexity, making it now more time-consuming and challenging for researchers to extract knowledge from their experiments. Rapid processing of the results is essential to test, refine, and formulate new hypotheses for designing follow-up experiments. As a result, biologists have to dedicate nowadays substantial time to <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">data analysis<\/a> tasks while training themselves effectively as <a href=\"https:\/\/www.limswiki.org\/index.php\/Genome_informatics\" title=\"Genome informatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"304bd6698e9a01cab6157ba518743ff1\">genome data scientists<\/a> rather than focusing on experimentation as they used to in the past.\n<\/p><p>In recent years, a considerable number of algorithms, statistical methods, and software tools has been developed to perform the individual analysis steps of different NGS applications. These include short read pre-processors, aligners, variant and peak callers, as well as statistical methods for the analysis of genomic regions that are differentially expressed<sup id=\"rdp-ebb-cite_ref-RobinsonEdgeR10_6-0\" class=\"reference\"><a href=\"#cite_note-RobinsonEdgeR10-6\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LoveModerated14_7-0\" class=\"reference\"><a href=\"#cite_note-LoveModerated14-7\" rel=\"external_link\">[7]<\/a><\/sup>, bound<sup id=\"rdp-ebb-cite_ref-KharchenkoDesign08_8-0\" class=\"reference\"><a href=\"#cite_note-KharchenkoDesign08-8\" rel=\"external_link\">[8]<\/a><\/sup>, or methylated.<sup id=\"rdp-ebb-cite_ref-AkalinMethyl12_9-0\" class=\"reference\"><a href=\"#cite_note-AkalinMethyl12-9\" rel=\"external_link\">[9]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HuberOrch15_10-0\" class=\"reference\"><a href=\"#cite_note-HuberOrch15-10\" rel=\"external_link\">[10]<\/a><\/sup> Also essential are tools for processing short read alignments<sup id=\"rdp-ebb-cite_ref-LiTheSeq09_11-0\" class=\"reference\"><a href=\"#cite_note-LiTheSeq09-11\" rel=\"external_link\">[11]<\/a><\/sup>, genomic intervals<sup id=\"rdp-ebb-cite_ref-LawrenceSoft13_12-0\" class=\"reference\"><a href=\"#cite_note-LawrenceSoft13-12\" rel=\"external_link\">[12]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-QuinlanBED10_13-0\" class=\"reference\"><a href=\"#cite_note-QuinlanBED10-13\" rel=\"external_link\">[13]<\/a><\/sup>, and annotations.<sup id=\"rdp-ebb-cite_ref-DurinckBioMart_14-0\" class=\"reference\"><a href=\"#cite_note-DurinckBioMart-14\" rel=\"external_link\">[14]<\/a><\/sup> However, most data analysis routines of NGS applications are very complex, involving multiple software tools for their many processing steps. As a result, there is a great need for flexible software environments connecting the individual software components to automated workflows in order to perform complex genome-wide analyses in an efficient and reproducible manner. While many workflow management resources exist<sup id=\"rdp-ebb-cite_ref-GoecksGalaxy10_15-0\" class=\"reference\"><a href=\"#cite_note-GoecksGalaxy10-15\" rel=\"external_link\">[15]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-K.C3.B6sterSnake12_16-0\" class=\"reference\"><a href=\"#cite_note-K.C3.B6sterSnake12-16\" rel=\"external_link\">[16]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WolstencroftTheTav13_17-0\" class=\"reference\"><a href=\"#cite_note-WolstencroftTheTav13-17\" rel=\"external_link\">[17]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-GuimeraBcbio12_18-0\" class=\"reference\"><a href=\"#cite_note-GuimeraBcbio12-18\" rel=\"external_link\">[18]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WarrScient12_19-0\" class=\"reference\"><a href=\"#cite_note-WarrScient12-19\" rel=\"external_link\">[19]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-GoodstadtRuffus10_20-0\" class=\"reference\"><a href=\"#cite_note-GoodstadtRuffus10-20\" rel=\"external_link\">[20]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-StroppWorkflows12_21-0\" class=\"reference\"><a href=\"#cite_note-StroppWorkflows12-21\" rel=\"external_link\">[21]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-McLellanTheWasp12_22-0\" class=\"reference\"><a href=\"#cite_note-McLellanTheWasp12-22\" rel=\"external_link\">[22]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WolfingerVienna15_23-0\" class=\"reference\"><a href=\"#cite_note-WolfingerVienna15-23\" rel=\"external_link\">[23]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ReidLaunch14_24-0\" class=\"reference\"><a href=\"#cite_note-ReidLaunch14-24\" rel=\"external_link\">[24]<\/a><\/sup> for a variety of data analysis programming languages (for details see below), only insufficient general purpose NGS workflow solutions are currently available for the popular <a href=\"https:\/\/www.limswiki.org\/index.php\/R_(programming_language)\" title=\"R (programming language)\" target=\"_blank\" class=\"wiki-link\" data-key=\"1b0aa598f071aca4c5b4ee08d8bb2bde\">R programming language<\/a>. R and the affiliated Bioconductor environment provide a substantial number of widely used tools with a large user base in this area.<sup id=\"rdp-ebb-cite_ref-HuberOrch15_10-1\" class=\"reference\"><a href=\"#cite_note-HuberOrch15-10\" rel=\"external_link\">[10]<\/a><\/sup> Thus, a workflow framework for federating NGS applications from within R will have many benefits for experimental and computational scientists who use R for NGS data analysis.\n<\/p><p>To address this need, we designed systemPipeR as a Bioconductor package for building and running workflows for most NGS applications, with support for integrating a wide array of command-line and R\/Bioconductor software.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Implementation\">Implementation<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Environment\">Environment<\/span><\/h3>\n<p>systemPipeR has been implemented as an open-source Bioconductor package using the R programming language for statistical computing and graphics. R was chosen as the core development platform for systemPipeR because of the following reasons. (i) R is currently one of the most popular statistical data analysis and programming environments in <a href=\"https:\/\/www.limswiki.org\/index.php\/Bioinformatics\" title=\"Bioinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"8f506695fdbb26e3f314da308f8c053b\">bioinformatics<\/a>. (ii) Its external language bindings support the implementation of computationally time-consuming analysis steps in high-performance languages such as C\/C++. (iii) It supports advanced parallel computation on multi-core machines and computer clusters. (iv) A well developed infrastructure interfaces R with several other popular programing languages such as Python. (v) R provides advanced graphical and visualization utilities for scientific computing. (vi) It offers access to a vast landscape of statistical and machine learning tools. (vii) Its integration with the Bioconductor project promotes reusability of genomics software components, while also making efficient use of a large number of existing NGS packages that are well tested and widely used by the community. To support long-term reproducibility of analysis outcomes, systemPipeR is also distributed as a Docker image of Bioconductor\u2019s sequencing division. Docker containers provide an efficient solution for packaging complex software together with all its system dependencies to ensure it will run the same in the future across different environments, including different operating systems and <a href=\"https:\/\/www.limswiki.org\/index.php\/Cloud_computing\" title=\"Cloud computing\" target=\"_blank\" class=\"wiki-link\" data-key=\"fcfe5882eaa018d920cedb88398b604f\">cloud-based<\/a> solutions.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Workflow_design\">Workflow design<\/span><\/h3>\n<p>systemPipeR workflows (Fig. 1) can be run from start-to-finish with a single command, or stepwise in interactive mode from the R console. New workflows are constructed, or existing ones modified, by connecting so-called SYSargs workflow control modules (R S4 class). Each SYSargs module contains instructions needed for processing a set of input files with a specific command-line or R software; as well as the paths to the corresponding outputs generated by a specific NGS tool such as a read preprocessor (trimmed\/filtered FASTQ files), aligner (SAM\/BAM files), read counter, variant caller (VCF\/BCF files), peak caller (BED\/WIG files), or statistical function. Typically, the only input the user needs to provide for running workflows is a single tabular targets file containing the paths to the initial sample input files (e.g. FASTQ) along with sample labels, and if appropriate biological replicate and contrast information for controlling differential abundance analyses (e.g., gene expression). Downstream derivatives of these targets files along with the corresponding SYSargs instances (see Fig. 1) are created automatically within each workflow. \n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Backman_BMCBio2016_17.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"b5fec52d7e631c5525f7e26f1389ed82\"><img alt=\"Fig1 Backman BMCBio2016 17.gif\" src=\"https:\/\/www.limswiki.org\/images\/8\/89\/Fig1_Backman_BMCBio2016_17.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> Workflow steps with input\/output file operations are controlled by SYSargs objects. Each SYSargs instance is constructed from a targets and a param file. The only input required from the user is the initial targets file. Subsequent instances are created automatically. Any number of predefined or custom workflow steps is supported.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The parameters required for running command-line software are provided by parameter (param) files, described below. For R-based workflow steps, param files are not required but can be useful for operations importing and\/or exporting sample-level files. This modular design has several advantages. First, it provides a high level of flexibility for designing workflows, such as allowing the user to start workflows from the very beginning or anywhere in-between (e.g. FASTQ or BAM level). Second, it is straightforward to add custom workflow steps without requiring computational expert knowledge from users. Workflows can also have any number of steps including branch points. Lastly, it also minimizes errors as all input and output files are registered, and sample labels specified in the initial targets file will be consistently used throughout all workflow results, including plots, tables, and workflow reports.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Command-line_software_support\">Command-line software support<\/span><\/h3>\n<p>An important feature of systemPipeR is support for running command-line software directly from R on both single machines or computer clusters. This offers several advantages, such as seamless integration of most command-line software available in the NGS field with the extensive genome analysis resources provided by R\/Bioconductor. The user interface for running command-line software has been generalized as a single function for ease of use, while only one additional command will run the same tool in parallel mode on a computer cluster (see below). Examples of command-line software used by systemPipeR\u2019s preconfigured workflow templates (see below) include the aligners BWA-MEM<sup id=\"rdp-ebb-cite_ref-LiAligning13_25-0\" class=\"reference\"><a href=\"#cite_note-LiAligning13-25\" rel=\"external_link\">[25]<\/a><\/sup>, Bowtie2<sup id=\"rdp-ebb-cite_ref-LangmeadFast12_26-0\" class=\"reference\"><a href=\"#cite_note-LangmeadFast12-26\" rel=\"external_link\">[26]<\/a><\/sup>, TopHat2<sup id=\"rdp-ebb-cite_ref-KimTopHat2_13_27-0\" class=\"reference\"><a href=\"#cite_note-KimTopHat2_13-27\" rel=\"external_link\">[27]<\/a><\/sup>, HISAT2<sup id=\"rdp-ebb-cite_ref-KimHISAT15_28-0\" class=\"reference\"><a href=\"#cite_note-KimHISAT15-28\" rel=\"external_link\">[28]<\/a><\/sup>, as well as the peak\/variant callers MACS<sup id=\"rdp-ebb-cite_ref-ZhangModel08_29-0\" class=\"reference\"><a href=\"#cite_note-ZhangModel08-29\" rel=\"external_link\">[29]<\/a><\/sup>, GATK<sup id=\"rdp-ebb-cite_ref-McKennaTheGenome10_30-0\" class=\"reference\"><a href=\"#cite_note-McKennaTheGenome10-30\" rel=\"external_link\">[30]<\/a><\/sup>, and BCFtools.<sup id=\"rdp-ebb-cite_ref-LiTheSeq09_11-1\" class=\"reference\"><a href=\"#cite_note-LiTheSeq09-11\" rel=\"external_link\">[11]<\/a><\/sup> Support for additional command-line NGS software can be added by simply providing the argument settings of a chosen software in a tabular param file. If appropriate, new param files can be permanently included in the package to share them with the community. Functionality for creating param files automatically will be provided in the future. This will allow users to create new param instances simply by providing an example of the command-line syntax of a chosen software tool. \n<\/p><p>Major advantages of running command-line software from within systemPipeR include: a uniform sample management infrastructure within and across workflows; integration of BatchJobs\u2019<sup id=\"rdp-ebb-cite_ref-BischlBatch12_31-0\" class=\"reference\"><a href=\"#cite_note-BischlBatch12-31\" rel=\"external_link\">[31]<\/a><\/sup> efficient error management infrastructure for job submissions on computer clusters; the simplicity of restarting failed processes; as well as seamless addition of new samples (e.g., FASTQ or BAM files). In case of a restart, the system will skip the analysis steps of already completed samples and only perform the analysis of the missing ones. If required, any workflow step can be rerun on demand for all or a subset of samples. When submitting command-line software to computer clusters, BatchJobs monitors the status of job submissions and alerts users of exceptions, while recording warning and error messages for each process in a log directory with a database-like structure that is accessible from within R or the command-line. This organization helps to diagnose and resolve errors.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Parallel_evaluation\">Parallel evaluation<\/span><\/h3>\n<p>The processing time for NGS experiments can be greatly reduced by making use of parallel evaluation across several CPU cores on single machines, or multiple nodes of computer clusters and cloud-based systems. systemPipeR simplifies these parallelization tasks without creating any limitations for users who do not have access to high-performance computer (HPC) resources by providing the option to run workflows in serial or parallel mode. The parallelization functionalities available in systemPipeR are largely based on existing and well maintained R packages, mainly BatchJobs and BiocParallel.<sup id=\"rdp-ebb-cite_ref-BischlBatch12_31-1\" class=\"reference\"><a href=\"#cite_note-BischlBatch12-31\" rel=\"external_link\">[31]<\/a><\/sup> By making use of cluster template files, most schedulers and queuing systems are also supported (e.g., Torque, Sun Grid Engine, Slurm). If required, entire workflows can be executed in parallel mode by issuing a single command, while simultaneously generating a detailed analysis report (for details see below). If sufficient parallel computer resources are available, systemPipeR can complete the entire analysis workflow of several complex NGS experiments, each containing large numbers of FASTQ files, within hours rather than days or weeks, as can be the case for non-parallelized workflows.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Automated_analysis_reports\">Automated analysis reports<\/span><\/h3>\n<p>systemPipeR generates automated analysis reports with knitr and R markdown.<sup id=\"rdp-ebb-cite_ref-XieDynamic13_32-0\" class=\"reference\"><a href=\"#cite_note-XieDynamic13-32\" rel=\"external_link\">[32]<\/a><\/sup> These modern reporting environments integrate R code with LaTeX or Markdown. During the evaluation of the R code, reports are dynamically generated in PDF or HTML format. A caching system allows to re-execute selected workflow reporting steps without repeating unnecessary components. This way one can generate reports that resemble a research paper where user generated text is combined with analysis results. This includes support for citations, autogenerated bibliographies, code chunks with syntax highlighting, and inline evaluation of variables to update text content. Data components in a report such as tables and figures are updated automatically when rebuilding the document and\/or rerunning workflows partially or entirely.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Results_and_discussion\">Results and discussion<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Overview\">Overview<\/span><\/h3>\n<p>systemPipeR provides utilities for building and running NGS analysis workflows. To adapt to community standards, widely used R\/Bioconductor packages are integrated where possible. This includes the Bioconductor packages ShortRead, Biostrings, and Rsamtools for processing sequence and alignment files<sup id=\"rdp-ebb-cite_ref-MorganShortRead09_33-0\" class=\"reference\"><a href=\"#cite_note-MorganShortRead09-33\" rel=\"external_link\">[33]<\/a><\/sup>; GenomicRanges, GenomicAlignments, and GenomicFeatures for handling genomic range operations, read counting, and annotation data<sup id=\"rdp-ebb-cite_ref-LawrenceSoft13_12-1\" class=\"reference\"><a href=\"#cite_note-LawrenceSoft13-12\" rel=\"external_link\">[12]<\/a><\/sup>; edgeR and DESeq2 for differential abundance analysis<sup id=\"rdp-ebb-cite_ref-RobinsonEdgeR10_6-1\" class=\"reference\"><a href=\"#cite_note-RobinsonEdgeR10-6\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LoveModerated14_7-1\" class=\"reference\"><a href=\"#cite_note-LoveModerated14-7\" rel=\"external_link\">[7]<\/a><\/sup>; and VariantTools and VariantAnnotation for filtering and annotating genome variants.<sup id=\"rdp-ebb-cite_ref-ObenchainVariant14_34-0\" class=\"reference\"><a href=\"#cite_note-ObenchainVariant14-34\" rel=\"external_link\">[34]<\/a><\/sup> If necessary, one can substitute most of these packages with alternative R or command-line tools. \n<\/p><p>Because many NGS applications share overlapping analysis needs (Fig. 2 a), certain workflow steps are conceptualized in systemPipeR by a single generic function, with support for application-specific parameter settings (Table 1). For instance, most NGS applications involve a short read alignment step (see Fig. 2 b), but with very distinct mapping requirements, such as splice junction awareness and variant tolerance for RNA-Seq and VAR-Seq, respectively. To simplify their execution for the user, the different aligners can be run with the same runCommandline function where the software, and its parameter settings are specified in the corresponding SYSargs instance (see above and Fig. 1).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Backman_BMCBio2016_17.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"7f7545bf19942a2a766543d3517130be\"><img alt=\"Fig2 Backman BMCBio2016 17.gif\" src=\"https:\/\/www.limswiki.org\/images\/7\/7f\/Fig2_Backman_BMCBio2016_17.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> Workflow Steps and Graphical Features. Relevant workflow steps of several NGS applications (<b>a<\/b>) are illustrated in form of a simplified flowchart (<b>b<\/b>). Examples of systemPipeR\u2019s functionalities are given under (<b>c<\/b>) including: (1) eight different plots for summarizing the quality and diversity of short reads provided as FASTQ files; (2) strand-specific read count summaries for all feature types provided by a genome annotation; (3) summary plots of read depth coverage for any number of transcripts with nucleotide resolution upstream\/downstream of their start and stop codons, as well as binned coverage for their coding regions; (4) enumeration of up- and down-regulated DEGs for user defined sample comparisons; (5) similarity clustering of sample profiles; (6) 2-5-way Venn diagrams for DEGs, peak and variant sets; (7) gene-wise clustering with a wide range of algorithms; and (8) support for plotting read pileups and variants in the context of genome annotations along with genome browser support.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"2\"><b>Table 1.<\/b> Selected functions. The table lists a subset of over 50 methods and functions defined by systemPipeR. Usage instructions are provided in the corresponding help pages and vignettes of the package.\n<\/td><\/tr>\n\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Function name\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Description\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>genWorkenvir<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Generates workflow templates provided by systemPipeRdata helper package\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>systemArgs<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Constructs SYSargs workflow control module (S4 object) from targets and param files\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>runCommandline<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Executes command-line software on samples and parameters specified in SYSargs\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>clusterRun<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Runs command-line software in parallel mode on a computer cluster\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>clusterRun<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Runs command-line software in parallel mode on a computer cluster\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>preprocessReads<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Filtering and\/or trimming of short reads using predefined or custom parameters\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>seeFASTQ\/seeFASTQplot<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Generates quality reports for any number of FASTQ files\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>alignStats<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Generates alignment statistics, such as total number of reads and alignment frequency\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>run_edgeR\/run_DESeq2<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Runs edgeR or DESeq2 for any number of pairwise sample comparisons\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>filterDEGs<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Filters and plots DEG results based on user-defined parameters\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>overLapper\/vennPlot<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Computation of Venn intersects for 2-20 or more samples and 2-5 way Venn diagrams\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>GOCluster_Report<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GO term enrichment analysis for large numbers of gene sets\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>variantReport<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Generates a variant report containing genomic annotations and confidence statistics\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>predORF<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Prediction of short open reading frames in DNA sequences\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>featuretypeCounts<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Computes and plots read distribution for many feature types at once\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><tt>featureCoverage<\/tt>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Computes and plots read depth coverage from many transcripts\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Workflow_templates\">Workflow templates<\/span><\/h3>\n<p>systemPipeR also provides end-to-end workflow templates for RNA-Seq, Ribo-Seq, ChIP-Seq, and VAR-Seq analysis. A detailed vignette (manual) is provided for each workflow, while an overview vignette introduces the general design concepts. Templates for additional NGS applications will be made available in the future. To test workflows quickly or design new ones from existing templates, users can generate with a single command (genWorkenvir) workflow instances fully populated with sample data and parameter files required for running a chosen workflow. The corresponding sample data are provided by the affiliated data package systemPipeRdata, also available from Bioconductor. To illustrates the utilities of systemPipeR\u2019s workflow templates, a case study has been included as Additional file 1 that guides the reader through the most important steps of a sample workflow. A typical gene-level RNA-Seq analysis was chosen here because it is currently one of the most widely used applications in the NGS field.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Add-on_tools\">Add-on tools<\/span><\/h3>\n<p>In addition to providing a framework for running NGS analysis workflows, systemPipeR includes many functions and methods that expand and enhance its workflows. The following gives selected examples of these utilities (also illustrated in Fig. 2 c and Table 1). A read pre-processor function (<tt>preprocessReads<\/tt>) addresses the often very sophisticated quality filtering and adaptor trimming needs of specialized NGS applications such as Ribo-Seq or smallRNA-Seq. The functions <tt>seeFastq<\/tt> and <tt>seeFastqPlot<\/tt> generate and plot detailed quality reports for FASTQ files (Fig. 2 c1). These reports are easy to generate and designed to facilitate the visual inspection of large numbers of FASTQ files in a single report. The <tt>featuretypeCounts<\/tt> function computes and plots the distribution of reads across all features available in a genome annotation rather than just a single one (Fig. 2 c2). The <tt>featureCoverage<\/tt> function generates from genome-level alignments read depth coverage summaries for all or a subset of transcripts with nucleotide resolution upstream\/downstream of their start and stop codons, as well as binned coverage for their coding regions (Fig. 2 c3). Additional utilities include functions to automate the analysis of differentially expressed genes (DEGs) with edgeR or DESeq2 (Fig. 2 c4), to compute Venn intersects for large numbers of sample sets (e.g. 2-20 or as many as available memory allows) with plotting functionalities for 2-5 way Venn diagrams (Fig. 2 c6), and to run gene set enrichment analyses in batch mode on large numbers of gene sets. The modular design of the systemPipeR environment allows users to easily substitute any of these built-in tools with alternative R-based or command-line software, such as using FastQC<sup id=\"rdp-ebb-cite_ref-BBFastQC_35-0\" class=\"reference\"><a href=\"#cite_note-BBFastQC-35\" rel=\"external_link\">[35]<\/a><\/sup>, FASTX-Toolkit<sup id=\"rdp-ebb-cite_ref-HannonFASTX_36-0\" class=\"reference\"><a href=\"#cite_note-HannonFASTX-36\" rel=\"external_link\">[36]<\/a><\/sup>, or MultiQC<sup id=\"rdp-ebb-cite_ref-EwelsMultiQC16_37-0\" class=\"reference\"><a href=\"#cite_note-EwelsMultiQC16-37\" rel=\"external_link\">[37]<\/a><\/sup> for quality reporting, read trimming or result aggregation, respectively.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Performance_and_scalability\">Performance and scalability<\/span><\/h3>\n<p>systemPipeR has been optimized to run workflows in a time and memory efficient manner even on very large read sets from complex genomes (e.g., mammalian genomes). This is achieved by making heavy use of indexing, file streaming, and parallelization functionalities. For instance, users can limit the RAM requirements of several workflow steps by specifying the maximum number of reads or alignments to stream into memory at any time. This enables analysis of very large files with tens of GBs of storage space on systems with limited RAM resources, making it possible to run systemPipeR workflows even on laptops or smaller workstations, provided they have the required software installed and enough disk space available for storing large NGS input and result files. The processing time of non-parallelized analysis steps depends on the time performance of a specific software tool chosen for a workflow step. For instance, in the RNA-Seq workflow described under Additional file 1 the alignment step will run on a single sample (FASTQ file) with the native time performance of the chosen aligner Bowtie2\/Tophat2. Using the much faster HISAT2 aligner instead would accelerate the alignment step proportionally to the time improvements provided by this aligner without the need of additional parallel computer resources.<sup id=\"rdp-ebb-cite_ref-KimHISAT15_28-1\" class=\"reference\"><a href=\"#cite_note-KimHISAT15-28\" rel=\"external_link\">[28]<\/a><\/sup> \n<\/p><p>On a computer cluster, parallelized systemPipeR workflows scale nearly linearly in time with the number of sample files (i.e., FASTQ files) since every step can be parallelized at the sample level. In practice this means, the runtime of an analysis of 100 FASTQ files can be accelerated by 10 or 100 fold when using instead of a single CPU core 10 or 100 CPU cores, respectively. For example, the RNA-Seq workflow in Additional file 1 can process 100 FASTQ files, each with 30\u201340 million reads from a mammalian genome, in six to eight hours using 100 CPU cores (CPU Model: AMD 6376, 2.3 GHz) and a maximum RAM requirement of less than 10 GB per node. Since the alignment step with Bowtie2\/Tophat2 accounts for most of the compute time of the entire workflow, the use of faster RNA-Seq aligners, such as Rsubread or HISAT2, can reduce the compute time to less than three hours. With comparable parallel computer resources available, one can complete with systemPipeR the end-to-end analysis of several complex NGS experiments each containing 50\u2013100 FASTQ files in less than a day rather than many days or weeks as is common in non-parallelized workflows.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Need_for_an_R-based_NGS_workflow_environment\">Need for an R-based NGS workflow environment<\/span><\/h3>\n<p>Several related software tools with NGS workflow functionality are available. These include Galaxy<sup id=\"rdp-ebb-cite_ref-GoecksGalaxy10_15-1\" class=\"reference\"><a href=\"#cite_note-GoecksGalaxy10-15\" rel=\"external_link\">[15]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-AfganHarness11_38-0\" class=\"reference\"><a href=\"#cite_note-AfganHarness11-38\" rel=\"external_link\">[38]<\/a><\/sup>, Snakemake<sup id=\"rdp-ebb-cite_ref-K.C3.B6sterSnake12_16-1\" class=\"reference\"><a href=\"#cite_note-K.C3.B6sterSnake12-16\" rel=\"external_link\">[16]<\/a><\/sup>, Taverna<sup id=\"rdp-ebb-cite_ref-WolstencroftTheTav13_17-1\" class=\"reference\"><a href=\"#cite_note-WolstencroftTheTav13-17\" rel=\"external_link\">[17]<\/a><\/sup>, BioBlend<sup id=\"rdp-ebb-cite_ref-SloggettBioBlend13_39-0\" class=\"reference\"><a href=\"#cite_note-SloggettBioBlend13-39\" rel=\"external_link\">[39]<\/a><\/sup>, bcbio-nextgen<sup id=\"rdp-ebb-cite_ref-GuimeraBcbio12_18-1\" class=\"reference\"><a href=\"#cite_note-GuimeraBcbio12-18\" rel=\"external_link\">[18]<\/a><\/sup>, Knime<sup id=\"rdp-ebb-cite_ref-WarrScient12_19-1\" class=\"reference\"><a href=\"#cite_note-WarrScient12-19\" rel=\"external_link\">[19]<\/a><\/sup>, Ruffus<sup id=\"rdp-ebb-cite_ref-GoodstadtRuffus10_20-1\" class=\"reference\"><a href=\"#cite_note-GoodstadtRuffus10-20\" rel=\"external_link\">[20]<\/a><\/sup>, Kepler<sup id=\"rdp-ebb-cite_ref-StroppWorkflows12_21-1\" class=\"reference\"><a href=\"#cite_note-StroppWorkflows12-21\" rel=\"external_link\">[21]<\/a><\/sup>, Wasp<sup id=\"rdp-ebb-cite_ref-McLellanTheWasp12_22-1\" class=\"reference\"><a href=\"#cite_note-McLellanTheWasp12-22\" rel=\"external_link\">[22]<\/a><\/sup>, ViennaNGS<sup id=\"rdp-ebb-cite_ref-WolfingerVienna15_23-1\" class=\"reference\"><a href=\"#cite_note-WolfingerVienna15-23\" rel=\"external_link\">[23]<\/a><\/sup>, Mercury<sup id=\"rdp-ebb-cite_ref-ReidLaunch14_24-1\" class=\"reference\"><a href=\"#cite_note-ReidLaunch14-24\" rel=\"external_link\">[24]<\/a><\/sup>, RAP<sup id=\"rdp-ebb-cite_ref-D.27AntonioRAP15_40-0\" class=\"reference\"><a href=\"#cite_note-D.27AntonioRAP15-40\" rel=\"external_link\">[40]<\/a><\/sup>, and LONI<sup id=\"rdp-ebb-cite_ref-TorriNext12_41-0\" class=\"reference\"><a href=\"#cite_note-TorriNext12-41\" rel=\"external_link\">[41]<\/a><\/sup> among others. Additionally, general purpose utilities for workflow management and design are provided by Rabix<sup id=\"rdp-ebb-cite_ref-SBGRabix_42-0\" class=\"reference\"><a href=\"#cite_note-SBGRabix-42\" rel=\"external_link\">[42]<\/a><\/sup> and WDL.<sup id=\"rdp-ebb-cite_ref-BI_WDL_43-0\" class=\"reference\"><a href=\"#cite_note-BI_WDL-43\" rel=\"external_link\">[43]<\/a><\/sup> \n<\/p><p>These tools provide infrastructure for streamlining the analysis of NGS data in a variety of data analysis environments and computer languages. However, only limited resources are available for designing and running analysis workflows for a wide range of NGS applications directly from within R as is possible with systemPipeR. One of the few exceptions is QuasR.<sup id=\"rdp-ebb-cite_ref-GaidatzisQuasR15_44-0\" class=\"reference\"><a href=\"#cite_note-GaidatzisQuasR15-44\" rel=\"external_link\">[44]<\/a><\/sup> This Bioconductor package supports the initial analysis steps of several NGS applications, but it lacks an interface to integrate external command-line software and functionalities to build new workflows. Other existing R\/Bioconductor resources for analyzing NGS data address the needs in this area only partially. For instance, many of them are limited to certain NGS applications, or cover only a subset of the processing steps required for complete workflows; do not support command-line software; or lack workflow design functionalities for different NGS applications. systemPipeR has been designed to address these requirements. However, it is important to mention here that well established community workflow environments like Galaxy provide several additional features not available in systemPipeR. A small sub-selection of them includes: (i) a web interface to support non-expert users who are not familiar with data analysis programming environments like R; (ii) support for a wider range of data types outside of the NGS field; (iii) a well-established infrastructure and community for archiving and sharing workflow protocols; or (iv) support for additional reporting technologies such as iPython notebooks. To take advantage of this powerful infrastructure, Galaxy compatible versions of systemPipeR\u2019s NGS workflows will be released in the future. This will allow biologists to run them from an easy-to-use web interface, while also being able to access additional functionalities available in Galaxy\u2019s large ecosystem of analysis tools.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusion\">Conclusion<\/span><\/h2>\n<p>The systemPipeR package unites R\/Bioconductor resources with external command-line software to standardize and automate the analysis of a wide range of NGS applications. Its functionalities reduce the complexity and time required to translate NGS data into interpretable research results, while a built-in reporting feature improves reproducibility. The environment provides sufficient flexibility to choose the optimal software for each step in complex NGS workflows, customize workflows, and design new workflows. Pre-configured workflow templates are included for several NGS applications. Templates for additional NGS applications are under development and will be added to the package in the near future.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Availability_and_requirements\">Availability and requirements<\/span><\/h2>\n<p><b>Project name<\/b>: systemPipeR workflow environment\n<\/p><p><b>Project home page<\/b>: <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/bioconductor.org\/packages\/systemPipeR\/\" target=\"_blank\">https:\/\/bioconductor.org\/packages\/systemPipeR\/<\/a>\n<\/p><p><b>Archived version<\/b>: systemPipeR\n<\/p><p><b>Operating system(s)<\/b>: Platform-independent\n<\/p><p><b>Programming language<\/b>: R\n<\/p><p><b>Other requirements<\/b>: R version \u22653.2, Bioconductor version \u22653.2\n<\/p><p><b>License<\/b>: Artistic-2-0\n<\/p><p><b>Any restrictions to use by non-academics<\/b>: none\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Abbreviations\">Abbreviations<\/span><\/h2>\n<p><b>BAM<\/b>: Binary version of sequence alignment map format\n<\/p><p><b>ChIP-Seq<\/b>: Chromatin immunoprecipitation sequencing\n<\/p><p><b>DEG<\/b>: Differentially expressed genes\n<\/p><p><b>FASTQ<\/b>: short read sequence file format\n<\/p><p><b>NGS<\/b>: Next generation sequencing\n<\/p><p><b>Ribo-Seq<\/b>: NGS profiling of mRNA populations bound to ribosomes\n<\/p><p><b>RNA-Seq<\/b>: NGS profiling of mRNA\n<\/p><p><b>SAM<\/b>: Sequence alignment map format\n<\/p><p><b>VAR-Seq<\/b>: NGS-based variant detection\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Declarations\">Declarations<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h3>\n<p>We acknowledge the Bioconductor core team and community for providing valuable input for developing systemPipeR.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Funding\">Funding<\/span><\/h3>\n<p>This work was supported by grants from the National Science Foundation (ABI-0957099, MCB-1021969, IOS-1546879), the National Institutes of Health (U24AG051129, R01-AI36959), and the National Institute of Food and Agriculture (2011-68004-30154).\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Authors.E2.80.99_contributions\">Authors\u2019 contributions<\/span><\/h3>\n<p>TB and TG conceived the idea for systemPipeR. TG developed the methods, implemented the R package, and wrote the article. Both authors read and approved the final manuscript.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Competing_interests\">Competing interests<\/span><\/h3>\n<p>The authors declare that they have no competing interests.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Additional_files\">Additional files<\/span><\/h3>\n<p><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/static-content.springer.com\/esm\/art%3A10.1186%2Fs12859-016-1241-0\/MediaObjects\/12859_2016_1241_MOESM1_ESM.pdf\" target=\"_blank\">Additional file 1<\/a>: RNA-Seq Workflow Example. Case study to illustrate the utilities of systemPipeR using an RNA-Seq workflow as example. (PDF 89 kb)\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-KaliskySingle11-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KaliskySingle11_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kalisky, T.; Quake, S.R.&#32;(2011).&#32;\"Single-cell genomics\".&#32;<i>Nature Methods<\/i>&#32;<b>8<\/b>&#32;(4): 311\u20134.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnmeth0411-311\" target=\"_blank\">10.1038\/nmeth0411-311<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21451520\" target=\"_blank\">21451520<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Single-cell+genomics&amp;rft.jtitle=Nature+Methods&amp;rft.aulast=Kalisky%2C+T.%3B+Quake%2C+S.R.&amp;rft.au=Kalisky%2C+T.%3B+Quake%2C+S.R.&amp;rft.date=2011&amp;rft.volume=8&amp;rft.issue=4&amp;rft.pages=311%E2%80%934&amp;rft_id=info:doi\/10.1038%2Fnmeth0411-311&amp;rft_id=info:pmid\/21451520&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TrapnellTheDynamics14-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TrapnellTheDynamics14_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Trapnell, C.; Cacchiarelli, D.; Grimsby, J. et al.&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4122333\" target=\"_blank\">\"The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells\"<\/a>.&#32;<i>Nature Biotechnology<\/i>&#32;<b>32<\/b>&#32;(4): 381\u201386.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnbt.2859\" target=\"_blank\">10.1038\/nbt.2859<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4122333\/\" target=\"_blank\">PMC4122333<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24658644\" target=\"_blank\">24658644<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4122333\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4122333<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+dynamics+and+regulators+of+cell+fate+decisions+are+revealed+by+pseudotemporal+ordering+of+single+cells&amp;rft.jtitle=Nature+Biotechnology&amp;rft.aulast=Trapnell%2C+C.%3B+Cacchiarelli%2C+D.%3B+Grimsby%2C+J.+et+al.&amp;rft.au=Trapnell%2C+C.%3B+Cacchiarelli%2C+D.%3B+Grimsby%2C+J.+et+al.&amp;rft.date=2014&amp;rft.volume=32&amp;rft.issue=4&amp;rft.pages=381%E2%80%9386&amp;rft_id=info:doi\/10.1038%2Fnbt.2859&amp;rft_id=info:pmc\/PMC4122333&amp;rft_id=info:pmid\/24658644&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4122333&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Lindblad-TohAHigh11-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Lindblad-TohAHigh11_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lindblad-Toh, K.; Garber, M.; Zuk, O. et al.&#32;(2011).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3207357\" target=\"_blank\">\"A high-resolution map of human evolutionary constraint using 29 mammals\"<\/a>.&#32;<i>Nature<\/i>&#32;<b>478<\/b>&#32;(7370): 476\u201382.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnature10530\" target=\"_blank\">10.1038\/nature10530<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3207357\/\" target=\"_blank\">PMC3207357<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21993624\" target=\"_blank\">21993624<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3207357\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3207357<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+high-resolution+map+of+human+evolutionary+constraint+using+29+mammals&amp;rft.jtitle=Nature&amp;rft.aulast=Lindblad-Toh%2C+K.%3B+Garber%2C+M.%3B+Zuk%2C+O.+et+al.&amp;rft.au=Lindblad-Toh%2C+K.%3B+Garber%2C+M.%3B+Zuk%2C+O.+et+al.&amp;rft.date=2011&amp;rft.volume=478&amp;rft.issue=7370&amp;rft.pages=476%E2%80%9382&amp;rft_id=info:doi\/10.1038%2Fnature10530&amp;rft_id=info:pmc\/PMC3207357&amp;rft_id=info:pmid\/21993624&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3207357&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Kato-MaedaUseOf13-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Kato-MaedaUseOf13_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kato-Maeda, M.; Ho, C.; Passarelli, B. et al.&#32;(2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3589338\" target=\"_blank\">\"Use of whole genome sequencing to determine the microevolution of Mycobacterium tuberculosis during an outbreak\"<\/a>.&#32;<i>PLoS One<\/i>&#32;<b>8<\/b>&#32;(3): e58235.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0058235\" target=\"_blank\">10.1371\/journal.pone.0058235<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3589338\/\" target=\"_blank\">PMC3589338<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23472164\" target=\"_blank\">23472164<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3589338\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3589338<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Use+of+whole+genome+sequencing+to+determine+the+microevolution+of+Mycobacterium+tuberculosis+during+an+outbreak&amp;rft.jtitle=PLoS+One&amp;rft.aulast=Kato-Maeda%2C+M.%3B+Ho%2C+C.%3B+Passarelli%2C+B.+et+al.&amp;rft.au=Kato-Maeda%2C+M.%3B+Ho%2C+C.%3B+Passarelli%2C+B.+et+al.&amp;rft.date=2013&amp;rft.volume=8&amp;rft.issue=3&amp;rft.pages=e58235&amp;rft_id=info:doi\/10.1371%2Fjournal.pone.0058235&amp;rft_id=info:pmc\/PMC3589338&amp;rft_id=info:pmid\/23472164&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3589338&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HoltTheNew08-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HoltTheNew08_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Holt, R.A.; Jones, S.J.&#32;(2008).&#32;\"The new paradigm of flow cell sequencing\".&#32;<i>Genome Research<\/i>&#32;<b>18<\/b>&#32;(6): 839-46.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1101%2Fgr.073262.107\" target=\"_blank\">10.1101\/gr.073262.107<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/18519653\" target=\"_blank\">18519653<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+new+paradigm+of+flow+cell+sequencing&amp;rft.jtitle=Genome+Research&amp;rft.aulast=Holt%2C+R.A.%3B+Jones%2C+S.J.&amp;rft.au=Holt%2C+R.A.%3B+Jones%2C+S.J.&amp;rft.date=2008&amp;rft.volume=18&amp;rft.issue=6&amp;rft.pages=839-46&amp;rft_id=info:doi\/10.1101%2Fgr.073262.107&amp;rft_id=info:pmid\/18519653&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RobinsonEdgeR10-6\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-RobinsonEdgeR10_6-0\" rel=\"external_link\">6.0<\/a><\/sup> <sup><a href=\"#cite_ref-RobinsonEdgeR10_6-1\" rel=\"external_link\">6.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Robinson, M.D.; McCarthy, D.J.; Smyth, G.K.&#32;(2010).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2796818\" target=\"_blank\">\"edgeR: A Bioconductor package for differential expression analysis of digital gene expression data\"<\/a>.&#32;<i>Bioinformatics<\/i>&#32;<b>26<\/b>&#32;(1): 139\u201340.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtp616\" target=\"_blank\">10.1093\/bioinformatics\/btp616<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2796818\/\" target=\"_blank\">PMC2796818<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19910308\" target=\"_blank\">19910308<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2796818\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2796818<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=edgeR%3A+A+Bioconductor+package+for+differential+expression+analysis+of+digital+gene+expression+data&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Robinson%2C+M.D.%3B+McCarthy%2C+D.J.%3B+Smyth%2C+G.K.&amp;rft.au=Robinson%2C+M.D.%3B+McCarthy%2C+D.J.%3B+Smyth%2C+G.K.&amp;rft.date=2010&amp;rft.volume=26&amp;rft.issue=1&amp;rft.pages=139%E2%80%9340&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtp616&amp;rft_id=info:pmc\/PMC2796818&amp;rft_id=info:pmid\/19910308&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2796818&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LoveModerated14-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LoveModerated14_7-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-LoveModerated14_7-1\" rel=\"external_link\">7.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Love, M.I.; Huber, W.; Anders, S.&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4302049\" target=\"_blank\">\"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2\"<\/a>.&#32;<i>Genome Biology<\/i>&#32;<b>15<\/b>&#32;(12): 550.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fs13059-014-0550-8\" target=\"_blank\">10.1186\/s13059-014-0550-8<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4302049\/\" target=\"_blank\">PMC4302049<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25516281\" target=\"_blank\">25516281<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4302049\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4302049<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Moderated+estimation+of+fold+change+and+dispersion+for+RNA-seq+data+with+DESeq2&amp;rft.jtitle=Genome+Biology&amp;rft.aulast=Love%2C+M.I.%3B+Huber%2C+W.%3B+Anders%2C+S.&amp;rft.au=Love%2C+M.I.%3B+Huber%2C+W.%3B+Anders%2C+S.&amp;rft.date=2014&amp;rft.volume=15&amp;rft.issue=12&amp;rft.pages=550&amp;rft_id=info:doi\/10.1186%2Fs13059-014-0550-8&amp;rft_id=info:pmc\/PMC4302049&amp;rft_id=info:pmid\/25516281&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4302049&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KharchenkoDesign08-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KharchenkoDesign08_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kharchenko, P.V.; Tolstorukov, M.Y.; Park, P.J.&#32;(2008).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2597701\" target=\"_blank\">\"Design and analysis of ChIP-seq experiments for DNA-binding proteins\"<\/a>.&#32;<i>Nature Biotechnology<\/i>&#32;<b>26<\/b>&#32;(12): 1351\u20139.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnbt.1508\" target=\"_blank\">10.1038\/nbt.1508<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2597701\/\" target=\"_blank\">PMC2597701<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19029915\" target=\"_blank\">19029915<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2597701\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2597701<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Design+and+analysis+of+ChIP-seq+experiments+for+DNA-binding+proteins&amp;rft.jtitle=Nature+Biotechnology&amp;rft.aulast=Kharchenko%2C+P.V.%3B+Tolstorukov%2C+M.Y.%3B+Park%2C+P.J.&amp;rft.au=Kharchenko%2C+P.V.%3B+Tolstorukov%2C+M.Y.%3B+Park%2C+P.J.&amp;rft.date=2008&amp;rft.volume=26&amp;rft.issue=12&amp;rft.pages=1351%E2%80%939&amp;rft_id=info:doi\/10.1038%2Fnbt.1508&amp;rft_id=info:pmc\/PMC2597701&amp;rft_id=info:pmid\/19029915&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2597701&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AkalinMethyl12-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AkalinMethyl12_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Akalin, A.; Kormaksson, M.; Li, S. et al.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3491415\" target=\"_blank\">\"methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles\"<\/a>.&#32;<i>Genome Biology<\/i>&#32;<b>13<\/b>&#32;(10): R87.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fgb-2012-13-10-r87\" target=\"_blank\">10.1186\/gb-2012-13-10-r87<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3491415\/\" target=\"_blank\">PMC3491415<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23034086\" target=\"_blank\">23034086<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3491415\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3491415<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=methylKit%3A+a+comprehensive+R+package+for+the+analysis+of+genome-wide+DNA+methylation+profiles&amp;rft.jtitle=Genome+Biology&amp;rft.aulast=Akalin%2C+A.%3B+Kormaksson%2C+M.%3B+Li%2C+S.+et+al.&amp;rft.au=Akalin%2C+A.%3B+Kormaksson%2C+M.%3B+Li%2C+S.+et+al.&amp;rft.date=2012&amp;rft.volume=13&amp;rft.issue=10&amp;rft.pages=R87&amp;rft_id=info:doi\/10.1186%2Fgb-2012-13-10-r87&amp;rft_id=info:pmc\/PMC3491415&amp;rft_id=info:pmid\/23034086&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3491415&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HuberOrch15-10\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-HuberOrch15_10-0\" rel=\"external_link\">10.0<\/a><\/sup> <sup><a href=\"#cite_ref-HuberOrch15_10-1\" rel=\"external_link\">10.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Huber, W.; Carey, V.J.; Gentleman, R. et al.&#32;(2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4509590\" target=\"_blank\">\"Orchestrating high-throughput genomic analysis with Bioconductor\"<\/a>.&#32;<i>Nature Methods<\/i>&#32;<b>12<\/b>&#32;(2): 115\u201321.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnmeth.3252\" target=\"_blank\">10.1038\/nmeth.3252<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4509590\/\" target=\"_blank\">PMC4509590<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25633503\" target=\"_blank\">25633503<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4509590\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4509590<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Orchestrating+high-throughput+genomic+analysis+with+Bioconductor&amp;rft.jtitle=Nature+Methods&amp;rft.aulast=Huber%2C+W.%3B+Carey%2C+V.J.%3B+Gentleman%2C+R.+et+al.&amp;rft.au=Huber%2C+W.%3B+Carey%2C+V.J.%3B+Gentleman%2C+R.+et+al.&amp;rft.date=2015&amp;rft.volume=12&amp;rft.issue=2&amp;rft.pages=115%E2%80%9321&amp;rft_id=info:doi\/10.1038%2Fnmeth.3252&amp;rft_id=info:pmc\/PMC4509590&amp;rft_id=info:pmid\/25633503&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4509590&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LiTheSeq09-11\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LiTheSeq09_11-0\" rel=\"external_link\">11.0<\/a><\/sup> <sup><a href=\"#cite_ref-LiTheSeq09_11-1\" rel=\"external_link\">11.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Li, H.; Handsaker, B.; Wysoker, A. et al.&#32;(2009).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2723002\" target=\"_blank\">\"The Sequence Alignment\/Map format and SAMtools\"<\/a>.&#32;<i>Bioinformatics<\/i>&#32;<b>25<\/b>&#32;(16): 2078\u20139.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtp352\" target=\"_blank\">10.1093\/bioinformatics\/btp352<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2723002\/\" target=\"_blank\">PMC2723002<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19505943\" target=\"_blank\">19505943<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2723002\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2723002<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+Sequence+Alignment%2FMap+format+and+SAMtools&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Li%2C+H.%3B+Handsaker%2C+B.%3B+Wysoker%2C+A.+et+al.&amp;rft.au=Li%2C+H.%3B+Handsaker%2C+B.%3B+Wysoker%2C+A.+et+al.&amp;rft.date=2009&amp;rft.volume=25&amp;rft.issue=16&amp;rft.pages=2078%E2%80%939&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtp352&amp;rft_id=info:pmc\/PMC2723002&amp;rft_id=info:pmid\/19505943&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2723002&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LawrenceSoft13-12\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LawrenceSoft13_12-0\" rel=\"external_link\">12.0<\/a><\/sup> <sup><a href=\"#cite_ref-LawrenceSoft13_12-1\" rel=\"external_link\">12.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lawrence, M.; Huber, W.; Pag\u00e8s, H. et al.&#32;(2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3738458\" target=\"_blank\">\"Software for computing and annotating genomic ranges\"<\/a>.&#32;<i>PLoS Computational Biology<\/i>&#32;<b>9<\/b>&#32;(8): e1003118.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pcbi.1003118\" target=\"_blank\">10.1371\/journal.pcbi.1003118<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3738458\/\" target=\"_blank\">PMC3738458<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23950696\" target=\"_blank\">23950696<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3738458\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3738458<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Software+for+computing+and+annotating+genomic+ranges&amp;rft.jtitle=PLoS+Computational+Biology&amp;rft.aulast=Lawrence%2C+M.%3B+Huber%2C+W.%3B+Pag%C3%A8s%2C+H.+et+al.&amp;rft.au=Lawrence%2C+M.%3B+Huber%2C+W.%3B+Pag%C3%A8s%2C+H.+et+al.&amp;rft.date=2013&amp;rft.volume=9&amp;rft.issue=8&amp;rft.pages=e1003118&amp;rft_id=info:doi\/10.1371%2Fjournal.pcbi.1003118&amp;rft_id=info:pmc\/PMC3738458&amp;rft_id=info:pmid\/23950696&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3738458&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-QuinlanBED10-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-QuinlanBED10_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Quinlan, A.R.; Hall, I.M.&#32;(2010).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2832824\" target=\"_blank\">\"BEDTools: a flexible suite of utilities for comparing genomic features\"<\/a>.&#32;<i>Bioinformatics<\/i>&#32;<b>26<\/b>&#32;(6): 841-2.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtq033\" target=\"_blank\">10.1093\/bioinformatics\/btq033<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2832824\/\" target=\"_blank\">PMC2832824<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20110278\" target=\"_blank\">20110278<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2832824\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2832824<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=BEDTools%3A+a+flexible+suite+of+utilities+for+comparing+genomic+features&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Quinlan%2C+A.R.%3B+Hall%2C+I.M.&amp;rft.au=Quinlan%2C+A.R.%3B+Hall%2C+I.M.&amp;rft.date=2010&amp;rft.volume=26&amp;rft.issue=6&amp;rft.pages=841-2&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtq033&amp;rft_id=info:pmc\/PMC2832824&amp;rft_id=info:pmid\/20110278&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2832824&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DurinckBioMart-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DurinckBioMart_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Durinck, S.; Moreau, Y.; Kasprzyk, A.&#32;(2005).&#32;\"BioMart and Bioconductor: A powerful link between biological databases and microarray data analysis\".&#32;<i>Bioinformatics<\/i>&#32;<b>21<\/b>&#32;(16): 3439-40.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbti525\" target=\"_blank\">10.1093\/bioinformatics\/bti525<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/16082012\" target=\"_blank\">16082012<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=BioMart+and+Bioconductor%3A+A+powerful+link+between+biological+databases+and+microarray+data+analysis&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Durinck%2C+S.%3B+Moreau%2C+Y.%3B+Kasprzyk%2C+A.&amp;rft.au=Durinck%2C+S.%3B+Moreau%2C+Y.%3B+Kasprzyk%2C+A.&amp;rft.date=2005&amp;rft.volume=21&amp;rft.issue=16&amp;rft.pages=3439-40&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbti525&amp;rft_id=info:pmid\/16082012&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GoecksGalaxy10-15\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-GoecksGalaxy10_15-0\" rel=\"external_link\">15.0<\/a><\/sup> <sup><a href=\"#cite_ref-GoecksGalaxy10_15-1\" rel=\"external_link\">15.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Goecks, Jeremey; Nekrutenko, Anton; Taylor, James; The Galaxy Team&#32;(2010).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2945788\" target=\"_blank\">\"Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences\"<\/a>.&#32;<i>Genome Biology<\/i>&#32;<b>11<\/b>&#32;(8): R86.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fgb-2010-11-8-r86\" target=\"_blank\">10.1186\/gb-2010-11-8-r86<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2945788\/\" target=\"_blank\">PMC2945788<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20738864\" target=\"_blank\">20738864<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2945788\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2945788<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Galaxy%3A+A+comprehensive+approach+for+supporting+accessible%2C+reproducible%2C+and+transparent+computational+research+in+the+life+sciences&amp;rft.jtitle=Genome+Biology&amp;rft.aulast=Goecks%2C+Jeremey%3B+Nekrutenko%2C+Anton%3B+Taylor%2C+James%3B+The+Galaxy+Team&amp;rft.au=Goecks%2C+Jeremey%3B+Nekrutenko%2C+Anton%3B+Taylor%2C+James%3B+The+Galaxy+Team&amp;rft.date=2010&amp;rft.volume=11&amp;rft.issue=8&amp;rft.pages=R86&amp;rft_id=info:doi\/10.1186%2Fgb-2010-11-8-r86&amp;rft_id=info:pmc\/PMC2945788&amp;rft_id=info:pmid\/20738864&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2945788&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-K.C3.B6sterSnake12-16\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-K.C3.B6sterSnake12_16-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-K.C3.B6sterSnake12_16-1\" rel=\"external_link\">16.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">K\u00f6ster, J.; Rahmann, S.&#32;(2012).&#32;\"Snakemake: A scalable bioinformatics workflow engine\".&#32;<i>Bioinformatics<\/i>&#32;<b>28<\/b>&#32;(19): 2520-2.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbts480\" target=\"_blank\">10.1093\/bioinformatics\/bts480<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22908215\" target=\"_blank\">22908215<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Snakemake%3A+A+scalable+bioinformatics+workflow+engine&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=K%C3%B6ster%2C+J.%3B+Rahmann%2C+S.&amp;rft.au=K%C3%B6ster%2C+J.%3B+Rahmann%2C+S.&amp;rft.date=2012&amp;rft.volume=28&amp;rft.issue=19&amp;rft.pages=2520-2&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbts480&amp;rft_id=info:pmid\/22908215&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WolstencroftTheTav13-17\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WolstencroftTheTav13_17-0\" rel=\"external_link\">17.0<\/a><\/sup> <sup><a href=\"#cite_ref-WolstencroftTheTav13_17-1\" rel=\"external_link\">17.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wolstencroft, K.; Haines, R.; Fellows, D. et al.&#32;(2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3692062\" target=\"_blank\">\"The Taverna workflow suite: Designing and executing workflows of Web Services on the desktop, web or in the cloud\"<\/a>.&#32;<i>Nucleic Acids Research<\/i>&#32;<b>41<\/b>&#32;(W1): W557-W561.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fnar%2Fgkt328\" target=\"_blank\">10.1093\/nar\/gkt328<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3692062\/\" target=\"_blank\">PMC3692062<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23640334\" target=\"_blank\">23640334<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3692062\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3692062<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+Taverna+workflow+suite%3A+Designing+and+executing+workflows+of+Web+Services+on+the+desktop%2C+web+or+in+the+cloud&amp;rft.jtitle=Nucleic+Acids+Research&amp;rft.aulast=Wolstencroft%2C+K.%3B+Haines%2C+R.%3B+Fellows%2C+D.+et+al.&amp;rft.au=Wolstencroft%2C+K.%3B+Haines%2C+R.%3B+Fellows%2C+D.+et+al.&amp;rft.date=2013&amp;rft.volume=41&amp;rft.issue=W1&amp;rft.pages=W557-W561&amp;rft_id=info:doi\/10.1093%2Fnar%2Fgkt328&amp;rft_id=info:pmc\/PMC3692062&amp;rft_id=info:pmid\/23640334&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3692062&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GuimeraBcbio12-18\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-GuimeraBcbio12_18-0\" rel=\"external_link\">18.0<\/a><\/sup> <sup><a href=\"#cite_ref-GuimeraBcbio12_18-1\" rel=\"external_link\">18.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Guimera, R.V.&#32;(2012).&#32;\"bcbio-nextgen: Automated, distributed next-gen sequencing pipeline\".&#32;<i>ENBnet Journal<\/i>&#32;<b>17<\/b>&#32;(B): 30.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.14806%2Fej.17.B.286\" target=\"_blank\">10.14806\/ej.17.B.286<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=bcbio-nextgen%3A+Automated%2C+distributed+next-gen+sequencing+pipeline&amp;rft.jtitle=ENBnet+Journal&amp;rft.aulast=Guimera%2C+R.V.&amp;rft.au=Guimera%2C+R.V.&amp;rft.date=2012&amp;rft.volume=17&amp;rft.issue=B&amp;rft.pages=30&amp;rft_id=info:doi\/10.14806%2Fej.17.B.286&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WarrScient12-19\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WarrScient12_19-0\" rel=\"external_link\">19.0<\/a><\/sup> <sup><a href=\"#cite_ref-WarrScient12_19-1\" rel=\"external_link\">19.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Warr, W.A.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3414708\" target=\"_blank\">\"Scientific workflow systems: Pipeline Pilot and KNIME\"<\/a>.&#32;<i>Journal of Computer-aided Molecular Design<\/i>&#32;<b>26<\/b>&#32;(7): 801\u20134.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs10822-012-9577-7\" target=\"_blank\">10.1007\/s10822-012-9577-7<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3414708\/\" target=\"_blank\">PMC3414708<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22644661\" target=\"_blank\">22644661<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3414708\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3414708<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Scientific+workflow+systems%3A+Pipeline+Pilot+and+KNIME&amp;rft.jtitle=Journal+of+Computer-aided+Molecular+Design&amp;rft.aulast=Warr%2C+W.A.&amp;rft.au=Warr%2C+W.A.&amp;rft.date=2012&amp;rft.volume=26&amp;rft.issue=7&amp;rft.pages=801%E2%80%934&amp;rft_id=info:doi\/10.1007%2Fs10822-012-9577-7&amp;rft_id=info:pmc\/PMC3414708&amp;rft_id=info:pmid\/22644661&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3414708&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GoodstadtRuffus10-20\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-GoodstadtRuffus10_20-0\" rel=\"external_link\">20.0<\/a><\/sup> <sup><a href=\"#cite_ref-GoodstadtRuffus10_20-1\" rel=\"external_link\">20.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Goodstadt, L.&#32;(2010).&#32;\"Ruffus: A lightweight Python library for computational pipelines\".&#32;<i>Bioinformatics<\/i>&#32;<b>26<\/b>&#32;(21): 2778-9.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtq524\" target=\"_blank\">10.1093\/bioinformatics\/btq524<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20847218\" target=\"_blank\">20847218<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Ruffus%3A+A+lightweight+Python+library+for+computational+pipelines&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Goodstadt%2C+L.&amp;rft.au=Goodstadt%2C+L.&amp;rft.date=2010&amp;rft.volume=26&amp;rft.issue=21&amp;rft.pages=2778-9&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtq524&amp;rft_id=info:pmid\/20847218&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-StroppWorkflows12-21\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-StroppWorkflows12_21-0\" rel=\"external_link\">21.0<\/a><\/sup> <sup><a href=\"#cite_ref-StroppWorkflows12_21-1\" rel=\"external_link\">21.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Stropp, T.; McPhillips, T.; Lud\u00e4scher, B.; Bieda, M.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3431220\" target=\"_blank\">\"Workflows for microarray data processing in the Kepler environment\"<\/a>.&#32;<i>BMC Bioinformatics<\/i>&#32;<b>13<\/b>: 102.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2105-13-102\" target=\"_blank\">10.1186\/1471-2105-13-102<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3431220\/\" target=\"_blank\">PMC3431220<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22594911\" target=\"_blank\">22594911<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3431220\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3431220<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Workflows+for+microarray+data+processing+in+the+Kepler+environment&amp;rft.jtitle=BMC+Bioinformatics&amp;rft.aulast=Stropp%2C+T.%3B+McPhillips%2C+T.%3B++Lud%C3%A4scher%2C+B.%3B+Bieda%2C+M.&amp;rft.au=Stropp%2C+T.%3B+McPhillips%2C+T.%3B++Lud%C3%A4scher%2C+B.%3B+Bieda%2C+M.&amp;rft.date=2012&amp;rft.volume=13&amp;rft.pages=102&amp;rft_id=info:doi\/10.1186%2F1471-2105-13-102&amp;rft_id=info:pmc\/PMC3431220&amp;rft_id=info:pmid\/22594911&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3431220&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-McLellanTheWasp12-22\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-McLellanTheWasp12_22-0\" rel=\"external_link\">22.0<\/a><\/sup> <sup><a href=\"#cite_ref-McLellanTheWasp12_22-1\" rel=\"external_link\">22.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">McLellan, A.S.; Dubin, R.; Jing, Q. et al.&#32;(2012).&#32;\"The Wasp System: An open source environment for managing and analyzing genomic data\".&#32;<i>Genomics<\/i>&#32;<b>100<\/b>&#32;(6): 345-51.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ygeno.2012.08.005\" target=\"_blank\">10.1016\/j.ygeno.2012.08.005<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22944616\" target=\"_blank\">22944616<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+Wasp+System%3A+An+open+source+environment+for+managing+and+analyzing+genomic+data&amp;rft.jtitle=Genomics&amp;rft.aulast=McLellan%2C+A.S.%3B+Dubin%2C+R.%3B+Jing%2C+Q.+et+al.&amp;rft.au=McLellan%2C+A.S.%3B+Dubin%2C+R.%3B+Jing%2C+Q.+et+al.&amp;rft.date=2012&amp;rft.volume=100&amp;rft.issue=6&amp;rft.pages=345-51&amp;rft_id=info:doi\/10.1016%2Fj.ygeno.2012.08.005&amp;rft_id=info:pmid\/22944616&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WolfingerVienna15-23\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WolfingerVienna15_23-0\" rel=\"external_link\">23.0<\/a><\/sup> <sup><a href=\"#cite_ref-WolfingerVienna15_23-1\" rel=\"external_link\">23.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wolfinger, M.T.; Fallmann, J.; Eggenhofer, F.; Amman, F.&#32;(2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4513691\" target=\"_blank\">\"ViennaNGS: A toolbox for building efficient next-generation sequencing analysis pipelines\"<\/a>.&#32;<i>F1000Research<\/i>&#32;<b>4<\/b>: 50.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.12688%2Ff1000research.6157.2\" target=\"_blank\">10.12688\/f1000research.6157.2<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4513691\/\" target=\"_blank\">PMC4513691<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26236465\" target=\"_blank\">26236465<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4513691\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4513691<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=ViennaNGS%3A+A+toolbox+for+building+efficient+next-generation+sequencing+analysis+pipelines&amp;rft.jtitle=F1000Research&amp;rft.aulast=Wolfinger%2C+M.T.%3B+Fallmann%2C+J.%3B+Eggenhofer%2C+F.%3B+Amman%2C+F.&amp;rft.au=Wolfinger%2C+M.T.%3B+Fallmann%2C+J.%3B+Eggenhofer%2C+F.%3B+Amman%2C+F.&amp;rft.date=2015&amp;rft.volume=4&amp;rft.pages=50&amp;rft_id=info:doi\/10.12688%2Ff1000research.6157.2&amp;rft_id=info:pmc\/PMC4513691&amp;rft_id=info:pmid\/26236465&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4513691&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ReidLaunch14-24\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ReidLaunch14_24-0\" rel=\"external_link\">24.0<\/a><\/sup> <sup><a href=\"#cite_ref-ReidLaunch14_24-1\" rel=\"external_link\">24.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Reid, J.G.; Carroll, A.; Veeraraghavan, N. et al.&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3922167\" target=\"_blank\">\"Launching genomics into the cloud: Deployment of Mercury, a next generation sequence analysis pipeline\"<\/a>.&#32;<i>BMC Bioinformatics<\/i>&#32;<b>15<\/b>: 30.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2105-15-30\" target=\"_blank\">10.1186\/1471-2105-15-30<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3922167\/\" target=\"_blank\">PMC3922167<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24475911\" target=\"_blank\">24475911<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3922167\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3922167<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Launching+genomics+into+the+cloud%3A+Deployment+of+Mercury%2C+a+next+generation+sequence+analysis+pipeline&amp;rft.jtitle=BMC+Bioinformatics&amp;rft.aulast=Reid%2C+J.G.%3B+Carroll%2C+A.%3B+Veeraraghavan%2C+N.+et+al.&amp;rft.au=Reid%2C+J.G.%3B+Carroll%2C+A.%3B+Veeraraghavan%2C+N.+et+al.&amp;rft.date=2014&amp;rft.volume=15&amp;rft.pages=30&amp;rft_id=info:doi\/10.1186%2F1471-2105-15-30&amp;rft_id=info:pmc\/PMC3922167&amp;rft_id=info:pmid\/24475911&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3922167&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LiAligning13-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LiAligning13_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Li, H.&#32;(26 May 2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/arxiv.org\/abs\/1303.3997\" target=\"_blank\">\"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM\"<\/a>.&#32;<i>arXiv.org<\/i>.&#32;Cornell University Library<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/arxiv.org\/abs\/1303.3997\" target=\"_blank\">https:\/\/arxiv.org\/abs\/1303.3997<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Aligning+sequence+reads%2C+clone+sequences+and+assembly+contigs+with+BWA-MEM&amp;rft.atitle=arXiv.org&amp;rft.aulast=Li%2C+H.&amp;rft.au=Li%2C+H.&amp;rft.date=26+May+2013&amp;rft.pub=Cornell+University+Library&amp;rft_id=https%3A%2F%2Farxiv.org%2Fabs%2F1303.3997&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LangmeadFast12-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LangmeadFast12_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Langmead, B.; Salzberg, S.L.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3322381\" target=\"_blank\">\"Fast gapped-read alignment with Bowtie 2\"<\/a>.&#32;<i>Nature Methods<\/i>&#32;<b>9<\/b>&#32;(4): 357-9.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnmeth.1923\" target=\"_blank\">10.1038\/nmeth.1923<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3322381\/\" target=\"_blank\">PMC3322381<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22388286\" target=\"_blank\">22388286<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3322381\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3322381<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Fast+gapped-read+alignment+with+Bowtie+2&amp;rft.jtitle=Nature+Methods&amp;rft.aulast=Langmead%2C+B.%3B+Salzberg%2C+S.L.&amp;rft.au=Langmead%2C+B.%3B+Salzberg%2C+S.L.&amp;rft.date=2012&amp;rft.volume=9&amp;rft.issue=4&amp;rft.pages=357-9&amp;rft_id=info:doi\/10.1038%2Fnmeth.1923&amp;rft_id=info:pmc\/PMC3322381&amp;rft_id=info:pmid\/22388286&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3322381&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KimTopHat2_13-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KimTopHat2_13_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kim, D.; Pertea, G.; Trapnell, C. et al.&#32;(2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4053844\" target=\"_blank\">\"TopHat2: Accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions\"<\/a>.&#32;<i>Genome Biology<\/i>&#32;<b>14<\/b>: R36.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fgb-2013-14-4-r36\" target=\"_blank\">10.1186\/gb-2013-14-4-r36<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4053844\/\" target=\"_blank\">PMC4053844<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23618408\" target=\"_blank\">23618408<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4053844\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4053844<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=TopHat2%3A+Accurate+alignment+of+transcriptomes+in+the+presence+of+insertions%2C+deletions+and+gene+fusions&amp;rft.jtitle=Genome+Biology&amp;rft.aulast=Kim%2C+D.%3B+Pertea%2C+G.%3B+Trapnell%2C+C.+et+al.&amp;rft.au=Kim%2C+D.%3B+Pertea%2C+G.%3B+Trapnell%2C+C.+et+al.&amp;rft.date=2013&amp;rft.volume=14&amp;rft.pages=R36&amp;rft_id=info:doi\/10.1186%2Fgb-2013-14-4-r36&amp;rft_id=info:pmc\/PMC4053844&amp;rft_id=info:pmid\/23618408&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4053844&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KimHISAT15-28\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-KimHISAT15_28-0\" rel=\"external_link\">28.0<\/a><\/sup> <sup><a href=\"#cite_ref-KimHISAT15_28-1\" rel=\"external_link\">28.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kim, D.; Langmead, B.; Salzberg, S.L.&#32;(2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4655817\" target=\"_blank\">\"HISAT: A fast spliced aligner with low memory requirements\"<\/a>.&#32;<i>Nature Methods<\/i>&#32;<b>12<\/b>&#32;(4): 357-60.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnmeth.3317\" target=\"_blank\">10.1038\/nmeth.3317<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4655817\/\" target=\"_blank\">PMC4655817<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25751142\" target=\"_blank\">25751142<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4655817\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4655817<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=HISAT%3A+A+fast+spliced+aligner+with+low+memory+requirements&amp;rft.jtitle=Nature+Methods&amp;rft.aulast=Kim%2C+D.%3B+Langmead%2C+B.%3B+Salzberg%2C+S.L.&amp;rft.au=Kim%2C+D.%3B+Langmead%2C+B.%3B+Salzberg%2C+S.L.&amp;rft.date=2015&amp;rft.volume=12&amp;rft.issue=4&amp;rft.pages=357-60&amp;rft_id=info:doi\/10.1038%2Fnmeth.3317&amp;rft_id=info:pmc\/PMC4655817&amp;rft_id=info:pmid\/25751142&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4655817&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZhangModel08-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ZhangModel08_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zhang, Y.; Liu, T.; Meyer, C.A. et al.&#32;(2008).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2592715\" target=\"_blank\">\"Model-based analysis of ChIP-Seq (MACS)\"<\/a>.&#32;<i>Genome Biology<\/i>&#32;<b>9<\/b>&#32;(9): R137.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fgb-2008-9-9-r137\" target=\"_blank\">10.1186\/gb-2008-9-9-r137<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2592715\/\" target=\"_blank\">PMC2592715<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/18798982\" target=\"_blank\">18798982<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2592715\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2592715<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Model-based+analysis+of+ChIP-Seq+%28MACS%29&amp;rft.jtitle=Genome+Biology&amp;rft.aulast=Zhang%2C+Y.%3B+Liu%2C+T.%3B+Meyer%2C+C.A.+et+al.&amp;rft.au=Zhang%2C+Y.%3B+Liu%2C+T.%3B+Meyer%2C+C.A.+et+al.&amp;rft.date=2008&amp;rft.volume=9&amp;rft.issue=9&amp;rft.pages=R137&amp;rft_id=info:doi\/10.1186%2Fgb-2008-9-9-r137&amp;rft_id=info:pmc\/PMC2592715&amp;rft_id=info:pmid\/18798982&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2592715&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-McKennaTheGenome10-30\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-McKennaTheGenome10_30-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">McKenna, A.; Hanna, M.; Banks, E. et al.&#32;(2010).&#32;\"The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data\".&#32;<i>Genome Research<\/i>&#32;<b>20<\/b>: 1297-1303.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1101%2Fgr.107524.110\" target=\"_blank\">10.1101\/gr.107524.110<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+Genome+Analysis+Toolkit%3A+A+MapReduce+framework+for+analyzing+next-generation+DNA+sequencing+data&amp;rft.jtitle=Genome+Research&amp;rft.aulast=McKenna%2C+A.%3B+Hanna%2C+M.%3B+Banks%2C+E.+et+al.&amp;rft.au=McKenna%2C+A.%3B+Hanna%2C+M.%3B+Banks%2C+E.+et+al.&amp;rft.date=2010&amp;rft.volume=20&amp;rft.pages=1297-1303&amp;rft_id=info:doi\/10.1101%2Fgr.107524.110&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BischlBatch12-31\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BischlBatch12_31-0\" rel=\"external_link\">31.0<\/a><\/sup> <sup><a href=\"#cite_ref-BischlBatch12_31-1\" rel=\"external_link\">31.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bischl, B.; Lang, M.; Mersmann, O. et al.&#32;(2012).&#32;\"BatchJobs and BatchExperiments: Abstraction Mechanisms for Using R in Batch Environments\".&#32;<i>Journal of Statistical Software<\/i>&#32;<b>64<\/b>&#32;(11): 1\u201325.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.18637%2Fjss.v064.i11\" target=\"_blank\">10.18637\/jss.v064.i11<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=BatchJobs+and+BatchExperiments%3A+Abstraction+Mechanisms+for+Using+R+in+Batch+Environments&amp;rft.jtitle=Journal+of+Statistical+Software&amp;rft.aulast=Bischl%2C+B.%3B+Lang%2C+M.%3B+Mersmann%2C+O.+et+al.&amp;rft.au=Bischl%2C+B.%3B+Lang%2C+M.%3B+Mersmann%2C+O.+et+al.&amp;rft.date=2012&amp;rft.volume=64&amp;rft.issue=11&amp;rft.pages=1%E2%80%9325&amp;rft_id=info:doi\/10.18637%2Fjss.v064.i11&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-XieDynamic13-32\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-XieDynamic13_32-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Xie, Y.&#32;(2013).&#32;<i>Dynamic Documents with R and knitr<\/i>&#32;(1st ed.).&#32;Chapman and Hall\/CRC.&#32;pp.&#160;216.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781482203530.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Dynamic+Documents+with+R+and+knitr&amp;rft.aulast=Xie%2C+Y.&amp;rft.au=Xie%2C+Y.&amp;rft.date=2013&amp;rft.pages=pp.%26nbsp%3B216&amp;rft.edition=1st&amp;rft.pub=Chapman+and+Hall%2FCRC&amp;rft.isbn=9781482203530&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MorganShortRead09-33\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MorganShortRead09_33-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Morgan, M.; Anders, S.; Lawrence, M. et al.&#32;(2009).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2752612\" target=\"_blank\">\"ShortRead: A bioconductor package for input, quality assessment and exploration of high-throughput sequence data\"<\/a>.&#32;<i>Bioinformatics<\/i>&#32;<b>25<\/b>&#32;(19): 2607-8.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtp450\" target=\"_blank\">10.1093\/bioinformatics\/btp450<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2752612\/\" target=\"_blank\">PMC2752612<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19654119\" target=\"_blank\">19654119<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2752612\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2752612<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=ShortRead%3A+A+bioconductor+package+for+input%2C+quality+assessment+and+exploration+of+high-throughput+sequence+data&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Morgan%2C+M.%3B+Anders%2C+S.%3B+Lawrence%2C+M.+et+al.&amp;rft.au=Morgan%2C+M.%3B+Anders%2C+S.%3B+Lawrence%2C+M.+et+al.&amp;rft.date=2009&amp;rft.volume=25&amp;rft.issue=19&amp;rft.pages=2607-8&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtp450&amp;rft_id=info:pmc\/PMC2752612&amp;rft_id=info:pmid\/19654119&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2752612&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ObenchainVariant14-34\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ObenchainVariant14_34-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Obenchain, V.; Lawrence, M.; Carey, V. et al.&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4080743\" target=\"_blank\">\"VariantAnnotation: a Bioconductor package for exploration and annotation of genetic variants\"<\/a>.&#32;<i>Bioinformatics<\/i>&#32;<b>30<\/b>&#32;(14): 2076-8.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtu168\" target=\"_blank\">10.1093\/bioinformatics\/btu168<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4080743\/\" target=\"_blank\">PMC4080743<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24681907\" target=\"_blank\">24681907<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4080743\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4080743<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=VariantAnnotation%3A+a+Bioconductor+package+for+exploration+and+annotation+of+genetic+variants&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Obenchain%2C+V.%3B+Lawrence%2C+M.%3B+Carey%2C+V.+et+al.&amp;rft.au=Obenchain%2C+V.%3B+Lawrence%2C+M.%3B+Carey%2C+V.+et+al.&amp;rft.date=2014&amp;rft.volume=30&amp;rft.issue=14&amp;rft.pages=2076-8&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtu168&amp;rft_id=info:pmc\/PMC4080743&amp;rft_id=info:pmid\/24681907&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4080743&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BBFastQC-35\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BBFastQC_35-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/\" target=\"_blank\">\"FastQC\"<\/a>.&#32;Babraham Bioinformatics<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/\" target=\"_blank\">http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 15 September 2015<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=FastQC&amp;rft.atitle=&amp;rft.pub=Babraham+Bioinformatics&amp;rft_id=http%3A%2F%2Fwww.bioinformatics.babraham.ac.uk%2Fprojects%2Ffastqc%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HannonFASTX-36\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HannonFASTX_36-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/hannonlab.cshl.edu\/fastx_toolkit\/index.html\" target=\"_blank\">\"FASTX-Toolkit\"<\/a>.&#32;Hannon Laboratory<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/hannonlab.cshl.edu\/fastx_toolkit\/index.html\" target=\"_blank\">http:\/\/hannonlab.cshl.edu\/fastx_toolkit\/index.html<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 17 September 2015<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=FASTX-Toolkit&amp;rft.atitle=&amp;rft.pub=Hannon+Laboratory&amp;rft_id=http%3A%2F%2Fhannonlab.cshl.edu%2Ffastx_toolkit%2Findex.html&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EwelsMultiQC16-37\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-EwelsMultiQC16_37-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ewels, P.; Magnusson, M.; Lundin, S.; K\u00e4ller, M.&#32;(2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5039924\" target=\"_blank\">\"MultiQC: summarize analysis results for multiple tools and samples in a single report\"<\/a>.&#32;<i>Bioinformatics<\/i>&#32;<b>32<\/b>&#32;(19): 3047\u20133048.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtw354\" target=\"_blank\">10.1093\/bioinformatics\/btw354<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5039924\/\" target=\"_blank\">PMC5039924<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27312411\" target=\"_blank\">27312411<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5039924\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5039924<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=MultiQC%3A+summarize+analysis+results+for+multiple+tools+and+samples+in+a+single+report&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Ewels%2C+P.%3B+Magnusson%2C+M.%3B+Lundin%2C+S.%3B+K%C3%A4ller%2C+M.&amp;rft.au=Ewels%2C+P.%3B+Magnusson%2C+M.%3B+Lundin%2C+S.%3B+K%C3%A4ller%2C+M.&amp;rft.date=2016&amp;rft.volume=32&amp;rft.issue=19&amp;rft.pages=3047%E2%80%933048&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtw354&amp;rft_id=info:pmc\/PMC5039924&amp;rft_id=info:pmid\/27312411&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5039924&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AfganHarness11-38\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AfganHarness11_38-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Afgan, E.; Baker, D.; Coraor, N. et al.&#32;(2011).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3868438\" target=\"_blank\">\"Harnessing cloud computing with Galaxy Cloud\"<\/a>.&#32;<i>Nature Biotechnology<\/i>&#32;<b>29<\/b>&#32;(11): 972-4.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnbt.2028\" target=\"_blank\">10.1038\/nbt.2028<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3868438\/\" target=\"_blank\">PMC3868438<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22068528\" target=\"_blank\">22068528<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3868438\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3868438<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Harnessing+cloud+computing+with+Galaxy+Cloud&amp;rft.jtitle=Nature+Biotechnology&amp;rft.aulast=Afgan%2C+E.%3B+Baker%2C+D.%3B+Coraor%2C+N.+et+al.&amp;rft.au=Afgan%2C+E.%3B+Baker%2C+D.%3B+Coraor%2C+N.+et+al.&amp;rft.date=2011&amp;rft.volume=29&amp;rft.issue=11&amp;rft.pages=972-4&amp;rft_id=info:doi\/10.1038%2Fnbt.2028&amp;rft_id=info:pmc\/PMC3868438&amp;rft_id=info:pmid\/22068528&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3868438&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SloggettBioBlend13-39\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SloggettBioBlend13_39-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Sloggett, C.; Goonasekera, N.; Afgan, E.&#32;(2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4288140\" target=\"_blank\">\"BioBlend: automating pipeline analyses within Galaxy and CloudMan\"<\/a>.&#32;<i>Bioinformatics<\/i>&#32;<b>29<\/b>&#32;(13): 1685-6.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtt199\" target=\"_blank\">10.1093\/bioinformatics\/btt199<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4288140\/\" target=\"_blank\">PMC4288140<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23630176\" target=\"_blank\">23630176<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4288140\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4288140<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=BioBlend%3A+automating+pipeline+analyses+within+Galaxy+and+CloudMan&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Sloggett%2C+C.%3B+Goonasekera%2C+N.%3B+Afgan%2C+E.&amp;rft.au=Sloggett%2C+C.%3B+Goonasekera%2C+N.%3B+Afgan%2C+E.&amp;rft.date=2013&amp;rft.volume=29&amp;rft.issue=13&amp;rft.pages=1685-6&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtt199&amp;rft_id=info:pmc\/PMC4288140&amp;rft_id=info:pmid\/23630176&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4288140&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-D.27AntonioRAP15-40\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-D.27AntonioRAP15_40-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">D'Antonio, M.; D'Onorio De Meo, P.; Pallocca, M. et al.&#32;(2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4461013\" target=\"_blank\">\"RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application\"<\/a>.&#32;<i>BMC Genomics<\/i>&#32;<b>16<\/b>: S3.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2164-16-S6-S3\" target=\"_blank\">10.1186\/1471-2164-16-S6-S3<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4461013\/\" target=\"_blank\">PMC4461013<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26046471\" target=\"_blank\">26046471<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4461013\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4461013<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=RAP%3A+RNA-Seq+Analysis+Pipeline%2C+a+new+cloud-based+NGS+web+application&amp;rft.jtitle=BMC+Genomics&amp;rft.aulast=D%27Antonio%2C+M.%3B+D%27Onorio+De+Meo%2C+P.%3B+Pallocca%2C+M.+et+al.&amp;rft.au=D%27Antonio%2C+M.%3B+D%27Onorio+De+Meo%2C+P.%3B+Pallocca%2C+M.+et+al.&amp;rft.date=2015&amp;rft.volume=16&amp;rft.pages=S3&amp;rft_id=info:doi\/10.1186%2F1471-2164-16-S6-S3&amp;rft_id=info:pmc\/PMC4461013&amp;rft_id=info:pmid\/26046471&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4461013&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TorriNext12-41\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TorriNext12_41-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Torri, F.; Dinov, I.D.; Zamanyan, A. et al.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3490498\" target=\"_blank\">\"Next generation sequence analysis and computational genomics using graphical pipeline workflows\"<\/a>.&#32;<i>Genes<\/i>&#32;<b>3<\/b>&#32;(3): 545\u201375.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3390%2Fgenes3030545\" target=\"_blank\">10.3390\/genes3030545<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3490498\/\" target=\"_blank\">PMC3490498<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23139896\" target=\"_blank\">23139896<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3490498\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3490498<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Next+generation+sequence+analysis+and+computational+genomics+using+graphical+pipeline+workflows&amp;rft.jtitle=Genes&amp;rft.aulast=Torri%2C+F.%3B+Dinov%2C+I.D.%3B+Zamanyan%2C+A.+et+al.&amp;rft.au=Torri%2C+F.%3B+Dinov%2C+I.D.%3B+Zamanyan%2C+A.+et+al.&amp;rft.date=2012&amp;rft.volume=3&amp;rft.issue=3&amp;rft.pages=545%E2%80%9375&amp;rft_id=info:doi\/10.3390%2Fgenes3030545&amp;rft_id=info:pmc\/PMC3490498&amp;rft_id=info:pmid\/23139896&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3490498&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SBGRabix-42\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SBGRabix_42-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/rabix.io\/\" target=\"_blank\">\"Rabix\"<\/a>.&#32;Seven Bridges Genomics, Inc<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/rabix.io\/\" target=\"_blank\">http:\/\/rabix.io\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Rabix&amp;rft.atitle=&amp;rft.pub=Seven+Bridges+Genomics%2C+Inc&amp;rft_id=http%3A%2F%2Frabix.io%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BI_WDL-43\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BI_WDL_43-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Broad Institute.&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/github.com\/broadinstitute\/wdl\" target=\"_blank\">\"broadinstitute\/wdl\"<\/a>.&#32;<i>GitHub<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/github.com\/broadinstitute\/wdl\" target=\"_blank\">https:\/\/github.com\/broadinstitute\/wdl<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 16 September 2015<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=broadinstitute%2Fwdl&amp;rft.atitle=GitHub&amp;rft.aulast=Broad+Institute&amp;rft.au=Broad+Institute&amp;rft_id=https%3A%2F%2Fgithub.com%2Fbroadinstitute%2Fwdl&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GaidatzisQuasR15-44\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GaidatzisQuasR15_44-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gaidatzis, D.; Lerch, A.; Hahne, F. et al.&#32;(2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4382904\" target=\"_blank\">\"QuasR: Quantification and annotation of short reads in R\"<\/a>.&#32;<i>Bioinformatics<\/i>&#32;<b>31<\/b>: 7.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtu781\" target=\"_blank\">10.1093\/bioinformatics\/btu781<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4382904\/\" target=\"_blank\">PMC4382904<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25417205\" target=\"_blank\">25417205<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4382904\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4382904<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=QuasR%3A+Quantification+and+annotation+of+short+reads+in+R&amp;rft.jtitle=Bioinformatics&amp;rft.aulast=Gaidatzis%2C+D.%3B+Lerch%2C+A.%3B+Hahne%2C+F.+et+al.&amp;rft.au=Gaidatzis%2C+D.%3B+Lerch%2C+A.%3B+Hahne%2C+F.+et+al.&amp;rft.date=2015&amp;rft.volume=31&amp;rft.pages=7&amp;rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtu781&amp;rft_id=info:pmc\/PMC4382904&amp;rft_id=info:pmid\/25417205&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4382904&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original URL to Rabix was dead, and it was replaced with a current one for this version.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193150\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 1.070 seconds\nReal time usage: 1.102 seconds\nPreprocessor visited node count: 36452\/1000000\nPreprocessor generated node count: 40710\/1000000\nPost\u2010expand include size: 323688\/2097152 bytes\nTemplate argument size: 98587\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 1063.051 1 - -total\n 88.28% 938.424 1 - Template:Reflist\n 77.90% 828.074 44 - Template:Citation\/core\n 72.99% 775.934 38 - Template:Cite_journal\n 12.18% 129.515 103 - Template:Citation\/identifier\n 6.84% 72.716 5 - Template:Cite_web\n 5.87% 62.415 1 - Template:Infobox_journal_article\n 5.67% 60.231 1 - Template:Infobox\n 4.60% 48.866 46 - Template:Citation\/make_link\n 4.47% 47.525 235 - Template:Hide_in_print\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10784-0!*!0!!en!5!* and timestamp 20181214193148 and revision id 33904\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment\">https:\/\/www.limswiki.org\/index.php\/Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","d6135e8d32b77d11c05c7b261fe72044_images":["https:\/\/www.limswiki.org\/images\/8\/89\/Fig1_Backman_BMCBio2016_17.gif","https:\/\/www.limswiki.org\/images\/7\/7f\/Fig2_Backman_BMCBio2016_17.gif"],"d6135e8d32b77d11c05c7b261fe72044_timestamp":1544815908,"bbf9b02ac710d05d03c94083fa4e01e0_type":"article","bbf9b02ac710d05d03c94083fa4e01e0_title":"Promoting data sharing among Indonesian scientists: A proposal of a generic university-level research data management plan (RDMP) (Irawan and Rachmi 2018)","bbf9b02ac710d05d03c94083fa4e01e0_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)","bbf9b02ac710d05d03c94083fa4e01e0_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Promoting data sharing among Indonesian scientists: A proposal of a generic university-level research data management plan (RDMP)\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nPromoting data sharing among Indonesian scientists: A proposal of a generic university-level research data management plan (RDMP)Journal\n \nResearch Ideas and OutcomesAuthor(s)\n \nIrawan, Dasapta E.; Rachmi, Cut N.Author affiliation(s)\n \nInstitut Teknologi Bandung, Universitas PadjadjaranPrimary contact\n \nEmail: dasaptaerwin at outlook dot co dot idYear published\n \n2018Volume and issue\n \n4Page(s)\n \ne28163DOI\n \n10.3897\/rio.4.e28163ISSN\n \n2367-7163Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/riojournal.com\/articles.php?id=28163Download\n \nhttps:\/\/riojournal.com\/article\/28163\/download\/pdf\/ (PDF)\n\nAbstract \nEvery researcher needs data in their working ecosystem, but despite the resources (funding, time, and energy) they have spent to get the data, only a few are putting more real attention into data management. This paper mainly describes our recommendation of a research data management plan (RDMP) at the university level. This paper is an extension of our initiative, to be developed at the university or national level, while also in-line with current developments in scientific practices mandating data sharing and data re-use.\nResearchers can use this article as an assessment form to describe the setting of their research and data management. Researchers can also develop a more detailed RDMP to cater to a specific project's environment. In this RDMP, we propose three levels of storage: offline working storage, offline backup storage, and online-cloud backup storage, located on a shared-repository. We also propose two kinds of cloud repository: a dynamic repository to store live data and a static repository to keep a copy of final data.\nHopefully, this RDMP could solve problems on data sharing and preservation, and additionally it could improve researchers' awareness about data management to increase the value and impact of their research efforts.\nKeywords: research data management plan, open data, data sharing, data repository, reproducible research\n\nIntroduction \nGood data management is capable of supporting scientific discovery[1], yet we have been observing a cultural barrier on data sharing.[2] More insights about data sharing and the diverse perceptions among scientists in various fields have been endlessly discussed.[3][4][5][6]\nEvery researcher needs data in their working ecosystem, but despite the resources (funding, time, and energy) they have spent to get the data, only a few are putting more real attention into data management.[7][8] A data management strategy is not just an administrative document; it also plays an important role in guiding researchers in storing, backing up, preserving, and sharing their research data in a proper and sustainable manner.\nThis paper describes a guideline to build a university-level research data management plan (RDMP) and how it can promote data sharing among scientists. This RDMP would be the first one to be developed at the university level in Indonesia. This project is in-line with current development in scientific practices mandating data sharing and data re-use. The goals of this RDMP project are to build awareness about data sharing and preservation to scientists, especially academic staffs, and to build a practical and simple tool to help them manage their research data. The goal of an RDMP project is to guide researchers in managing their data, including curating, storing, sharing, and preserving it for immediate and future use.\nThis RDMP proposal is largely extracted from our experience in developing RDMP for an international research collaboration funded by RCUK (Research Council UK).[9]\n\nDescription \nGeneral overview \nThe concern to having a proper RDMP was triggered by difficulties faced by researchers to find data from another researcher or previous research and to extract data from reports. The other problem is to find guidelines, especially in Indonesia, on how to appropriately manage your research data, to store them, and to keep them available in the long run. Clearly scientists have issues on how to re-use datasets from prior research, how to cite them in their own work (re-use), and how to know the limitation of such actions.\nDue to the large effort to get data in terms of funding, time, and energy, the longevity of data should be more than one or two years, as we find to be the general case in the Indonesia research ecosystem (Fig. 1).[9][10][11][12] Another important point to address is the barrier of data sharing that involves the fear of getting scooped, the lack of knowledge concerning intellectual property rights (IPR), and data ownership. Therefore, by developing this document, we could solve the barriers and at the same time we could come up with another way to increase the value of research data, instead of only looking at mainstream metrics.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. Current situation of data lifecycle\n\n\n\nHow to use this article as a set of guidelines \nResearchers can use this article as an assessment form to describe the setting of their research and data management requirements from a potential funder. Researchers can also develop a more detailed RDMP to cater to a specific project's environment. They should justify the setting of their research and requirement of the funder regarding data sharing and data preservation.\n\nSeven components in RDMP \nThe proposed RDMP is divided into seven components:\n\nData collection\nDocumentation and metadata\nStorage and backup\nPreservation\nSharing and re-use\nResponsibilities and resources\nEthics and legal compliance\nReferences \nGiven the different nature of research, funders, and DMP standards, we refer to the following sources in developing this RDMP:\n\n Data sharing culture (Neylon 2017a[10], Neylon 2017b[11])\n Open data principles and reproducible research (Irawan et al. 2017[13])\n RDMP check lists or rubric (Digital Curation Center 2014[14], Teperek et al. 2017[15], University of California Curation Center 2018[16])\n RDMP case study from various fields of sciences (Neylon 2017c[17], Traynor 2017[18], Wael 2017[19], Woolfrey 2017[20])\nComponent 1: Data collection \nWhat types of data will you collect, create, link to, acquire and\/or record? \nThis RDMP covers the following type of data or documents, which are considered data sources:\n\n Raw data that may come in the following forms:\n any field or laboratory measurements collected during in a research\n any voice recording and its transcript of an interview or any other forms of data collection phase\n any vector and raster based images\n any video recording and its text caption of an interview or any other forms of data collection phase\n survey form responses from participants\n field notes or laboratory records\n Grant Proposals: funders may request researcher to submit their research plan as a pre-registration document in several platforms such as OSF or Curate Science\n Project-level RDMP: some funders, such as RCUK, mandate the submission of a final RDMP before the project begins\n Shared texts, voice, or video recordings of communication between team member\n Reports: may appear as a preliminary report, mid-term report, final report, or short communications\n Preprints: the preprint has been admitted as part of research output by several funders[21]\n Maps\nWhat file formats will your data be collected in? Will these formats allow for data re-use, sharing and long-term access to the data? \nAlthough most researchers use Microsoft-based applications, and most open repositories accept and provide a native viewer for many formats, the following are our choice of formats. You may refer to University of Sydney RDMP file formats or Cornell University\u2019s preservation file formats for more information.\n\nSpreadsheets \nThey should be written in text format, e.g., .csv (comma separated value) or .txt (using tab separated value). Data creators should format the spreadsheet in a \"database\" format by:\n\n starting the data immediately in cell (1,1);\n avoiding merging rows or columns; and\n clearly using the correct and consistent cell format, e.g., number, string, date, time, and category.\nDocuments \nWe recommend a text-based (ASCII) file, e.g., .txt, Markdown, or any other text format that can be created and read using a plain text reader like Notepad.\n\nAudio\/video recordings \n Audio recordings: .wav or .mp3\n Video recordings: .mp4 or .mpg\nImages and maps \n General image: .jpg, .png, .bmp, .tiff\n Raster: geoTiff\n Vector: .shp\nEmails (project communications) \nAlthough most researchers are now using proprietary email clients like Microsoft Outlook or Apple Mail, they still need to store selected emails in plain text as well.\n\nWhat conventions and procedures will you use to structure, name and version control your files to help you and others better understand how your data are organized? \nFiles are uploaded to an online repository and organized into folders by phase or by working package. If the file organization get too complicated to accommodate a set folder structure, then it should be separated and linked together. We recommend the following set of folders to organize the files.\nroot folder:\n\n data:\n raw\n processed\n analysis\n code (or script)\n tables\n figure or image\n output\n report\n presentation\n article (or manuscript)\nSome field of research may have other specific folder arrangements, but generally they should have the components in the figure. If some team members choose to maintain a Google Drive, DropBox, Onedrive or other cloud service, then they should make an accessible link to the drives or folders and register the links to the data repository. To accommodate limited storage, the principal investigator (PI), co-PI, and team members may also maintain an open repository, such as OSF, Figshare, Zenodo, GitHub, GitLab, and other similar services, given that such services offer version control and access option features. All services should be linked together to a central repository. The team may also maintain a dedicated project website to store the data and related research documents, to keep track of the activities, and to store the project's repository or storage structure.\n\nComponent 2: Data documentation and metadata \nWhat documentation will be needed for the data to be read and interpreted correctly in the future? \nAll data will be preserved in open formats to ensure its readability in the future. Any metadata should be attached to each data file, or in some instance, a data folder. A README file should be included in the root folder, containing folder structure, a general overview, and some context of the data.\n\nHow will you make sure that documentation is created or captured consistently throughout your project? \nAll deliverables (data, reports, presentations, preprints, etc.) should be recorded, listed, and stored in the project repository. A README file may be useful to describe the context, time frame, location, structure, and status of the files. Data staff (DS) may be assigned to check the status of the documentation.\n\nWhat metadata standard will be needed to describe your data? \nWe recommend the following minimum metadata schema for general data:\n\n Title of the dataset (see example)\n Abstract (to give context)\n Creator\n Contributor\n Publisher\n Funder\n Date of publication\n Resource type\n Location\n License\/rights\n Data structure\n Data size\n File format\nFor geospatial dataset, we refer to the ISO 19115-1:2003 geospatial metadata standard, which is also used by Badan Informasi Geospatial of Indonesia (Indonesia Board of Geospatial Information). A minimum metadata schema for general dataset and general geodataset can be found here.\n\nComponent 3: Storage and backup \nWhat are the anticipated storage requirements for your project, in terms of storage space (in megabytes, gigabytes, terabytes, etc.) and the length of time you will be storing it? \nWe anticipate less than five gigabytes of data and documents to be generated by the project. As far as possible, data will be deposited in long-term archives. A minimum of 10 years of preservation should be in consideration, but there are open repositories that provide longer preservation time, e.g., up to 50 years or more. Data should be deposited at the start of the project and ended by the time the final report is submitted to the project funder. An embargo period (maximum of two years) may be assigned if needed. Following the end of the embargo period, assigned data staff must make the data publicly available until a minimum of 10 years.\n\nHow and where will your data be stored and backed up during your research project? \nData and documents are stored at three storage levels:\n\n working offline storage and at least one offline backup using a portable hard drive\n an online dynamic data repository using the university's available institutional repository and\/or open repository services like the OSF (maintained by Center for Open Science), Figshare (maintained by Digital Science), or Zenodo (maintained by CERN)\n an online static data repository; an institutional repository can be used to store the final dataset and other documents\nWe suggest the following back up strategies:\n\n back up from offline working storage to portable media must be preformed immediately; daily backup is highly recommended\n back up to cloud storage or repository at least once a week\n team members are suggested to use a back up application such as Apple Time Machine or Free File Sync\nHow will the research team and other collaborators access, modify, and contribute data throughout the project? \nThe research team, relevant members of the research team, and project participants will be granted access to the data repository and to other online services. The access will be set through a unique user ID and password system before the embargo period ends. The minimum access for the above-mentioned parties will be \"read-write\" access, while an \"administrator\" role should be given to the PI and at least two other team members: one co-PI and data staff. After exceeding the embargo period, the data repository will be made public.\n\nComponent 4: Preservation \nWhere will you deposit your data for long-term preservation and access at the end of your research project? \nSelection of material \nAll final materials as follows will be kept available in the Institutional Repository and OSF dynamic repository:\n\n data:\n raw data\n final processed data\n reports:\n preliminary report\n mid term report and\n final report\nAll intermediate and ongoing files, including data and other documents, will be made available in the OSF dynamic repository.\n\nPreservation \nLong term preservation of publicly available data will be through appropriate repositories, including institutional repository. More than one archive may be selected using the LOCKSS principle or FAIR principle for data sharing as the main criteria. In this case, we recommend the OSF dynamic repository and static institutional repository.\n\nIndicate how you will ensure your data is preservation-ready. Consider preservation-friendly file formats, ensuring file integrity, anonymization and de-identification, inclusion of supporting documentation. \nFor all data generated from research, we may ask the data creator to convert it from any proprietary file formats to open formats for long term preservation. Another option would be to have data staff (DS) assigned to work on file conversion. The data creator or DS should ensure the anonymization\/de-identification of sensitive data.\n\nComponent 5: Sharing and reuse \nWhat data will you be sharing and in what form (e.g., raw, processed, analyzed, final)? \nIn a general sense, we recommend sharing raw, processed, analyzed, and final datasets. However, given the nature of the project, PIs may appeal for another form of data sharing. They could complete a data assessment form in order to come up with an appropriate data sharing mechanism. PIs may have to:\n\n choose which type of data that they think could be safely shared without breaching a data release agreement with other parties, and\n separate primary data from another institution from the primary new data acquired by team members.\nHave you considered what type of end-user license to include with your data? \nWe recommend using moderate licenses, e.g., a CC-BY license, MIT license, and Academic Free License as the default license for data and also for all resulting documents. However, the PI may propose another more lenient license such as the CC0 waiver or CC-BY-SA license. For sensitive data, PIs may suggest a more restrictive license.\n\nWhat steps will be taken to help the research community know that your data exists? \nAll data and associated data repository should be able to be found by at least one indexing service, e.g., Google Scholar. Common repositories are now accessible via BASE and ONESearch (a feature from the Indonesia National Library and Archive). To be formally cited, we also recommend the use of a persistent link, e.g., a DOI from CrossRef or Datacite.\n\nComponent 6: Responsibilities and resources \nIdentify who will be responsible for managing this project's data during and after the project and the major data management tasks for which they will be responsible. \nPI and an assigned DS are responsible for research data management. This includes file conversion, classifying, and managing the various research outputs identified in this RDMP, throughout the research cycle and during the lifetime of the data.\n\nHow will responsibilities for managing data activities be handled in case substantive changes happen in the personnel overseeing the project's data, including a change of principal investigator? \nIn the case of a change of PI or DS, responsibility will be transferred to one of the co-PIs or to a DS assigned by the PI or institution.\n\nWhat resources will you require to implement your data management plan? What do you estimate the overall cost for data management to be? \nAside from the data collection phase, the major costs of data management for the project are for management and storage components. The management components should be funded by the research project, while storage is the responsibility of the university, or a PI may select a free, open repository.\n\nComponent 7: Ethics and legal compliance \nAn intellectual property rights (IPR) officer at the university level is very much needed in this case, but researchers should also have enough basic knowledge regarding this subject.\n\nIf your research project includes sensitive data, how will you ensure that it is securely managed and accessible only to approved members of the project? \nA university-level or several faculty-level data stewards (DS) should be assigned to ensure the management of sensitive data and data management in general. The access to such data may be restricted to PI, one of the co-PIs, and the DS. The DS will have a checklist form to help them assess the situation.\n\nIf applicable, what strategies will you undertake to address secondary uses of sensitive data? \nUsers must register to access the data or contact the university DS, filling out a sensitive data usage form. The form then will be evaluated by a university-level or faculty\/school-level DS, given that the DS should also consult with the data creator or original researcher.\n\nHow will you manage legal, ethical, and intellectual property issues? \nIP rights for the project are largely held by the university, or there could be joint IPR management for joint research activity. It should be clearly mentioned in the data agreement.\n\nAcknowledgements \nWe thank the following persons for their feedback and corrections to this article: the repository team of Institut Teknologi Bandung, Willem Vervoort and Gene Melzack (from University of Sydney), Driajana, Akhmad Riqqi, and Yudi Darma (from UDARA team), and also Sarah Lindley (from University of Manchester), open science community and INArxiv preprint server users.\n\nHosting institution \nThe university solely, or in case of a joint research, the hosting institution should be clearly stated in the data sharing and ownership agreement.\n\nAuthor contributions \nBoth authors contributed evenly to this article.\n\nConflicts of interest \nBoth authors declare no competing interest upon the publishing of this paper.\n\nReferences \n\n\n\u2191 Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J. et al.&#32;(2016).&#32;\"The FAIR Guiding Principles for scientific data management and stewardship\".&#32;Scientific Data&#32;3: 160018.&#32;doi:10.1038\/sdata.2016.18. &#160; \n\n\u2191 Davidson, J.; Jones, S.; Molloy, L. et al.&#32;(2014).&#32;\"Emerging Good Practice in Managing Research Data and Research Information within UK Universities\".&#32;Procedia Computer Science&#32;33: 215\u201322.&#32;doi:10.1016\/j.procs.2014.06.035. &#160; \n\n\u2191 Tenopir, C.; Allard, S.; Douglass, K. et al.&#32;(2011).&#32;\"Data sharing by scientists: Practices and perceptions\".&#32;PLoS One&#32;6&#32;(6): e21101.&#32;doi:10.1371\/journal.pone.0021101.&#32;PMC&#160;PMC3126798.&#32;PMID&#160;21738610.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3126798 . &#160; \n\n\u2191 Tenopir, C.; Dalton, E.D.; Allard, S. et al.&#32;(2015).&#32;\"Changes in Data Sharing and Data Reuse Practices and Perceptions among Scientists Worldwide\".&#32;PLoS One&#32;10&#32;(8): e0134826.&#32;doi:10.1371\/journal.pone.0134826.&#32;PMC&#160;PMC4550246.&#32;PMID&#160;26308551.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4550246 . &#160; \n\n\u2191 van Panhuis, W.G.; Paul, P.; Emerson, C. et al.&#32;(2014).&#32;\"A systematic review of barriers to data sharing in public health\".&#32;BMC Public Health&#32;14: 1144.&#32;doi:10.1186\/1471-2458-14-1144.&#32;PMC&#160;PMC4239377.&#32;PMID&#160;25377061.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4239377 . &#160; \n\n\u2191 Wallis, J.C.; Rolando, E.; Borgman, C.L.&#32;(2013).&#32;\"If we share data, will anyone use them? Data sharing and reuse in the long tail of science and technology\".&#32;PLoS One&#32;8&#32;(7): e67332.&#32;doi:10.1371\/journal.pone.0067332.&#32;PMC&#160;PMC3720779.&#32;PMID&#160;23935830.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3720779 . &#160; \n\n\u2191 Irawan, D.E.&#32;(24 April 2018).&#32;\"RDM policy and data archiving at university level -- technical bits -- an example from ITB\".&#32;Figshare.&#32;https:\/\/figshare.com\/articles\/RDM_policy_and_data_archiving_at_university_level_--_technical_bits_--_an_example_from_ITB\/6179084\/1 . &#160; \n\n\u2191 Irawan, D.E.&#32;(19 September 2017).&#32;\"A light introduction to research data management\".&#32;Figshare.&#32;https:\/\/figshare.com\/articles\/A_light_introduction_to_research_data_management\/5418694\/1 . &#160; \n\n\u2191 9.0 9.1 Irawan, D.E.; Rachmi, C.N.&#32;(15 May 2018).&#32;\"Promoting data sharing among Indonesian scientists: A proposal of generic university-level RDMP\".&#32;Open Science Framework.&#32;doi:10.17605\/OSF.IO\/59VCN.&#32;https:\/\/osf.io\/59vcn\/ . &#160; \n\n\u2191 10.0 10.1 Neylon, C.&#32;(2017).&#32;\"Compliance Culture or Culture Change? The role of funders in improving data management and sharing practice amongst researchers\".&#32;Research Ideas and Outcomes&#32;3: e14673.&#32;doi:10.3897\/rio.3.e14673. &#160; \n\n\u2191 11.0 11.1 Neylon, C.&#32;(2017).&#32;\"Building a Culture of Data Sharing: Policy Design and Implementation for Research Data Management in Development Research\".&#32;Research Ideas and Outcomes&#32;3: e21773.&#32;doi:10.3897\/rio.3.e21773. &#160; \n\n\u2191 Neylon, C.&#32;(2017).&#32;\"Support Your Data: A Research Data Management Guide for Researchers\".&#32;Research Ideas and Outcomes&#32;4: e26439.&#32;doi:10.3897\/rio.4.e26439. &#160; \n\n\u2191 Irawan, D.E.; Vervoort, R.W.; Melzack, G.&#32;(19 December 2017).&#32;\"Open Data Workshop SSEAC Usyd - ITB\".&#32;Open Science Framework.&#32;doi:10.17605\/OSF.IO\/S76GU.&#32;https:\/\/osf.io\/s76gu\/ . &#160; \n\n\u2191 Digital Curation Center&#32;(2014).&#32;\"Checklist for a Data Management Plan\".&#32;http:\/\/www.dcc.ac.uk\/resources\/data-management-plans\/checklist . &#160; \n\n\u2191 Teperek, M.; Mollitt, B.; Southall, J.; Donaldson, M.&#32;(23 January 2017).&#32;\"Wellcome DMP assessment rubric v2.0\".&#32;Zenodo.&#32;doi:10.5281\/zenodo.257650.&#32;https:\/\/zenodo.org\/record\/257650 . &#160; \n\n\u2191 University of California Curation Center&#32;(2018).&#32;\"DMPTool\".&#32;Regents of the University of California.&#32;https:\/\/dmptool.org\/ . &#160; \n\n\u2191 Neylon, C.&#32;(2017).&#32;\"Data Management Plan: IDRC Data Sharing Pilot Project\".&#32;Research Ideas and Outcomes&#32;3: e14672.&#32;doi:10.3897\/rio.3.e14672. &#160; \n\n\u2191 Traynor, C.&#32;(2017).&#32;\"Data Management Plan: Empowering Indigenous Peoples and Knowledge Systems Related to Climate Change and Intellectual Property Rights\".&#32;Research Ideas and Outcomes&#32;3: e15111.&#32;doi:10.3897\/rio.3.e15111. &#160; \n\n\u2191 Wael, R.&#32;(2017).&#32;\"Data Management Plan: HarassMap\".&#32;Research Ideas and Outcomes&#32;3: e15133.&#32;doi:10.3897\/rio.3.e15133. &#160; \n\n\u2191 Woolfrey, L.&#32;(2017).&#32;\"Data Management Plan: Opening access to economic data to prevent tobacco related diseases in Africa\".&#32;Research Ideas and Outcomes&#32;3: e14837.&#32;doi:10.3897\/rio.3.e14837. &#160; \n\n\u2191 Bourne, P.E.; Polka, J.K.; Vale, R.D.; Kiley, R.&#32;(2017).&#32;\"Ten simple rules to consider regarding preprint submission\".&#32;PLoS Computational Biology&#32;13&#32;(5): e1005473.&#32;doi:10.1371\/journal.pcbi.1005473.&#32;PMC&#160;PMC5417409.&#32;PMID&#160;28472041.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5417409 . &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation, and grammar for improved readability. In some cases important information was missing from the references, and that information was added. The original article listed citations in alphabetical order, while this wiki lists them by order of appearance, by design.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\">https:\/\/www.limswiki.org\/index.php\/Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on data management and sharingLIMSwiki journal articles on open dataLIMSwiki journal articles on research\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 4 September 2018, at 20:33.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 399 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","bbf9b02ac710d05d03c94083fa4e01e0_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Promoting_data_sharing_among_Indonesian_scientists_A_proposal_of_a_generic_university-level_research_data_management_plan_RDMP skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Promoting data sharing among Indonesian scientists: A proposal of a generic university-level research data management plan (RDMP)<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>Every researcher needs data in their working ecosystem, but despite the resources (funding, time, and energy) they have spent to get the data, only a few are putting more real attention into <a href=\"https:\/\/www.limswiki.org\/index.php\/Information_management\" title=\"Information management\" target=\"_blank\" class=\"wiki-link\" data-key=\"f8672d270c0750a858ed940158ca0a73\">data management<\/a>. This paper mainly describes our recommendation of a research data management plan (RDMP) at the university level. This paper is an extension of our initiative, to be developed at the university or national level, while also in-line with current developments in scientific practices mandating data sharing and data re-use.\n<\/p><p>Researchers can use this article as an assessment form to describe the setting of their research and data management. Researchers can also develop a more detailed RDMP to cater to a specific project's environment. In this RDMP, we propose three levels of storage: offline working storage, offline backup storage, and online-cloud backup storage, located on a shared-repository. We also propose two kinds of cloud repository: a dynamic repository to store live data and a static repository to keep a copy of final data.\n<\/p><p>Hopefully, this RDMP could solve problems on data sharing and preservation, and additionally it could improve researchers' awareness about data management to increase the value and impact of their research efforts.\n<\/p><p><b>Keywords<\/b>: research data management plan, open data, data sharing, data repository, reproducible research\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Good data management is capable of supporting scientific discovery<sup id=\"rdp-ebb-cite_ref-WilkinsonTheFAIR16_1-0\" class=\"reference\"><a href=\"#cite_note-WilkinsonTheFAIR16-1\" rel=\"external_link\">[1]<\/a><\/sup>, yet we have been observing a cultural barrier on data sharing.<sup id=\"rdp-ebb-cite_ref-DavidsonEmerging14_2-0\" class=\"reference\"><a href=\"#cite_note-DavidsonEmerging14-2\" rel=\"external_link\">[2]<\/a><\/sup> More insights about data sharing and the diverse perceptions among scientists in various fields have been endlessly discussed.<sup id=\"rdp-ebb-cite_ref-TenopirData11_3-0\" class=\"reference\"><a href=\"#cite_note-TenopirData11-3\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-TenopirChanges15_4-0\" class=\"reference\"><a href=\"#cite_note-TenopirChanges15-4\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-VanPanhuisASys14_5-0\" class=\"reference\"><a href=\"#cite_note-VanPanhuisASys14-5\" rel=\"external_link\">[5]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WallisIfWe13_6-0\" class=\"reference\"><a href=\"#cite_note-WallisIfWe13-6\" rel=\"external_link\">[6]<\/a><\/sup>\n<\/p><p>Every researcher needs data in their working ecosystem, but despite the resources (funding, time, and energy) they have spent to get the data, only a few are putting more real attention into data management.<sup id=\"rdp-ebb-cite_ref-IrawanRDM18_7-0\" class=\"reference\"><a href=\"#cite_note-IrawanRDM18-7\" rel=\"external_link\">[7]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-IrawanALight17_8-0\" class=\"reference\"><a href=\"#cite_note-IrawanALight17-8\" rel=\"external_link\">[8]<\/a><\/sup> A data management strategy is not just an administrative document; it also plays an important role in guiding researchers in storing, backing up, preserving, and sharing their research data in a proper and sustainable manner.\n<\/p><p>This paper describes a guideline to build a university-level research data management plan (RDMP) and how it can promote data sharing among scientists. This RDMP would be the first one to be developed at the university level in Indonesia. This project is in-line with current development in scientific practices mandating data sharing and data re-use. The goals of this RDMP project are to build awareness about data sharing and preservation to scientists, especially academic staffs, and to build a practical and simple tool to help them manage their research data. The goal of an RDMP project is to guide researchers in managing their data, including curating, storing, sharing, and preserving it for immediate and future use.\n<\/p><p>This RDMP proposal is largely extracted from our experience in developing RDMP for an international research collaboration funded by RCUK (Research Council UK).<sup id=\"rdp-ebb-cite_ref-IrawanPromoting18_9-0\" class=\"reference\"><a href=\"#cite_note-IrawanPromoting18-9\" rel=\"external_link\">[9]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Description\">Description<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"General_overview\">General overview<\/span><\/h3>\n<p>The concern to having a proper RDMP was triggered by difficulties faced by researchers to find data from another researcher or previous research and to extract data from reports. The other problem is to find guidelines, especially in Indonesia, on how to appropriately manage your research data, to store them, and to keep them available in the long run. Clearly scientists have issues on how to re-use datasets from prior research, how to cite them in their own work (re-use), and how to know the limitation of such actions.\n<\/p><p>Due to the large effort to get data in terms of funding, time, and energy, the longevity of data should be more than one or two years, as we find to be the general case in the Indonesia research ecosystem (Fig. 1).<sup id=\"rdp-ebb-cite_ref-IrawanPromoting18_9-1\" class=\"reference\"><a href=\"#cite_note-IrawanPromoting18-9\" rel=\"external_link\">[9]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-NeylonCompliance17_10-0\" class=\"reference\"><a href=\"#cite_note-NeylonCompliance17-10\" rel=\"external_link\">[10]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-NeylonBuilding17_11-0\" class=\"reference\"><a href=\"#cite_note-NeylonBuilding17-11\" rel=\"external_link\">[11]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BorghiSupport17_12-0\" class=\"reference\"><a href=\"#cite_note-BorghiSupport17-12\" rel=\"external_link\">[12]<\/a><\/sup> Another important point to address is the barrier of data sharing that involves the fear of getting scooped, the lack of knowledge concerning intellectual property rights (IPR), and data ownership. Therefore, by developing this document, we could solve the barriers and at the same time we could come up with another way to increase the value of research data, instead of only looking at mainstream metrics.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Irawan_ResIdeasOut2018_4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"39627f3b818a2237d4aa3a8359cf5358\"><img alt=\"Fig1 Irawan ResIdeasOut2018 4.png\" src=\"https:\/\/www.limswiki.org\/images\/0\/06\/Fig1_Irawan_ResIdeasOut2018_4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> Current situation of data lifecycle<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"How_to_use_this_article_as_a_set_of_guidelines\">How to use this article as a set of guidelines<\/span><\/h3>\n<p>Researchers can use this article as an assessment form to describe the setting of their research and data management requirements from a potential funder. Researchers can also develop a more detailed RDMP to cater to a specific project's environment. They should justify the setting of their research and requirement of the funder regarding data sharing and data preservation.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Seven_components_in_RDMP\">Seven components in RDMP<\/span><\/h3>\n<p>The proposed RDMP is divided into seven components:\n<\/p>\n<ol><li>Data collection<\/li>\n<li>Documentation and metadata<\/li>\n<li>Storage and backup<\/li>\n<li>Preservation<\/li>\n<li>Sharing and re-use<\/li>\n<li>Responsibilities and resources<\/li>\n<li>Ethics and legal compliance<\/li><\/ol>\n<h3><span class=\"mw-headline\" id=\"References\">References<\/span><\/h3>\n<p>Given the different nature of research, funders, and DMP standards, we refer to the following sources in developing this RDMP:\n<\/p>\n<ul><li> Data sharing culture (Neylon 2017a<sup id=\"rdp-ebb-cite_ref-NeylonCompliance17_10-1\" class=\"reference\"><a href=\"#cite_note-NeylonCompliance17-10\" rel=\"external_link\">[10]<\/a><\/sup>, Neylon 2017b<sup id=\"rdp-ebb-cite_ref-NeylonBuilding17_11-1\" class=\"reference\"><a href=\"#cite_note-NeylonBuilding17-11\" rel=\"external_link\">[11]<\/a><\/sup>)<\/li>\n<li> Open data principles and reproducible research (Irawan <i>et al.<\/i> 2017<sup id=\"rdp-ebb-cite_ref-IrawanOpenData17_13-0\" class=\"reference\"><a href=\"#cite_note-IrawanOpenData17-13\" rel=\"external_link\">[13]<\/a><\/sup>)<\/li>\n<li> RDMP check lists or rubric (Digital Curation Center 2014<sup id=\"rdp-ebb-cite_ref-DCCChecklist_14-0\" class=\"reference\"><a href=\"#cite_note-DCCChecklist-14\" rel=\"external_link\">[14]<\/a><\/sup>, Teperek <i>et al.<\/i> 2017<sup id=\"rdp-ebb-cite_ref-TeperekWellcome17_15-0\" class=\"reference\"><a href=\"#cite_note-TeperekWellcome17-15\" rel=\"external_link\">[15]<\/a><\/sup>, University of California Curation Center 2018<sup id=\"rdp-ebb-cite_ref-UCCC_DMPTool_16-0\" class=\"reference\"><a href=\"#cite_note-UCCC_DMPTool-16\" rel=\"external_link\">[16]<\/a><\/sup>)<\/li>\n<li> RDMP case study from various fields of sciences (Neylon 2017c<sup id=\"rdp-ebb-cite_ref-NeylonDataMan17_17-0\" class=\"reference\"><a href=\"#cite_note-NeylonDataMan17-17\" rel=\"external_link\">[17]<\/a><\/sup>, Traynor 2017<sup id=\"rdp-ebb-cite_ref-TraynorDataMan17_18-0\" class=\"reference\"><a href=\"#cite_note-TraynorDataMan17-18\" rel=\"external_link\">[18]<\/a><\/sup>, Wael 2017<sup id=\"rdp-ebb-cite_ref-WaelDataMan17_19-0\" class=\"reference\"><a href=\"#cite_note-WaelDataMan17-19\" rel=\"external_link\">[19]<\/a><\/sup>, Woolfrey 2017<sup id=\"rdp-ebb-cite_ref-WoolfreyDataMan17_20-0\" class=\"reference\"><a href=\"#cite_note-WoolfreyDataMan17-20\" rel=\"external_link\">[20]<\/a><\/sup>)<\/li><\/ul>\n<h2><span class=\"mw-headline\" id=\"Component_1:_Data_collection\">Component 1: Data collection<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"What_types_of_data_will_you_collect.2C_create.2C_link_to.2C_acquire_and.2For_record.3F\">What types of data will you collect, create, link to, acquire and\/or record?<\/span><\/h3>\n<p>This RDMP covers the following type of data or documents, which are considered data sources:\n<\/p>\n<ul><li> Raw data that may come in the following forms:\n<ul><li> any field or <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratory<\/a> measurements collected during in a research<\/li>\n<li> any voice recording and its transcript of an interview or any other forms of data collection phase<\/li>\n<li> any vector and raster based images<\/li>\n<li> any video recording and its text caption of an interview or any other forms of data collection phase<\/li>\n<li> survey form responses from participants<\/li>\n<li> field notes or laboratory records<\/li><\/ul><\/li>\n<li> Grant Proposals: funders may request researcher to submit their research plan as a pre-registration document in several platforms such as OSF or Curate Science<\/li>\n<li> Project-level RDMP: some funders, such as RCUK, mandate the submission of a final RDMP before the project begins<\/li>\n<li> Shared texts, voice, or video recordings of communication between team member<\/li>\n<li> Reports: may appear as a preliminary report, mid-term report, final report, or short communications<\/li>\n<li> Preprints: the preprint has been admitted as part of research output by several funders<sup id=\"rdp-ebb-cite_ref-BourneTenSimp17_21-0\" class=\"reference\"><a href=\"#cite_note-BourneTenSimp17-21\" rel=\"external_link\">[21]<\/a><\/sup><\/li>\n<li> Maps<\/li><\/ul>\n<h3><span class=\"mw-headline\" id=\"What_file_formats_will_your_data_be_collected_in.3F_Will_these_formats_allow_for_data_re-use.2C_sharing_and_long-term_access_to_the_data.3F\">What file formats will your data be collected in? Will these formats allow for data re-use, sharing and long-term access to the data?<\/span><\/h3>\n<p>Although most researchers use Microsoft-based applications, and most open repositories accept and provide a native viewer for many formats, the following are our choice of formats. You may refer to <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/library.sydney.edu.au\/research\/data-management\/research-data-management-plans.html\" target=\"_blank\">University of Sydney RDMP file formats<\/a> or <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/guides.library.cornell.edu\/ecommons\/formats\" target=\"_blank\">Cornell University\u2019s preservation file formats<\/a> for more information.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Spreadsheets\">Spreadsheets<\/span><\/h4>\n<p>They should be written in text format, e.g., .csv (comma separated value) or .txt (using tab separated value). Data creators should format the spreadsheet in a \"database\" format by:\n<\/p>\n<ul><li> starting the data immediately in cell (1,1);<\/li>\n<li> avoiding merging rows or columns; and<\/li>\n<li> clearly using the correct and consistent cell format, e.g., number, string, date, time, and category.<\/li><\/ul>\n<h4><span class=\"mw-headline\" id=\"Documents\">Documents<\/span><\/h4>\n<p>We recommend a text-based (ASCII) file, e.g., .txt, Markdown, or any other text format that can be created and read using a plain text reader like Notepad.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Audio.2Fvideo_recordings\">Audio\/video recordings<\/span><\/h4>\n<ul><li> Audio recordings: .wav or .mp3<\/li>\n<li> Video recordings: .mp4 or .mpg<\/li><\/ul>\n<h4><span class=\"mw-headline\" id=\"Images_and_maps\">Images and maps<\/span><\/h4>\n<ul><li> General image: .jpg, .png, .bmp, .tiff<\/li>\n<li> Raster: geoTiff<\/li>\n<li> Vector: .shp<\/li><\/ul>\n<h4><span class=\"mw-headline\" id=\"Emails_.28project_communications.29\">Emails (project communications)<\/span><\/h4>\n<p>Although most researchers are now using proprietary email clients like Microsoft Outlook or Apple Mail, they still need to store selected emails in plain text as well.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"What_conventions_and_procedures_will_you_use_to_structure.2C_name_and_version_control_your_files_to_help_you_and_others_better_understand_how_your_data_are_organized.3F\">What conventions and procedures will you use to structure, name and version control your files to help you and others better understand how your data are organized?<\/span><\/h3>\n<p>Files are uploaded to an online repository and organized into folders by phase or by working package. If the file organization get too complicated to accommodate a set folder structure, then it should be separated and linked together. We recommend the following set of folders to organize the files.\n<\/p><p>root folder:\n<\/p>\n<ul><li> data:\n<ul><li> raw<\/li>\n<li> processed<\/li><\/ul><\/li>\n<li> analysis\n<ul><li> code (or script)<\/li>\n<li> tables<\/li>\n<li> figure or image<\/li><\/ul><\/li>\n<li> output\n<ul><li> report<\/li>\n<li> presentation<\/li>\n<li> article (or manuscript)<\/li><\/ul><\/li><\/ul>\n<p>Some field of research may have other specific folder arrangements, but generally they should have the components in the figure. If some team members choose to maintain a Google Drive, DropBox, Onedrive or other cloud service, then they should make an accessible link to the drives or folders and register the links to the data repository. To accommodate limited storage, the principal investigator (PI), co-PI, and team members may also maintain an open repository, such as OSF, Figshare, Zenodo, GitHub, GitLab, and other similar services, given that such services offer version control and access option features. All services should be linked together to a central repository. The team may also maintain a dedicated project website to store the data and related research documents, to keep track of the activities, and to store the project's repository or storage structure.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Component_2:_Data_documentation_and_metadata\">Component 2: Data documentation and metadata<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"What_documentation_will_be_needed_for_the_data_to_be_read_and_interpreted_correctly_in_the_future.3F\">What documentation will be needed for the data to be read and interpreted correctly in the future?<\/span><\/h3>\n<p>All data will be preserved in open formats to ensure its readability in the future. Any metadata should be attached to each data file, or in some instance, a data folder. A README file should be included in the root folder, containing folder structure, a general overview, and some context of the data.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"How_will_you_make_sure_that_documentation_is_created_or_captured_consistently_throughout_your_project.3F\">How will you make sure that documentation is created or captured consistently throughout your project?<\/span><\/h3>\n<p>All deliverables (data, reports, presentations, preprints, etc.) should be recorded, listed, and stored in the project repository. A README file may be useful to describe the context, time frame, location, structure, and status of the files. Data staff (DS) may be assigned to check the status of the documentation.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"What_metadata_standard_will_be_needed_to_describe_your_data.3F\">What metadata standard will be needed to describe your data?<\/span><\/h3>\n<p>We recommend the following minimum metadata schema for general data:\n<\/p>\n<ul><li> Title of the dataset (see example)<\/li>\n<li> Abstract (to give context)<\/li>\n<li> Creator<\/li>\n<li> Contributor<\/li>\n<li> Publisher<\/li>\n<li> Funder<\/li>\n<li> Date of publication<\/li>\n<li> Resource type<\/li>\n<li> Location<\/li>\n<li> License\/rights<\/li>\n<li> Data structure<\/li>\n<li> Data size<\/li>\n<li> File format<\/li><\/ul>\n<p>For geospatial dataset, we refer to the ISO 19115-1:2003 geospatial metadata standard, which is also used by Badan Informasi Geospatial of Indonesia (Indonesia Board of Geospatial Information). A minimum metadata schema for general dataset and general geodataset can be <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/docs.google.com\/spreadsheets\/d\/1L2jyIILIIEBuhyFpOREyYZOolx0tn0jkiZNsrCa4Y6M\/edit#gid=0\" target=\"_blank\">found here<\/a>.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Component_3:_Storage_and_backup\">Component 3: Storage and backup<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"What_are_the_anticipated_storage_requirements_for_your_project.2C_in_terms_of_storage_space_.28in_megabytes.2C_gigabytes.2C_terabytes.2C_etc..29_and_the_length_of_time_you_will_be_storing_it.3F\">What are the anticipated storage requirements for your project, in terms of storage space (in megabytes, gigabytes, terabytes, etc.) and the length of time you will be storing it?<\/span><\/h3>\n<p>We anticipate less than five gigabytes of data and documents to be generated by the project. As far as possible, data will be deposited in long-term archives. A minimum of 10 years of preservation should be in consideration, but there are open repositories that provide longer preservation time, e.g., up to 50 years or more. Data should be deposited at the start of the project and ended by the time the final report is submitted to the project funder. An embargo period (maximum of two years) may be assigned if needed. Following the end of the embargo period, assigned data staff must make the data publicly available until a minimum of 10 years.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"How_and_where_will_your_data_be_stored_and_backed_up_during_your_research_project.3F\">How and where will your data be stored and backed up during your research project?<\/span><\/h3>\n<p>Data and documents are stored at three storage levels:\n<\/p>\n<ul><li> working offline storage and at least one offline backup using a portable hard drive<\/li>\n<li> an online dynamic data repository using the university's available institutional repository and\/or open repository services like the OSF (maintained by Center for Open Science), Figshare (maintained by Digital Science), or Zenodo (maintained by CERN)<\/li>\n<li> an online static data repository; an institutional repository can be used to store the final dataset and other documents<\/li><\/ul>\n<p>We suggest the following back up strategies:\n<\/p>\n<ul><li> back up from offline working storage to portable media must be preformed immediately; daily backup is highly recommended<\/li>\n<li> back up to cloud storage or repository at least once a week<\/li>\n<li> team members are suggested to use a back up application such as Apple Time Machine or Free File Sync<\/li><\/ul>\n<h3><span class=\"mw-headline\" id=\"How_will_the_research_team_and_other_collaborators_access.2C_modify.2C_and_contribute_data_throughout_the_project.3F\">How will the research team and other collaborators access, modify, and contribute data throughout the project?<\/span><\/h3>\n<p>The research team, relevant members of the research team, and project participants will be granted access to the data repository and to other online services. The access will be set through a unique user ID and password system before the embargo period ends. The minimum access for the above-mentioned parties will be \"read-write\" access, while an \"administrator\" role should be given to the PI and at least two other team members: one co-PI and data staff. After exceeding the embargo period, the data repository will be made public.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Component_4:_Preservation\">Component 4: Preservation<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Where_will_you_deposit_your_data_for_long-term_preservation_and_access_at_the_end_of_your_research_project.3F\">Where will you deposit your data for long-term preservation and access at the end of your research project?<\/span><\/h3>\n<h4><span class=\"mw-headline\" id=\"Selection_of_material\">Selection of material<\/span><\/h4>\n<p>All final materials as follows will be kept available in the Institutional Repository and OSF dynamic repository:\n<\/p>\n<ul><li> data:\n<ul><li> raw data<\/li>\n<li> final processed data<\/li><\/ul><\/li>\n<li> reports:\n<ul><li> preliminary report<\/li>\n<li> mid term report and<\/li>\n<li> final report<\/li><\/ul><\/li><\/ul>\n<p>All intermediate and ongoing files, including data and other documents, will be made available in the OSF dynamic repository.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Preservation\">Preservation<\/span><\/h4>\n<p>Long term preservation of publicly available data will be through appropriate repositories, including institutional repository. More than one archive may be selected using the <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.lockss.org\/about\/principles\/\" target=\"_blank\">LOCKSS principle<\/a> or <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.force11.org\/group\/fairgroup\/fairprinciples\" target=\"_blank\">FAIR principle for data sharing<\/a> as the main criteria. In this case, we recommend the OSF dynamic repository and static institutional repository.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Indicate_how_you_will_ensure_your_data_is_preservation-ready._Consider_preservation-friendly_file_formats.2C_ensuring_file_integrity.2C_anonymization_and_de-identification.2C_inclusion_of_supporting_documentation.\">Indicate how you will ensure your data is preservation-ready. Consider preservation-friendly file formats, ensuring file integrity, anonymization and de-identification, inclusion of supporting documentation.<\/span><\/h3>\n<p>For all data generated from research, we may ask the data creator to convert it from any proprietary file formats to open formats for long term preservation. Another option would be to have data staff (DS) assigned to work on file conversion. The data creator or DS should ensure the anonymization\/de-identification of sensitive data.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Component_5:_Sharing_and_reuse\">Component 5: Sharing and reuse<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"What_data_will_you_be_sharing_and_in_what_form_.28e.g..2C_raw.2C_processed.2C_analyzed.2C_final.29.3F\">What data will you be sharing and in what form (e.g., raw, processed, analyzed, final)?<\/span><\/h3>\n<p>In a general sense, we recommend sharing raw, processed, analyzed, and final datasets. However, given the nature of the project, PIs may appeal for another form of data sharing. They could complete a data assessment form in order to come up with an appropriate data sharing mechanism. PIs may have to:\n<\/p>\n<ul><li> choose which type of data that they think could be safely shared without breaching a data release agreement with other parties, and<\/li>\n<li> separate primary data from another institution from the primary new data acquired by team members.<\/li><\/ul>\n<h3><span class=\"mw-headline\" id=\"Have_you_considered_what_type_of_end-user_license_to_include_with_your_data.3F\">Have you considered what type of end-user license to include with your data?<\/span><\/h3>\n<p>We recommend using moderate licenses, e.g., a CC-BY license, MIT license, and Academic Free License as the default license for data and also for all resulting documents. However, the PI may propose another more lenient license such as the CC0 waiver or CC-BY-SA license. For sensitive data, PIs may suggest a more restrictive license.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"What_steps_will_be_taken_to_help_the_research_community_know_that_your_data_exists.3F\">What steps will be taken to help the research community know that your data exists?<\/span><\/h3>\n<p>All data and associated data repository should be able to be found by at least one indexing service, e.g., Google Scholar. Common repositories are now accessible via <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.base-search.net\/about\/en\/\" target=\"_blank\">BASE<\/a> and <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/onesearch.id\/\" target=\"_blank\">ONESearch<\/a> (a feature from the Indonesia National Library and Archive). To be formally cited, we also recommend the use of a persistent link, e.g., a DOI from <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/crossref.org\/\" target=\"_blank\">CrossRef<\/a> or <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/datacite.org\/\" target=\"_blank\">Datacite<\/a>.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Component_6:_Responsibilities_and_resources\">Component 6: Responsibilities and resources<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Identify_who_will_be_responsible_for_managing_this_project.27s_data_during_and_after_the_project_and_the_major_data_management_tasks_for_which_they_will_be_responsible.\">Identify who will be responsible for managing this project's data during and after the project and the major data management tasks for which they will be responsible.<\/span><\/h3>\n<p>PI and an assigned DS are responsible for research data management. This includes file conversion, classifying, and managing the various research outputs identified in this RDMP, throughout the research cycle and during the lifetime of the data.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"How_will_responsibilities_for_managing_data_activities_be_handled_in_case_substantive_changes_happen_in_the_personnel_overseeing_the_project.27s_data.2C_including_a_change_of_principal_investigator.3F\">How will responsibilities for managing data activities be handled in case substantive changes happen in the personnel overseeing the project's data, including a change of principal investigator?<\/span><\/h3>\n<p>In the case of a change of PI or DS, responsibility will be transferred to one of the co-PIs or to a DS assigned by the PI or institution.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"What_resources_will_you_require_to_implement_your_data_management_plan.3F_What_do_you_estimate_the_overall_cost_for_data_management_to_be.3F\">What resources will you require to implement your data management plan? What do you estimate the overall cost for data management to be?<\/span><\/h3>\n<p>Aside from the data collection phase, the major costs of data management for the project are for management and storage components. The management components should be funded by the research project, while storage is the responsibility of the university, or a PI may select a free, open repository.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Component_7:_Ethics_and_legal_compliance\">Component 7: Ethics and legal compliance<\/span><\/h2>\n<p>An intellectual property rights (IPR) officer at the university level is very much needed in this case, but researchers should also have enough basic knowledge regarding this subject.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"If_your_research_project_includes_sensitive_data.2C_how_will_you_ensure_that_it_is_securely_managed_and_accessible_only_to_approved_members_of_the_project.3F\">If your research project includes sensitive data, how will you ensure that it is securely managed and accessible only to approved members of the project?<\/span><\/h3>\n<p>A university-level or several faculty-level data stewards (DS) should be assigned to ensure the management of sensitive data and data management in general. The access to such data may be restricted to PI, one of the co-PIs, and the DS. The DS will have a checklist form to help them assess the situation.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"If_applicable.2C_what_strategies_will_you_undertake_to_address_secondary_uses_of_sensitive_data.3F\">If applicable, what strategies will you undertake to address secondary uses of sensitive data?<\/span><\/h3>\n<p>Users must register to access the data or contact the university DS, filling out a sensitive data usage form. The form then will be evaluated by a university-level or faculty\/school-level DS, given that the DS should also consult with the data creator or original researcher.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"How_will_you_manage_legal.2C_ethical.2C_and_intellectual_property_issues.3F\">How will you manage legal, ethical, and intellectual property issues?<\/span><\/h3>\n<p>IP rights for the project are largely held by the university, or there could be joint IPR management for joint research activity. It should be clearly mentioned in the data agreement.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>We thank the following persons for their feedback and corrections to this article: the repository team of Institut Teknologi Bandung, Willem Vervoort and Gene Melzack (from University of Sydney), Driajana, Akhmad Riqqi, and Yudi Darma (from UDARA team), and also Sarah Lindley (from University of Manchester), open science community and INArxiv preprint server users.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Hosting_institution\">Hosting institution<\/span><\/h3>\n<p>The university solely, or in case of a joint research, the hosting institution should be clearly stated in the data sharing and ownership agreement.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>Both authors contributed evenly to this article.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h3>\n<p>Both authors declare no competing interest upon the publishing of this paper.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References_2\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-WilkinsonTheFAIR16-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WilkinsonTheFAIR16_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J. et al.&#32;(2016).&#32;\"The FAIR Guiding Principles for scientific data management and stewardship\".&#32;<i>Scientific Data<\/i>&#32;<b>3<\/b>: 160018.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fsdata.2016.18\" target=\"_blank\">10.1038\/sdata.2016.18<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+FAIR+Guiding+Principles+for+scientific+data+management+and+stewardship&amp;rft.jtitle=Scientific+Data&amp;rft.aulast=Wilkinson%2C+M.D.%3B+Dumontier%2C+M.%3B+Aalbersberg%2C+I.J.+et+al.&amp;rft.au=Wilkinson%2C+M.D.%3B+Dumontier%2C+M.%3B+Aalbersberg%2C+I.J.+et+al.&amp;rft.date=2016&amp;rft.volume=3&amp;rft.pages=160018&amp;rft_id=info:doi\/10.1038%2Fsdata.2016.18&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DavidsonEmerging14-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DavidsonEmerging14_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Davidson, J.; Jones, S.; Molloy, L. et al.&#32;(2014).&#32;\"Emerging Good Practice in Managing Research Data and Research Information within UK Universities\".&#32;<i>Procedia Computer Science<\/i>&#32;<b>33<\/b>: 215\u201322.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.procs.2014.06.035\" target=\"_blank\">10.1016\/j.procs.2014.06.035<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Emerging+Good+Practice+in+Managing+Research+Data+and+Research+Information+within+UK+Universities&amp;rft.jtitle=Procedia+Computer+Science&amp;rft.aulast=Davidson%2C+J.%3B+Jones%2C+S.%3B+Molloy%2C+L.+et+al.&amp;rft.au=Davidson%2C+J.%3B+Jones%2C+S.%3B+Molloy%2C+L.+et+al.&amp;rft.date=2014&amp;rft.volume=33&amp;rft.pages=215%E2%80%9322&amp;rft_id=info:doi\/10.1016%2Fj.procs.2014.06.035&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TenopirData11-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TenopirData11_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Tenopir, C.; Allard, S.; Douglass, K. et al.&#32;(2011).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3126798\" target=\"_blank\">\"Data sharing by scientists: Practices and perceptions\"<\/a>.&#32;<i>PLoS One<\/i>&#32;<b>6<\/b>&#32;(6): e21101.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0021101\" target=\"_blank\">10.1371\/journal.pone.0021101<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3126798\/\" target=\"_blank\">PMC3126798<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21738610\" target=\"_blank\">21738610<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3126798\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3126798<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+sharing+by+scientists%3A+Practices+and+perceptions&amp;rft.jtitle=PLoS+One&amp;rft.aulast=Tenopir%2C+C.%3B+Allard%2C+S.%3B+Douglass%2C+K.+et+al.&amp;rft.au=Tenopir%2C+C.%3B+Allard%2C+S.%3B+Douglass%2C+K.+et+al.&amp;rft.date=2011&amp;rft.volume=6&amp;rft.issue=6&amp;rft.pages=e21101&amp;rft_id=info:doi\/10.1371%2Fjournal.pone.0021101&amp;rft_id=info:pmc\/PMC3126798&amp;rft_id=info:pmid\/21738610&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3126798&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TenopirChanges15-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TenopirChanges15_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Tenopir, C.; Dalton, E.D.; Allard, S. et al.&#32;(2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4550246\" target=\"_blank\">\"Changes in Data Sharing and Data Reuse Practices and Perceptions among Scientists Worldwide\"<\/a>.&#32;<i>PLoS One<\/i>&#32;<b>10<\/b>&#32;(8): e0134826.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0134826\" target=\"_blank\">10.1371\/journal.pone.0134826<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4550246\/\" target=\"_blank\">PMC4550246<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26308551\" target=\"_blank\">26308551<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4550246\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4550246<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Changes+in+Data+Sharing+and+Data+Reuse+Practices+and+Perceptions+among+Scientists+Worldwide&amp;rft.jtitle=PLoS+One&amp;rft.aulast=Tenopir%2C+C.%3B+Dalton%2C+E.D.%3B+Allard%2C+S.+et+al.&amp;rft.au=Tenopir%2C+C.%3B+Dalton%2C+E.D.%3B+Allard%2C+S.+et+al.&amp;rft.date=2015&amp;rft.volume=10&amp;rft.issue=8&amp;rft.pages=e0134826&amp;rft_id=info:doi\/10.1371%2Fjournal.pone.0134826&amp;rft_id=info:pmc\/PMC4550246&amp;rft_id=info:pmid\/26308551&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4550246&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VanPanhuisASys14-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VanPanhuisASys14_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">van Panhuis, W.G.; Paul, P.; Emerson, C. et al.&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4239377\" target=\"_blank\">\"A systematic review of barriers to data sharing in public health\"<\/a>.&#32;<i>BMC Public Health<\/i>&#32;<b>14<\/b>: 1144.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2458-14-1144\" target=\"_blank\">10.1186\/1471-2458-14-1144<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4239377\/\" target=\"_blank\">PMC4239377<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25377061\" target=\"_blank\">25377061<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4239377\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4239377<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+systematic+review+of+barriers+to+data+sharing+in+public+health&amp;rft.jtitle=BMC+Public+Health&amp;rft.aulast=van+Panhuis%2C+W.G.%3B+Paul%2C+P.%3B+Emerson%2C+C.+et+al.&amp;rft.au=van+Panhuis%2C+W.G.%3B+Paul%2C+P.%3B+Emerson%2C+C.+et+al.&amp;rft.date=2014&amp;rft.volume=14&amp;rft.pages=1144&amp;rft_id=info:doi\/10.1186%2F1471-2458-14-1144&amp;rft_id=info:pmc\/PMC4239377&amp;rft_id=info:pmid\/25377061&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4239377&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WallisIfWe13-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WallisIfWe13_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wallis, J.C.; Rolando, E.; Borgman, C.L.&#32;(2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3720779\" target=\"_blank\">\"If we share data, will anyone use them? Data sharing and reuse in the long tail of science and technology\"<\/a>.&#32;<i>PLoS One<\/i>&#32;<b>8<\/b>&#32;(7): e67332.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0067332\" target=\"_blank\">10.1371\/journal.pone.0067332<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3720779\/\" target=\"_blank\">PMC3720779<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23935830\" target=\"_blank\">23935830<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3720779\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3720779<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=If+we+share+data%2C+will+anyone+use+them%3F+Data+sharing+and+reuse+in+the+long+tail+of+science+and+technology&amp;rft.jtitle=PLoS+One&amp;rft.aulast=Wallis%2C+J.C.%3B+Rolando%2C+E.%3B+Borgman%2C+C.L.&amp;rft.au=Wallis%2C+J.C.%3B+Rolando%2C+E.%3B+Borgman%2C+C.L.&amp;rft.date=2013&amp;rft.volume=8&amp;rft.issue=7&amp;rft.pages=e67332&amp;rft_id=info:doi\/10.1371%2Fjournal.pone.0067332&amp;rft_id=info:pmc\/PMC3720779&amp;rft_id=info:pmid\/23935830&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3720779&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IrawanRDM18-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IrawanRDM18_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Irawan, D.E.&#32;(24 April 2018).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/figshare.com\/articles\/RDM_policy_and_data_archiving_at_university_level_--_technical_bits_--_an_example_from_ITB\/6179084\/1\" target=\"_blank\">\"RDM policy and data archiving at university level -- technical bits -- an example from ITB\"<\/a>.&#32;<i>Figshare<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/figshare.com\/articles\/RDM_policy_and_data_archiving_at_university_level_--_technical_bits_--_an_example_from_ITB\/6179084\/1\" target=\"_blank\">https:\/\/figshare.com\/articles\/RDM_policy_and_data_archiving_at_university_level_--_technical_bits_--_an_example_from_ITB\/6179084\/1<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=RDM+policy+and+data+archiving+at+university+level+--+technical+bits+--+an+example+from+ITB&amp;rft.atitle=Figshare&amp;rft.aulast=Irawan%2C+D.E.&amp;rft.au=Irawan%2C+D.E.&amp;rft.date=24+April+2018&amp;rft_id=https%3A%2F%2Ffigshare.com%2Farticles%2FRDM_policy_and_data_archiving_at_university_level_--_technical_bits_--_an_example_from_ITB%2F6179084%2F1&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IrawanALight17-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IrawanALight17_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Irawan, D.E.&#32;(19 September 2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/figshare.com\/articles\/A_light_introduction_to_research_data_management\/5418694\/1\" target=\"_blank\">\"A light introduction to research data management\"<\/a>.&#32;<i>Figshare<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/figshare.com\/articles\/A_light_introduction_to_research_data_management\/5418694\/1\" target=\"_blank\">https:\/\/figshare.com\/articles\/A_light_introduction_to_research_data_management\/5418694\/1<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=A+light+introduction+to+research+data+management&amp;rft.atitle=Figshare&amp;rft.aulast=Irawan%2C+D.E.&amp;rft.au=Irawan%2C+D.E.&amp;rft.date=19+September+2017&amp;rft_id=https%3A%2F%2Ffigshare.com%2Farticles%2FA_light_introduction_to_research_data_management%2F5418694%2F1&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IrawanPromoting18-9\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-IrawanPromoting18_9-0\" rel=\"external_link\">9.0<\/a><\/sup> <sup><a href=\"#cite_ref-IrawanPromoting18_9-1\" rel=\"external_link\">9.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">Irawan, D.E.; Rachmi, C.N.&#32;(15 May 2018).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/osf.io\/59vcn\/\" target=\"_blank\">\"Promoting data sharing among Indonesian scientists: A proposal of generic university-level RDMP\"<\/a>.&#32;<i>Open Science Framework<\/i>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.17605%2FOSF.IO%2F59VCN\" target=\"_blank\">10.17605\/OSF.IO\/59VCN<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/osf.io\/59vcn\/\" target=\"_blank\">https:\/\/osf.io\/59vcn\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Promoting+data+sharing+among+Indonesian+scientists%3A+A+proposal+of+generic+university-level+RDMP&amp;rft.atitle=Open+Science+Framework&amp;rft.aulast=Irawan%2C+D.E.%3B+Rachmi%2C+C.N.&amp;rft.au=Irawan%2C+D.E.%3B+Rachmi%2C+C.N.&amp;rft.date=15+May+2018&amp;rft_id=info:doi\/10.17605%2FOSF.IO%2F59VCN&amp;rft_id=https%3A%2F%2Fosf.io%2F59vcn%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NeylonCompliance17-10\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-NeylonCompliance17_10-0\" rel=\"external_link\">10.0<\/a><\/sup> <sup><a href=\"#cite_ref-NeylonCompliance17_10-1\" rel=\"external_link\">10.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Neylon, C.&#32;(2017).&#32;\"Compliance Culture or Culture Change? The role of funders in improving data management and sharing practice amongst researchers\".&#32;<i>Research Ideas and Outcomes<\/i>&#32;<b>3<\/b>: e14673.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3897%2Frio.3.e14673\" target=\"_blank\">10.3897\/rio.3.e14673<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Compliance+Culture+or+Culture+Change%3F+The+role+of+funders+in+improving+data+management+and+sharing+practice+amongst+researchers&amp;rft.jtitle=Research+Ideas+and+Outcomes&amp;rft.aulast=Neylon%2C+C.&amp;rft.au=Neylon%2C+C.&amp;rft.date=2017&amp;rft.volume=3&amp;rft.pages=e14673&amp;rft_id=info:doi\/10.3897%2Frio.3.e14673&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NeylonBuilding17-11\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-NeylonBuilding17_11-0\" rel=\"external_link\">11.0<\/a><\/sup> <sup><a href=\"#cite_ref-NeylonBuilding17_11-1\" rel=\"external_link\">11.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Neylon, C.&#32;(2017).&#32;\"Building a Culture of Data Sharing: Policy Design and Implementation for Research Data Management in Development Research\".&#32;<i>Research Ideas and Outcomes<\/i>&#32;<b>3<\/b>: e21773.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3897%2Frio.3.e21773\" target=\"_blank\">10.3897\/rio.3.e21773<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Building+a+Culture+of+Data+Sharing%3A+Policy+Design+and+Implementation+for+Research+Data+Management+in+Development+Research&amp;rft.jtitle=Research+Ideas+and+Outcomes&amp;rft.aulast=Neylon%2C+C.&amp;rft.au=Neylon%2C+C.&amp;rft.date=2017&amp;rft.volume=3&amp;rft.pages=e21773&amp;rft_id=info:doi\/10.3897%2Frio.3.e21773&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BorghiSupport17-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BorghiSupport17_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Neylon, C.&#32;(2017).&#32;\"Support Your Data: A Research Data Management Guide for Researchers\".&#32;<i>Research Ideas and Outcomes<\/i>&#32;<b>4<\/b>: e26439.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3897%2Frio.4.e26439\" target=\"_blank\">10.3897\/rio.4.e26439<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Support+Your+Data%3A+A+Research+Data+Management+Guide+for+Researchers&amp;rft.jtitle=Research+Ideas+and+Outcomes&amp;rft.aulast=Neylon%2C+C.&amp;rft.au=Neylon%2C+C.&amp;rft.date=2017&amp;rft.volume=4&amp;rft.pages=e26439&amp;rft_id=info:doi\/10.3897%2Frio.4.e26439&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IrawanOpenData17-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IrawanOpenData17_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Irawan, D.E.; Vervoort, R.W.; Melzack, G.&#32;(19 December 2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/osf.io\/s76gu\/\" target=\"_blank\">\"Open Data Workshop SSEAC Usyd - ITB\"<\/a>.&#32;<i>Open Science Framework<\/i>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.17605%2FOSF.IO%2FS76GU\" target=\"_blank\">10.17605\/OSF.IO\/S76GU<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/osf.io\/s76gu\/\" target=\"_blank\">https:\/\/osf.io\/s76gu\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Open+Data+Workshop+SSEAC+Usyd+-+ITB&amp;rft.atitle=Open+Science+Framework&amp;rft.aulast=Irawan%2C+D.E.%3B+Vervoort%2C+R.W.%3B+Melzack%2C+G.&amp;rft.au=Irawan%2C+D.E.%3B+Vervoort%2C+R.W.%3B+Melzack%2C+G.&amp;rft.date=19+December+2017&amp;rft_id=info:doi\/10.17605%2FOSF.IO%2FS76GU&amp;rft_id=https%3A%2F%2Fosf.io%2Fs76gu%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DCCChecklist-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DCCChecklist_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Digital Curation Center&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.dcc.ac.uk\/resources\/data-management-plans\/checklist\" target=\"_blank\">\"Checklist for a Data Management Plan\"<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.dcc.ac.uk\/resources\/data-management-plans\/checklist\" target=\"_blank\">http:\/\/www.dcc.ac.uk\/resources\/data-management-plans\/checklist<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Checklist+for+a+Data+Management+Plan&amp;rft.atitle=&amp;rft.aulast=Digital+Curation+Center&amp;rft.au=Digital+Curation+Center&amp;rft.date=2014&amp;rft_id=http%3A%2F%2Fwww.dcc.ac.uk%2Fresources%2Fdata-management-plans%2Fchecklist&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TeperekWellcome17-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TeperekWellcome17_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Teperek, M.; Mollitt, B.; Southall, J.; Donaldson, M.&#32;(23 January 2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/zenodo.org\/record\/257650\" target=\"_blank\">\"Wellcome DMP assessment rubric v2.0\"<\/a>.&#32;<i>Zenodo<\/i>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5281%2Fzenodo.257650\" target=\"_blank\">10.5281\/zenodo.257650<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/zenodo.org\/record\/257650\" target=\"_blank\">https:\/\/zenodo.org\/record\/257650<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Wellcome+DMP+assessment+rubric+v2.0&amp;rft.atitle=Zenodo&amp;rft.aulast=Teperek%2C+M.%3B+Mollitt%2C+B.%3B+Southall%2C+J.%3B+Donaldson%2C+M.&amp;rft.au=Teperek%2C+M.%3B+Mollitt%2C+B.%3B+Southall%2C+J.%3B+Donaldson%2C+M.&amp;rft.date=23+January+2017&amp;rft_id=info:doi\/10.5281%2Fzenodo.257650&amp;rft_id=https%3A%2F%2Fzenodo.org%2Frecord%2F257650&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-UCCC_DMPTool-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-UCCC_DMPTool_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">University of California Curation Center&#32;(2018).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/dmptool.org\/\" target=\"_blank\">\"DMPTool\"<\/a>.&#32;Regents of the University of California<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/dmptool.org\/\" target=\"_blank\">https:\/\/dmptool.org\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=DMPTool&amp;rft.atitle=&amp;rft.aulast=University+of+California+Curation+Center&amp;rft.au=University+of+California+Curation+Center&amp;rft.date=2018&amp;rft.pub=Regents+of+the+University+of+California&amp;rft_id=https%3A%2F%2Fdmptool.org%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NeylonDataMan17-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NeylonDataMan17_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Neylon, C.&#32;(2017).&#32;\"Data Management Plan: IDRC Data Sharing Pilot Project\".&#32;<i>Research Ideas and Outcomes<\/i>&#32;<b>3<\/b>: e14672.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3897%2Frio.3.e14672\" target=\"_blank\">10.3897\/rio.3.e14672<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+Management+Plan%3A+IDRC+Data+Sharing+Pilot+Project&amp;rft.jtitle=Research+Ideas+and+Outcomes&amp;rft.aulast=Neylon%2C+C.&amp;rft.au=Neylon%2C+C.&amp;rft.date=2017&amp;rft.volume=3&amp;rft.pages=e14672&amp;rft_id=info:doi\/10.3897%2Frio.3.e14672&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TraynorDataMan17-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TraynorDataMan17_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Traynor, C.&#32;(2017).&#32;\"Data Management Plan: Empowering Indigenous Peoples and Knowledge Systems Related to Climate Change and Intellectual Property Rights\".&#32;<i>Research Ideas and Outcomes<\/i>&#32;<b>3<\/b>: e15111.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3897%2Frio.3.e15111\" target=\"_blank\">10.3897\/rio.3.e15111<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+Management+Plan%3A+Empowering+Indigenous+Peoples+and+Knowledge+Systems+Related+to+Climate+Change+and+Intellectual+Property+Rights&amp;rft.jtitle=Research+Ideas+and+Outcomes&amp;rft.aulast=Traynor%2C+C.&amp;rft.au=Traynor%2C+C.&amp;rft.date=2017&amp;rft.volume=3&amp;rft.pages=e15111&amp;rft_id=info:doi\/10.3897%2Frio.3.e15111&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WaelDataMan17-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WaelDataMan17_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wael, R.&#32;(2017).&#32;\"Data Management Plan: HarassMap\".&#32;<i>Research Ideas and Outcomes<\/i>&#32;<b>3<\/b>: e15133.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3897%2Frio.3.e15133\" target=\"_blank\">10.3897\/rio.3.e15133<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+Management+Plan%3A+HarassMap&amp;rft.jtitle=Research+Ideas+and+Outcomes&amp;rft.aulast=Wael%2C+R.&amp;rft.au=Wael%2C+R.&amp;rft.date=2017&amp;rft.volume=3&amp;rft.pages=e15133&amp;rft_id=info:doi\/10.3897%2Frio.3.e15133&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WoolfreyDataMan17-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WoolfreyDataMan17_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Woolfrey, L.&#32;(2017).&#32;\"Data Management Plan: Opening access to economic data to prevent tobacco related diseases in Africa\".&#32;<i>Research Ideas and Outcomes<\/i>&#32;<b>3<\/b>: e14837.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3897%2Frio.3.e14837\" target=\"_blank\">10.3897\/rio.3.e14837<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+Management+Plan%3A+Opening+access+to+economic+data+to+prevent+tobacco+related+diseases+in+Africa&amp;rft.jtitle=Research+Ideas+and+Outcomes&amp;rft.aulast=Woolfrey%2C+L.&amp;rft.au=Woolfrey%2C+L.&amp;rft.date=2017&amp;rft.volume=3&amp;rft.pages=e14837&amp;rft_id=info:doi\/10.3897%2Frio.3.e14837&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BourneTenSimp17-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BourneTenSimp17_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bourne, P.E.; Polka, J.K.; Vale, R.D.; Kiley, R.&#32;(2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5417409\" target=\"_blank\">\"Ten simple rules to consider regarding preprint submission\"<\/a>.&#32;<i>PLoS Computational Biology<\/i>&#32;<b>13<\/b>&#32;(5): e1005473.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pcbi.1005473\" target=\"_blank\">10.1371\/journal.pcbi.1005473<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5417409\/\" target=\"_blank\">PMC5417409<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28472041\" target=\"_blank\">28472041<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5417409\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5417409<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Ten+simple+rules+to+consider+regarding+preprint+submission&amp;rft.jtitle=PLoS+Computational+Biology&amp;rft.aulast=Bourne%2C+P.E.%3B+Polka%2C+J.K.%3B+Vale%2C+R.D.%3B+Kiley%2C+R.&amp;rft.au=Bourne%2C+P.E.%3B+Polka%2C+J.K.%3B+Vale%2C+R.D.%3B+Kiley%2C+R.&amp;rft.date=2017&amp;rft.volume=13&amp;rft.issue=5&amp;rft.pages=e1005473&amp;rft_id=info:doi\/10.1371%2Fjournal.pcbi.1005473&amp;rft_id=info:pmc\/PMC5417409&amp;rft_id=info:pmid\/28472041&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5417409&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation, and grammar for improved readability. In some cases important information was missing from the references, and that information was added. The original article listed citations in alphabetical order, while this wiki lists them by order of appearance, by design.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193148\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.531 seconds\nReal time usage: 0.561 seconds\nPreprocessor visited node count: 17024\/1000000\nPreprocessor generated node count: 32171\/1000000\nPost\u2010expand include size: 127846\/2097152 bytes\nTemplate argument size: 40656\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 537.510 1 - -total\n 81.62% 438.696 1 - Template:Reflist\n 71.36% 383.567 21 - Template:Citation\/core\n 53.33% 286.680 14 - Template:Cite_journal\n 22.34% 120.058 7 - Template:Cite_web\n 13.23% 71.101 1 - Template:Infobox_journal_article\n 12.74% 68.495 1 - Template:Infobox\n 7.65% 41.119 27 - Template:Citation\/identifier\n 7.53% 40.460 80 - Template:Infobox\/row\n 4.44% 23.849 26 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10797-0!*!0!!*!5!* and timestamp 20181214193148 and revision id 33973\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)\">https:\/\/www.limswiki.org\/index.php\/Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","bbf9b02ac710d05d03c94083fa4e01e0_images":["https:\/\/www.limswiki.org\/images\/0\/06\/Fig1_Irawan_ResIdeasOut2018_4.png"],"bbf9b02ac710d05d03c94083fa4e01e0_timestamp":1544815908,"5084f989065d7c37f4ccf170c3f09ee7_type":"article","5084f989065d7c37f4ccf170c3f09ee7_title":"Support Your Data: A research data management guide for researchers (Borghi et al. 2018)","5084f989065d7c37f4ccf170c3f09ee7_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers","5084f989065d7c37f4ccf170c3f09ee7_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Support Your Data: A research data management guide for researchers\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nSupport Your Data: A research data management guide for researchersJournal\n \nResearch Ideas and OutcomesAuthor(s)\n \nBorghi, John A.; Abrams, Stephen; Lowenberg, Daniella; Simms, Stephanie; Chodacki, JohnAuthor affiliation(s)\n \nUniversity of California Curation CenterPrimary contact\n \nEmail: john dot borghi at ucop dot eduYear published\n \n2018Volume and issue\n \n4Page(s)\n \ne26439DOI\n \n10.3897\/rio.4.e26439ISSN\n \n2367-7163Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/riojournal.com\/articles.php?id=26439Download\n \nhttps:\/\/riojournal.com\/article\/26439\/download\/pdf\/ (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Project development \n4 The support your data materials \n5 RDM rubric \n6 One-page guides \n7 Using the Support Your Data materials \n8 Next steps \n9 Supplementary material \n10 Acknowledgements \n\n10.1 Hosting institution \n10.2 Author contributions \n10.3 Conflicts of interest \n\n\n11 Footnotes \n12 References \n13 Notes \n\n\n\nAbstract \nResearchers are faced with rapidly evolving expectations about how they should manage and share their data, code, and other research materials. To help them meet these expectations and generally manage and share their data more effectively, we are developing a suite of tools which we are currently referring to as \"Support Your Data.\" These tools\u2014 which include a rubric designed to enable researchers to self-assess their current data management practices and a series of short guides which provide actionable information about how to advance practices as necessary or desired\u2014are intended to be easily customizable to meet the needs of researchers working in a variety of institutional and disciplinary contexts.\nKeywords: research data management, RDM, data sharing, open data, open science\n\nIntroduction \nResearch data management (RDM), a term that encompasses activities related to the storage, organization, documentation, and dissemination of data[a], is central to efforts aimed at maximizing the value of scientific investment (e.g., the Holdren memorandum[1]) and addressing concerns related to the integrity of the research process (e.g., Collins and Tabak's discussion on reproducibility[2]). Unfortunately, when surveyed directly, researchers often acknowledge that they lack the skills and experience needed to manage and share their data effectively.[3][4][5] This disconnect demonstrates the need for tools that bridge the communication gap that exists between the research community, data service providers, and other local, national, and international data stakeholder groups. The development of one such tool, which we are tentatively referring to as \u201cSupport Your Data,\u201d is the subject of this project report.\nAs demonstrated by visualizations such as the research data lifecycle[6][7], RDM is continuous, iterative, and embedded throughout the course of a research project. Well thought out RDM practices make the research process more efficient, facilitate collaboration, and help prevent the loss of data (see Lowndes et al. 2017[8]). Effective RDM is also crucial to establishing the accessibility of data after a project\u2019s conclusion, which is increasingly required by data stakeholders such as research funding agencies and scholarly publishers. Steps must be taken early in the research process to ensure that data can be shared later. For example, the sharing of data from human participants must be approved by an institutional review board (IRB) and described in informed consent documents before any data is collected.[9] More generally, data that are made available are only useful if formatted, documented, and organized in a manner that enables examination and reuse by others. Related guidance (e.g., from Goodman et al.[10]) and standards (e.g., FAIR Guiding Principles[11]) highlight that proper data management is a key factor in enabling effective data sharing, which is itself a key factor in establishing research transparency and reproducibility.\nComplementing calls for improved data management and more widespread data sharing by transparency and reproducibility-related initiatives within the research community[12][13], RDM has increasingly become a focus for academic libraries. Though offerings vary considerably between institutions, library RDM programs generally emphasize skills training and assisting researchers in complying with data-related policies and mandates[14][15][5] Guidance provided to researchers by library-based data service providers often focuses on topics such as data management planning, metadata and documentation, data organization, storage and backup procedures, and long term preservation. Though \u201cbest practice\u201d documents written by researchers often cover similar topics, they generally do not reference the work of data service providers. A recent effort to bridge these two perspectives through a survey of data management practices in the field of human brain imaging (neuroimaging) demonstrates that many researchers are unaware of or do not make use of library-based RDM resources. Furthermore, their RDM practices are highly variable, often described using hypothesis or workflow-specific terminology, and rooted in immediate and practical concerns (e.g., \u201cI want to prevent the loss of data.\u201d).[16] Therefore, for data service providers, crossing this communication gap and effectively engaging with researchers on the topic of RDM requires not only overcoming differences in language, terminology, and priorities between and within different research areas, but also placing related concepts within the context of a researcher\u2019s day-to-day work with data.\nThere are several existing tools that bring together the perspectives of data service providers and researchers to evaluate RDM practices. However, because these tools are often oriented towards data service providers, they have not seen widespread adoption by researchers who may have minimal contact with library-based RDM programs. For example, the Data Curation Profiles toolkit-which consists of a structured interviewed designed to elucidate data-related practices and needs in different academic disciplines-was designed to launch discussions between librarians and researchers and facilitate the development of data services that address the needs of researchers.[17] Other RDM assessment tools draw heavily from the capability maturity model (CMM) framework, which describes practices based on their degree of formality and optimization.[18] A maturity model specific to the management of scientific data characterizes research groups on the basis of how well their procedures related to data acquisition, description, dissemination, and preservation are defined, documented, and generalized.[19] The DMVitals tool[20] combines elements of the Data Curation Profiles and maturity-based tools to systematically assess a researcher\u2019s data management practices and generate customized and actionable recommendations based on institutional and domain standards.\nThis brief review of the current RDM landscape highlights several significant trends:\n\n Researchers face an evolving array of expectations related to how they manage and share data. Unfortunately, there is a significant communication gap between researchers and library-based data service providers.\n Overcoming this communication gap requires placing RDM in the context of a researcher\u2019s day-to-day work with data and overcoming differences in language, terminology, and priorities between and within different research communities.\n There is currently no user-friendly guide that allows researchers to assess and advance their own data management practices.\nThe intention of the Support Your Data project is to address these trends by developing materials that frame activities related to research data management so that they can be easily understood and acted upon by researchers. At present, these materials consist of a rubric designed to allow researchers to self assess their own RDM practices over the course of a research project and a complementary set of guides that direct researchers towards RDM-related services at their institution and provide actionable information about how to advance their practices as necessary or desired. To meet the needs of researchers in different institutional and disciplinary contexts, all of these materials have been designed to be easily customizable.\n\nProject development \nThe development process for the Support Your Data project drew upon a large number of sources. An initial point of inspiration was the \u201cHowOpenIsIt?\u201d guide developed by SPARC, PLOS, and the Open Access Scholarly Publishers Association (OASPA).[21] The format of this guide, in which a number of topics (e.g., author posting rights, reuse rights) are described on a spectrum from closed to open access, allows for a number of complex and interrelated issues to be presented in a relatively simple and easy to understand manner. This prompted us to consider how to present research data management, a topic sufficiently complex as to be labelled a \u201cwicked problem,\u201d[22] in a similar manner.\nA literature search and analysis of existing RDM evaluation tools revealed that the majority were either designed to benchmark RDM services at the institutional level (e.g., the Australian National Data Service's data management framework[23] and the Digital Curation Center's CARDIO effort[24]) or intended to foster communication between researchers and library based data service providers.[20][17] For this reason, we decided that our yet unnamed project should focus on developing materials for researchers. Working under the assumption that researchers in different institutional and disciplinary contexts might have a range of RDM-related priorities and access to different levels of RDM-related services, we decided at the outset of the development process that our materials should be developed with an eye towards customization.\nOne major early difficulty was determining how to describe the research process. While we wanted to draw from the workflow-based organization of visualizations such as the research data lifecycle, we also wanted to avoid presenting the progression of a research project using models or terminology that would be unfamiliar or unappealing to researchers. After conducting an informal survey of what words researchers associate with given activities (e.g., \u201cWhat term(s) do you use to describe the stage of your research that involves acquiring, accumulating, or measuring data?\u201d) and examining related work on the topic (e.g., Mattern et al. 2015[25]) we decided to focus on describing RDM-related practices rather than project stages. Even so, terminology proved to be a significant problem as we quickly determined that phrases such \u201cdata management planning\u201d and \u201cdata sharing\u201d had significantly different meanings to different audiences. Our efforts to reduce jargon would continue throughout the development process.\nAs with other RDM evaluation tools, we adopted elements of the capability maturity model framework to describe different data management-related activities on a continuum from \u201cad hoc\u201d to \u201crefined and optimized.\u201d This early conception of an \u201cRDM Maturity Guide\u201d was described in early blog posts intended to elicit feedback from members of the the data services and research communities. However, as the project progressed, we moved away from explicitly referencing the concept of practice maturity. Informal feedback received during the development of a parallel project, in which researchers were asked to provide quantitative RDM maturity ratings for themselves and their field as a whole[16], revealed that the concept needed constant clarification and that researchers were resistant to the connotation that their practices could be considered \u201cimmature.\u201d\nThe general structure of what would become the Support Your Data rubric was therefore refined to include a series of RDM-related activities described at different levels of definition and optimization. Because the rubric was to be designed to allow researchers to self-assess the current state of their RDM practices, we quickly decided that the rubric should be complemented by a series of short guides designed to provide information about how to advances practices as necessary or desired. In a series of biweekly meetings, we then set out to draft content for these materials. Feedback from the broader community was sought throughout this process through additional blog posts and presentations at research data-focused conferences (e.g., see Borghi et al. 2017[26] and Borghi et al. 2018[27])\nInitially, development of the content for the rubric and the guides progressed in parallel. Informed by informal surveys of researchers and data service providers (e.g. \u201cWhat activities do you consider part of \u2018planning for data\u2019?\u201d), we reviewed draft materials, worked to clarify language, and added relevant information as necessary. Though the activities described in the rows of the rubric (and expanded upon further in the guides) remained largely consistent throughout the development process, the earliest iterations of the rubric did not use use set labels to describe a researcher\u2019s practices related to each activity. This was intentional, as we wanted to resist quantification of a researcher\u2019s practices into a score of their RDM maturity. However, after an initial round of revisions, we determined that the rubric was becoming unbalanced. The lack of labels meant that different activities were being described at different levels of specificity which made interpretation difficult, thus defeating the entire purpose of the project.\nIn response, we refined the structure of the rubric further so that a researcher\u2019s RDM-related activities were described using one of four labels (see next section). After taking care that these labels were descriptive and not evaluative, we then completed a draft version of the entire rubric. We decided to use declarative statements to describe each RDM-related activity under each label in order to maximize the degree to which a researcher would identify a description with their own practices. We then proceeded to refine the content and structure of the guides. The materials presented in the next section are the result of this most recent round of revision.\n\nThe support your data materials \nAt present, the Support Your Data materials consist of a rubric designed to allow researchers to self assess their own RDM practices and a complementary series of one-page guides intended to provide researchers access to RDM-related expertise (including local RDM-related resources) and advance practices as necessary or desired. All of these materials are intended to be customizable in order to meet the needs of researchers in different institutional or disciplinary contexts.\nThe aim of the Support Your Data project is to be descriptive rather than prescriptive. Neither the rubric nor the guides assumes that every researcher will want, need, or be able to achieve the same level data management practices. Rather, the intent of these materials is to help researchers understand where they are in regards to RDM and, when appropriate, how to get to where they want or need to be.\n\nRDM rubric \nA schematic version of RDM rubric is shown in Table 1. Different RDM-related activities occurring over the course of a research project are represented in separate rows. Though the order from top to bottom loosely follows the progression of a research project, it is very likely that these activities will occur in a different order or simultaneously in a researcher's day-to-day work with data. The six activities described in the rubric (planning, organizing, saving, preparing, analyzing, and sharing) are intentionally general in order to make the rubric applicable to as wide a population as possible. Future versions of the rubric, adapted to specific disciplinary or institutional contexts, could incorporate greater, fewer, or altogether different activities. \n\n\n\n\n\n\n\nTable 1. The Support Your Data RDM rubric. The language used throughout the rubric is intended to describe RDM-related activities such as data management planning, organizing data, saving data, preparing data, analyzing data, and sharing data in a researcher-friendly fashion. A formatted version is available as Suppl. material 1.\n\n\n\n\n\nAd Hoc\n\nOne-Time\n\nActive and Informative\n\nOptimized for Re-Use\n\n\nPlanning your project\n\nWhen it comes to my data, I have a \"way of doing things\" but no standard or documented plans.\n\nI create some formal plans about how I will manage my data at the start of a project, but I generally don't refer back to them.\n\nI develop detailed plans about how I will manage my data that I actively revisit and revise over the course of a project.\n\nI have created plans for managing my data that are designed to streamline its future use by myself or others.\n\n\nOrganizing your data\n\nI don\u2019t follow a consistent approach for keeping my data organized, so it often takes time to find things.\n\nI have an approach for organizing my data, but I only put it into action after my project is complete.\n\nI have an approach for organizing my data that I implement prospectively, but it not necessarily standardized.\n\nI organize my data so that others can navigate, understand, and use it without me being present.\n\n\nSaving and backing up your data\n\nI decide what data is important while I am working on it and typically save it in a single location.\n\nI know what data needs to be saved and I back it up after I'm done working on it to reduce the risk of loss.\n\nI have a system for regularly saving important data while I am working on it. I have multiple backups.\n\nI save my data in a manner and location designed maximize opportunities for re-use by myself and others.\n\n\nGetting your data ready for analysis\n\nI don't have a standardized or well documented process for preparing my data for analysis.\n\nI have thought about how I will need to prepare my data, but I handle each case in a different manner.\n\nMy process for preparing data is standardized and well documented.\n\nI prepare my data in such a way as to facilitate use by both myself and others in the future.\n\n\nAnalyzing your data and handling the outputs\n\nI often have to redo my analyses or examine their products to determine what procedures or parameters were applied.\n\nAfter I finish my analysis, I document the specific parameters, procedures, and protocols applied.\n\nI regularly document the specifics of both my analysis workflow and decision making process while I am analyzing my data.\n\nI have ensured that the specifics of my analysis workflow and decision making process can be understood and put into action by others.\n\n\nSharing and publishing your data\n\nI share the results of my research, but generally I do not share the underlying data.\n\nI share my data only when I'm required to do so or in response to direct requests from other researchers.\n\nI regularly share the data that underlies my results and conclusions in a form that enables use by others.\n\nBecause of my excellent data management practices, I am able to efficiently share my data whenever I need to with whomever I need to.\n\n\n\nProceeding left to right, a series of declarative statements describe each activity in terms of how well they are designed to foster access to and use of data in the future. The four levels, \u201cad hoc,\u201d \u201cone-time,\u201d \u201cactive and informative,\u201d and \u201coptimized for re-use,\u201d are intended to be descriptive not prescriptive.\n\n Ad hoc - Refers to circumstances in which practices are neither standardized or documented. Every time a researcher has to manage their data they have to design new practices and procedures from scratch.\n One time - Refers to circumstances in which data management occurs only when it is necessary, such as in direct response to a mandate from a funder or publisher. Practices or procedures implemented at one phase of a project are not designed with later phases in mind.\n Active and informative - Refers to circumstances in which data management is a regular part of the research process. Practices and procedures are standardized, well documented, and well integrated with those implemented at other phases.\n Optimized for re-use - Refers to circumstances in which data all management activities are designed to facilitate the re-use of data in the future.\nIt should be noted that \u201cre-use\u201d in the context of the Support Your Data project is not necessarily meant as an endorsement of data sharing or other open science practices but is representative of the close link between effective sharing and effective research data management. It is very likely that the person who will need to examine or re-use a given dataset will be the researcher who collected or analyzed it in the first place.\n\nOne-page guides \nPrelimary versions of the guides associated with each row of the RDM rubric are available as Suppl. materials 2, 3, 4, 5, 6, and 7. Designed to be easily customizable to fit the terminology, practices, and services associated with different disciplinary and institutional communities, the guides all follow a similar structure.\n\n Abstract - A brief summary of the contents of the guide.\n What does it mean? - Provides an operational definition of the activity covered by the guide. For some guides (Planning, Preparing), this consists of a sentence or two describing the activity. For others (e.g. Saving, Preparing, Analyzing, Sharing) this involves a more detailed breakdown of what each activity involves in practice.\n Requirements and how to meet them - Provides a brief summary of how to meet expectations or mandates related to each activity. Because data-related requirements and services are highly discipline and institionally specific, the contents of these sections are designed to be easily customizable.\n Things to think about - Contains notes and recommendations that do not fit into the other sections.\nBoth the rubric and the guides are intended for easy customization to reflect the terminology, tools, best practices, and services specific to different disciplinary and institutional communities. In the template guides, some suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). Discipline-specific versions may incorporate the jargon, workflow, standards, and priorities of researchers working in a particular domain (e.g., neuroscience[28]). Institution-specific versions may also incorporate links to available data management, curation, and preservation tools and services.\n\nUsing the Support Your Data materials \nWe envision several use cases for the Support Your Data materials. The most likely is one in which these materials are used to facilitate discussion between an individual researcher or research group and a data service provider. In such a case, the researcher or research group can use the RDM rubric to identify the difference between where they are in regards to RDM versus where they want or need to be and then a data service provider can use the guides, customized to highlight available services and tools, to provide information about how to move forward. Another probable use case is one in which a particular research community uses these materials as part of a broader effort to improve data management (including data sharing) related practices. In this case, the organization and content of both the RDM rubric and the guides can be customized, with the assistance of data service providers, to include community-specific activities, requirements, and terminology. Though we were careful to ensure that our materials are merely descriptive, such customized versions could be more prescriptive in adhering to institutional or discipline-specific norms or policies.\nThough helping researchers respond to evolving expectations related to the management and sharing of their data was a major driving force behind the project, the Support Your Data materials, at least in their current iteration, are not designed to increase compliance with specific policies or requirements. For example, though a researcher using these materials would be directed to local RDM services and tools (e.g., a local DMPTool instance) related to the creation of data management plans (DMPs), neither the rubric nor the \u201cplanning for data\u201d guide give specific guidance on how to comply with the DMP requirements of different funding agencies. However, in helping researchers assess and advance their data management practices, the Support Your Data materials may indirectly help them comply more effectively with data-related requirements throughout the lifecycle of a research project.\n\nNext steps \nNow that we have a complete set of draft materials, the next step of the Support Your Data project is to focus on design and adoption. Moving forward, we will work with internal and external partners on the visual presentation of the materials and to develop pamphlets, postcards, and other collateral. As has been the case throughout the project, we will also continue to invite feedback and explore partnerships with stakeholders interested in developing customized materials.\n\nSupplementary material \n Suppl. material 1: A formatted version of the Support Your Data RDM rubric (.odp file)\n Suppl. material 2: A draft guide that corresponds with the \"Planning your project\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)\n Suppl. material 3: A draft guide that corresponds with the \"Organizing your data\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)\n Suppl. material 4: A draft guide that corresponds with the \"Saving and backing up your data\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)\n Suppl. material 5: A draft guide that corresponds with the \"Getting your data ready for analysis\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)\n Suppl. material 6: A draft guide that corresponds with the \"Analyzing your data and handling the outputs\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)\n Suppl. material 7: A draft guide that corresponds with the \"Sharing and publishing your data\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)\nAcknowledgements \nHosting institution \nUC Curation Center, California Digital Library\n\nAuthor contributions \nJB drafted the manuscript and lead the development of the materials. SA, DL, SS, and JC co-developed the materials and reviewed the manuscript.\n\nConflicts of interest \nThe authors declare no conflicts of interest.\n\nFootnotes \n\n\n\u2191 For the purposes of this report we are using the term \u201cdata\u201d broadly to refer to the inputs or outputs required to evaluate, reproduce, or built upon the analyses or conclusions of a given research project. This includes, but is not limited to, raw data, processed data, research-related code, and documentation pertaining to study parameters and procedures. \n\n\nReferences \n\n\n\u2191 Holdren, J.P.&#32;(22 February 2013).&#32;\"Increasing Access to the Results of Federally Funded Scientific Research\".&#32;Office of Science and Technology Policy.&#32;https:\/\/obamawhitehouse.archives.gov\/sites\/default\/files\/microsites\/ostp\/ostp_public_access_memo_2013.pdf . &#160; \n\n\u2191 Collins, F.S.; Tabak, L.A.&#32;(2014).&#32;\"Policy: NIH plans to enhance reproducibility\".&#32;Nature&#32;505&#32;(7485): 612\u201313.&#32;doi:10.1038\/505612a. &#160; \n\n\u2191 Barone, L.; Williams, J.; Micklos, D.&#32;(2017).&#32;\"Unmet needs for analyzing biological big data: A survey of 704 NSF principal investigators\".&#32;PLOS Computational Biology&#32;13&#32;(11): e1005858.&#32;doi:10.1371\/journal.pcbi.1005755.&#32;PMC&#160;PMC5654259.&#32;PMID&#160;29049281.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5654259 . &#160; \n\n\u2191 Federer, L.M.; Lu, Y.L.; Joubert, D.J. et al.&#32;(2015).&#32;\"Biomedical Data Sharing and Reuse: Attitudes and Practices of Clinical and Scientific Research Staff\".&#32;PLoS One&#32;10&#32;(6): e0129506.&#32;doi:10.1371\/journal.pone.0129506.&#32;PMC&#160;PMC4481309.&#32;PMID&#160;26107811.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4481309 . &#160; \n\n\u2191 5.0 5.1 Tenopir, C.; Sandusky, R.J.; Allard, S.; Birch, B.&#32;(2014).&#32;\"Research data management services in academic research libraries and perceptions of librarians\".&#32;Library &amp; Information Science Research&#32;36&#32;(2): 84\u201390.&#32;doi:10.1016\/j.lisr.2013.11.003. &#160; \n\n\u2191 Carlson, J.&#32;(2014).&#32;\"The use of lifecycle models in developing and supporting data services\".&#32;In&#32;Ray, J.M..&#32;Research Data Management: Practical Strategies for Information Professionals.&#32;Purdue University Press.&#32;ISBN&#160;9781557536648. &#160; \n\n\u2191 Cox, A.M.; Tam, W.W.T..&#32;\"A critical analysis of lifecycle models of the research process and research data management\".&#32;Aslib Journal of Information Management&#32;70&#32;(2): 142-57.&#32;doi:10.1108\/AJIM-11-2017-0251. &#160; \n\n\u2191 Lowndes, J.S.S.; Best, B.D.; Scarborough, C. et al.&#32;(2017).&#32;\"Our path to better science in less time using open data science tools\".&#32;Nature Ecology and Evolution&#32;1: 0160.&#32;doi:10.1038\/s41559-017-0160. &#160; \n\n\u2191 Meyer, M.N.&#32;(2018).&#32;\"Practical Tips for Ethical Data Sharing\".&#32;Advances in Methods and Practices in Psychological Science&#32;1&#32;(1): 131-144.&#32;doi:10.1177\/2515245917747656. &#160; \n\n\u2191 Goodman, A.; Pepe, A.; Blocker, A.W. et al..&#32;\"Ten Simple Rules for the Care and Feeding of Scientific Data\".&#32;PLoS Computational Biology&#32;10&#32;(4): e1003542.&#32;doi:10.1371\/journal.pcbi.1003542. &#160; \n\n\u2191 Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J. et al.&#32;(2016).&#32;\"The FAIR Guiding Principles for scientific data management and stewardship\".&#32;Scientific Data&#32;3: 160018.&#32;doi:10.1038\/sdata.2016.18.&#32;PMC&#160;PMC4792175.&#32;PMID&#160;26978244.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4792175 . &#160; \n\n\u2191 Ioannidis, J.P.A..&#32;\"How to Make More Published Research True\".&#32;PLoS Medicine&#32;11&#32;(10): e1001747.&#32;doi:10.1371\/journal.pmed.1001747. &#160; \n\n\u2191 Munaf\u00f2, M.R.; Nosek, B.A.; Bishop, D.V.M. et al.&#32;(2017).&#32;\"A manifesto for reproducible science\".&#32;Nature Human Behaviour&#32;1: 0021.&#32;doi:10.1038\/s41562-016-0021. &#160; \n\n\u2191 Cox, A.M.; Kennan, M.A.; Lyon, L. et al.&#32;(2017).&#32;\"Developments in research data management in academic libraries: Towards an understanding of research data service maturity\".&#32;Journal of the Association for Information Science and Technology&#32;68&#32;(9): 2182-2200.&#32;doi:10.1002\/asi.23781. &#160; \n\n\u2191 Flores, J.R.; Brodeur, J.J.; Daniels, M.G. et al.&#32;(2015).&#32;\"Libraries and the Research Data Management Landscape\".&#32;In&#32;Maclachlan, J.C.; Waraksa, E.A.; Williford, C..&#32;The Process of Discovery: The CLIR Postdoctoral Fellowship Program and the Future of the Academy.&#32;Council on Library and Information Resources.&#32;ISBN&#160;9781932326529. &#160; \n\n\u2191 16.0 16.1 Borghi, J.A.; Van Gulick, A.E.&#32;(2018).&#32;\"Data management and sharing in neuroimaging: Practices and perceptions of MRI researchers\".&#32;bioRxiv.&#32;doi:10.1371\/journal.pone.0200562. &#160; \n\n\u2191 17.0 17.1 Witt, M.; Carlson, J.; Brandt, D.S.; Cragin, M.H.&#32;(2009).&#32;\"Constructing Data Curation Profiles\".&#32;International Journal of Digital Curation&#32;4&#32;(3): 93-103.&#32;doi:10.2218\/ijdc.v4i3.117. &#160; \n\n\u2191 Paulk, M.C.; Curtis, B.; Chrissis, M.B.; Weber, C.V.&#32;(1993).&#32;\"Capability maturity model, version 1.1\".&#32;IEEE Software&#32;10&#32;(4): 18-27.&#32;doi:10.1109\/52.219617. &#160; \n\n\u2191 Crowston, K.; Qin, J.&#32;(2012).&#32;\"A capability maturity model for scientific data management: Evidence from the literature\".&#32;Proceedings of the American Society for Information Science and Technology&#32;48&#32;(1): 1-9.&#32;doi:10.1002\/meet.2011.14504801036. &#160; \n\n\u2191 20.0 20.1 Sallans, A.; Lake, S.&#32;(2014).&#32;\"Data management assessment and planning tools\".&#32;In&#32;Ray, J.M..&#32;Research Data Management: Practical Strategies for Information Professionals.&#32;Purdue University Press.&#32;ISBN&#160;9781557536648. &#160; \n\n\u2191 \"HowOpenIsIt? A Guide for Evaluating the Openness of Journals\".&#32;New Venture Fund.&#32;2013.&#32;https:\/\/sparcopen.org\/our-work\/howopenisit\/ . &#160; \n\n\u2191 Awre, C.; Baxter, J.; Clifford, B. et al.&#32;(2015).&#32;\"Research Data Management as a \u201cwicked problem\"\".&#32;Library Review&#32;64&#32;(4\/5): 356-371.&#32;doi:10.1108\/LR-04-2015-0043. &#160; \n\n\u2191 \"Creating a data management framework\".&#32;Australian National Data Service.&#32;2011.&#32;http:\/\/www.ands.org.au\/guides\/creating-a-data-management-framework . &#160; \n\n\u2191 \"Collaborative Assessment of Research Data Infrastructure and Objectives (CARDIO)\".&#32;Digital Curation Center.&#32;2013.&#32;https:\/\/cardio.dcc.ac.uk\/about\/ . &#160; \n\n\u2191 Mattern, E.; Jeng, W.; He, D.&#32;(2015).&#32;\"Using participatory design and visual narrative inquiry to investigate researchers\u2019 data challenges and recommendations for library research data services\".&#32;Program: electronic library and information systems&#32;49&#32;(4): 408-423.&#32;doi:10.1108\/PROG-01-2015-0012. &#160; \n\n\u2191 Borghi, J.A.; Abrams, S.; Chodacki, J. et al.&#32;(22 September 2017).&#32;\"Developing a Data Management Guide for Researchers\".&#32;Zenodo.&#32;doi:10.5281\/zenodo.1213384.&#32;https:\/\/zenodo.org\/record\/1213384 . &#160; \n\n\u2191 Borghi, J.A.; Abrams, S.; Lowenberg, D. et al.&#32;(21 March 2018).&#32;\"Support Your Data: A Data Management Guide for Researchers\".&#32;Zenodo.&#32;doi:10.5281\/zenodo.1204885.&#32;https:\/\/zenodo.org\/record\/1204885 . &#160; \n\n\u2191 Nichols. T.E.; Das, S.; Eickhoff, S.B. et al.&#32;(2017).&#32;\"Best practices in data analysis and sharing in neuroimaging using MRI\".&#32;Nature Neuroscience&#32;20: 299\u2013303.&#32;doi:10.1038\/nn.4500. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Footnotes were originally numbered but have been converted to lowercase alpha for this version. The original article lists references alphabetically, but this version\u2014by design\u2014lists them in order of appearance.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\">https:\/\/www.limswiki.org\/index.php\/Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on data management and sharingLIMSwiki journal articles on open dataLIMSwiki journal articles on research\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 24 September 2018, at 14:26.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 209 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","5084f989065d7c37f4ccf170c3f09ee7_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Support_Your_Data_A_research_data_management_guide_for_researchers skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Support Your Data: A research data management guide for researchers<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>Researchers are faced with rapidly evolving expectations about how they should manage and share their data, code, and other <a href=\"https:\/\/www.limswiki.org\/index.php\/Research\" title=\"Research\" target=\"_blank\" class=\"wiki-link\" data-key=\"409634fd90113f119362927fe222f549\">research<\/a> materials. To help them meet these expectations and generally manage and share their data more effectively, we are developing a suite of tools which we are currently referring to as \"Support Your Data.\" These tools\u2014 which include a rubric designed to enable researchers to self-assess their current <a href=\"https:\/\/www.limswiki.org\/index.php\/Information_management\" title=\"Information management\" target=\"_blank\" class=\"wiki-link\" data-key=\"f8672d270c0750a858ed940158ca0a73\">data management<\/a> practices and a series of short guides which provide actionable <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> about how to advance practices as necessary or desired\u2014are intended to be easily customizable to meet the needs of researchers working in a variety of institutional and disciplinary contexts.\n<\/p><p><b>Keywords<\/b>: research data management, RDM, data sharing, open data, open science\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Research data management (RDM), a term that encompasses activities related to the storage, organization, documentation, and dissemination of data<sup id=\"rdp-ebb-cite_ref-1\" class=\"reference\"><a href=\"#cite_note-1\" rel=\"external_link\">[a]<\/a><\/sup>, is central to efforts aimed at maximizing the value of scientific investment (e.g., the Holdren memorandum<sup id=\"rdp-ebb-cite_ref-HoldrenIncreasing13_2-0\" class=\"reference\"><a href=\"#cite_note-HoldrenIncreasing13-2\" rel=\"external_link\">[1]<\/a><\/sup>) and addressing concerns related to the integrity of the research process (e.g., Collins and Tabak's discussion on reproducibility<sup id=\"rdp-ebb-cite_ref-CollinsPolicy14_3-0\" class=\"reference\"><a href=\"#cite_note-CollinsPolicy14-3\" rel=\"external_link\">[2]<\/a><\/sup>). Unfortunately, when surveyed directly, researchers often acknowledge that they lack the skills and experience needed to manage and share their data effectively.<sup id=\"rdp-ebb-cite_ref-BaroneUnmet17_4-0\" class=\"reference\"><a href=\"#cite_note-BaroneUnmet17-4\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-FedererBiomedical15_5-0\" class=\"reference\"><a href=\"#cite_note-FedererBiomedical15-5\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-TenopirResearch14_6-0\" class=\"reference\"><a href=\"#cite_note-TenopirResearch14-6\" rel=\"external_link\">[5]<\/a><\/sup> This disconnect demonstrates the need for tools that bridge the communication gap that exists between the research community, data service providers, and other local, national, and international data stakeholder groups. The development of one such tool, which we are tentatively referring to as \u201cSupport Your Data,\u201d is the subject of this project report.\n<\/p><p>As demonstrated by visualizations such as the research data lifecycle<sup id=\"rdp-ebb-cite_ref-CarlsonResearch14_7-0\" class=\"reference\"><a href=\"#cite_note-CarlsonResearch14-7\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-CoxACritical18_8-0\" class=\"reference\"><a href=\"#cite_note-CoxACritical18-8\" rel=\"external_link\">[7]<\/a><\/sup>, RDM is continuous, iterative, and embedded throughout the course of a research project. Well thought out RDM practices make the research process more efficient, facilitate collaboration, and help prevent the loss of data (see Lowndes <i>et al.<\/i> 2017<sup id=\"rdp-ebb-cite_ref-LowndesOurPath17_9-0\" class=\"reference\"><a href=\"#cite_note-LowndesOurPath17-9\" rel=\"external_link\">[8]<\/a><\/sup>). Effective RDM is also crucial to establishing the accessibility of data after a project\u2019s conclusion, which is increasingly required by data stakeholders such as research funding agencies and scholarly publishers. Steps must be taken early in the research process to ensure that data can be shared later. For example, the sharing of data from human participants must be approved by an institutional review board (IRB) and described in informed consent documents before any data is collected.<sup id=\"rdp-ebb-cite_ref-MeyerPractical18_10-0\" class=\"reference\"><a href=\"#cite_note-MeyerPractical18-10\" rel=\"external_link\">[9]<\/a><\/sup> More generally, data that are made available are only useful if formatted, documented, and organized in a manner that enables examination and reuse by others. Related guidance (e.g., from Goodman <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-GoodmanTen14_11-0\" class=\"reference\"><a href=\"#cite_note-GoodmanTen14-11\" rel=\"external_link\">[10]<\/a><\/sup>) and standards (e.g., FAIR Guiding Principles<sup id=\"rdp-ebb-cite_ref-WilkinsonTheFAIR16_12-0\" class=\"reference\"><a href=\"#cite_note-WilkinsonTheFAIR16-12\" rel=\"external_link\">[11]<\/a><\/sup>) highlight that proper data management is a key factor in enabling effective data sharing, which is itself a key factor in establishing research transparency and reproducibility.\n<\/p><p>Complementing calls for improved data management and more widespread data sharing by transparency and reproducibility-related initiatives within the research community<sup id=\"rdp-ebb-cite_ref-IoannidisHowTo14_13-0\" class=\"reference\"><a href=\"#cite_note-IoannidisHowTo14-13\" rel=\"external_link\">[12]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-Munaf.C3.B2AManifesto17_14-0\" class=\"reference\"><a href=\"#cite_note-Munaf.C3.B2AManifesto17-14\" rel=\"external_link\">[13]<\/a><\/sup>, RDM has increasingly become a focus for academic libraries. Though offerings vary considerably between institutions, library RDM programs generally emphasize skills training and assisting researchers in complying with data-related policies and mandates<sup id=\"rdp-ebb-cite_ref-CoxDevelop17_15-0\" class=\"reference\"><a href=\"#cite_note-CoxDevelop17-15\" rel=\"external_link\">[14]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-FloresLibraries15_16-0\" class=\"reference\"><a href=\"#cite_note-FloresLibraries15-16\" rel=\"external_link\">[15]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-TenopirResearch14_6-1\" class=\"reference\"><a href=\"#cite_note-TenopirResearch14-6\" rel=\"external_link\">[5]<\/a><\/sup> Guidance provided to researchers by library-based data service providers often focuses on topics such as data management planning, metadata and documentation, data organization, storage and backup procedures, and long term preservation. Though \u201cbest practice\u201d documents written by researchers often cover similar topics, they generally do not reference the work of data service providers. A recent effort to bridge these two perspectives through a survey of data management practices in the field of human brain imaging (neuroimaging) demonstrates that many researchers are unaware of or do not make use of library-based RDM resources. Furthermore, their RDM practices are highly variable, often described using hypothesis or workflow-specific terminology, and rooted in immediate and practical concerns (e.g., \u201cI want to prevent the loss of data.\u201d).<sup id=\"rdp-ebb-cite_ref-BorghiData18_17-0\" class=\"reference\"><a href=\"#cite_note-BorghiData18-17\" rel=\"external_link\">[16]<\/a><\/sup> Therefore, for data service providers, crossing this communication gap and effectively engaging with researchers on the topic of RDM requires not only overcoming differences in language, terminology, and priorities between and within different research areas, but also placing related concepts within the context of a researcher\u2019s day-to-day work with data.\n<\/p><p>There are several existing tools that bring together the perspectives of data service providers and researchers to evaluate RDM practices. However, because these tools are often oriented towards data service providers, they have not seen widespread adoption by researchers who may have minimal contact with library-based RDM programs. For example, the Data Curation Profiles toolkit-which consists of a structured interviewed designed to elucidate data-related practices and needs in different academic disciplines-was designed to launch discussions between librarians and researchers and facilitate the development of data services that address the needs of researchers.<sup id=\"rdp-ebb-cite_ref-WittContruct09_18-0\" class=\"reference\"><a href=\"#cite_note-WittContruct09-18\" rel=\"external_link\">[17]<\/a><\/sup> Other RDM assessment tools draw heavily from the capability maturity model (CMM) framework, which describes practices based on their degree of formality and optimization.<sup id=\"rdp-ebb-cite_ref-PaulkCapability93_19-0\" class=\"reference\"><a href=\"#cite_note-PaulkCapability93-19\" rel=\"external_link\">[18]<\/a><\/sup> A maturity model specific to the management of scientific data characterizes research groups on the basis of how well their procedures related to data acquisition, description, dissemination, and preservation are defined, documented, and generalized.<sup id=\"rdp-ebb-cite_ref-CrowstonACapability12_20-0\" class=\"reference\"><a href=\"#cite_note-CrowstonACapability12-20\" rel=\"external_link\">[19]<\/a><\/sup> The DMVitals tool<sup id=\"rdp-ebb-cite_ref-SallansResearch14_21-0\" class=\"reference\"><a href=\"#cite_note-SallansResearch14-21\" rel=\"external_link\">[20]<\/a><\/sup> combines elements of the Data Curation Profiles and maturity-based tools to systematically assess a researcher\u2019s data management practices and generate customized and actionable recommendations based on institutional and domain standards.\n<\/p><p>This brief review of the current RDM landscape highlights several significant trends:\n<\/p>\n<ol><li> Researchers face an evolving array of expectations related to how they manage and share data. Unfortunately, there is a significant communication gap between researchers and library-based data service providers.<\/li>\n<li> Overcoming this communication gap requires placing RDM in the context of a researcher\u2019s day-to-day work with data and overcoming differences in language, terminology, and priorities between and within different research communities.<\/li>\n<li> There is currently no user-friendly guide that allows researchers to assess and advance their own data management practices.<\/li><\/ol>\n<p>The intention of the Support Your Data project is to address these trends by developing materials that frame activities related to research data management so that they can be easily understood and acted upon by researchers. At present, these materials consist of a rubric designed to allow researchers to self assess their own RDM practices over the course of a research project and a complementary set of guides that direct researchers towards RDM-related services at their institution and provide actionable information about how to advance their practices as necessary or desired. To meet the needs of researchers in different institutional and disciplinary contexts, all of these materials have been designed to be easily customizable.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Project_development\">Project development<\/span><\/h2>\n<p>The development process for the Support Your Data project drew upon a large number of sources. An initial point of inspiration was the \u201cHowOpenIsIt?\u201d guide developed by SPARC, PLOS, and the Open Access Scholarly Publishers Association (OASPA).<sup id=\"rdp-ebb-cite_ref-SPARCHowOpen_22-0\" class=\"reference\"><a href=\"#cite_note-SPARCHowOpen-22\" rel=\"external_link\">[21]<\/a><\/sup> The format of this guide, in which a number of topics (e.g., author posting rights, reuse rights) are described on a spectrum from closed to open access, allows for a number of complex and interrelated issues to be presented in a relatively simple and easy to understand manner. This prompted us to consider how to present research data management, a topic sufficiently complex as to be labelled a \u201cwicked problem,\u201d<sup id=\"rdp-ebb-cite_ref-AwreResearch15_23-0\" class=\"reference\"><a href=\"#cite_note-AwreResearch15-23\" rel=\"external_link\">[22]<\/a><\/sup> in a similar manner.\n<\/p><p>A literature search and analysis of existing RDM evaluation tools revealed that the majority were either designed to benchmark RDM services at the institutional level (e.g., the Australian National Data Service's data management framework<sup id=\"rdp-ebb-cite_ref-ANDSCreating_24-0\" class=\"reference\"><a href=\"#cite_note-ANDSCreating-24\" rel=\"external_link\">[23]<\/a><\/sup> and the Digital Curation Center's CARDIO effort<sup id=\"rdp-ebb-cite_ref-DCC_CARDIO_25-0\" class=\"reference\"><a href=\"#cite_note-DCC_CARDIO-25\" rel=\"external_link\">[24]<\/a><\/sup>) or intended to foster communication between researchers and library based data service providers.<sup id=\"rdp-ebb-cite_ref-SallansResearch14_21-1\" class=\"reference\"><a href=\"#cite_note-SallansResearch14-21\" rel=\"external_link\">[20]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WittContruct09_18-1\" class=\"reference\"><a href=\"#cite_note-WittContruct09-18\" rel=\"external_link\">[17]<\/a><\/sup> For this reason, we decided that our yet unnamed project should focus on developing materials for researchers. Working under the assumption that researchers in different institutional and disciplinary contexts might have a range of RDM-related priorities and access to different levels of RDM-related services, we decided at the outset of the development process that our materials should be developed with an eye towards customization.\n<\/p><p>One major early difficulty was determining how to describe the research process. While we wanted to draw from the workflow-based organization of visualizations such as the research data lifecycle, we also wanted to avoid presenting the progression of a research project using models or terminology that would be unfamiliar or unappealing to researchers. After conducting an informal survey of what words researchers associate with given activities (e.g., \u201cWhat term(s) do you use to describe the stage of your research that involves acquiring, accumulating, or measuring data?\u201d) and examining related work on the topic (e.g., Mattern <i>et al.<\/i> 2015<sup id=\"rdp-ebb-cite_ref-MatternUsing15_26-0\" class=\"reference\"><a href=\"#cite_note-MatternUsing15-26\" rel=\"external_link\">[25]<\/a><\/sup>) we decided to focus on describing RDM-related practices rather than project stages. Even so, terminology proved to be a significant problem as we quickly determined that phrases such \u201cdata management planning\u201d and \u201cdata sharing\u201d had significantly different meanings to different audiences. Our efforts to reduce jargon would continue throughout the development process.\n<\/p><p>As with other RDM evaluation tools, we adopted elements of the capability maturity model framework to describe different data management-related activities on a continuum from \u201cad hoc\u201d to \u201crefined and optimized.\u201d This early conception of an \u201cRDM Maturity Guide\u201d was described in early blog posts intended to elicit feedback from members of the the data services and research communities. However, as the project progressed, we moved away from explicitly referencing the concept of practice maturity. Informal feedback received during the development of a parallel project, in which researchers were asked to provide quantitative RDM maturity ratings for themselves and their field as a whole<sup id=\"rdp-ebb-cite_ref-BorghiData18_17-1\" class=\"reference\"><a href=\"#cite_note-BorghiData18-17\" rel=\"external_link\">[16]<\/a><\/sup>, revealed that the concept needed constant clarification and that researchers were resistant to the connotation that their practices could be considered \u201cimmature.\u201d\n<\/p><p>The general structure of what would become the Support Your Data rubric was therefore refined to include a series of RDM-related activities described at different levels of definition and optimization. Because the rubric was to be designed to allow researchers to self-assess the current state of their RDM practices, we quickly decided that the rubric should be complemented by a series of short guides designed to provide information about how to advances practices as necessary or desired. In a series of biweekly meetings, we then set out to draft content for these materials. Feedback from the broader community was sought throughout this process through additional blog posts and presentations at research data-focused conferences (e.g., see Borghi <i>et al.<\/i> 2017<sup id=\"rdp-ebb-cite_ref-BorghiDeveloping17_27-0\" class=\"reference\"><a href=\"#cite_note-BorghiDeveloping17-27\" rel=\"external_link\">[26]<\/a><\/sup> and Borghi <i>et al.<\/i> 2018<sup id=\"rdp-ebb-cite_ref-BorghiSupport18_28-0\" class=\"reference\"><a href=\"#cite_note-BorghiSupport18-28\" rel=\"external_link\">[27]<\/a><\/sup>)\n<\/p><p>Initially, development of the content for the rubric and the guides progressed in parallel. Informed by informal surveys of researchers and data service providers (e.g. \u201cWhat activities do you consider part of \u2018planning for data\u2019?\u201d), we reviewed draft materials, worked to clarify language, and added relevant information as necessary. Though the activities described in the rows of the rubric (and expanded upon further in the guides) remained largely consistent throughout the development process, the earliest iterations of the rubric did not use use set labels to describe a researcher\u2019s practices related to each activity. This was intentional, as we wanted to resist quantification of a researcher\u2019s practices into a score of their RDM maturity. However, after an initial round of revisions, we determined that the rubric was becoming unbalanced. The lack of labels meant that different activities were being described at different levels of specificity which made interpretation difficult, thus defeating the entire purpose of the project.\n<\/p><p>In response, we refined the structure of the rubric further so that a researcher\u2019s RDM-related activities were described using one of four labels (see next section). After taking care that these labels were descriptive and not evaluative, we then completed a draft version of the entire rubric. We decided to use declarative statements to describe each RDM-related activity under each label in order to maximize the degree to which a researcher would identify a description with their own practices. We then proceeded to refine the content and structure of the guides. The materials presented in the next section are the result of this most recent round of revision.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"The_support_your_data_materials\">The support your data materials<\/span><\/h2>\n<p>At present, the Support Your Data materials consist of a rubric designed to allow researchers to self assess their own RDM practices and a complementary series of one-page guides intended to provide researchers access to RDM-related expertise (including local RDM-related resources) and advance practices as necessary or desired. All of these materials are intended to be customizable in order to meet the needs of researchers in different institutional or disciplinary contexts.\n<\/p><p>The aim of the Support Your Data project is to be descriptive rather than prescriptive. Neither the rubric nor the guides assumes that every researcher will want, need, or be able to achieve the same level data management practices. Rather, the intent of these materials is to help researchers understand where they are in regards to RDM and, when appropriate, how to get to where they want or need to be.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"RDM_rubric\">RDM rubric<\/span><\/h2>\n<p>A schematic version of RDM rubric is shown in Table 1. Different RDM-related activities occurring over the course of a research project are represented in separate rows. Though the order from top to bottom loosely follows the progression of a research project, it is very likely that these activities will occur in a different order or simultaneously in a researcher's day-to-day work with data. The six activities described in the rubric (planning, organizing, saving, preparing, analyzing, and sharing) are intentionally general in order to make the rubric applicable to as wide a population as possible. Future versions of the rubric, adapted to specific disciplinary or institutional contexts, could incorporate greater, fewer, or altogether different activities. \n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\"><b>Table 1.<\/b> The Support Your Data RDM rubric. The language used throughout the rubric is intended to describe RDM-related activities such as data management planning, organizing data, saving data, preparing data, analyzing data, and sharing data in a researcher-friendly fashion. A formatted version is available as Suppl. material 1.\n<\/td><\/tr>\n\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Ad Hoc\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">One-Time\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Active and Informative\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Optimized for Re-Use\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Planning your project\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">When it comes to my data, I have a \"way of doing things\" but no standard or documented plans.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I create some formal plans about how I will manage my data at the start of a project, but I generally don't refer back to them.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I develop detailed plans about how I will manage my data that I actively revisit and revise over the course of a project.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I have created plans for managing my data that are designed to streamline its future use by myself or others.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Organizing your data\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I don\u2019t follow a consistent approach for keeping my data organized, so it often takes time to find things.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I have an approach for organizing my data, but I only put it into action after my project is complete.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I have an approach for organizing my data that I implement prospectively, but it not necessarily standardized.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I organize my data so that others can navigate, understand, and use it without me being present.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Saving and backing up your data\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I decide what data is important while I am working on it and typically save it in a single location.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I know what data needs to be saved and I back it up after I'm done working on it to reduce the risk of loss.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I have a system for regularly saving important data while I am working on it. I have multiple backups.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I save my data in a manner and location designed maximize opportunities for re-use by myself and others.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Getting your data ready for analysis\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I don't have a standardized or well documented process for preparing my data for analysis.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I have thought about how I will need to prepare my data, but I handle each case in a different manner.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">My process for preparing data is standardized and well documented.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I prepare my data in such a way as to facilitate use by both myself and others in the future.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Analyzing your data and handling the outputs\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I often have to redo my analyses or examine their products to determine what procedures or parameters were applied.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">After I finish my analysis, I document the specific parameters, procedures, and protocols applied.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I regularly document the specifics of both my analysis workflow and decision making process while I am analyzing my data.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I have ensured that the specifics of my analysis workflow and decision making process can be understood and put into action by others.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Sharing and publishing your data\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I share the results of my research, but generally I do not share the underlying data.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I share my data only when I'm required to do so or in response to direct requests from other researchers.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">I regularly share the data that underlies my results and conclusions in a form that enables use by others.\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Because of my excellent data management practices, I am able to efficiently share my data whenever I need to with whomever I need to.\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Proceeding left to right, a series of declarative statements describe each activity in terms of how well they are designed to foster access to and use of data in the future. The four levels, \u201cad hoc,\u201d \u201cone-time,\u201d \u201cactive and informative,\u201d and \u201coptimized for re-use,\u201d are intended to be descriptive not prescriptive.\n<\/p>\n<ul><li> <b>Ad hoc<\/b> - Refers to circumstances in which practices are neither standardized or documented. Every time a researcher has to manage their data they have to design new practices and procedures from scratch.<\/li><\/ul>\n<ul><li> <b>One time<\/b> - Refers to circumstances in which data management occurs only when it is necessary, such as in direct response to a mandate from a funder or publisher. Practices or procedures implemented at one phase of a project are not designed with later phases in mind.<\/li><\/ul>\n<ul><li> <b>Active and informative<\/b> - Refers to circumstances in which data management is a regular part of the research process. Practices and procedures are standardized, well documented, and well integrated with those implemented at other phases.<\/li><\/ul>\n<ul><li> <b>Optimized for re-use<\/b> - Refers to circumstances in which data all management activities are designed to facilitate the re-use of data in the future.<\/li><\/ul>\n<p>It should be noted that \u201cre-use\u201d in the context of the Support Your Data project is not necessarily meant as an endorsement of data sharing or other open science practices but is representative of the close link between effective sharing and effective research data management. It is very likely that the person who will need to examine or re-use a given dataset will be the researcher who collected or analyzed it in the first place.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"One-page_guides\">One-page guides<\/span><\/h2>\n<p>Prelimary versions of the guides associated with each row of the RDM rubric are available as Suppl. materials 2, 3, 4, 5, 6, and 7. Designed to be easily customizable to fit the terminology, practices, and services associated with different disciplinary and institutional communities, the guides all follow a similar structure.\n<\/p>\n<ul><li> <b>Abstract<\/b> - A brief summary of the contents of the guide.<\/li>\n<li> <b>What does it mean?<\/b> - Provides an operational definition of the activity covered by the guide. For some guides (Planning, Preparing), this consists of a sentence or two describing the activity. For others (e.g. Saving, Preparing, Analyzing, Sharing) this involves a more detailed breakdown of what each activity involves in practice.<\/li>\n<li> <b>Requirements and how to meet them<\/b> - Provides a brief summary of how to meet expectations or mandates related to each activity. Because data-related requirements and services are highly discipline and institionally specific, the contents of these sections are designed to be easily customizable.<\/li>\n<li> <b>Things to think about<\/b> - Contains notes and recommendations that do not fit into the other sections.<\/li><\/ul>\n<p>Both the rubric and the guides are intended for easy customization to reflect the terminology, tools, best practices, and services specific to different disciplinary and institutional communities. In the template guides, some suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). Discipline-specific versions may incorporate the jargon, workflow, standards, and priorities of researchers working in a particular domain (e.g., neuroscience<sup id=\"rdp-ebb-cite_ref-NicholsBest17_29-0\" class=\"reference\"><a href=\"#cite_note-NicholsBest17-29\" rel=\"external_link\">[28]<\/a><\/sup>). Institution-specific versions may also incorporate links to available data management, curation, and preservation tools and services.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Using_the_Support_Your_Data_materials\">Using the Support Your Data materials<\/span><\/h2>\n<p>We envision several use cases for the Support Your Data materials. The most likely is one in which these materials are used to facilitate discussion between an individual researcher or research group and a data service provider. In such a case, the researcher or research group can use the RDM rubric to identify the difference between where they are in regards to RDM versus where they want or need to be and then a data service provider can use the guides, customized to highlight available services and tools, to provide information about how to move forward. Another probable use case is one in which a particular research community uses these materials as part of a broader effort to improve data management (including data sharing) related practices. In this case, the organization and content of both the RDM rubric and the guides can be customized, with the assistance of data service providers, to include community-specific activities, requirements, and terminology. Though we were careful to ensure that our materials are merely descriptive, such customized versions could be more prescriptive in adhering to institutional or discipline-specific norms or policies.\n<\/p><p>Though helping researchers respond to evolving expectations related to the management and sharing of their data was a major driving force behind the project, the Support Your Data materials, at least in their current iteration, are not designed to increase compliance with specific policies or requirements. For example, though a researcher using these materials would be directed to local RDM services and tools (e.g., a local DMPTool instance) related to the creation of data management plans (DMPs), neither the rubric nor the \u201cplanning for data\u201d guide give specific guidance on how to comply with the DMP requirements of different funding agencies. However, in helping researchers assess and advance their data management practices, the Support Your Data materials may indirectly help them comply more effectively with data-related requirements throughout the lifecycle of a research project.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Next_steps\">Next steps<\/span><\/h2>\n<p>Now that we have a complete set of draft materials, the next step of the Support Your Data project is to focus on design and adoption. Moving forward, we will work with internal and external partners on the visual presentation of the materials and to develop pamphlets, postcards, and other collateral. As has been the case throughout the project, we will also continue to invite feedback and explore partnerships with stakeholders interested in developing customized materials.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Supplementary_material\">Supplementary material<\/span><\/h2>\n<ul><li> <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/riojournal.com\/article\/download\/suppl\/4348984\/\" target=\"_blank\">Suppl. material 1<\/a>: A formatted version of the Support Your Data RDM rubric (.odp file)<\/li>\n<li> <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/riojournal.com\/article\/download\/suppl\/4349005\/\" target=\"_blank\">Suppl. material 2<\/a>: A draft guide that corresponds with the \"Planning your project\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)<\/li>\n<li> <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/riojournal.com\/article\/download\/suppl\/4349007\/\" target=\"_blank\">Suppl. material 3<\/a>: A draft guide that corresponds with the \"Organizing your data\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)<\/li>\n<li> <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/riojournal.com\/article\/download\/suppl\/4349008\/\" target=\"_blank\">Suppl. material 4<\/a>: A draft guide that corresponds with the \"Saving and backing up your data\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)<\/li>\n<li> <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/riojournal.com\/article\/download\/suppl\/4349037\/\" target=\"_blank\">Suppl. material 5<\/a>: A draft guide that corresponds with the \"Getting your data ready for analysis\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)<\/li>\n<li> <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/riojournal.com\/article\/download\/suppl\/4349038\/\" target=\"_blank\">Suppl. material 6<\/a>: A draft guide that corresponds with the \"Analyzing your data and handling the outputs\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)<\/li>\n<li> <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/riojournal.com\/article\/download\/suppl\/4349039\/\" target=\"_blank\">Suppl. material 7<\/a>: A draft guide that corresponds with the \"Sharing and publishing your data\" row of the RDM rubric. Suggested points of customization are highlighted in yellow (discipline-specific) and red (institution-specific). (.odt file)<\/li><\/ul>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Hosting_institution\">Hosting institution<\/span><\/h3>\n<p>UC Curation Center, California Digital Library\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>JB drafted the manuscript and lead the development of the materials. SA, DL, SS, and JC co-developed the materials and reviewed the manuscript.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h3>\n<p>The authors declare no conflicts of interest.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Footnotes\">Footnotes<\/span><\/h2>\n<div class=\"reflist\" style=\"list-style-type: lower-alpha;\">\n<ol class=\"references\">\n<li id=\"cite_note-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-1\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">For the purposes of this report we are using the term \u201cdata\u201d broadly to refer to the inputs or outputs required to evaluate, reproduce, or built upon the analyses or conclusions of a given research project. This includes, but is not limited to, raw data, processed data, research-related code, and documentation pertaining to study parameters and procedures.<\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-HoldrenIncreasing13-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HoldrenIncreasing13_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Holdren, J.P.&#32;(22 February 2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/obamawhitehouse.archives.gov\/sites\/default\/files\/microsites\/ostp\/ostp_public_access_memo_2013.pdf\" target=\"_blank\">\"Increasing Access to the Results of Federally Funded Scientific Research\"<\/a>.&#32;Office of Science and Technology Policy<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/obamawhitehouse.archives.gov\/sites\/default\/files\/microsites\/ostp\/ostp_public_access_memo_2013.pdf\" target=\"_blank\">https:\/\/obamawhitehouse.archives.gov\/sites\/default\/files\/microsites\/ostp\/ostp_public_access_memo_2013.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Increasing+Access+to+the+Results+of+Federally+Funded+Scientific+Research&amp;rft.atitle=&amp;rft.aulast=Holdren%2C+J.P.&amp;rft.au=Holdren%2C+J.P.&amp;rft.date=22+February+2013&amp;rft.pub=Office+of+Science+and+Technology+Policy&amp;rft_id=https%3A%2F%2Fobamawhitehouse.archives.gov%2Fsites%2Fdefault%2Ffiles%2Fmicrosites%2Fostp%2Fostp_public_access_memo_2013.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CollinsPolicy14-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CollinsPolicy14_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Collins, F.S.; Tabak, L.A.&#32;(2014).&#32;\"Policy: NIH plans to enhance reproducibility\".&#32;<i>Nature<\/i>&#32;<b>505<\/b>&#32;(7485): 612\u201313.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2F505612a\" target=\"_blank\">10.1038\/505612a<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Policy%3A+NIH+plans+to+enhance+reproducibility&amp;rft.jtitle=Nature&amp;rft.aulast=Collins%2C+F.S.%3B+Tabak%2C+L.A.&amp;rft.au=Collins%2C+F.S.%3B+Tabak%2C+L.A.&amp;rft.date=2014&amp;rft.volume=505&amp;rft.issue=7485&amp;rft.pages=612%E2%80%9313&amp;rft_id=info:doi\/10.1038%2F505612a&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BaroneUnmet17-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BaroneUnmet17_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Barone, L.; Williams, J.; Micklos, D.&#32;(2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5654259\" target=\"_blank\">\"Unmet needs for analyzing biological big data: A survey of 704 NSF principal investigators\"<\/a>.&#32;<i>PLOS Computational Biology<\/i>&#32;<b>13<\/b>&#32;(11): e1005858.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pcbi.1005755\" target=\"_blank\">10.1371\/journal.pcbi.1005755<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5654259\/\" target=\"_blank\">PMC5654259<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/29049281\" target=\"_blank\">29049281<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5654259\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5654259<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Unmet+needs+for+analyzing+biological+big+data%3A+A+survey+of+704+NSF+principal+investigators&amp;rft.jtitle=PLOS+Computational+Biology&amp;rft.aulast=Barone%2C+L.%3B+Williams%2C+J.%3B+Micklos%2C+D.&amp;rft.au=Barone%2C+L.%3B+Williams%2C+J.%3B+Micklos%2C+D.&amp;rft.date=2017&amp;rft.volume=13&amp;rft.issue=11&amp;rft.pages=e1005858&amp;rft_id=info:doi\/10.1371%2Fjournal.pcbi.1005755&amp;rft_id=info:pmc\/PMC5654259&amp;rft_id=info:pmid\/29049281&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5654259&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FedererBiomedical15-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FedererBiomedical15_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Federer, L.M.; Lu, Y.L.; Joubert, D.J. et al.&#32;(2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4481309\" target=\"_blank\">\"Biomedical Data Sharing and Reuse: Attitudes and Practices of Clinical and Scientific Research Staff\"<\/a>.&#32;<i>PLoS One<\/i>&#32;<b>10<\/b>&#32;(6): e0129506.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0129506\" target=\"_blank\">10.1371\/journal.pone.0129506<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4481309\/\" target=\"_blank\">PMC4481309<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26107811\" target=\"_blank\">26107811<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4481309\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4481309<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Biomedical+Data+Sharing+and+Reuse%3A+Attitudes+and+Practices+of+Clinical+and+Scientific+Research+Staff&amp;rft.jtitle=PLoS+One&amp;rft.aulast=Federer%2C+L.M.%3B+Lu%2C+Y.L.%3B+Joubert%2C+D.J.+et+al.&amp;rft.au=Federer%2C+L.M.%3B+Lu%2C+Y.L.%3B+Joubert%2C+D.J.+et+al.&amp;rft.date=2015&amp;rft.volume=10&amp;rft.issue=6&amp;rft.pages=e0129506&amp;rft_id=info:doi\/10.1371%2Fjournal.pone.0129506&amp;rft_id=info:pmc\/PMC4481309&amp;rft_id=info:pmid\/26107811&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4481309&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TenopirResearch14-6\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-TenopirResearch14_6-0\" rel=\"external_link\">5.0<\/a><\/sup> <sup><a href=\"#cite_ref-TenopirResearch14_6-1\" rel=\"external_link\">5.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Tenopir, C.; Sandusky, R.J.; Allard, S.; Birch, B.&#32;(2014).&#32;\"Research data management services in academic research libraries and perceptions of librarians\".&#32;<i>Library &amp; Information Science Research<\/i>&#32;<b>36<\/b>&#32;(2): 84\u201390.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.lisr.2013.11.003\" target=\"_blank\">10.1016\/j.lisr.2013.11.003<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Research+data+management+services+in+academic+research+libraries+and+perceptions+of+librarians&amp;rft.jtitle=Library+%26+Information+Science+Research&amp;rft.aulast=Tenopir%2C+C.%3B+Sandusky%2C+R.J.%3B+Allard%2C+S.%3B+Birch%2C+B.&amp;rft.au=Tenopir%2C+C.%3B+Sandusky%2C+R.J.%3B+Allard%2C+S.%3B+Birch%2C+B.&amp;rft.date=2014&amp;rft.volume=36&amp;rft.issue=2&amp;rft.pages=84%E2%80%9390&amp;rft_id=info:doi\/10.1016%2Fj.lisr.2013.11.003&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CarlsonResearch14-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CarlsonResearch14_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Carlson, J.&#32;(2014).&#32;\"The use of lifecycle models in developing and supporting data services\".&#32;In&#32;Ray, J.M..&#32;<i>Research Data Management: Practical Strategies for Information Professionals<\/i>.&#32;Purdue University Press.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781557536648.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=The+use+of+lifecycle+models+in+developing+and+supporting+data+services&amp;rft.atitle=Research+Data+Management%3A+Practical+Strategies+for+Information+Professionals&amp;rft.aulast=Carlson%2C+J.&amp;rft.au=Carlson%2C+J.&amp;rft.date=2014&amp;rft.pub=Purdue+University+Press&amp;rft.isbn=9781557536648&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CoxACritical18-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CoxACritical18_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Cox, A.M.; Tam, W.W.T..&#32;\"A critical analysis of lifecycle models of the research process and research data management\".&#32;<i>Aslib Journal of Information Management<\/i>&#32;<b>70<\/b>&#32;(2): 142-57.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1108%2FAJIM-11-2017-0251\" target=\"_blank\">10.1108\/AJIM-11-2017-0251<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+critical+analysis+of+lifecycle+models+of+the+research+process+and+research+data+management&amp;rft.jtitle=Aslib+Journal+of+Information+Management&amp;rft.aulast=Cox%2C+A.M.%3B+Tam%2C+W.W.T.&amp;rft.au=Cox%2C+A.M.%3B+Tam%2C+W.W.T.&amp;rft.volume=70&amp;rft.issue=2&amp;rft.pages=142-57&amp;rft_id=info:doi\/10.1108%2FAJIM-11-2017-0251&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LowndesOurPath17-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LowndesOurPath17_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lowndes, J.S.S.; Best, B.D.; Scarborough, C. et al.&#32;(2017).&#32;\"Our path to better science in less time using open data science tools\".&#32;<i>Nature Ecology and Evolution<\/i>&#32;<b>1<\/b>: 0160.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fs41559-017-0160\" target=\"_blank\">10.1038\/s41559-017-0160<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Our+path+to+better+science+in+less+time+using+open+data+science+tools&amp;rft.jtitle=Nature+Ecology+and+Evolution&amp;rft.aulast=Lowndes%2C+J.S.S.%3B+Best%2C+B.D.%3B+Scarborough%2C+C.+et+al.&amp;rft.au=Lowndes%2C+J.S.S.%3B+Best%2C+B.D.%3B+Scarborough%2C+C.+et+al.&amp;rft.date=2017&amp;rft.volume=1&amp;rft.pages=0160&amp;rft_id=info:doi\/10.1038%2Fs41559-017-0160&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MeyerPractical18-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MeyerPractical18_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Meyer, M.N.&#32;(2018).&#32;\"Practical Tips for Ethical Data Sharing\".&#32;<i>Advances in Methods and Practices in Psychological Science<\/i>&#32;<b>1<\/b>&#32;(1): 131-144.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1177%2F2515245917747656\" target=\"_blank\">10.1177\/2515245917747656<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Practical+Tips+for+Ethical+Data+Sharing&amp;rft.jtitle=Advances+in+Methods+and+Practices+in+Psychological+Science&amp;rft.aulast=Meyer%2C+M.N.&amp;rft.au=Meyer%2C+M.N.&amp;rft.date=2018&amp;rft.volume=1&amp;rft.issue=1&amp;rft.pages=131-144&amp;rft_id=info:doi\/10.1177%2F2515245917747656&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GoodmanTen14-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GoodmanTen14_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Goodman, A.; Pepe, A.; Blocker, A.W. et al..&#32;\"Ten Simple Rules for the Care and Feeding of Scientific Data\".&#32;<i>PLoS Computational Biology<\/i>&#32;<b>10<\/b>&#32;(4): e1003542.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pcbi.1003542\" target=\"_blank\">10.1371\/journal.pcbi.1003542<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Ten+Simple+Rules+for+the+Care+and+Feeding+of+Scientific+Data&amp;rft.jtitle=PLoS+Computational+Biology&amp;rft.aulast=Goodman%2C+A.%3B+Pepe%2C+A.%3B+Blocker%2C+A.W.+et+al.&amp;rft.au=Goodman%2C+A.%3B+Pepe%2C+A.%3B+Blocker%2C+A.W.+et+al.&amp;rft.volume=10&amp;rft.issue=4&amp;rft.pages=e1003542&amp;rft_id=info:doi\/10.1371%2Fjournal.pcbi.1003542&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WilkinsonTheFAIR16-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WilkinsonTheFAIR16_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J. et al.&#32;(2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4792175\" target=\"_blank\">\"The FAIR Guiding Principles for scientific data management and stewardship\"<\/a>.&#32;<i>Scientific Data<\/i>&#32;<b>3<\/b>: 160018.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fsdata.2016.18\" target=\"_blank\">10.1038\/sdata.2016.18<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4792175\/\" target=\"_blank\">PMC4792175<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26978244\" target=\"_blank\">26978244<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4792175\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4792175<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+FAIR+Guiding+Principles+for+scientific+data+management+and+stewardship&amp;rft.jtitle=Scientific+Data&amp;rft.aulast=Wilkinson%2C+M.D.%3B+Dumontier%2C+M.%3B+Aalbersberg%2C+I.J.+et+al.&amp;rft.au=Wilkinson%2C+M.D.%3B+Dumontier%2C+M.%3B+Aalbersberg%2C+I.J.+et+al.&amp;rft.date=2016&amp;rft.volume=3&amp;rft.pages=160018&amp;rft_id=info:doi\/10.1038%2Fsdata.2016.18&amp;rft_id=info:pmc\/PMC4792175&amp;rft_id=info:pmid\/26978244&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4792175&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IoannidisHowTo14-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IoannidisHowTo14_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ioannidis, J.P.A..&#32;\"How to Make More Published Research True\".&#32;<i>PLoS Medicine<\/i>&#32;<b>11<\/b>&#32;(10): e1001747.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pmed.1001747\" target=\"_blank\">10.1371\/journal.pmed.1001747<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=How+to+Make+More+Published+Research+True&amp;rft.jtitle=PLoS+Medicine&amp;rft.aulast=Ioannidis%2C+J.P.A.&amp;rft.au=Ioannidis%2C+J.P.A.&amp;rft.volume=11&amp;rft.issue=10&amp;rft.pages=e1001747&amp;rft_id=info:doi\/10.1371%2Fjournal.pmed.1001747&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Munaf.C3.B2AManifesto17-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Munaf.C3.B2AManifesto17_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Munaf\u00f2, M.R.; Nosek, B.A.; Bishop, D.V.M. et al.&#32;(2017).&#32;\"A manifesto for reproducible science\".&#32;<i>Nature Human Behaviour<\/i>&#32;<b>1<\/b>: 0021.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fs41562-016-0021\" target=\"_blank\">10.1038\/s41562-016-0021<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+manifesto+for+reproducible+science&amp;rft.jtitle=Nature+Human+Behaviour&amp;rft.aulast=Munaf%C3%B2%2C+M.R.%3B+Nosek%2C+B.A.%3B+Bishop%2C+D.V.M.+et+al.&amp;rft.au=Munaf%C3%B2%2C+M.R.%3B+Nosek%2C+B.A.%3B+Bishop%2C+D.V.M.+et+al.&amp;rft.date=2017&amp;rft.volume=1&amp;rft.pages=0021&amp;rft_id=info:doi\/10.1038%2Fs41562-016-0021&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CoxDevelop17-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CoxDevelop17_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Cox, A.M.; Kennan, M.A.; Lyon, L. et al.&#32;(2017).&#32;\"Developments in research data management in academic libraries: Towards an understanding of research data service maturity\".&#32;<i>Journal of the Association for Information Science and Technology<\/i>&#32;<b>68<\/b>&#32;(9): 2182-2200.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2Fasi.23781\" target=\"_blank\">10.1002\/asi.23781<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Developments+in+research+data+management+in+academic+libraries%3A+Towards+an+understanding+of+research+data+service+maturity&amp;rft.jtitle=Journal+of+the+Association+for+Information+Science+and+Technology&amp;rft.aulast=Cox%2C+A.M.%3B+Kennan%2C+M.A.%3B+Lyon%2C+L.+et+al.&amp;rft.au=Cox%2C+A.M.%3B+Kennan%2C+M.A.%3B+Lyon%2C+L.+et+al.&amp;rft.date=2017&amp;rft.volume=68&amp;rft.issue=9&amp;rft.pages=2182-2200&amp;rft_id=info:doi\/10.1002%2Fasi.23781&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FloresLibraries15-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FloresLibraries15_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Flores, J.R.; Brodeur, J.J.; Daniels, M.G. et al.&#32;(2015).&#32;\"Libraries and the Research Data Management Landscape\".&#32;In&#32;Maclachlan, J.C.; Waraksa, E.A.; Williford, C..&#32;<i>The Process of Discovery: The CLIR Postdoctoral Fellowship Program and the Future of the Academy<\/i>.&#32;Council on Library and Information Resources.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781932326529.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Libraries+and+the+Research+Data+Management+Landscape&amp;rft.atitle=The+Process+of+Discovery%3A+The+CLIR+Postdoctoral+Fellowship+Program+and+the+Future+of+the+Academy&amp;rft.aulast=Flores%2C+J.R.%3B+Brodeur%2C+J.J.%3B+Daniels%2C+M.G.+et+al.&amp;rft.au=Flores%2C+J.R.%3B+Brodeur%2C+J.J.%3B+Daniels%2C+M.G.+et+al.&amp;rft.date=2015&amp;rft.pub=Council+on+Library+and+Information+Resources&amp;rft.isbn=9781932326529&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BorghiData18-17\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BorghiData18_17-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-BorghiData18_17-1\" rel=\"external_link\">16.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Borghi, J.A.; Van Gulick, A.E.&#32;(2018).&#32;\"Data management and sharing in neuroimaging: Practices and perceptions of MRI researchers\".&#32;<i>bioRxiv<\/i>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0200562\" target=\"_blank\">10.1371\/journal.pone.0200562<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+management+and+sharing+in+neuroimaging%3A+Practices+and+perceptions+of+MRI+researchers&amp;rft.jtitle=bioRxiv&amp;rft.aulast=Borghi%2C+J.A.%3B+Van+Gulick%2C+A.E.&amp;rft.au=Borghi%2C+J.A.%3B+Van+Gulick%2C+A.E.&amp;rft.date=2018&amp;rft_id=info:doi\/10.1371%2Fjournal.pone.0200562&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WittContruct09-18\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WittContruct09_18-0\" rel=\"external_link\">17.0<\/a><\/sup> <sup><a href=\"#cite_ref-WittContruct09_18-1\" rel=\"external_link\">17.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Witt, M.; Carlson, J.; Brandt, D.S.; Cragin, M.H.&#32;(2009).&#32;\"Constructing Data Curation Profiles\".&#32;<i>International Journal of Digital Curation<\/i>&#32;<b>4<\/b>&#32;(3): 93-103.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2218%2Fijdc.v4i3.117\" target=\"_blank\">10.2218\/ijdc.v4i3.117<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Constructing+Data+Curation+Profiles&amp;rft.jtitle=International+Journal+of+Digital+Curation&amp;rft.aulast=Witt%2C+M.%3B+Carlson%2C+J.%3B+Brandt%2C+D.S.%3B+Cragin%2C+M.H.&amp;rft.au=Witt%2C+M.%3B+Carlson%2C+J.%3B+Brandt%2C+D.S.%3B+Cragin%2C+M.H.&amp;rft.date=2009&amp;rft.volume=4&amp;rft.issue=3&amp;rft.pages=93-103&amp;rft_id=info:doi\/10.2218%2Fijdc.v4i3.117&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PaulkCapability93-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PaulkCapability93_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Paulk, M.C.; Curtis, B.; Chrissis, M.B.; Weber, C.V.&#32;(1993).&#32;\"Capability maturity model, version 1.1\".&#32;<i>IEEE Software<\/i>&#32;<b>10<\/b>&#32;(4): 18-27.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2F52.219617\" target=\"_blank\">10.1109\/52.219617<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Capability+maturity+model%2C+version+1.1&amp;rft.jtitle=IEEE+Software&amp;rft.aulast=Paulk%2C+M.C.%3B+Curtis%2C+B.%3B+Chrissis%2C+M.B.%3B+Weber%2C+C.V.&amp;rft.au=Paulk%2C+M.C.%3B+Curtis%2C+B.%3B+Chrissis%2C+M.B.%3B+Weber%2C+C.V.&amp;rft.date=1993&amp;rft.volume=10&amp;rft.issue=4&amp;rft.pages=18-27&amp;rft_id=info:doi\/10.1109%2F52.219617&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CrowstonACapability12-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CrowstonACapability12_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Crowston, K.; Qin, J.&#32;(2012).&#32;\"A capability maturity model for scientific data management: Evidence from the literature\".&#32;<i>Proceedings of the American Society for Information Science and Technology<\/i>&#32;<b>48<\/b>&#32;(1): 1-9.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2Fmeet.2011.14504801036\" target=\"_blank\">10.1002\/meet.2011.14504801036<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+capability+maturity+model+for+scientific+data+management%3A+Evidence+from+the+literature&amp;rft.jtitle=Proceedings+of+the+American+Society+for+Information+Science+and+Technology&amp;rft.aulast=Crowston%2C+K.%3B+Qin%2C+J.&amp;rft.au=Crowston%2C+K.%3B+Qin%2C+J.&amp;rft.date=2012&amp;rft.volume=48&amp;rft.issue=1&amp;rft.pages=1-9&amp;rft_id=info:doi\/10.1002%2Fmeet.2011.14504801036&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SallansResearch14-21\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-SallansResearch14_21-0\" rel=\"external_link\">20.0<\/a><\/sup> <sup><a href=\"#cite_ref-SallansResearch14_21-1\" rel=\"external_link\">20.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Sallans, A.; Lake, S.&#32;(2014).&#32;\"Data management assessment and planning tools\".&#32;In&#32;Ray, J.M..&#32;<i>Research Data Management: Practical Strategies for Information Professionals<\/i>.&#32;Purdue University Press.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781557536648.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Data+management+assessment+and+planning+tools&amp;rft.atitle=Research+Data+Management%3A+Practical+Strategies+for+Information+Professionals&amp;rft.aulast=Sallans%2C+A.%3B+Lake%2C+S.&amp;rft.au=Sallans%2C+A.%3B+Lake%2C+S.&amp;rft.date=2014&amp;rft.pub=Purdue+University+Press&amp;rft.isbn=9781557536648&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SPARCHowOpen-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SPARCHowOpen_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/sparcopen.org\/our-work\/howopenisit\/\" target=\"_blank\">\"HowOpenIsIt? A Guide for Evaluating the Openness of Journals\"<\/a>.&#32;New Venture Fund.&#32;2013<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/sparcopen.org\/our-work\/howopenisit\/\" target=\"_blank\">https:\/\/sparcopen.org\/our-work\/howopenisit\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=HowOpenIsIt%3F+A+Guide+for+Evaluating+the+Openness+of+Journals&amp;rft.atitle=&amp;rft.date=2013&amp;rft.pub=New+Venture+Fund&amp;rft_id=https%3A%2F%2Fsparcopen.org%2Four-work%2Fhowopenisit%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AwreResearch15-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AwreResearch15_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Awre, C.; Baxter, J.; Clifford, B. et al.&#32;(2015).&#32;\"Research Data Management as a \u201cwicked problem\"\".&#32;<i>Library Review<\/i>&#32;<b>64<\/b>&#32;(4\/5): 356-371.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1108%2FLR-04-2015-0043\" target=\"_blank\">10.1108\/LR-04-2015-0043<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Research+Data+Management+as+a+%E2%80%9Cwicked+problem%22&amp;rft.jtitle=Library+Review&amp;rft.aulast=Awre%2C+C.%3B+Baxter%2C+J.%3B+Clifford%2C+B.+et+al.&amp;rft.au=Awre%2C+C.%3B+Baxter%2C+J.%3B+Clifford%2C+B.+et+al.&amp;rft.date=2015&amp;rft.volume=64&amp;rft.issue=4%2F5&amp;rft.pages=356-371&amp;rft_id=info:doi\/10.1108%2FLR-04-2015-0043&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ANDSCreating-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ANDSCreating_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ands.org.au\/guides\/creating-a-data-management-framework\" target=\"_blank\">\"Creating a data management framework\"<\/a>.&#32;Australian National Data Service.&#32;2011<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.ands.org.au\/guides\/creating-a-data-management-framework\" target=\"_blank\">http:\/\/www.ands.org.au\/guides\/creating-a-data-management-framework<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Creating+a+data+management+framework&amp;rft.atitle=&amp;rft.date=2011&amp;rft.pub=Australian+National+Data+Service&amp;rft_id=http%3A%2F%2Fwww.ands.org.au%2Fguides%2Fcreating-a-data-management-framework&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DCC_CARDIO-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DCC_CARDIO_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/cardio.dcc.ac.uk\/about\/\" target=\"_blank\">\"Collaborative Assessment of Research Data Infrastructure and Objectives (CARDIO)\"<\/a>.&#32;Digital Curation Center.&#32;2013<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/cardio.dcc.ac.uk\/about\/\" target=\"_blank\">https:\/\/cardio.dcc.ac.uk\/about\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Collaborative+Assessment+of+Research+Data+Infrastructure+and+Objectives+%28CARDIO%29&amp;rft.atitle=&amp;rft.date=2013&amp;rft.pub=Digital+Curation+Center&amp;rft_id=https%3A%2F%2Fcardio.dcc.ac.uk%2Fabout%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MatternUsing15-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MatternUsing15_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Mattern, E.; Jeng, W.; He, D.&#32;(2015).&#32;\"Using participatory design and visual narrative inquiry to investigate researchers\u2019 data challenges and recommendations for library research data services\".&#32;<i>Program: electronic library and information systems<\/i>&#32;<b>49<\/b>&#32;(4): 408-423.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1108%2FPROG-01-2015-0012\" target=\"_blank\">10.1108\/PROG-01-2015-0012<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Using+participatory+design+and+visual+narrative+inquiry+to+investigate+researchers%E2%80%99+data+challenges+and+recommendations+for+library+research+data+services&amp;rft.jtitle=Program%3A+electronic+library+and+information+systems&amp;rft.aulast=Mattern%2C+E.%3B+Jeng%2C+W.%3B+He%2C+D.&amp;rft.au=Mattern%2C+E.%3B+Jeng%2C+W.%3B+He%2C+D.&amp;rft.date=2015&amp;rft.volume=49&amp;rft.issue=4&amp;rft.pages=408-423&amp;rft_id=info:doi\/10.1108%2FPROG-01-2015-0012&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BorghiDeveloping17-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BorghiDeveloping17_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Borghi, J.A.; Abrams, S.; Chodacki, J. et al.&#32;(22 September 2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/zenodo.org\/record\/1213384\" target=\"_blank\">\"Developing a Data Management Guide for Researchers\"<\/a>.&#32;<i>Zenodo<\/i>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5281%2Fzenodo.1213384\" target=\"_blank\">10.5281\/zenodo.1213384<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/zenodo.org\/record\/1213384\" target=\"_blank\">https:\/\/zenodo.org\/record\/1213384<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Developing+a+Data+Management+Guide+for+Researchers&amp;rft.atitle=Zenodo&amp;rft.aulast=Borghi%2C+J.A.%3B+Abrams%2C+S.%3B+Chodacki%2C+J.+et+al.&amp;rft.au=Borghi%2C+J.A.%3B+Abrams%2C+S.%3B+Chodacki%2C+J.+et+al.&amp;rft.date=22+September+2017&amp;rft_id=info:doi\/10.5281%2Fzenodo.1213384&amp;rft_id=https%3A%2F%2Fzenodo.org%2Frecord%2F1213384&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BorghiSupport18-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BorghiSupport18_28-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Borghi, J.A.; Abrams, S.; Lowenberg, D. et al.&#32;(21 March 2018).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/zenodo.org\/record\/1204885\" target=\"_blank\">\"Support Your Data: A Data Management Guide for Researchers\"<\/a>.&#32;<i>Zenodo<\/i>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5281%2Fzenodo.1204885\" target=\"_blank\">10.5281\/zenodo.1204885<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/zenodo.org\/record\/1204885\" target=\"_blank\">https:\/\/zenodo.org\/record\/1204885<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Support+Your+Data%3A+A+Data+Management+Guide+for+Researchers&amp;rft.atitle=Zenodo&amp;rft.aulast=Borghi%2C+J.A.%3B+Abrams%2C+S.%3B+Lowenberg%2C+D.+et+al.&amp;rft.au=Borghi%2C+J.A.%3B+Abrams%2C+S.%3B+Lowenberg%2C+D.+et+al.&amp;rft.date=21+March+2018&amp;rft_id=info:doi\/10.5281%2Fzenodo.1204885&amp;rft_id=https%3A%2F%2Fzenodo.org%2Frecord%2F1204885&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NicholsBest17-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NicholsBest17_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Nichols. T.E.; Das, S.; Eickhoff, S.B. et al.&#32;(2017).&#32;\"Best practices in data analysis and sharing in neuroimaging using MRI\".&#32;<i>Nature Neuroscience<\/i>&#32;<b>20<\/b>: 299\u2013303.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnn.4500\" target=\"_blank\">10.1038\/nn.4500<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Best+practices+in+data+analysis+and+sharing+in+neuroimaging+using+MRI&amp;rft.jtitle=Nature+Neuroscience&amp;rft.aulast=Nichols.+T.E.%3B+Das%2C+S.%3B+Eickhoff%2C+S.B.+et+al.&amp;rft.au=Nichols.+T.E.%3B+Das%2C+S.%3B+Eickhoff%2C+S.B.+et+al.&amp;rft.date=2017&amp;rft.volume=20&amp;rft.pages=299%E2%80%93303&amp;rft_id=info:doi\/10.1038%2Fnn.4500&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Footnotes were originally numbered but have been converted to lowercase alpha for this version. The original article lists references alphabetically, but this version\u2014by design\u2014lists them in order of appearance.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193147\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.647 seconds\nReal time usage: 0.675 seconds\nPreprocessor visited node count: 21574\/1000000\nPreprocessor generated node count: 36450\/1000000\nPost\u2010expand include size: 147855\/2097152 bytes\nTemplate argument size: 50144\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 654.760 1 - -total\n 83.42% 546.197 2 - Template:Reflist\n 72.19% 472.697 28 - Template:Citation\/core\n 52.29% 342.405 19 - Template:Cite_journal\n 15.99% 104.718 6 - Template:Cite_web\n 10.88% 71.255 1 - Template:Infobox_journal_article\n 10.49% 68.701 1 - Template:Infobox\n 8.74% 57.202 3 - Template:Cite_book\n 6.64% 43.457 30 - Template:Citation\/identifier\n 6.44% 42.137 80 - Template:Infobox\/row\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10805-0!*!0!!en!*!* and timestamp 20181214193147 and revision id 34017\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers\">https:\/\/www.limswiki.org\/index.php\/Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","5084f989065d7c37f4ccf170c3f09ee7_images":[],"5084f989065d7c37f4ccf170c3f09ee7_timestamp":1544815907,"8468ac745333952ccc234d2243224725_type":"article","8468ac745333952ccc234d2243224725_title":"Technology transfer and true transformation: Implications for open data (Bezuidenhout 2017)","8468ac745333952ccc234d2243224725_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data","8468ac745333952ccc234d2243224725_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Technology transfer and true transformation: Implications for open data\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nTechnology transfer and true transformation: Implications for open dataJournal\n \nData Science JournalAuthor(s)\n \nBezuidenhout, LouiseAuthor affiliation(s)\n \nUniversity of OxfordPrimary contact\n \nEmail: louise dot bezuidenhout at insis dot ox dot ac dot ukYear published\n \n2017Volume and issue\n \n16Page(s)\n \n26DOI\n \n10.5334\/dsj-2017-026ISSN\n \n1683-1470Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/datascience.codata.org\/articles\/10.5334\/dsj-2017-026\/Download\n \nhttps:\/\/datascience.codata.org\/articles\/10.5334\/dsj-2017-026\/galley\/678\/download\/ (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n\n2.1 Unpacking the links between laboratory equipment and open data \n\n\n3 It's not just the equipment \n\n3.1 The equipment is ... \n\n3.1.1 ... not there ... \n3.1.2 ... broken ... \n3.1.3 ... not running ... \n\n\n\n\n4 Open data, technological difficulties and the slow pace of research \n\n4.1 A need for speed \n4.2 Data quality \n\n\n5 Technology transfer: The solution to pace and openness? \n\n5.1 Problems with technology transfer \n5.2 Gerson\u2019s model of technology transfer \n5.3 Avoiding \"insidious inequalities\" \n\n\n6 Building capacity in research ... and in open data \n\n6.1 Recognizing problems, designing solutions \n6.2 Creating \u201csafe spaces\u201d to discuss these issues \n6.3 More holistic approaches to capacity building \n\n\n7 Conclusion \n8 Additional file \n9 Footnotes \n10 Acknowledgements \n\n10.1 Competing interests \n\n\n11 References \n12 Notes \n\n\n\nAbstract \nWhen considering the \u201copenness\u201d of data, it is unsurprising that most conversations focus on the online environment\u2014how data is collated, moved, and recombined for multiple purposes. Nonetheless, it is important to recognize that the movements online are only part of the data lifecycle. Indeed, considering where and how data are created\u2014namely, the research setting\u2014are of key importance to open data initiatives. In particular, such insights offer key understandings of how and why scientists engage with in practices of openness, and how data transitions from personal control to public ownership.\nThis paper examines research settings in low\/middle-income countries (LMIC) to better understand how resource limitations influence open data buy-in. Using empirical fieldwork in Kenyan and South African laboratories, it draws attention to some key issues currently overlooked in open data discussions. First, many of the hesitations raised by the scientists about sharing data were as much tied to the speed of their research as to any other factor. Thus, it would seem that the longer it takes for individual scientists to create data, the more hesitant they are about sharing it. Second, the pace of research is a multifaceted bind involving many different challenges relating to laboratory equipment and infrastructure. Indeed, it is unlikely that one single solution (such as equipment donation) will ameliorate these \u201cbinds of pace.\u201d Third, these \u201cbinds of pace\u201d were used by the scientists to construct \u201cnarratives of exclusion\u201d through which they remove themselves from responsibility for data sharing.\nUsing an adapted model of technology first proposed by Elihu Gerson, the paper then offers key ways in which these critical \u201cbinds of pace\u201d can be addressed in open data discourse. In particular, it calls for an expanded understanding of laboratory equipment and research speed to include all aspects of the research environment. It also advocates for better engagement with LMIC scientists regarding these challenges and the adoption of frugal\/responsible design principles in future open data initiatives.\nKeywords: technology, low\/middle-income countries, data sharing, research, pace \n\nIntroduction \nThe issue of increasing the openness of data online is a global priority. Indeed, open data is increasingly featuring on agendas of both high- and low\/middle-income country development plans.[1] Nevertheless, data sharing in low\/middle-income countries (LMICs) is challenged by a number of widely-recognized issues. These include a lack of resources for sharing activities[2] as well as for research activities more generally. Strategically increasing research capacity in LMICs\u2014and thus the ability of LMIC researchers to participate in the open data movement\u2014is intrinsically tied (at least in part) to the need for increasing the availability of laboratory and ICT equipment.\n\nUnpacking the links between laboratory equipment and open data \nIt is recognized that the lack of up-to-date laboratory equipment hampers not only the ability to conduct certain types of research, but has an overall impact on the pace and efficiency of research. How to best address this lack of physical research resources is becoming a topic for directed intervention, and a number of different organizations have been set up to address issues relating to equipment provision. These include databases of equipment[a], equipment donation schemes[b], or equipment collaborations, as well as increased equipment budgets in many funded grants.[c]\nDespite the value of these initiatives, a coordinated and sustained approach to research equipment in LMICs remains elusive for two key reasons. First, a lack of empirical evidence detailing the contextual heterogeneity of LMIC research environments challenges targeted interventions. Second, the absence of LMIC scientists in more general discussions on scientific research practices makes it difficult to pinpoint key issues that may be prevalent within these research settings. Thus, capacity building initiatives are often challenged by the absence of a clear picture of what equipment are needed and best deployed in LMIC regions. It is therefore highly possible that other interventions are critically needed if this resource shortfall is to be effectively addressed.\nThe challenges of increasing research capacity through equipment-related interventions have far-reaching implications for LMIC research. In this special edition, and in related papers[3][4][5], we argue for a stronger connection between the discussions of open data and the research environment in which data are generated. The physical\u2014as well as the social and regulatory aspects of research environments\u2014influences how scientists are able to create, curate, and disseminate data, and thus the ability of scientists to contribute and re-use data online. Moreover\u2014and often overlooked\u2014the characteristics and challenges of personal research environments can influence the importance that scientists attach to the open data movement.[3][4][5]\nNonetheless, in many discussions on open data there is an absence of robust discussion on the influence of the physical research environment on data engagement activities. This paper examines this issue in more detail examining four interlinking questions. First, to what extent do issues relating to technology affect the pace of research in these laboratories? Second, could these issues of pace be ameliorated by the directed provision of more equipment\u2014particularly high-level, specialized machinery? Moreover, how can reflecting on issues to do with technology contribute towards more inclusive discussion surrounding open data? Finally, how can a better understanding of research technologies enable more contextually-sensitive discussions about data engagement?\nIn order to unpack these questions in detail, the paper discusses qualitative fieldwork conducted in four African laboratories between 2014 and 2015. This fieldwork was designed to investigate data engagement activities among scientists working in resource-limited environments. From these interviews, the paper highlights how issues of data engagement and issues of equipment provision were inextricably intertwined and often interdependent. If these issues are to be effectively addressed in open data discussions, the paper suggests that an expanded definition of \u201cresearch technologies\u201d is necessary. Using a model proposed by Elihu Gerson, the paper then offers key ways in which the critical issues of technological contextuality can be effectively implemented into open data discourse.\n\nIt's not just the equipment \nWhen considering laboratory equipment and research it is tempting to make the assumption that more\u2014and newer\u2014equipment leads to more productive research that is conducted at a faster speed with increased outputs (such as data). Indeed, such assumptions drive many of the equipment-focused initiatives mentioned above. Similarly, it is tempting to extend such assumptions to open data conversations. If more equipment will facilitate the faster production of increased amounts of data, the argument would go, then scientists will be more able (and willing) to share their data online.\nWhile these arguments make a compelling case, examination of the current status quo indicates a need for caution. Indeed, if the causal links between equipment provision, increased research pace, and improved open outputs were that straightforward, data sharing should be markedly increased by the provision of (any) laboratory equipment. Such questions motivated a period of embedded fieldwork in Kenya and South Africa between 2014 and 2015. I wanted to examine how scientists in low-resourced research settings engaged in open data activities and discussions\u2014and whether their physical laboratory environment had any influence over this engagement.[d] Over the course of the year I spent three to six weeks in four different chemistry laboratories and conducted 56 semi-structured interviews with researchers and postgraduate students to find out what was working in their research environments, and what challenged their ability to generate, curate, store, share, and re-use data online.\nUpon analyzing the interviews, the issue of pace in research was unavoidable. Indeed, it was everywhere. Concerns about the slowness of research, and the pressure to speed it up, pervaded how the scientists talked about their research, valued their data, identified threats to their sovereignty and acquisition of credit, positioned themselves within the scientific community, and evaluated the international community\u2019s efforts to assist them. These issues have been discussed in other papers[3][4][5] and will not be covered here. Instead, this paper takes a step back to look at why there was this overwhelming awareness of pace in these laboratories. What aspects of the laboratory equipment played key roles in controlling the pace of research, and consequentially the engagement of scientists in open data activities.\n\nThe equipment is ... \nThe laboratories that I visited were not members of high-profile consortia or integrated into well-funded foreign research networks. Rather, they were good examples of home-grown science. They produced high-quality research but were dependent on their funding from multiple national and international sources. Moreover, their facilities\u2014and the budget to maintain or upgrade them\u2014were provided by their host institutions. This created a bind for the researchers, as the facilities provided were often minimal and\/or badly maintained, and their institutions did not have large amounts of \u201ccore funding\u201d for upgrades. As one Kenyan participant said:\n\nWe get no funding from the government. We get paid from the government, we get bills of power and water by the government but otherwise, other than that, the materials that we need for research we have to source from funding agencies. (KY1:8)\nSimilarly, as most of the funding for their research came from project-specific grants, the researchers had few opportunities to secure money for standard laboratory equipment or general laboratory maintenance. A participant in South Africa eloquently said, when talking about her research that:\n\n[it] is a challenge because the university doesn\u2019t offer a start-up fund for equipment. \u2026 I would need to pay bit by bit and one by one. When I have funding then buy one piece of equipment and maybe after five years I would have my lab. (SA2:11)\nMoreover, even when the money was there, many of the participants said that they experienced problems accessing it, or using it to address the challenges that they identified in their daily research environment. This is evident in a quote by another South African participant who said:\n\nIt\u2019s really bad \u2013 the bureaucracy of it. It\u2019s how the money is transferred, technical services, procurement, all those \u2026 but those are like \u201cgrand problems\u201d that you can\u2019t solve. (SA2:6)\nThus, a lot of the discussions I had about research and data engagement became discussions about equipment and research environments. The researchers I interviewed highlighted a number of key issues that affected the pace of their research in comparison (in their opinion) to well-resourced laboratories. In particular, the statements related to the \u201cun-usability\u201d of the equipment that was available for them to use. These statements are broadly grouped under the headings below.\n\n... not there ... \nOne of the most common complaints I heard in all four laboratories was that the equipment available for research curtailed the types of research that could be done by the researchers. While this is, of course, an issue for scientists around the world, for many of the researchers that I interviewed this was almost a deal-breaking aspect of their research plans. As one Kenyan participant observed:\n\nthe lack of equipment limits the extent to which you can do research \u2013 and even the type of research that you want to do. And you ask yourself, ok, so I want to do this kind of research but do I have the machinery? (KY2:3)\nSimilarly, a participant from the other Kenyan site said: \n\n[o]ur labs are not even there for synthesis \u2013 synthetic work \u2013 the environment is not there. So when it comes to that I either have to skip it or I have to go to a lab that has such facilities. (KY1:3)\nThese constraints not only shaped the research being conducted in these environments, but they also necessitated that a number of researchers change the direction of their research in order to fit in with the equipment available. Particularly in Kenya there were a number of lecturers and professors who had done postgraduate training in the U.K. or U.S., but they were unable to capitalize on their research experiences back home. This was described by one Kenyan professor who said:\n\nthe kind of research which is taking place here is a bit different from what I was doing \u2013 like in the UK I was doing synthetic organic chemistry. And the kind of equipment and the rest, it was purely on silicone chemistry and the reagents and the rest I couldn\u2019t get them here. So what I had to do was to look for things which are relevant for this institution. (KY1:1)\nIn addition to shaping the types\u2014and thus the broad pace\u2014of research, the lack of equipment also had an impact on the daily pace of research activities in the laboratory. This is evident in the exchange below, where the participant (a postgraduate student) explains day-to-day practices within the laboratory. In particular, he highlights how sharing basic equipment plays a highly influential role on how much he can work on a day-to-day basis, and thus how much data he can produce. As there were six postgraduate students sharing one evaporator, one can only imagine their frustration.\n\nParticipant: The solvents and reagents we have all, but the equipment\u2013some equipments are missing. But we do the best we can.\r\nLB: And with so many in the lab there must be high competition to use the equipment.\r\nParticipant: Yeah! For example, this evaporator, we all use it. So we have to use it at a certain time and you when you leave it the other person wants to use it and so on and so on.\r\nLB: So there is a schedule.\r\nParticipant: So for us to work very well, so everyone should have at least an evaporator like this so that you can use it at any time. In that case it can become very easier, instead of sharing \u2013 it\u2019s not easy. (KY1:6)\nThe absence of laboratory equipment thus created two different pace-related binds for the researchers that were interviewed. Not only did it shape the types of research that could be conducted, thus affecting the long-term pace of research, but it also shaped the pace of daily research. In this, it was often the absence of multiple copies of generic equipment\u2014evaporators, Gilson pipettes, glassware, water baths and so forth\u2014that played key roles in slowing down the amount of experiments that could be done by one individual on a daily basis.\n\n... broken ... \nAt one of the Kenyan universities I was given a tour around the laboratories and shown the available equipment for research. I took the following note in my field diary after some discussion with my guides:\n\nThis department has been donated an NMR machine by a laboratory in the U.S.A. When it arrived it needed to be calibrated and set up. It would also seem that some parts needed to be replaced in order to get it working. However, there is no technical support for this make and model [it is an older version of the current one on the market] in East Africa, and the only place with spare parts and a qualified technician is in South Africa. This creates situation in which they are expected to be grateful for donations, but age of machine and lack of funds for upkeep makes it obsolete before it is delivered. (KY2 field diary: day 3)\nIndeed, the NMR machine had never been in use, as the laboratory lacked the money to fly the technician from South Africa. Such lack of technical support and funds available for maintenance and upkeep were often key issues for the researchers interviewed. It was apparent that even the equipment that was bought using project funds was vulnerable to this situation after the end of the grant. Thus, while it may be assumed that many of the laboratories in sub-Saharan Africa possess quality research equipment, the lack of technical support\u2014together with the rapid obsolescence of models of research equipment\u2014cause this equipment to stand un-used.\nIt must be noted that many of the participants made use of some sort of equipment sharing, either by partnering with geographically close institutions or by sending samples away. One South African participant described this, saying:\n\n[i]t\u2019s only now we are starting collaboration in terms with sharing equipment because previously they didn\u2019t have any equipment so they were using ours but now ours is broken down and we are going back to them. (SA2:2)\nNonetheless, every single participant who discussed equipment sharing mentioned the time and frustration of not being able to do experiments in situ\u2014and the waste of time and resources necessary to take experiments to a different laboratory.\nThe inability to make full use of the equipment available was a source of considerable frustration to many of the scientists interviewed. Moreover, they perceived a lack of agency in being able to ameliorate these situations due to the constraints of project-specific funding, lack of core funding, and an absence of other pots of money that could be tapped into for repairs and maintenance. As one PI in South Africa said:\n\n[y]ou know they call us to meetings and they say we have funding for this and that. And I think \u201cgreat stuff\u201d, but I wish they would ask me what the real issues are. I\u2019ll probably tell you 100 other things outside of the money [permitted to be spent on the grant]. (SA2:1)\n... not running ... \nRelated to the problems experienced with broken equipment were another: not having the reagents or infrastructure to use working equipment. This was eloquently described by one of the Kenyan participants, who said:\n\n[o]ur equipment is not running or idle. We have an AS that is not operating, because we have no fume hood and now no acetylene gas. Because of this it has been idle for six years. (KY1:9)\nSimilarly, in South Africa, one participant described the challenges of working in a geographically-isolated university, saying:\n\nit has been very challenging [having the NMR machine] \u2013 it\u2019s a baby that you have to nurse all the time. Also for the liquid nitrogen that we need at first we couldn\u2019t get a source of liquid nitrogen north of [a major metropolitan area six hours away]. (SA2:6)\nThe difficulties of ensuring regular supplies of reagents, electricity, and internet connection often had a significant impact on the ability of the researchers to run what equipment was available to them. Consequently, the pace of their research slowed down almost as much as if the equipment were broken or missing.\n\nOpen data, technological difficulties and the slow pace of research \nAs detailed from the fieldwork above, the scientists in the laboratories I visited often experienced challenges to their ability to work effectively. Absent, broken, or poorly maintained laboratory equipment slowed down their research and delayed the production and subsequent analysis of research data. Interestingly, these challenges played a big part in how they discussed their involvement\u2014or lack thereof\u2014in open data activities. Indeed, while most of the interviewees were supportive of data engagement activities in theory, there was not much data engagement occurring on a daily basis. These issues are elaborated on below.\n\nA need for speed \nMany of the scientists that I interviewed believed that the slower pace of their research (in comparison to high-income countries [HICs]) left them at a disadvantage when it came to data release, particularly in terms of pre-publication data release. This is evident in the exchange below:\n\nParticipant: But no in the fact that maybe I\u2019m here in the lab doing something and someone is out there in Europe and they do the same research as me and published before me so my work will be null and void.\r\nLB: So you\u2019re concerned that by making your research available other people might beat you to the post.\r\nParticipant: Yes. Because it may be null and void but you\u2019ve been in the lab for almost a year.\r\nLB: Do you think it is influenced by the resource difference between the North and the South?\r\nParticipant: We\u2019re in Africa, right. That is the West\u2014they definitely have more advanced stuff than us. So if I\u2019m doing this research for one year, someone in Europe of the U.S. they can do it in 3 or 4 months. So that is where now the issue. (KY1\/4)\nThis concern about speed of data analysis has been reported by other researchers[6][7] and has already influenced a number of data release expectations by funders and consortia (such as MalariaGEN and H3Africa consortia). These initiatives focus predominantly on ensuring that scientists in LMICs get extended periods of time on completion of the project to process and analyze the data generated.[6][7]\nWhile extended data moratoria at the end of the project is undoubtedly valuable to enable the maximum number of publications from a research project, the quote above highlights that more is needed. What became apparent from the conversations with fieldwork participants was that they were conscious of the pace of their research throughout\u2014and that being slow at producing data was as pertinent as taking longer to analyze the final product. This highlights a key oversight in current data discussions, where there is no sensitivity to how mid-project data releases can be safeguarded for researchers who necessarily take longer to complete their research projects due to resource limitations.\nWhat the fieldwork identifies is the need for corresponding efforts to address the issues relating to the varying pace of data generation. More reflection and productive policies are needed that address the multitude of issues that cause this slower pace in daily research activities. Specifically, this links directly to the types, availability, maintenance, and provisioning for the equipment in the laboratory. If the entire research process occurs at a slower pace, it is unlikely (as the quotes show), that many researchers will risk sharing pre-publication data, methodologies, and other resources. This is of particular importance for scientists not involved in international research networks, and who do not have extensive support systems to draw on.\n\nData quality \nAnother key theme that emerged from the discussions about data sharing was that many of the participants were concerned that\u2014even if they did release the data\u2014data would not be re-used by their international peers. One researcher in Kenya highlighted this, saying:\n\n[t]here is a constraint. Even the conditions aren\u2019t right, so you cannot work as fast. One of the limitations is of facilities. I mean facilities that can\u2019t be considered credible for some publication. If the instruments that are there are really elementary so you have to search for instruments that aren\u2019t here and that takes some time. (KY2:15)\nThis was eloquently reiterated by another of his peers, who said:\n\nhow much can we do to develop our own data? What processes do we need to convince people that the data are good?\u201d (KY2:13)\nSuch statements show a distinct anxiety over the data that are being produced that is linked to the types of equipment being used to produce it. If, as is suggested in the quotes above, the equipment is older, and the methodologies are more basic, how will the data be viewed by international peers? Would it, as some of the participants suggested, not be viewed as of equal value if it is shared? In other words, would the data created in low-resourced settings be re-used at all if it is released online? Such observations link the pace of technological change within research communities to perceptions of data sharing, something that has not yet been examined in open data discussions.\nPerceptions of data becoming obsolete based on the equipment and methods used to generate it are contentious, and the validity of such positions may be argued. Nonetheless, it has far reaching consequences for the open data movement. In a way, it may be said to offer an example of the Thomas theorem.[e] If the researchers believe that their data will be judged based on the age of their equipment and methods, they will be less inclined to go through the effort of sharing. This is particularly the case if they believe that their work will be heavily scrutinized, overlooked, or rendered obsolete upon arrival. Consequentially, the pace of research is slowed down by researchers not sharing data, or delaying its release.\nTogether, these two issues were highly influential in mediating the interviewees predominant lack of involvement in data engagement activities. In particular, these two issues had key effects on the lack of pre-publication release of data and the participation in online knowledge transfer (through posting of presentations online, contribution to discussion forums, release of methodologies, and so forth). The specter of \u201cbeing scooped\u201d due to the slower pace of research, coupled with the helplessness of changing the pace at which data were generated thus led to a situation in which the scientists recognized the value of increased openness in science but did little to engage.\n\nTechnology transfer: The solution to pace and openness? \nThe fieldwork described above clearly highlighted two key issues. First, the speed of research in the laboratories that I visited was influenced by the technologies available. This impacted research productivity, data production, research efficiency, and the optimal use of funding resources. Second, the issues of pace in research were intimately connected to how scientists valued their research, and subsequently how they conceived their responsibilities to be involved in data engagement activities.\nIn a way, the researchers who were interviewed constructed a \u201cnarrative of exclusion\u201d in which they (in)voluntarily opted out of open data activities. This narrative was constructed around perceptions of the pace of high-income country science and their inability to match this pace in their own research. The existence of these perceptions, and the preferred exclusion that the narrators often choose, is rarely acknowledged in open data discussions.\nThe obvious solution to these problems\u2014the solution to slower research in LMICs, to less data engagement, to more visibility of LMIC research\u2014would appear to be investment in the equipment present in these low-resourced laboratories. Providing more equipment to researchers currently working with the pressures described by the interview participants would seem the logical step out of this current conundrum \u2026 or does it?\n\nProblems with technology transfer \nInitiatives such as Seeding Labs[f] and the Sustainable Sciences Institute[g] have been influential in partnering HIC donors of equipment with LMIC applicants, and have considerable testimonials to bear witness to their positive impact. Nonetheless, a recent article on Seeding Labs noted that \u201c[r]ecipients pay a fraction of the equipment\u2019s cost to offset the logistical expenses \u2014 although Seeding Labs refuses to say how much \u2014 and, as buyers, they assume responsibility for setting up and maintaining the equipment\u201d.[8]\nIn light of the difficulties experienced by the Kenyan and South African laboratory in setting up their NMR machines (see fieldnote above and SA2:6), or the Kenyan laboratory\u2019s struggles with their AS machine (KY1:9), it is important not to see these initiatives as blueprint examples for generic success. Rather, the careful matching of donors and recipients, mentoring during the process of donation, and a careful analysis of what is required from the target sites are all necessary to ensure success.\nSimilarly, efforts to create equipment databases to facilitate inter-institutional sharing in LMICs have also struggled with similar problems. Informal discussions with scientists regarding such initiatives have brought to light key contextual concerns, such as how the user\/provider relationship will cope with issues such as payment and sourcing of reagents, maintenance, technical support, and possible damage instances. Such concerns, it must be noted, are not unique to LMICs and have similarly been discussed in relation to the EPSRC equipment sharing portal for U.K. universities.[h]\nThese problems highlight two key concerns. First, current approaches to technology transfer often do not take into consideration the limitations of the context in which it will be used. Providing equipment without the researchers having a sustained ability to get the reagents necessary to run it is highly problematic. Second, current approaches to technological transfer often do not take into consideration the difficulties of moving technologies across different contexts. As described by my field notes from the first Kenyan society, there is no value in a piece of equipment that cannot be calibrated or maintained due to a lack of qualified technical support.\nWhile the implications for broader, capacity-focused discussions are apparent, these observations also have important consequences for future open data discussions. What the evidence clearly suggests is that relying on LMIC researchers receiving more equipment will not necessarily influence the speed of their research, nor their willingness to share data. Thus, structuring projections on LMIC involvement in the open date movement based on a linear model of technology transfer\/research productivity is highly problematic. What open data discussions need is a new model of technology\/pace that takes these issues into account.\n\nGerson\u2019s model of technology transfer \nFirst and foremost, it is important to critique exactly what is needed from a definition of \u201ctechnology.\u201d It is often tempting to equate \u201ctechnologies\u201d to the equipment used within the laboratory, in particular the specialized (and often high-tech) machines such as NMR, polymerase chain reaction (PCR), chromatographic apparatus and so forth. In contrast, as evident from the fieldwork, these pieces of equipment are not the only causes of the pace issues experienced by the researchers interviewed. Indeed, the student discussing the lack of multiple evaporators in his lab (KY1:6), or the researcher who struggled to get liquid nitrogen for the AS machine (SA2:6) experienced similar problems of pace. Moreover, dissociating discussions of equipment from those relating to their running costs and infrastructural requirements is evidently limiting.\nWith this in mind, it is helpful to make use of a recent definition of \u201ctechnologies\u201d proposed by Elihu Gerson in his 2015 lecture at the International Society for the History, Philosophy and Social Studies of Biology (ISHPSSB). He proposed that \"[t]echnology can include instruments, specialized materials such as cell lines, model organisms, enzymes, antibodies etc. It also includes specialized codified procedures, such as those used in psychology, field observations etc.\"[9]\nThis expanded notion of technologies allows us to draw in the many different issues that were raised by the fieldwork participants, including the difficulties of getting reagents, of setting up laboratories and protocols, of having instruments available, and also having the expertise necessary to utilize the equipment.\nSecond, Gerson draws attention to the difficulties of moving technologies across research contexts. He highlights six ways in which attempts to introduce technologies into new environments may be problematic. These include:\n\n Materials and equipment are recalcitrant.\n Researchers can\u2019t anticipate every contingency in a situation.\n Resituating new technologies requires coordination between source and target sites.\n Repertoires for work at the target must be developed.\n New technologies must be registered at the target site.\n New technologies address phenomena in new ways.\nMoreover, it is apparent that a failure to address these concerns can result in incomplete resituation of technologies that significantly slows down research processes and stops effective data engagement. This is evident in many of the quotes from the fieldwork, such as the idle AS machine at one of the Kenyan sites (KY1:9), or my field notes description of the donated NMR machine. It would thus appear that effective data engagement\u2014generation, storage, curation, analysis, dissemination, and re-use\u2014is directly tied not only to issues of technology provision, but also of the (in)effective manner in which the technology is introduced into the new context. Thus, without careful attention to the contexts in which technologies are to be deployed, making assumptions about its efficacy is highly problematic.\nSuch observations are of particular importance to open data discussions that should, ideally, consider the process of data production in its entirety. Figure 1 outlines this in detail, identifying key areas of the research data cycle. While each stage is associated with technologies that could speed up research, introducing these technologies without correctly situating them within the specific context could undermine these efforts.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. Technologies associated with data engagement\n\n\n\nFigure 1\u2014and the accompanying empirical evidence\u2014highlights the potential issues of what could accompany well-meaning technology provision and undermine effective data engagement. Using the adapted version of Gerson\u2019s model clearly highlights how open data discussions\u2014instead of relying on the provision of high-level laboratory equipment as a means of speeding up research and encouraging sharing\u2014needs to carefully consider the research contexts, use expanded interpretations of \u201ctechnologies,\u201d and pay attention to the repertoires necessary to use available technologies. Such awareness also needs to be reflected in the design of future initiatives, so as to safeguard against the hidden binds of pace that accompany inappropriate re-situation of technologies. Without such an expanded focus, it is likely that LMIC scientists will continue to underperform in open data initiatives.\n\nAvoiding \"insidious inequalities\" \nIssues associated with the pace of research and incomplete technology transfer also extend beyond the sites of data creation in LMICs. While many LMIC countries are rapidly expanding their internet capabilities, incomplete integration of these capabilities into immediate working environments\u2014through lack of expertise, older equipment, poor maintenance and technical support, and infrastructural challenges (such as power provision)\u2014continue to challenge effective usage. As one Kenyan participant observed:\n\n[i]n Kenya people say that we have internet everywhere but really how much can you download, and you have to have the equipment to be able to. You bought the data bundle but what you have is not enough for you to download any publication or anything like that. Some areas in Kenya we know that people can\u2019t even access. Although we know the networking has been done but there is an assumption that everyone can access. (KY1:2)\nFor LMIC scientists, it would therefore seem that the connection between \u201cpace\u201d and technology is inescapable. Without directed interventions that address the integration of technologies in situ it is likely that scientists will continue to operate at the slower speed that currently characterizes much of the research in these areas. The evidence from the fieldwork presented in this paper suggests that this has multiple implications for the open data movement, both in terms of practical engagement and ideological buy-in.\nCo-partnership with other initiatives aimed at addressing equipment provision, the integration of responsive design principles into data platforms, and the provision of funds to ameliorate hurdles to data generation and dissemination will all assist in changing this current paradigm of insidious inequality. As said by a PI in South Africa:\n\n[t]he disadvantaged are still disadvantaged and that is the fact of the matter. The government may be willing to address the gaps, but there are still gaps. (SA2:1)\nBut with an increased awareness of the contextuality of data engagement, it is likely that the open data movement can move beyond such accusations.\n\nBuilding capacity in research ... and in open data \nCombining the fieldwork above with Gerson\u2019s model of technology transfer makes a compelling case for prioritizing the issue of pace in open data discussions. Moreover, it clearly highlights the need for further, expanded discussions on technical capabilities necessary for data engagement. It thus becomes important to ask: what can be done to effect such a change?\n\nRecognizing problems, designing solutions \nWhen discussing the issues of equipment limitations, a South African researcher made a telling remark. She said:\n\n[b]ut where I find it difficult is people don\u2019t understand our situation\u2014it\u2019s not bad will, it\u2019s just not being able to figure it out. (SA2:12)\nIn this comment she was specifically talking about the difficulty of registering online for international conferences where the high-resolution of the websites made it difficult to use in her low-bandwidth area. She described a long and annoying process of attempting to get the site to work, after which she ended up giving up and emailing the organizers. Nonetheless, as she said, creating websites that were not usable in low-bandwidth areas was not an intentional slight by the organizers. Rather, they were not aware that it could be problematic for many researchers in low-resourced environments.\nUnpacking this comment leads to a number of different issues. First and foremost, it draws attention to the lack of awareness by scientists, funders, and affiliated stakeholders in high-income countries about these problems. While many are aware that there are some \u201cequipment issues\u201d in low-resourced research settings, very few have a consistent and coherent impression of what these problems actually are, particularly when using an extended interpretation of technologies.\nThis lack of awareness is compounded by the rarity of detailed ethnographic analyses of technological challenges in low-resourced settings. While a number of studies on working conditions in low-resourced environments are gradually emerging[10][11][12][3], the focus of these studies are diverse and a systematic collation of the evidence with regards to research technologies is urgently needed.\nSuch awareness will be of critical importance to future open data discussions. As evident from the fieldwork, issues of the availability of effective technologies\u2014and the resultant impact on research pace\u2014were key contributors to the lack of involvement in data engagement issues in all four institutions that I visited. Not only did the lack of effective and integrated technologies slow down the pace of data generation, but also the lack of up-to-date information and communications technologies (and the corresponding infrastructures and repertoires) cause difficulties in all aspect of data engagement.\nWithout support for these issues\u2014and policies that directly address these issues\u2014it is difficult to see how LMIC researchers can be effectively drawn into open data discussions. Indeed, current policies do little to assuage the fears that they have relating to the pace, leading (at least in part) to the lack of daily data engagement evident from the scientists I interviewed. As a result, the lack of contextually-sensitive policies not only led to lower levels of data contribution and re-use from scientists in these regions, but also influenced the manner in which the ideals and responsibilities of open data were discussed in these setting.\nTo counter this, open data discussions need to expand discussions on responsible innovation to include the expanded version of research technologies suggested by Gerson. Key lessons from the \u201cfrugal innovation\u201d movement[13] could also be effectively incorporated to promote cost-effective and efficient design of research technologies that will more easily adapt to low-resourced settings. Similarly, best practice by consortia such as the Global Health Network[i]\u2014where the capabilities of the \u201clowest common denominator become the guiding set of criteria for which software and webpages are designed\u2014offer important lessons that could be extended further into discussions on open data. What issues, it is necessary to ask, are really slowing down research and altering data engagement, and how can current policies and initiatives be designed to address these issues more productively?\n\nCreating \u201csafe spaces\u201d to discuss these issues \nMost of the fieldwork participants I interviewed were very forthcoming about the challenges of their research environments. This forthrightness was consistent with the numerous informal discussions I\u2019ve had with LMIC scientists at conferences, socially and in related projects. Nonetheless, the same issues that they were so willing to discuss\u2014and had such robust opinions on\u2014were rarely raised to the university governance, funders, collaborators, and stakeholders that might be able to make some difference in ameliorating them.\nThis led to many discussions about why the fieldwork participants were both so willing and unwilling to discuss their contextual challenges. One Kenyan scientist put it very succinctly, saying:\n\nI was worried about applying for international funding because the facilities are poor and we have to deliver. (KY2:2)\nThe fieldwork participants described a tension inherent in drawing attention to the limitations of their environment, particularly that it may have some negative impact on their ability to secure funding, disseminate results, or form collaborations.\nChallenges with the effective situation of technologies within research contexts, and the accompanying binds of pace that they create, thus rarely get raised to those in positions to effect change. It is thus apparent that scientists in LMICs need to feel comfortable and confident to raise these issues. How this can be done, of course, is open for discussion and by no means apparent. Nonetheless, creating a \u201csafe space\u201d in which they can do so is urgent. It may be possible that by operationalizing the micro-finance scheme described by Brian Rappert in this issue such a space may be created. A failure to encourage LMIC scientists to effectively discuss these binds of pace and technology has significant impact on the research in these areas. Moreover, it also has downstream effects on the uptake of open data ideals and the contribution of these scientists to the online data milieu.[3][5]\nIn order to stimulate LMIC scientists\u2019 involvement with open data initiatives it is thus vital that they are able to raise the technical challenges in their work without fear of reprisal. It must be noted that such fears may be largely unfounded, and funders, networks, and collaborators may be happy to engage with LMIC scientists on ways to ameliorate these issues. However, the current stalemate between lack of awareness of the former\u2019s part and lack of forthrightness on the latter\u2019s can only be broken if stakeholders are explicit about their intention to help, and their willingness to engage in productive discussion without any suggestion of reprisal.\n\nMore holistic approaches to capacity building \nIn a similar vein to Rappert\u2019s article in this issue, the fieldwork evidence makes a very strong case for the need for alternative approaches to capacity building and research funding. Combining this observation with Gerson\u2019s extended view on technology, it is evident that what is needed are funding avenues that allow researchers to address the difficulties of technology transfer and the establishment of robust technology landscapes.\nWhile project-specific funding is, of course, of vital importance to LMIC research, it is important to recognize that without corresponding support on research structure, infrastructure, technology re-situation, and establishment of social practices, research in these settings will continue at a slower pace to their HIC counterparts. This has significant implications for capacity building initiatives, and for the advancement of research agendas in these regions.\nSimilarly, this has significant implications for advancing open data ideals in these regions. This was recognized by a number of researchers, who drew strong correlations with the slow pace of research and the idea of open data. As one researcher in South African observed:\n\n[y]es we end up spending much more time than it would be in a western country. But even the power failures, for example, they are not part of data sharing but are part of the vicious circle. (SA2:12)\nSimilarly, the incomplete integration of technologies into the in situ research context\u2014particularly ICTs\u2014was highly problematic for many researchers who would otherwise have been interested in participating both as a data contributor and user.\nIt would therefore be of particular importance that open data initiatives start problematizing the range of research technologies necessary for effective data engagement. It is possible that considerable traction for promoting openness in research might be gained from partnering on equipment sharing initiatives, or on funding avenues that facilitate sample or staff exchanges between resource-limited settings. By tapping in to initiatives that already address technological insufficiencies and their effect on the pace of research, it is possible that a new area of discussion on openness can be leveraged.\n\nConclusion \nAll in all, this paper makes an argument for the importance of considering the technologies that underpin data engagement activities, and the ways in which they insidiously control the speed of research. By controlling the pace of research, in turn they control buy-in from science communities in ways that are usually overlooked. Without directed attempts to address the insidious inequalities in data engagement that accompany the binds of technology\/pace, it is likely that LMIC scientists will continue to struggle to make full use of the open data movement, and that in turn will undermine the egalitarian ideals of the open data movement. In particular, a multifaceted understanding of the binds of \u201cpace\u201d is needed to ensure the visibility of scholarly products from LMICs. Without careful and sensitive attention to these issues, it is likely that LMIC scholars will continue to exclude themselves from opportunities to share data, thus missing out on improved visibility online.[14] This impacts not only on their credibility[15] as scientists, but also precludes the effective re-use of their research efforts.[16][17]\n\nAdditional file \nAppendix: Methodologies used in study - DOI: https:\/\/doi.org\/10.5334\/dsj-2017-026.s1\n\nFootnotes \n\n\n\u2191 Such as the EPSRC\u2019s database https:\/\/equipment.data.ac.uk\/ (discussed later) \n\n\u2191 Such as Seeding Labs (discussed later) \n\n\u2191 For example, see http:\/\/www.esrc.ac.uk\/funding\/guidance-for-applicants\/changes-to-equipment-funding\/ \n\n\u2191 A full description of the methodology is given in the appendix. \n\n\u2191 The Thomas theorem was formulated in 1928 by W. I. Thomas and D. S. Thomas and states that \u201cif men define situations as real, they are real in their consequences.\u201d See The child in America: Behavior problems and programs. W.I. Thomas and D.S. Thomas. New York: Knopf, 1928: 571\u2013572. \n\n\u2191 http:\/\/seedinglabs.org\/ \n\n\u2191 http:\/\/sustainablesciences.org\/ \n\n\u2191 equipment.data.ac.uk and personal communication with U.K. institutional research officers delegated to collate institutional equipment lists \n\n\u2191 https:\/\/tghn.org and personal communication with The Global Health Network website developers \n\n\nAcknowledgements \nThanks to Brian Rappert, Sabina Leonelli, and Ann Kelly for their collaboration on the project Beyond the Digital Divide: Sharing Data Across Developing and Developed Countries Leverhulme Trust, (RPG-2013-153). Thanks to the two reviewers whose comments strengthened the initial manuscript.\n\nCompeting interests \nThe author has no competing interests to declare.\n\nReferences \n\n\n\u2191 Schwegmann, C.&#32;(February 2013).&#32;\"Open Data in Developing Countries\"&#32;(PDF).&#32;EPSI Platform.&#32;https:\/\/www.europeandataportal.eu\/sites\/default\/files\/2013_open_data_in_developing_countries.pdf .&#32;Retrieved 02 May 2017 . &#160; \n\n\u2191 Bull, S.&#32;(October 2016).&#32;\"Ensuring Global Equity in Open Research\".&#32;Wellcome Trust.&#32;doi:10.6084\/m9.figshare.4055181.&#32;https:\/\/figshare.com\/articles\/Review_Ensuring_global_equity_in_open_research\/4055181 .&#32;Retrieved 02 May 2017 . &#160; \n\n\u2191 3.0 3.1 3.2 3.3 3.4 Bezuidenhout, L.; Kelly, A.H.; Leonelli, S.; Rappert, B.&#32;(2016).&#32;\"\u2018$100 Is Not Much To You\u2019: Open Science and neglected accessibilities for scientific research in Africa\".&#32;Critical Public Health&#32;27&#32;(1): 39\u201349.&#32;doi:10.1080\/09581596.2016.1252832. &#160; \n\n\u2191 4.0 4.1 4.2 Bezuidenhout, L.; Rappert, B.&#32;(2016).&#32;\"What hinders data sharing in African science?\".&#32;Fourth CODESRIA Conference on Electronic Publishing: 1\u201313.&#32;http:\/\/www.codesria.org\/spip.php?article2564&amp;lang=en . &#160; \n\n\u2191 5.0 5.1 5.2 5.3 Bezuidenhout, L.; Leonelli, S.; Kelly, A.H.; Rappert, B.&#32;(2017).&#32;\"Beyond the digital divide: Towards a situated approach to open data\".&#32;Science and Public Policy&#32;44&#32;(4): 464\u201375.&#32;doi:10.1093\/scipol\/scw036. &#160; \n\n\u2191 6.0 6.1 Malaria Genomic Epidemiology Network&#32;(2008).&#32;\"A global network for investigating the genomic epidemiology of malaria\".&#32;Nature&#32;456&#32;(7223): 732\u20137.&#32;doi:10.1038\/nature07632.&#32;PMC&#160;PMC3758999.&#32;PMID&#160;19079050.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3758999 . &#160; \n\n\u2191 7.0 7.1 Parker, M.; Bull, S.J.; de Vries, J. et al.&#32;(2009).&#32;\"Ethical data release in genome-wide association studies in developing countries\".&#32;PLoS Medicine&#32;6&#32;(11): e1000143.&#32;doi:10.1371\/journal.pmed.1000143.&#32;PMC&#160;PMC2771895.&#32;PMID&#160;19956792.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2771895 . &#160; \n\n\u2191 Venkatraman, V.&#32;(13 September 2011).&#32;\"Recycled kit equips African labs\".&#32;SciDev.Net.&#32;https:\/\/www.scidev.net\/global\/capacity-building\/feature\/recycled-kit-equips-african-labs-.html .&#32;Retrieved 03 May 2017 . &#160; \n\n\u2191 Gerson, E.M.&#32;(July 2015).&#32;\"Resituating new data collection technology\".&#32;Tremont Research Institute.&#32;https:\/\/www.academia.edu\/20947756\/Resituating_new_data_collection_technology .&#32;Retrieved 02 May 2017 . &#160; \n\n\u2191 Bezuidenhout, L.&#32;(2015).&#32;\"Ethics in the minutiae: examining the role of the physical laboratory environment in ethical discourse\".&#32;Science and Engineering Ethics&#32;21&#32;(1): 51-73.&#32;doi:10.1007\/s11948-013-9506-8.&#32;PMID&#160;24510311. &#160; \n\n\u2191 Fine, J.C.&#32;(2007).&#32;\"Investing in STI in Sub-Saharan Africa: Lessons from Collaborative Initiatives in Research and Higher Education\"&#32;(PDF).&#32;Global Forum: Building Science, Technology and Innovation Capacity For Sustainable Growth and Poverty Reduction.&#32;http:\/\/siteresources.worldbank.org\/INTSTIGLOFOR\/Resources\/Investing_in_STI_Paper_Feb06.pdf .&#32;Retrieved 02 May 2017 . &#160; \n\n\u2191 Harle, J.&#32;(November 2010).&#32;\"Growing Knowledge: Access to Research in East and Southern African Universities\".&#32;The Association of Commonwealth Universities.&#32;https:\/\/www.acu.ac.uk\/focus-areas\/arcadia-growing-knowledge .&#32;Retrieved 02 May 2017 . &#160; \n\n\u2191 Radjou, N.; Prabhu, J.; Polman, P.&#32;(2015).&#32;Frugal Innovation: How to do more with less.&#32;The Economist.&#32;pp.&#160;272.&#32;ISBN&#160;9781610395052. &#160; \n\n\u2191 Neylon, C.; Willmers, M.; Thomas, K.&#32;(February 2014).&#32;\"Illustrating Impact: Applying Altmetrics to Southern African Research\".&#32;University of Cape Town, Scholarly Communication in Africa Programme.&#32;http:\/\/hdl.handle.net\/11427\/2316 .&#32;Retrieved 02 May 2017 . &#160; \n\n\u2191 Piwowar, H.; Priem, J.&#32;(2013).&#32;\"The power of altmetrics on a CV\".&#32;Bulletin of the American Society for Information Science and Technology&#32;39&#32;(4): 10\u201313.&#32;doi:10.1002\/bult.2013.1720390405. &#160; \n\n\u2191 OECD&#32;(2007).&#32;\"OECD Principles and Guidelines for Access to Research Data from Public Funding\".&#32;OECD Publishing.&#32;https:\/\/www.oecd.org\/sti\/sci-tech\/38500813.pdf .&#32;Retrieved 02 May 2017 . &#160; \n\n\u2191 Piwowar, H.A.; Vision, T.J.&#32;(2013).&#32;\"Data reuse and the open data citation advantage\".&#32;PeerJ&#32;1: e175.&#32;doi:10.7717\/peerj.175. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version\u2014by design\u2014lists them in order of appearance. Footnotes have been changed from numbers to letters as citations are currently using numbers. \"Bezuidenhout et al forthcoming\" (from the original) has since been published, and this version includes the updated citation. One footnote was turned into a more appropriate citation.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\">https:\/\/www.limswiki.org\/index.php\/Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on informaticsLIMSwiki journal articles on open data\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 2 July 2018, at 22:51.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 240 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","8468ac745333952ccc234d2243224725_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Technology_transfer_and_true_transformation_Implications_for_open_data skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Technology transfer and true transformation: Implications for open data<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>When considering the \u201copenness\u201d of data, it is unsurprising that most conversations focus on the online environment\u2014how data is collated, moved, and recombined for multiple purposes. Nonetheless, it is important to recognize that the movements online are only part of the data lifecycle. Indeed, considering where and how data are created\u2014namely, the research setting\u2014are of key importance to open data initiatives. In particular, such insights offer key understandings of how and why scientists engage with in practices of openness, and how data transitions from personal control to public ownership.\n<\/p><p>This paper examines research settings in low\/middle-income countries (LMIC) to better understand how resource limitations influence open data buy-in. Using empirical fieldwork in Kenyan and South African <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratories<\/a>, it draws attention to some key issues currently overlooked in open data discussions. First, many of the hesitations raised by the scientists about sharing data were as much tied to the speed of their research as to any other factor. Thus, it would seem that the longer it takes for individual scientists to create data, the more hesitant they are about sharing it. Second, the pace of research is a multifaceted bind involving many different challenges relating to laboratory equipment and infrastructure. Indeed, it is unlikely that one single solution (such as equipment donation) will ameliorate these \u201cbinds of pace.\u201d Third, these \u201cbinds of pace\u201d were used by the scientists to construct \u201cnarratives of exclusion\u201d through which they remove themselves from responsibility for data sharing.\n<\/p><p>Using an adapted model of technology first proposed by Elihu Gerson, the paper then offers key ways in which these critical \u201cbinds of pace\u201d can be addressed in open data discourse. In particular, it calls for an expanded understanding of laboratory equipment and research speed to include all aspects of the research environment. It also advocates for better engagement with LMIC scientists regarding these challenges and the adoption of frugal\/responsible design principles in future open data initiatives.\n<\/p><p><b>Keywords<\/b>: technology, low\/middle-income countries, data sharing, research, pace \n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>The issue of increasing the openness of data online is a global priority. Indeed, open data is increasingly featuring on agendas of both high- and low\/middle-income country development plans.<sup id=\"rdp-ebb-cite_ref-SchwegmannOpen113_1-0\" class=\"reference\"><a href=\"#cite_note-SchwegmannOpen113-1\" rel=\"external_link\">[1]<\/a><\/sup> Nevertheless, data sharing in low\/middle-income countries (LMICs) is challenged by a number of widely-recognized issues. These include a lack of resources for sharing activities<sup id=\"rdp-ebb-cite_ref-BullEnsuring16_2-0\" class=\"reference\"><a href=\"#cite_note-BullEnsuring16-2\" rel=\"external_link\">[2]<\/a><\/sup> as well as for research activities more generally. Strategically increasing research capacity in LMICs\u2014and thus the ability of LMIC researchers to participate in the open data movement\u2014is intrinsically tied (at least in part) to the need for increasing the availability of laboratory and ICT equipment.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Unpacking_the_links_between_laboratory_equipment_and_open_data\">Unpacking the links between laboratory equipment and open data<\/span><\/h3>\n<p>It is recognized that the lack of up-to-date laboratory equipment hampers not only the ability to conduct certain types of research, but has an overall impact on the pace and efficiency of research. How to best address this lack of physical research resources is becoming a topic for directed intervention, and a number of different organizations have been set up to address issues relating to equipment provision. These include databases of equipment<sup id=\"rdp-ebb-cite_ref-3\" class=\"reference\"><a href=\"#cite_note-3\" rel=\"external_link\">[a]<\/a><\/sup>, equipment donation schemes<sup id=\"rdp-ebb-cite_ref-4\" class=\"reference\"><a href=\"#cite_note-4\" rel=\"external_link\">[b]<\/a><\/sup>, or equipment collaborations, as well as increased equipment budgets in many funded grants.<sup id=\"rdp-ebb-cite_ref-5\" class=\"reference\"><a href=\"#cite_note-5\" rel=\"external_link\">[c]<\/a><\/sup>\n<\/p><p>Despite the value of these initiatives, a coordinated and sustained approach to research equipment in LMICs remains elusive for two key reasons. First, a lack of empirical evidence detailing the contextual heterogeneity of LMIC research environments challenges targeted interventions. Second, the absence of LMIC scientists in more general discussions on scientific research practices makes it difficult to pinpoint key issues that may be prevalent within these research settings. Thus, capacity building initiatives are often challenged by the absence of a clear picture of what equipment are needed and best deployed in LMIC regions. It is therefore highly possible that other interventions are critically needed if this resource shortfall is to be effectively addressed.\n<\/p><p>The challenges of increasing research capacity through equipment-related interventions have far-reaching implications for LMIC research. In this special edition, and in related papers<sup id=\"rdp-ebb-cite_ref-Bezuidenhout.24100-16_6-0\" class=\"reference\"><a href=\"#cite_note-Bezuidenhout.24100-16-6\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BezuidenhoutWhat16_7-0\" class=\"reference\"><a href=\"#cite_note-BezuidenhoutWhat16-7\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BezuidenhoutBeyond17_8-0\" class=\"reference\"><a href=\"#cite_note-BezuidenhoutBeyond17-8\" rel=\"external_link\">[5]<\/a><\/sup>, we argue for a stronger connection between the discussions of open data and the research environment in which data are generated. The physical\u2014as well as the social and regulatory aspects of research environments\u2014influences how scientists are able to create, curate, and disseminate data, and thus the ability of scientists to contribute and re-use data online. Moreover\u2014and often overlooked\u2014the characteristics and challenges of personal research environments can influence the importance that scientists attach to the open data movement.<sup id=\"rdp-ebb-cite_ref-Bezuidenhout.24100-16_6-1\" class=\"reference\"><a href=\"#cite_note-Bezuidenhout.24100-16-6\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BezuidenhoutWhat16_7-1\" class=\"reference\"><a href=\"#cite_note-BezuidenhoutWhat16-7\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BezuidenhoutBeyond17_8-1\" class=\"reference\"><a href=\"#cite_note-BezuidenhoutBeyond17-8\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p>Nonetheless, in many discussions on open data there is an absence of robust discussion on the influence of the physical research environment on data engagement activities. This paper examines this issue in more detail examining four interlinking questions. First, to what extent do issues relating to technology affect the pace of research in these laboratories? Second, could these issues of pace be ameliorated by the directed provision of more equipment\u2014particularly high-level, specialized machinery? Moreover, how can reflecting on issues to do with technology contribute towards more inclusive discussion surrounding open data? Finally, how can a better understanding of research technologies enable more contextually-sensitive discussions about data engagement?\n<\/p><p>In order to unpack these questions in detail, the paper discusses qualitative fieldwork conducted in four African laboratories between 2014 and 2015. This fieldwork was designed to investigate data engagement activities among scientists working in resource-limited environments. From these interviews, the paper highlights how issues of data engagement and issues of equipment provision were inextricably intertwined and often interdependent. If these issues are to be effectively addressed in open data discussions, the paper suggests that an expanded definition of \u201cresearch technologies\u201d is necessary. Using a model proposed by Elihu Gerson, the paper then offers key ways in which the critical issues of technological contextuality can be effectively implemented into open data discourse.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"It.27s_not_just_the_equipment\">It's not just the equipment<\/span><\/h2>\n<p>When considering laboratory equipment and research it is tempting to make the assumption that more\u2014and newer\u2014equipment leads to more productive research that is conducted at a faster speed with increased outputs (such as data). Indeed, such assumptions drive many of the equipment-focused initiatives mentioned above. Similarly, it is tempting to extend such assumptions to open data conversations. If more equipment will facilitate the faster production of increased amounts of data, the argument would go, then scientists will be more able (and willing) to share their data online.\n<\/p><p>While these arguments make a compelling case, examination of the current status quo indicates a need for caution. Indeed, if the causal links between equipment provision, increased research pace, and improved open outputs were that straightforward, data sharing should be markedly increased by the provision of (any) laboratory equipment. Such questions motivated a period of embedded fieldwork in Kenya and South Africa between 2014 and 2015. I wanted to examine how scientists in low-resourced research settings engaged in open data activities and discussions\u2014and whether their physical laboratory environment had any influence over this engagement.<sup id=\"rdp-ebb-cite_ref-9\" class=\"reference\"><a href=\"#cite_note-9\" rel=\"external_link\">[d]<\/a><\/sup> Over the course of the year I spent three to six weeks in four different chemistry laboratories and conducted 56 semi-structured interviews with researchers and postgraduate students to find out what was working in their research environments, and what challenged their ability to generate, curate, store, share, and re-use data online.\n<\/p><p>Upon analyzing the interviews, the issue of pace in research was unavoidable. Indeed, it was everywhere. Concerns about the slowness of research, and the pressure to speed it up, pervaded how the scientists talked about their research, valued their data, identified threats to their sovereignty and acquisition of credit, positioned themselves within the scientific community, and evaluated the international community\u2019s efforts to assist them. These issues have been discussed in other papers<sup id=\"rdp-ebb-cite_ref-Bezuidenhout.24100-16_6-2\" class=\"reference\"><a href=\"#cite_note-Bezuidenhout.24100-16-6\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BezuidenhoutWhat16_7-2\" class=\"reference\"><a href=\"#cite_note-BezuidenhoutWhat16-7\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BezuidenhoutBeyond17_8-2\" class=\"reference\"><a href=\"#cite_note-BezuidenhoutBeyond17-8\" rel=\"external_link\">[5]<\/a><\/sup> and will not be covered here. Instead, this paper takes a step back to look at why there was this overwhelming awareness of pace in these laboratories. What aspects of the laboratory equipment played key roles in controlling the pace of research, and consequentially the engagement of scientists in open data activities.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"The_equipment_is_...\">The equipment is ...<\/span><\/h3>\n<p>The laboratories that I visited were not members of high-profile consortia or integrated into well-funded foreign research networks. Rather, they were good examples of home-grown science. They produced high-quality research but were dependent on their funding from multiple national and international sources. Moreover, their facilities\u2014and the budget to maintain or upgrade them\u2014were provided by their host institutions. This created a bind for the researchers, as the facilities provided were often minimal and\/or badly maintained, and their institutions did not have large amounts of \u201ccore funding\u201d for upgrades. As one Kenyan participant said:\n<\/p>\n<blockquote>We get no funding from the government. We get paid from the government, we get bills of power and water by the government but otherwise, other than that, the materials that we need for research we have to source from funding agencies. (KY1:8)<\/blockquote>\n<p>Similarly, as most of the funding for their research came from project-specific grants, the researchers had few opportunities to secure money for standard laboratory equipment or general laboratory maintenance. A participant in South Africa eloquently said, when talking about her research that:\n<\/p>\n<blockquote>[it] is a challenge because the university doesn\u2019t offer a start-up fund for equipment. \u2026 I would need to pay bit by bit and one by one. When I have funding then buy one piece of equipment and maybe after five years I would have my lab. (SA2:11)<\/blockquote>\n<p>Moreover, even when the money was there, many of the participants said that they experienced problems accessing it, or using it to address the challenges that they identified in their daily research environment. This is evident in a quote by another South African participant who said:\n<\/p>\n<blockquote>It\u2019s really bad \u2013 the bureaucracy of it. It\u2019s how the money is transferred, technical services, procurement, all those \u2026 but those are like \u201cgrand problems\u201d that you can\u2019t solve. (SA2:6)<\/blockquote>\n<p>Thus, a lot of the discussions I had about research and data engagement became discussions about equipment and research environments. The researchers I interviewed highlighted a number of key issues that affected the pace of their research in comparison (in their opinion) to well-resourced laboratories. In particular, the statements related to the \u201cun-usability\u201d of the equipment that was available for them to use. These statements are broadly grouped under the headings below.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"..._not_there_...\">... not there ...<\/span><\/h4>\n<p>One of the most common complaints I heard in all four laboratories was that the equipment available for research curtailed the types of research that could be done by the researchers. While this is, of course, an issue for scientists around the world, for many of the researchers that I interviewed this was almost a deal-breaking aspect of their research plans. As one Kenyan participant observed:\n<\/p>\n<blockquote>the lack of equipment limits the extent to which you can do research \u2013 and even the type of research that you want to do. And you ask yourself, ok, so I want to do this kind of research but do I have the machinery? (KY2:3)<\/blockquote>\n<p>Similarly, a participant from the other Kenyan site said: \n<\/p>\n<blockquote>[o]ur labs are not even there for synthesis \u2013 synthetic work \u2013 the environment is not there. So when it comes to that I either have to skip it or I have to go to a lab that has such facilities. (KY1:3)<\/blockquote>\n<p>These constraints not only shaped the research being conducted in these environments, but they also necessitated that a number of researchers change the direction of their research in order to fit in with the equipment available. Particularly in Kenya there were a number of lecturers and professors who had done postgraduate training in the U.K. or U.S., but they were unable to capitalize on their research experiences back home. This was described by one Kenyan professor who said:\n<\/p>\n<blockquote>the kind of research which is taking place here is a bit different from what I was doing \u2013 like in the UK I was doing synthetic organic chemistry. And the kind of equipment and the rest, it was purely on silicone chemistry and the reagents and the rest I couldn\u2019t get them here. So what I had to do was to look for things which are relevant for this institution. (KY1:1)<\/blockquote>\n<p>In addition to shaping the types\u2014and thus the broad pace\u2014of research, the lack of equipment also had an impact on the daily pace of research activities in the laboratory. This is evident in the exchange below, where the participant (a postgraduate student) explains day-to-day practices within the laboratory. In particular, he highlights how sharing basic equipment plays a highly influential role on how much he can work on a day-to-day basis, and thus how much data he can produce. As there were six postgraduate students sharing one evaporator, one can only imagine their frustration.\n<\/p>\n<blockquote><b>Participant<\/b>: The solvents and reagents we have all, but the equipment\u2013some equipments are missing. But we do the best we can.<br \/><b>LB<\/b>: And with so many in the lab there must be high competition to use the equipment.<br \/><b>Participant<\/b>: Yeah! For example, this evaporator, we all use it. So we have to use it at a certain time and you when you leave it the other person wants to use it and so on and so on.<br \/><b>LB<\/b>: So there is a schedule.<br \/><b>Participant<\/b>: So for us to work very well, so everyone should have at least an evaporator like this so that you can use it at any time. In that case it can become very easier, instead of sharing \u2013 it\u2019s not easy. (KY1:6)<\/blockquote>\n<p>The absence of laboratory equipment thus created two different pace-related binds for the researchers that were interviewed. Not only did it shape the types of research that could be conducted, thus affecting the long-term pace of research, but it also shaped the pace of daily research. In this, it was often the absence of multiple copies of generic equipment\u2014<a href=\"https:\/\/www.limswiki.org\/index.php\/Evaporator\" title=\"Evaporator\" target=\"_blank\" class=\"wiki-link\" data-key=\"c6fc7479817783e12a2b8d3c14283a92\">evaporators<\/a>, Gilson <a href=\"https:\/\/www.limswiki.org\/index.php\/Pipette\" title=\"Pipette\" target=\"_blank\" class=\"wiki-link\" data-key=\"9006c0a0d32c697fea0447269ce369e5\">pipettes<\/a>, glassware, <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory_water_bath\" title=\"Laboratory water bath\" target=\"_blank\" class=\"wiki-link\" data-key=\"4ab3914ad70c829d612a550bcd23feff\">water baths<\/a> and so forth\u2014that played key roles in slowing down the amount of experiments that could be done by one individual on a daily basis.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"..._broken_...\">... broken ...<\/span><\/h4>\n<p>At one of the Kenyan universities I was given a tour around the laboratories and shown the available equipment for research. I took the following note in my field diary after some discussion with my guides:\n<\/p>\n<blockquote>This department has been donated an <a href=\"https:\/\/www.limswiki.org\/index.php\/Nuclear_magnetic_resonance_spectroscopy\" title=\"Nuclear magnetic resonance spectroscopy\" target=\"_blank\" class=\"wiki-link\" data-key=\"a05c6a4eb8775761248c099371cdb82f\">NMR machine<\/a> by a laboratory in the U.S.A. When it arrived it needed to be calibrated and set up. It would also seem that some parts needed to be replaced in order to get it working. However, there is no technical support for this make and model [it is an older version of the current one on the market] in East Africa, and the only place with spare parts and a qualified technician is in South Africa. This creates situation in which they are expected to be grateful for donations, but age of machine and lack of funds for upkeep makes it obsolete before it is delivered. (KY2 field diary: day 3)<\/blockquote>\n<p>Indeed, the NMR machine had never been in use, as the laboratory lacked the money to fly the technician from South Africa. Such lack of technical support and funds available for maintenance and upkeep were often key issues for the researchers interviewed. It was apparent that even the equipment that was bought using project funds was vulnerable to this situation after the end of the grant. Thus, while it may be assumed that many of the laboratories in sub-Saharan Africa possess quality research equipment, the lack of technical support\u2014together with the rapid obsolescence of models of research equipment\u2014cause this equipment to stand un-used.\n<\/p><p>It must be noted that many of the participants made use of some sort of equipment sharing, either by partnering with geographically close institutions or by sending samples away. One South African participant described this, saying:\n<\/p>\n<blockquote>[i]t\u2019s only now we are starting collaboration in terms with sharing equipment because previously they didn\u2019t have any equipment so they were using ours but now ours is broken down and we are going back to them. (SA2:2)<\/blockquote>\n<p>Nonetheless, every single participant who discussed equipment sharing mentioned the time and frustration of not being able to do experiments <i>in situ<\/i>\u2014and the waste of time and resources necessary to take experiments to a different laboratory.\n<\/p><p>The inability to make full use of the equipment available was a source of considerable frustration to many of the scientists interviewed. Moreover, they perceived a lack of agency in being able to ameliorate these situations due to the constraints of project-specific funding, lack of core funding, and an absence of other pots of money that could be tapped into for repairs and maintenance. As one PI in South Africa said:\n<\/p>\n<blockquote>[y]ou know they call us to meetings and they say we have funding for this and that. And I think \u201cgreat stuff\u201d, but I wish they would ask me what the real issues are. I\u2019ll probably tell you 100 other things outside of the money [permitted to be spent on the grant]. (SA2:1)<\/blockquote>\n<h4><span class=\"mw-headline\" id=\"..._not_running_...\">... not running ...<\/span><\/h4>\n<p>Related to the problems experienced with broken equipment were another: not having the reagents or infrastructure to use working equipment. This was eloquently described by one of the Kenyan participants, who said:\n<\/p>\n<blockquote>[o]ur equipment is not running or idle. We have an AS that is not operating, because we have no fume hood and now no acetylene gas. Because of this it has been idle for six years. (KY1:9)<\/blockquote>\n<p>Similarly, in South Africa, one participant described the challenges of working in a geographically-isolated university, saying:\n<\/p>\n<blockquote>it has been very challenging [having the NMR machine] \u2013 it\u2019s a baby that you have to nurse all the time. Also for the liquid nitrogen that we need at first we couldn\u2019t get a source of liquid nitrogen north of [a major metropolitan area six hours away]. (SA2:6)<\/blockquote>\n<p>The difficulties of ensuring regular supplies of reagents, electricity, and internet connection often had a significant impact on the ability of the researchers to run what equipment was available to them. Consequently, the pace of their research slowed down almost as much as if the equipment were broken or missing.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Open_data.2C_technological_difficulties_and_the_slow_pace_of_research\">Open data, technological difficulties and the slow pace of research<\/span><\/h2>\n<p>As detailed from the fieldwork above, the scientists in the laboratories I visited often experienced challenges to their ability to work effectively. Absent, broken, or poorly maintained laboratory equipment slowed down their research and delayed the production and subsequent analysis of research data. Interestingly, these challenges played a big part in how they discussed their involvement\u2014or lack thereof\u2014in open data activities. Indeed, while most of the interviewees were supportive of data engagement activities in theory, there was not much data engagement occurring on a daily basis. These issues are elaborated on below.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"A_need_for_speed\">A need for speed<\/span><\/h3>\n<p>Many of the scientists that I interviewed believed that the slower pace of their research (in comparison to high-income countries [HICs]) left them at a disadvantage when it came to data release, particularly in terms of pre-publication data release. This is evident in the exchange below:\n<\/p>\n<blockquote><b>Participant<\/b>: But no in the fact that maybe I\u2019m here in the lab doing something and someone is out there in Europe and they do the same research as me and published before me so my work will be null and void.<br \/><b>LB<\/b>: So you\u2019re concerned that by making your research available other people might beat you to the post.<br \/><b>Participant<\/b>: Yes. Because it may be null and void but you\u2019ve been in the lab for almost a year.<br \/><b>LB<\/b>: Do you think it is influenced by the resource difference between the North and the South?<br \/><b>Participant<\/b>: We\u2019re in Africa, right. That is the West\u2014they definitely have more advanced stuff than us. So if I\u2019m doing this research for one year, someone in Europe of the U.S. they can do it in 3 or 4 months. So that is where now the issue. (KY1\/4)<\/blockquote>\n<p>This concern about speed of data analysis has been reported by other researchers<sup id=\"rdp-ebb-cite_ref-MGENAGlobal08_10-0\" class=\"reference\"><a href=\"#cite_note-MGENAGlobal08-10\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ParkerEthical09_11-0\" class=\"reference\"><a href=\"#cite_note-ParkerEthical09-11\" rel=\"external_link\">[7]<\/a><\/sup> and has already influenced a number of data release expectations by funders and consortia (such as MalariaGEN and H3Africa consortia). These initiatives focus predominantly on ensuring that scientists in LMICs get extended periods of time on completion of the project to process and analyze the data generated.<sup id=\"rdp-ebb-cite_ref-MGENAGlobal08_10-1\" class=\"reference\"><a href=\"#cite_note-MGENAGlobal08-10\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ParkerEthical09_11-1\" class=\"reference\"><a href=\"#cite_note-ParkerEthical09-11\" rel=\"external_link\">[7]<\/a><\/sup>\n<\/p><p>While extended data moratoria at the end of the project is undoubtedly valuable to enable the maximum number of publications from a research project, the quote above highlights that more is needed. What became apparent from the conversations with fieldwork participants was that they were conscious of the pace of their research throughout\u2014and that being slow at producing data was as pertinent as taking longer to analyze the final product. This highlights a key oversight in current data discussions, where there is no sensitivity to how mid-project data releases can be safeguarded for researchers who necessarily take longer to complete their research projects due to resource limitations.\n<\/p><p>What the fieldwork identifies is the need for corresponding efforts to address the issues relating to the varying pace of data generation. More reflection and productive policies are needed that address the multitude of issues that cause this slower pace in daily research activities. Specifically, this links directly to the types, availability, maintenance, and provisioning for the equipment in the laboratory. If the entire research process occurs at a slower pace, it is unlikely (as the quotes show), that many researchers will risk sharing pre-publication data, methodologies, and other resources. This is of particular importance for scientists not involved in international research networks, and who do not have extensive support systems to draw on.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Data_quality\">Data quality<\/span><\/h3>\n<p>Another key theme that emerged from the discussions about data sharing was that many of the participants were concerned that\u2014even if they did release the data\u2014data would not be re-used by their international peers. One researcher in Kenya highlighted this, saying:\n<\/p>\n<blockquote>[t]here is a constraint. Even the conditions aren\u2019t right, so you cannot work as fast. One of the limitations is of facilities. I mean facilities that can\u2019t be considered credible for some publication. If the instruments that are there are really elementary so you have to search for instruments that aren\u2019t here and that takes some time. (KY2:15)<\/blockquote>\n<p>This was eloquently reiterated by another of his peers, who said:\n<\/p>\n<blockquote>how much can we do to develop our own data? What processes do we need to convince people that the data are good?\u201d (KY2:13)<\/blockquote>\n<p>Such statements show a distinct anxiety over the data that are being produced that is linked to the types of equipment being used to produce it. If, as is suggested in the quotes above, the equipment is older, and the methodologies are more basic, how will the data be viewed by international peers? Would it, as some of the participants suggested, not be viewed as of equal value if it is shared? In other words, would the data created in low-resourced settings be re-used at all if it is released online? Such observations link the pace of technological change within research communities to perceptions of data sharing, something that has not yet been examined in open data discussions.\n<\/p><p>Perceptions of data becoming obsolete based on the equipment and methods used to generate it are contentious, and the validity of such positions may be argued. Nonetheless, it has far reaching consequences for the open data movement. In a way, it may be said to offer an example of the Thomas theorem.<sup id=\"rdp-ebb-cite_ref-12\" class=\"reference\"><a href=\"#cite_note-12\" rel=\"external_link\">[e]<\/a><\/sup> If the researchers believe that their data will be judged based on the age of their equipment and methods, they will be less inclined to go through the effort of sharing. This is particularly the case if they believe that their work will be heavily scrutinized, overlooked, or rendered obsolete upon arrival. Consequentially, the pace of research is slowed down by researchers not sharing data, or delaying its release.\n<\/p><p>Together, these two issues were highly influential in mediating the interviewees predominant lack of involvement in data engagement activities. In particular, these two issues had key effects on the lack of pre-publication release of data and the participation in online knowledge transfer (through posting of presentations online, contribution to discussion forums, release of methodologies, and so forth). The specter of \u201cbeing scooped\u201d due to the slower pace of research, coupled with the helplessness of changing the pace at which data were generated thus led to a situation in which the scientists recognized the value of increased openness in science but did little to engage.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Technology_transfer:_The_solution_to_pace_and_openness.3F\">Technology transfer: The solution to pace and openness?<\/span><\/h2>\n<p>The fieldwork described above clearly highlighted two key issues. First, the speed of research in the laboratories that I visited was influenced by the technologies available. This impacted research productivity, data production, research efficiency, and the optimal use of funding resources. Second, the issues of pace in research were intimately connected to how scientists valued their research, and subsequently how they conceived their responsibilities to be involved in data engagement activities.\n<\/p><p>In a way, the researchers who were interviewed constructed a \u201cnarrative of exclusion\u201d in which they (in)voluntarily opted out of open data activities. This narrative was constructed around perceptions of the pace of high-income country science and their inability to match this pace in their own research. The existence of these perceptions, and the preferred exclusion that the narrators often choose, is rarely acknowledged in open data discussions.\n<\/p><p>The obvious solution to these problems\u2014the solution to slower research in LMICs, to less data engagement, to more visibility of LMIC research\u2014would appear to be investment in the equipment present in these low-resourced laboratories. Providing more equipment to researchers currently working with the pressures described by the interview participants would seem the logical step out of this current conundrum \u2026 or does it?\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Problems_with_technology_transfer\">Problems with technology transfer<\/span><\/h3>\n<p>Initiatives such as Seeding Labs<sup id=\"rdp-ebb-cite_ref-13\" class=\"reference\"><a href=\"#cite_note-13\" rel=\"external_link\">[f]<\/a><\/sup> and the Sustainable Sciences Institute<sup id=\"rdp-ebb-cite_ref-14\" class=\"reference\"><a href=\"#cite_note-14\" rel=\"external_link\">[g]<\/a><\/sup> have been influential in partnering HIC donors of equipment with LMIC applicants, and have considerable testimonials to bear witness to their positive impact. Nonetheless, a recent article on Seeding Labs noted that \u201c[r]ecipients pay a fraction of the equipment\u2019s cost to offset the logistical expenses \u2014 although Seeding Labs refuses to say how much \u2014 and, as buyers, they assume responsibility for setting up and maintaining the equipment\u201d.<sup id=\"rdp-ebb-cite_ref-VenkatramanRecycled11_15-0\" class=\"reference\"><a href=\"#cite_note-VenkatramanRecycled11-15\" rel=\"external_link\">[8]<\/a><\/sup>\n<\/p><p>In light of the difficulties experienced by the Kenyan and South African laboratory in setting up their NMR machines (see fieldnote above and SA2:6), or the Kenyan laboratory\u2019s struggles with their AS machine (KY1:9), it is important not to see these initiatives as blueprint examples for generic success. Rather, the careful matching of donors and recipients, mentoring during the process of donation, and a careful analysis of what is required from the target sites are all necessary to ensure success.\n<\/p><p>Similarly, efforts to create equipment databases to facilitate inter-institutional sharing in LMICs have also struggled with similar problems. Informal discussions with scientists regarding such initiatives have brought to light key contextual concerns, such as how the user\/provider relationship will cope with issues such as payment and sourcing of reagents, maintenance, technical support, and possible damage instances. Such concerns, it must be noted, are not unique to LMICs and have similarly been discussed in relation to the EPSRC equipment sharing portal for U.K. universities.<sup id=\"rdp-ebb-cite_ref-16\" class=\"reference\"><a href=\"#cite_note-16\" rel=\"external_link\">[h]<\/a><\/sup>\n<\/p><p>These problems highlight two key concerns. First, current approaches to technology transfer often do not take into consideration the limitations of the context in which it will be used. Providing equipment without the researchers having a sustained ability to get the reagents necessary to run it is highly problematic. Second, current approaches to technological transfer often do not take into consideration the difficulties of moving technologies across different contexts. As described by my field notes from the first Kenyan society, there is no value in a piece of equipment that cannot be calibrated or maintained due to a lack of qualified technical support.\n<\/p><p>While the implications for broader, capacity-focused discussions are apparent, these observations also have important consequences for future open data discussions. What the evidence clearly suggests is that relying on LMIC researchers receiving more equipment will not necessarily influence the speed of their research, nor their willingness to share data. Thus, structuring projections on LMIC involvement in the open date movement based on a linear model of technology transfer\/research productivity is highly problematic. What open data discussions need is a new model of technology\/pace that takes these issues into account.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Gerson.E2.80.99s_model_of_technology_transfer\">Gerson\u2019s model of technology transfer<\/span><\/h3>\n<p>First and foremost, it is important to critique exactly what is needed from a definition of \u201ctechnology.\u201d It is often tempting to equate \u201ctechnologies\u201d to the equipment used within the laboratory, in particular the specialized (and often high-tech) machines such as NMR, polymerase chain reaction (PCR), <a href=\"https:\/\/www.limswiki.org\/index.php\/Chromatography\" title=\"Chromatography\" target=\"_blank\" class=\"wiki-link\" data-key=\"2615535d1f14c6cffdfad7285999ad9d\">chromatographic<\/a> apparatus and so forth. In contrast, as evident from the fieldwork, these pieces of equipment are not the only causes of the pace issues experienced by the researchers interviewed. Indeed, the student discussing the lack of multiple evaporators in his lab (KY1:6), or the researcher who struggled to get liquid nitrogen for the AS machine (SA2:6) experienced similar problems of pace. Moreover, dissociating discussions of equipment from those relating to their running costs and infrastructural requirements is evidently limiting.\n<\/p><p>With this in mind, it is helpful to make use of a recent definition of \u201ctechnologies\u201d proposed by Elihu Gerson in his 2015 lecture at the International Society for the History, Philosophy and Social Studies of Biology (ISHPSSB). He proposed that \"[t]echnology can include instruments, specialized materials such as cell lines, model organisms, enzymes, antibodies etc. It also includes specialized codified procedures, such as those used in psychology, field observations etc.\"<sup id=\"rdp-ebb-cite_ref-GearsonResit15_17-0\" class=\"reference\"><a href=\"#cite_note-GearsonResit15-17\" rel=\"external_link\">[9]<\/a><\/sup>\n<\/p><p>This expanded notion of technologies allows us to draw in the many different issues that were raised by the fieldwork participants, including the difficulties of getting reagents, of setting up laboratories and protocols, of having instruments available, and also having the expertise necessary to utilize the equipment.\n<\/p><p>Second, Gerson draws attention to the difficulties of moving technologies across research contexts. He highlights six ways in which attempts to introduce technologies into new environments may be problematic. These include:\n<\/p>\n<ul><li> Materials and equipment are recalcitrant.<\/li>\n<li> Researchers can\u2019t anticipate every contingency in a situation.<\/li>\n<li> Resituating new technologies requires coordination between source and target sites.<\/li>\n<li> Repertoires for work at the target must be developed.<\/li>\n<li> New technologies must be registered at the target site.<\/li>\n<li> New technologies address phenomena in new ways.<\/li><\/ul>\n<p>Moreover, it is apparent that a failure to address these concerns can result in incomplete resituation of technologies that significantly slows down research processes and stops effective data engagement. This is evident in many of the quotes from the fieldwork, such as the idle AS machine at one of the Kenyan sites (KY1:9), or my field notes description of the donated NMR machine. It would thus appear that effective data engagement\u2014generation, storage, curation, analysis, dissemination, and re-use\u2014is directly tied not only to issues of technology provision, but also of the (in)effective manner in which the technology is introduced into the new context. Thus, without careful attention to the contexts in which technologies are to be deployed, making assumptions about its efficacy is highly problematic.\n<\/p><p>Such observations are of particular importance to open data discussions that should, ideally, consider the process of data production in its entirety. Figure 1 outlines this in detail, identifying key areas of the research data cycle. While each stage is associated with technologies that could speed up research, introducing these technologies without correctly situating them within the specific context could undermine these efforts.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_BezuidenhoutDataSciJo2017_16.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"12792c6ffe4c16e432624dde64fa1a72\"><img alt=\"Fig1 BezuidenhoutDataSciJo2017 16.png\" src=\"https:\/\/www.limswiki.org\/images\/8\/82\/Fig1_BezuidenhoutDataSciJo2017_16.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> Technologies associated with data engagement<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Figure 1\u2014and the accompanying empirical evidence\u2014highlights the potential issues of what could accompany well-meaning technology provision and undermine effective data engagement. Using the adapted version of Gerson\u2019s model clearly highlights how open data discussions\u2014instead of relying on the provision of high-level laboratory equipment as a means of speeding up research and encouraging sharing\u2014needs to carefully consider the research contexts, use expanded interpretations of \u201ctechnologies,\u201d and pay attention to the repertoires necessary to use available technologies. Such awareness also needs to be reflected in the design of future initiatives, so as to safeguard against the hidden binds of pace that accompany inappropriate re-situation of technologies. Without such an expanded focus, it is likely that LMIC scientists will continue to underperform in open data initiatives.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Avoiding_.22insidious_inequalities.22\">Avoiding \"insidious inequalities\"<\/span><\/h3>\n<p>Issues associated with the pace of research and incomplete technology transfer also extend beyond the sites of data creation in LMICs. While many LMIC countries are rapidly expanding their internet capabilities, incomplete integration of these capabilities into immediate working environments\u2014through lack of expertise, older equipment, poor maintenance and technical support, and infrastructural challenges (such as power provision)\u2014continue to challenge effective usage. As one Kenyan participant observed:\n<\/p>\n<blockquote>[i]n Kenya people say that we have internet everywhere but really how much can you download, and you have to have the equipment to be able to. You bought the data bundle but what you have is not enough for you to download any publication or anything like that. Some areas in Kenya we know that people can\u2019t even access. Although we know the networking has been done but there is an assumption that everyone can access. (KY1:2)<\/blockquote>\n<p>For LMIC scientists, it would therefore seem that the connection between \u201cpace\u201d and technology is inescapable. Without directed interventions that address the integration of technologies <i>in situ<\/i> it is likely that scientists will continue to operate at the slower speed that currently characterizes much of the research in these areas. The evidence from the fieldwork presented in this paper suggests that this has multiple implications for the open data movement, both in terms of practical engagement and ideological buy-in.\n<\/p><p>Co-partnership with other initiatives aimed at addressing equipment provision, the integration of responsive design principles into data platforms, and the provision of funds to ameliorate hurdles to data generation and dissemination will all assist in changing this current paradigm of insidious inequality. As said by a PI in South Africa:\n<\/p>\n<blockquote>[t]he disadvantaged are still disadvantaged and that is the fact of the matter. The government may be willing to address the gaps, but there are still gaps. (SA2:1)<\/blockquote>\n<p>But with an increased awareness of the contextuality of data engagement, it is likely that the open data movement can move beyond such accusations.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Building_capacity_in_research_..._and_in_open_data\">Building capacity in research ... and in open data<\/span><\/h2>\n<p>Combining the fieldwork above with Gerson\u2019s model of technology transfer makes a compelling case for prioritizing the issue of pace in open data discussions. Moreover, it clearly highlights the need for further, expanded discussions on technical capabilities necessary for data engagement. It thus becomes important to ask: what can be done to effect such a change?\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Recognizing_problems.2C_designing_solutions\">Recognizing problems, designing solutions<\/span><\/h3>\n<p>When discussing the issues of equipment limitations, a South African researcher made a telling remark. She said:\n<\/p>\n<blockquote>[b]ut where I find it difficult is people don\u2019t understand our situation\u2014it\u2019s not bad will, it\u2019s just not being able to figure it out. (SA2:12)<\/blockquote>\n<p>In this comment she was specifically talking about the difficulty of registering online for international conferences where the high-resolution of the websites made it difficult to use in her low-bandwidth area. She described a long and annoying process of attempting to get the site to work, after which she ended up giving up and emailing the organizers. Nonetheless, as she said, creating websites that were not usable in low-bandwidth areas was not an intentional slight by the organizers. Rather, they were not aware that it could be problematic for many researchers in low-resourced environments.\n<\/p><p>Unpacking this comment leads to a number of different issues. First and foremost, it draws attention to the lack of awareness by scientists, funders, and affiliated stakeholders in high-income countries about these problems. While many are aware that there are some \u201cequipment issues\u201d in low-resourced research settings, very few have a consistent and coherent impression of what these problems actually are, particularly when using an extended interpretation of technologies.\n<\/p><p>This lack of awareness is compounded by the rarity of detailed ethnographic analyses of technological challenges in low-resourced settings. While a number of studies on working conditions in low-resourced environments are gradually emerging<sup id=\"rdp-ebb-cite_ref-BezuidenhoutEthics15_18-0\" class=\"reference\"><a href=\"#cite_note-BezuidenhoutEthics15-18\" rel=\"external_link\">[10]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-FineInvest07_19-0\" class=\"reference\"><a href=\"#cite_note-FineInvest07-19\" rel=\"external_link\">[11]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HarleGrowing10_20-0\" class=\"reference\"><a href=\"#cite_note-HarleGrowing10-20\" rel=\"external_link\">[12]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-Bezuidenhout.24100-16_6-3\" class=\"reference\"><a href=\"#cite_note-Bezuidenhout.24100-16-6\" rel=\"external_link\">[3]<\/a><\/sup>, the focus of these studies are diverse and a systematic collation of the evidence with regards to research technologies is urgently needed.\n<\/p><p>Such awareness will be of critical importance to future open data discussions. As evident from the fieldwork, issues of the availability of effective technologies\u2014and the resultant impact on research pace\u2014were key contributors to the lack of involvement in data engagement issues in all four institutions that I visited. Not only did the lack of effective and integrated technologies slow down the pace of data generation, but also the lack of up-to-date information and communications technologies (and the corresponding infrastructures and repertoires) cause difficulties in all aspect of data engagement.\n<\/p><p>Without support for these issues\u2014and policies that directly address these issues\u2014it is difficult to see how LMIC researchers can be effectively drawn into open data discussions. Indeed, current policies do little to assuage the fears that they have relating to the pace, leading (at least in part) to the lack of daily data engagement evident from the scientists I interviewed. As a result, the lack of contextually-sensitive policies not only led to lower levels of data contribution and re-use from scientists in these regions, but also influenced the manner in which the ideals and responsibilities of open data were discussed in these setting.\n<\/p><p>To counter this, open data discussions need to expand discussions on responsible innovation to include the expanded version of research technologies suggested by Gerson. Key lessons from the \u201cfrugal innovation\u201d movement<sup id=\"rdp-ebb-cite_ref-RadjouFrugal15_21-0\" class=\"reference\"><a href=\"#cite_note-RadjouFrugal15-21\" rel=\"external_link\">[13]<\/a><\/sup> could also be effectively incorporated to promote cost-effective and efficient design of research technologies that will more easily adapt to low-resourced settings. Similarly, best practice by consortia such as the Global Health Network<sup id=\"rdp-ebb-cite_ref-22\" class=\"reference\"><a href=\"#cite_note-22\" rel=\"external_link\">[i]<\/a><\/sup>\u2014where the capabilities of the \u201clowest common denominator become the guiding set of criteria for which software and webpages are designed\u2014offer important lessons that could be extended further into discussions on open data. What issues, it is necessary to ask, are really slowing down research and altering data engagement, and how can current policies and initiatives be designed to address these issues more productively?\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Creating_.E2.80.9Csafe_spaces.E2.80.9D_to_discuss_these_issues\">Creating \u201csafe spaces\u201d to discuss these issues<\/span><\/h3>\n<p>Most of the fieldwork participants I interviewed were very forthcoming about the challenges of their research environments. This forthrightness was consistent with the numerous informal discussions I\u2019ve had with LMIC scientists at conferences, socially and in related projects. Nonetheless, the same issues that they were so willing to discuss\u2014and had such robust opinions on\u2014were rarely raised to the university governance, funders, collaborators, and stakeholders that might be able to make some difference in ameliorating them.\n<\/p><p>This led to many discussions about why the fieldwork participants were both so willing and unwilling to discuss their contextual challenges. One Kenyan scientist put it very succinctly, saying:\n<\/p>\n<blockquote>I was worried about applying for international funding because the facilities are poor and we have to deliver. (KY2:2)<\/blockquote>\n<p>The fieldwork participants described a tension inherent in drawing attention to the limitations of their environment, particularly that it may have some negative impact on their ability to secure funding, disseminate results, or form collaborations.\n<\/p><p>Challenges with the effective situation of technologies within research contexts, and the accompanying binds of pace that they create, thus rarely get raised to those in positions to effect change. It is thus apparent that scientists in LMICs need to feel comfortable and confident to raise these issues. How this can be done, of course, is open for discussion and by no means apparent. Nonetheless, creating a \u201csafe space\u201d in which they can do so is urgent. It may be possible that by operationalizing the micro-finance scheme described by Brian Rappert in this issue such a space may be created. A failure to encourage LMIC scientists to effectively discuss these binds of pace and technology has significant impact on the research in these areas. Moreover, it also has downstream effects on the uptake of open data ideals and the contribution of these scientists to the online data milieu.<sup id=\"rdp-ebb-cite_ref-Bezuidenhout.24100-16_6-4\" class=\"reference\"><a href=\"#cite_note-Bezuidenhout.24100-16-6\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BezuidenhoutBeyond17_8-3\" class=\"reference\"><a href=\"#cite_note-BezuidenhoutBeyond17-8\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p>In order to stimulate LMIC scientists\u2019 involvement with open data initiatives it is thus vital that they are able to raise the technical challenges in their work without fear of reprisal. It must be noted that such fears may be largely unfounded, and funders, networks, and collaborators may be happy to engage with LMIC scientists on ways to ameliorate these issues. However, the current stalemate between lack of awareness of the former\u2019s part and lack of forthrightness on the latter\u2019s can only be broken if stakeholders are explicit about their intention to help, and their willingness to engage in productive discussion without any suggestion of reprisal.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"More_holistic_approaches_to_capacity_building\">More holistic approaches to capacity building<\/span><\/h3>\n<p>In a similar vein to Rappert\u2019s article in this issue, the fieldwork evidence makes a very strong case for the need for alternative approaches to capacity building and research funding. Combining this observation with Gerson\u2019s extended view on technology, it is evident that what is needed are funding avenues that allow researchers to address the difficulties of technology transfer and the establishment of robust technology landscapes.\n<\/p><p>While project-specific funding is, of course, of vital importance to LMIC research, it is important to recognize that without corresponding support on research structure, infrastructure, technology re-situation, and establishment of social practices, research in these settings will continue at a slower pace to their HIC counterparts. This has significant implications for capacity building initiatives, and for the advancement of research agendas in these regions.\n<\/p><p>Similarly, this has significant implications for advancing open data ideals in these regions. This was recognized by a number of researchers, who drew strong correlations with the slow pace of research and the idea of open data. As one researcher in South African observed:\n<\/p>\n<blockquote>[y]es we end up spending much more time than it would be in a western country. But even the power failures, for example, they are not part of data sharing but are part of the vicious circle. (SA2:12)<\/blockquote>\n<p>Similarly, the incomplete integration of technologies into the <i>in situ<\/i> research context\u2014particularly ICTs\u2014was highly problematic for many researchers who would otherwise have been interested in participating both as a data contributor and user.\n<\/p><p>It would therefore be of particular importance that open data initiatives start problematizing the range of research technologies necessary for effective data engagement. It is possible that considerable traction for promoting openness in research might be gained from partnering on equipment sharing initiatives, or on funding avenues that facilitate sample or staff exchanges between resource-limited settings. By tapping in to initiatives that already address technological insufficiencies and their effect on the pace of research, it is possible that a new area of discussion on openness can be leveraged.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusion\">Conclusion<\/span><\/h2>\n<p>All in all, this paper makes an argument for the importance of considering the technologies that underpin data engagement activities, and the ways in which they insidiously control the speed of research. By controlling the pace of research, in turn they control buy-in from science communities in ways that are usually overlooked. Without directed attempts to address the insidious inequalities in data engagement that accompany the binds of technology\/pace, it is likely that LMIC scientists will continue to struggle to make full use of the open data movement, and that in turn will undermine the egalitarian ideals of the open data movement. In particular, a multifaceted understanding of the binds of \u201cpace\u201d is needed to ensure the visibility of scholarly products from LMICs. Without careful and sensitive attention to these issues, it is likely that LMIC scholars will continue to exclude themselves from opportunities to share data, thus missing out on improved visibility online.<sup id=\"rdp-ebb-cite_ref-NeylonIllust14_23-0\" class=\"reference\"><a href=\"#cite_note-NeylonIllust14-23\" rel=\"external_link\">[14]<\/a><\/sup> This impacts not only on their credibility<sup id=\"rdp-ebb-cite_ref-PiwowarThePower13_24-0\" class=\"reference\"><a href=\"#cite_note-PiwowarThePower13-24\" rel=\"external_link\">[15]<\/a><\/sup> as scientists, but also precludes the effective re-use of their research efforts.<sup id=\"rdp-ebb-cite_ref-OECDOECD07_25-0\" class=\"reference\"><a href=\"#cite_note-OECDOECD07-25\" rel=\"external_link\">[16]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-PiwowarData13_26-0\" class=\"reference\"><a href=\"#cite_note-PiwowarData13-26\" rel=\"external_link\">[17]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Additional_file\">Additional file<\/span><\/h2>\n<p>Appendix: Methodologies used in study - DOI: <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/doi.org\/10.5334\/dsj-2017-026.s1\" target=\"_blank\">https:\/\/doi.org\/10.5334\/dsj-2017-026.s1<\/a>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Footnotes\">Footnotes<\/span><\/h2>\n<div class=\"reflist\" style=\"list-style-type: lower-alpha;\">\n<ol class=\"references\">\n<li id=\"cite_note-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-3\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">Such as the EPSRC\u2019s database <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/equipment.data.ac.uk\/\" target=\"_blank\">https:\/\/equipment.data.ac.uk\/<\/a> (discussed later)<\/span>\n<\/li>\n<li id=\"cite_note-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-4\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">Such as Seeding Labs (discussed later)<\/span>\n<\/li>\n<li id=\"cite_note-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-5\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">For example, see <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.esrc.ac.uk\/funding\/guidance-for-applicants\/changes-to-equipment-funding\/\" target=\"_blank\">http:\/\/www.esrc.ac.uk\/funding\/guidance-for-applicants\/changes-to-equipment-funding\/<\/a><\/span>\n<\/li>\n<li id=\"cite_note-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-9\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">A full description of the methodology is given in the appendix.<\/span>\n<\/li>\n<li id=\"cite_note-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-12\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\">The Thomas theorem was formulated in 1928 by W. I. Thomas and D. S. Thomas and states that \u201cif men define situations as real, they are real in their consequences.\u201d See <i>The child in America: Behavior problems and programs<\/i>. W.I. Thomas and D.S. Thomas. New York: Knopf, 1928: 571\u2013572.<\/span>\n<\/li>\n<li id=\"cite_note-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-13\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><a rel=\"external_link\" class=\"external free\" href=\"http:\/\/seedinglabs.org\/\" target=\"_blank\">http:\/\/seedinglabs.org\/<\/a><\/span>\n<\/li>\n<li id=\"cite_note-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-14\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><a rel=\"external_link\" class=\"external free\" href=\"http:\/\/sustainablesciences.org\/\" target=\"_blank\">http:\/\/sustainablesciences.org\/<\/a><\/span>\n<\/li>\n<li id=\"cite_note-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-16\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/equipment.data.ac.uk\/\" target=\"_blank\">equipment.data.ac.uk<\/a> and personal communication with U.K. institutional research officers delegated to collate institutional equipment lists<\/span>\n<\/li>\n<li id=\"cite_note-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-22\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><a rel=\"external_link\" class=\"external free\" href=\"https:\/\/tghn.org\" target=\"_blank\">https:\/\/tghn.org<\/a> and personal communication with The Global Health Network website developers<\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>Thanks to Brian Rappert, Sabina Leonelli, and Ann Kelly for their collaboration on the project <i>Beyond the Digital Divide: Sharing Data Across Developing and Developed Countries<\/i> Leverhulme Trust, (RPG-2013-153). Thanks to the two reviewers whose comments strengthened the initial manuscript.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Competing_interests\">Competing interests<\/span><\/h3>\n<p>The author has no competing interests to declare.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-SchwegmannOpen113-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SchwegmannOpen113_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Schwegmann, C.&#32;(February 2013).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.europeandataportal.eu\/sites\/default\/files\/2013_open_data_in_developing_countries.pdf\" target=\"_blank\">\"Open Data in Developing Countries\"<\/a>&#32;(PDF).&#32;EPSI Platform<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.europeandataportal.eu\/sites\/default\/files\/2013_open_data_in_developing_countries.pdf\" target=\"_blank\">https:\/\/www.europeandataportal.eu\/sites\/default\/files\/2013_open_data_in_developing_countries.pdf<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 02 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Open+Data+in+Developing+Countries&amp;rft.atitle=&amp;rft.aulast=Schwegmann%2C+C.&amp;rft.au=Schwegmann%2C+C.&amp;rft.date=February+2013&amp;rft.pub=EPSI+Platform&amp;rft_id=https%3A%2F%2Fwww.europeandataportal.eu%2Fsites%2Fdefault%2Ffiles%2F2013_open_data_in_developing_countries.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BullEnsuring16-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BullEnsuring16_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Bull, S.&#32;(October 2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/figshare.com\/articles\/Review_Ensuring_global_equity_in_open_research\/4055181\" target=\"_blank\">\"Ensuring Global Equity in Open Research\"<\/a>.&#32;Wellcome Trust.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.6084%2Fm9.figshare.4055181\" target=\"_blank\">10.6084\/m9.figshare.4055181<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/figshare.com\/articles\/Review_Ensuring_global_equity_in_open_research\/4055181\" target=\"_blank\">https:\/\/figshare.com\/articles\/Review_Ensuring_global_equity_in_open_research\/4055181<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 02 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Ensuring+Global+Equity+in+Open+Research&amp;rft.atitle=&amp;rft.aulast=Bull%2C+S.&amp;rft.au=Bull%2C+S.&amp;rft.date=October+2016&amp;rft.pub=Wellcome+Trust&amp;rft_id=info:doi\/10.6084%2Fm9.figshare.4055181&amp;rft_id=https%3A%2F%2Ffigshare.com%2Farticles%2FReview_Ensuring_global_equity_in_open_research%2F4055181&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Bezuidenhout.24100-16-6\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-Bezuidenhout.24100-16_6-0\" rel=\"external_link\">3.0<\/a><\/sup> <sup><a href=\"#cite_ref-Bezuidenhout.24100-16_6-1\" rel=\"external_link\">3.1<\/a><\/sup> <sup><a href=\"#cite_ref-Bezuidenhout.24100-16_6-2\" rel=\"external_link\">3.2<\/a><\/sup> <sup><a href=\"#cite_ref-Bezuidenhout.24100-16_6-3\" rel=\"external_link\">3.3<\/a><\/sup> <sup><a href=\"#cite_ref-Bezuidenhout.24100-16_6-4\" rel=\"external_link\">3.4<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bezuidenhout, L.; Kelly, A.H.; Leonelli, S.; Rappert, B.&#32;(2016).&#32;\"\u2018$100 Is Not Much To You\u2019: Open Science and neglected accessibilities for scientific research in Africa\".&#32;<i>Critical Public Health<\/i>&#32;<b>27<\/b>&#32;(1): 39\u201349.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1080%2F09581596.2016.1252832\" target=\"_blank\">10.1080\/09581596.2016.1252832<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=%E2%80%98%24100+Is+Not+Much+To+You%E2%80%99%3A+Open+Science+and+neglected+accessibilities+for+scientific+research+in+Africa&amp;rft.jtitle=Critical+Public+Health&amp;rft.aulast=Bezuidenhout%2C+L.%3B+Kelly%2C+A.H.%3B+Leonelli%2C+S.%3B+Rappert%2C+B.&amp;rft.au=Bezuidenhout%2C+L.%3B+Kelly%2C+A.H.%3B+Leonelli%2C+S.%3B+Rappert%2C+B.&amp;rft.date=2016&amp;rft.volume=27&amp;rft.issue=1&amp;rft.pages=39%E2%80%9349&amp;rft_id=info:doi\/10.1080%2F09581596.2016.1252832&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BezuidenhoutWhat16-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BezuidenhoutWhat16_7-0\" rel=\"external_link\">4.0<\/a><\/sup> <sup><a href=\"#cite_ref-BezuidenhoutWhat16_7-1\" rel=\"external_link\">4.1<\/a><\/sup> <sup><a href=\"#cite_ref-BezuidenhoutWhat16_7-2\" rel=\"external_link\">4.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bezuidenhout, L.; Rappert, B.&#32;(2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.codesria.org\/spip.php?article2564&lang=en\" target=\"_blank\">\"What hinders data sharing in African science?\"<\/a>.&#32;<i>Fourth CODESRIA Conference on Electronic Publishing<\/i>: 1\u201313<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.codesria.org\/spip.php?article2564&lang=en\" target=\"_blank\">http:\/\/www.codesria.org\/spip.php?article2564&amp;lang=en<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=What+hinders+data+sharing+in+African+science%3F&amp;rft.jtitle=Fourth+CODESRIA+Conference+on+Electronic+Publishing&amp;rft.aulast=Bezuidenhout%2C+L.%3B+Rappert%2C+B.&amp;rft.au=Bezuidenhout%2C+L.%3B+Rappert%2C+B.&amp;rft.date=2016&amp;rft.pages=1%E2%80%9313&amp;rft_id=http%3A%2F%2Fwww.codesria.org%2Fspip.php%3Farticle2564%26lang%3Den&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BezuidenhoutBeyond17-8\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BezuidenhoutBeyond17_8-0\" rel=\"external_link\">5.0<\/a><\/sup> <sup><a href=\"#cite_ref-BezuidenhoutBeyond17_8-1\" rel=\"external_link\">5.1<\/a><\/sup> <sup><a href=\"#cite_ref-BezuidenhoutBeyond17_8-2\" rel=\"external_link\">5.2<\/a><\/sup> <sup><a href=\"#cite_ref-BezuidenhoutBeyond17_8-3\" rel=\"external_link\">5.3<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bezuidenhout, L.; Leonelli, S.; Kelly, A.H.; Rappert, B.&#32;(2017).&#32;\"Beyond the digital divide: Towards a situated approach to open data\".&#32;<i>Science and Public Policy<\/i>&#32;<b>44<\/b>&#32;(4): 464\u201375.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fscipol%2Fscw036\" target=\"_blank\">10.1093\/scipol\/scw036<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Beyond+the+digital+divide%3A+Towards+a+situated+approach+to+open+data&amp;rft.jtitle=Science+and+Public+Policy&amp;rft.aulast=Bezuidenhout%2C+L.%3B+Leonelli%2C+S.%3B+Kelly%2C+A.H.%3B+Rappert%2C+B.&amp;rft.au=Bezuidenhout%2C+L.%3B+Leonelli%2C+S.%3B+Kelly%2C+A.H.%3B+Rappert%2C+B.&amp;rft.date=2017&amp;rft.volume=44&amp;rft.issue=4&amp;rft.pages=464%E2%80%9375&amp;rft_id=info:doi\/10.1093%2Fscipol%2Fscw036&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MGENAGlobal08-10\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MGENAGlobal08_10-0\" rel=\"external_link\">6.0<\/a><\/sup> <sup><a href=\"#cite_ref-MGENAGlobal08_10-1\" rel=\"external_link\">6.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Malaria Genomic Epidemiology Network&#32;(2008).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3758999\" target=\"_blank\">\"A global network for investigating the genomic epidemiology of malaria\"<\/a>.&#32;<i>Nature<\/i>&#32;<b>456<\/b>&#32;(7223): 732\u20137.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnature07632\" target=\"_blank\">10.1038\/nature07632<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3758999\/\" target=\"_blank\">PMC3758999<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19079050\" target=\"_blank\">19079050<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3758999\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3758999<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+global+network+for+investigating+the+genomic+epidemiology+of+malaria&amp;rft.jtitle=Nature&amp;rft.aulast=Malaria+Genomic+Epidemiology+Network&amp;rft.au=Malaria+Genomic+Epidemiology+Network&amp;rft.date=2008&amp;rft.volume=456&amp;rft.issue=7223&amp;rft.pages=732%E2%80%937&amp;rft_id=info:doi\/10.1038%2Fnature07632&amp;rft_id=info:pmc\/PMC3758999&amp;rft_id=info:pmid\/19079050&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3758999&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ParkerEthical09-11\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ParkerEthical09_11-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-ParkerEthical09_11-1\" rel=\"external_link\">7.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Parker, M.; Bull, S.J.; de Vries, J. et al.&#32;(2009).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2771895\" target=\"_blank\">\"Ethical data release in genome-wide association studies in developing countries\"<\/a>.&#32;<i>PLoS Medicine<\/i>&#32;<b>6<\/b>&#32;(11): e1000143.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pmed.1000143\" target=\"_blank\">10.1371\/journal.pmed.1000143<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2771895\/\" target=\"_blank\">PMC2771895<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19956792\" target=\"_blank\">19956792<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2771895\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2771895<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Ethical+data+release+in+genome-wide+association+studies+in+developing+countries&amp;rft.jtitle=PLoS+Medicine&amp;rft.aulast=Parker%2C+M.%3B+Bull%2C+S.J.%3B+de+Vries%2C+J.+et+al.&amp;rft.au=Parker%2C+M.%3B+Bull%2C+S.J.%3B+de+Vries%2C+J.+et+al.&amp;rft.date=2009&amp;rft.volume=6&amp;rft.issue=11&amp;rft.pages=e1000143&amp;rft_id=info:doi\/10.1371%2Fjournal.pmed.1000143&amp;rft_id=info:pmc\/PMC2771895&amp;rft_id=info:pmid\/19956792&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2771895&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VenkatramanRecycled11-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VenkatramanRecycled11_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Venkatraman, V.&#32;(13 September 2011).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.scidev.net\/global\/capacity-building\/feature\/recycled-kit-equips-african-labs-.html\" target=\"_blank\">\"Recycled kit equips African labs\"<\/a>.&#32;<i>SciDev.Net<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.scidev.net\/global\/capacity-building\/feature\/recycled-kit-equips-african-labs-.html\" target=\"_blank\">https:\/\/www.scidev.net\/global\/capacity-building\/feature\/recycled-kit-equips-african-labs-.html<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 03 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Recycled+kit+equips+African+labs&amp;rft.atitle=SciDev.Net&amp;rft.aulast=Venkatraman%2C+V.&amp;rft.au=Venkatraman%2C+V.&amp;rft.date=13+September+2011&amp;rft_id=https%3A%2F%2Fwww.scidev.net%2Fglobal%2Fcapacity-building%2Ffeature%2Frecycled-kit-equips-african-labs-.html&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GearsonResit15-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GearsonResit15_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Gerson, E.M.&#32;(July 2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.academia.edu\/20947756\/Resituating_new_data_collection_technology\" target=\"_blank\">\"Resituating new data collection technology\"<\/a>.&#32;Tremont Research Institute<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.academia.edu\/20947756\/Resituating_new_data_collection_technology\" target=\"_blank\">https:\/\/www.academia.edu\/20947756\/Resituating_new_data_collection_technology<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 02 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Resituating+new+data+collection+technology&amp;rft.atitle=&amp;rft.aulast=Gerson%2C+E.M.&amp;rft.au=Gerson%2C+E.M.&amp;rft.date=July+2015&amp;rft.pub=Tremont+Research+Institute&amp;rft_id=https%3A%2F%2Fwww.academia.edu%2F20947756%2FResituating_new_data_collection_technology&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BezuidenhoutEthics15-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BezuidenhoutEthics15_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bezuidenhout, L.&#32;(2015).&#32;\"Ethics in the minutiae: examining the role of the physical laboratory environment in ethical discourse\".&#32;<i>Science and Engineering Ethics<\/i>&#32;<b>21<\/b>&#32;(1): 51-73.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11948-013-9506-8\" target=\"_blank\">10.1007\/s11948-013-9506-8<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24510311\" target=\"_blank\">24510311<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Ethics+in+the+minutiae%3A+examining+the+role+of+the+physical+laboratory+environment+in+ethical+discourse&amp;rft.jtitle=Science+and+Engineering+Ethics&amp;rft.aulast=Bezuidenhout%2C+L.&amp;rft.au=Bezuidenhout%2C+L.&amp;rft.date=2015&amp;rft.volume=21&amp;rft.issue=1&amp;rft.pages=51-73&amp;rft_id=info:doi\/10.1007%2Fs11948-013-9506-8&amp;rft_id=info:pmid\/24510311&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FineInvest07-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FineInvest07_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fine, J.C.&#32;(2007).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/siteresources.worldbank.org\/INTSTIGLOFOR\/Resources\/Investing_in_STI_Paper_Feb06.pdf\" target=\"_blank\">\"Investing in STI in Sub-Saharan Africa: Lessons from Collaborative Initiatives in Research and Higher Education\"<\/a>&#32;(PDF).&#32;<i>Global Forum: Building Science, Technology and Innovation Capacity For Sustainable Growth and Poverty Reduction<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/siteresources.worldbank.org\/INTSTIGLOFOR\/Resources\/Investing_in_STI_Paper_Feb06.pdf\" target=\"_blank\">http:\/\/siteresources.worldbank.org\/INTSTIGLOFOR\/Resources\/Investing_in_STI_Paper_Feb06.pdf<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 02 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Investing+in+STI+in+Sub-Saharan+Africa%3A+Lessons+from+Collaborative+Initiatives+in+Research+and+Higher+Education&amp;rft.jtitle=Global+Forum%3A+Building+Science%2C+Technology+and+Innovation+Capacity+For+Sustainable+Growth+and+Poverty+Reduction&amp;rft.aulast=Fine%2C+J.C.&amp;rft.au=Fine%2C+J.C.&amp;rft.date=2007&amp;rft_id=http%3A%2F%2Fsiteresources.worldbank.org%2FINTSTIGLOFOR%2FResources%2FInvesting_in_STI_Paper_Feb06.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HarleGrowing10-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HarleGrowing10_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Harle, J.&#32;(November 2010).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.acu.ac.uk\/focus-areas\/arcadia-growing-knowledge\" target=\"_blank\">\"Growing Knowledge: Access to Research in East and Southern African Universities\"<\/a>.&#32;The Association of Commonwealth Universities<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.acu.ac.uk\/focus-areas\/arcadia-growing-knowledge\" target=\"_blank\">https:\/\/www.acu.ac.uk\/focus-areas\/arcadia-growing-knowledge<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 02 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Growing+Knowledge%3A+Access+to+Research+in+East+and+Southern+African+Universities&amp;rft.atitle=&amp;rft.aulast=Harle%2C+J.&amp;rft.au=Harle%2C+J.&amp;rft.date=November+2010&amp;rft.pub=The+Association+of+Commonwealth+Universities&amp;rft_id=https%3A%2F%2Fwww.acu.ac.uk%2Ffocus-areas%2Farcadia-growing-knowledge&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RadjouFrugal15-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RadjouFrugal15_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Radjou, N.; Prabhu, J.; Polman, P.&#32;(2015).&#32;<i>Frugal Innovation: How to do more with less<\/i>.&#32;The Economist.&#32;pp.&#160;272.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781610395052.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Frugal+Innovation%3A+How+to+do+more+with+less&amp;rft.aulast=Radjou%2C+N.%3B+Prabhu%2C+J.%3B+Polman%2C+P.&amp;rft.au=Radjou%2C+N.%3B+Prabhu%2C+J.%3B+Polman%2C+P.&amp;rft.date=2015&amp;rft.pages=pp.%26nbsp%3B272&amp;rft.pub=The+Economist&amp;rft.isbn=9781610395052&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NeylonIllust14-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NeylonIllust14_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Neylon, C.; Willmers, M.; Thomas, K.&#32;(February 2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/hdl.handle.net\/11427\/2316\" target=\"_blank\">\"Illustrating Impact: Applying Altmetrics to Southern African Research\"<\/a>.&#32;University of Cape Town, Scholarly Communication in Africa Programme<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/hdl.handle.net\/11427\/2316\" target=\"_blank\">http:\/\/hdl.handle.net\/11427\/2316<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 02 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Illustrating+Impact%3A+Applying+Altmetrics+to+Southern+African+Research&amp;rft.atitle=&amp;rft.aulast=Neylon%2C+C.%3B+Willmers%2C+M.%3B+Thomas%2C+K.&amp;rft.au=Neylon%2C+C.%3B+Willmers%2C+M.%3B+Thomas%2C+K.&amp;rft.date=February+2014&amp;rft.pub=University+of+Cape+Town%2C+Scholarly+Communication+in+Africa+Programme&amp;rft_id=http%3A%2F%2Fhdl.handle.net%2F11427%2F2316&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PiwowarThePower13-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PiwowarThePower13_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Piwowar, H.; Priem, J.&#32;(2013).&#32;\"The power of altmetrics on a CV\".&#32;<i>Bulletin of the American Society for Information Science and Technology<\/i>&#32;<b>39<\/b>&#32;(4): 10\u201313.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2Fbult.2013.1720390405\" target=\"_blank\">10.1002\/bult.2013.1720390405<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+power+of+altmetrics+on+a+CV&amp;rft.jtitle=Bulletin+of+the+American+Society+for+Information+Science+and+Technology&amp;rft.aulast=Piwowar%2C+H.%3B+Priem%2C+J.&amp;rft.au=Piwowar%2C+H.%3B+Priem%2C+J.&amp;rft.date=2013&amp;rft.volume=39&amp;rft.issue=4&amp;rft.pages=10%E2%80%9313&amp;rft_id=info:doi\/10.1002%2Fbult.2013.1720390405&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OECDOECD07-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OECDOECD07_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">OECD&#32;(2007).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.oecd.org\/sti\/sci-tech\/38500813.pdf\" target=\"_blank\">\"OECD Principles and Guidelines for Access to Research Data from Public Funding\"<\/a>.&#32;OECD Publishing<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.oecd.org\/sti\/sci-tech\/38500813.pdf\" target=\"_blank\">https:\/\/www.oecd.org\/sti\/sci-tech\/38500813.pdf<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 02 May 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=OECD+Principles+and+Guidelines+for+Access+to+Research+Data+from+Public+Funding&amp;rft.atitle=&amp;rft.aulast=OECD&amp;rft.au=OECD&amp;rft.date=2007&amp;rft.pub=OECD+Publishing&amp;rft_id=https%3A%2F%2Fwww.oecd.org%2Fsti%2Fsci-tech%2F38500813.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PiwowarData13-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PiwowarData13_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Piwowar, H.A.; Vision, T.J.&#32;(2013).&#32;\"Data reuse and the open data citation advantage\".&#32;<i>PeerJ<\/i>&#32;<b>1<\/b>: e175.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.7717%2Fpeerj.175\" target=\"_blank\">10.7717\/peerj.175<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+reuse+and+the+open+data+citation+advantage&amp;rft.jtitle=PeerJ&amp;rft.aulast=Piwowar%2C+H.A.%3B+Vision%2C+T.J.&amp;rft.au=Piwowar%2C+H.A.%3B+Vision%2C+T.J.&amp;rft.date=2013&amp;rft.volume=1&amp;rft.pages=e175&amp;rft_id=info:doi\/10.7717%2Fpeerj.175&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version\u2014by design\u2014lists them in order of appearance. Footnotes have been changed from numbers to letters as citations are currently using numbers. \"Bezuidenhout et al forthcoming\" (from the original) has since been published, and this version includes the updated citation. One footnote was turned into a more appropriate citation.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193147\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.472 seconds\nReal time usage: 0.506 seconds\nPreprocessor visited node count: 14106\/1000000\nPreprocessor generated node count: 35662\/1000000\nPost\u2010expand include size: 96395\/2097152 bytes\nTemplate argument size: 33819\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 474.665 1 - -total\n 76.26% 361.982 2 - Template:Reflist\n 62.77% 297.930 17 - Template:Citation\/core\n 35.06% 166.433 9 - Template:Cite_journal\n 28.47% 135.139 7 - Template:Cite_web\n 13.72% 65.119 1 - Template:Infobox_journal_article\n 13.08% 62.072 1 - Template:Infobox\n 7.68% 36.436 80 - Template:Infobox\/row\n 5.32% 25.243 14 - Template:Citation\/identifier\n 4.63% 21.962 1 - Template:Cite_book\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10663-0!*!0!!en!5!* and timestamp 20181214193146 and revision id 33508\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data\">https:\/\/www.limswiki.org\/index.php\/Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","8468ac745333952ccc234d2243224725_images":["https:\/\/www.limswiki.org\/images\/8\/82\/Fig1_BezuidenhoutDataSciJo2017_16.png"],"8468ac745333952ccc234d2243224725_timestamp":1544815906,"2c1bea416fe89e4530ea8d302ad49dbc_type":"article","2c1bea416fe89e4530ea8d302ad49dbc_title":"How big data, comparative effectiveness research, and rapid-learning health care systems can transform patient care in radiation oncology (Sanders and Showalter 2018)","2c1bea416fe89e4530ea8d302ad49dbc_url":"https:\/\/www.limswiki.org\/index.php\/Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology","2c1bea416fe89e4530ea8d302ad49dbc_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:How big data, comparative effectiveness research, and rapid-learning health care systems can transform patient care in radiation oncology\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nHow big data, comparative effectiveness research, and rapid-learning health care\r\nsystems can transform patient care in radiation oncologyJournal\n \nFrontiers in OncologyAuthor(s)\n \nSanders, Jason C.; Showalter, Timothy N.Author affiliation(s)\n \nUniversity of Virginia School of MedicinePrimary contact\n \nEmail: tns3b@virginia.eduEditors\n \nDeng, JunYear published\n \n2018Volume and issue\n \n8Page(s)\n \n155DOI\n \n10.3389\/fonc.2018.00155ISSN\n \n2234-943XDistribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/www.frontiersin.org\/articles\/10.3389\/fonc.2018.00155\/fullDownload\n \nhttps:\/\/www.frontiersin.org\/articles\/10.3389\/fonc.2018.00155\/pdf (PDF)\n\nContents\n\n1 Introduction \n2 Comparative effectiveness research (CER) and big data \n3 Rapid-learning health care system (RLHCS) and personalized medicine \n4 Integrating an RLHCS with oncology \n5 Implications for radiation oncology \n\n5.1 Patient reported outcomes (PROs) \n5.2 Dose selection and radiosensitivity \n5.3 Personalized treatment recommendations \n\n\n6 Conclusion \n7 Acknowledgements \n\n7.1 Author contributions \n7.2 Conflict of interest statement \n\n\n8 References \n9 Notes \n\n\n\nIntroduction \nBig data and comparative effectiveness research methodologies can be applied within the framework of a rapid-learning health care system (RLHCS) to accelerate discovery and to help turn the dream of fully personalized medicine into a reality. We synthesize recent advances in genomics with trends in big data to provide a forward-looking perspective on the potential of new advances to usher in an era of personalized radiation therapy, with emphases on the power of RLHCS to accelerate discovery and the future of individualized radiation treatment planning.\nKeywords: big data, radiation oncology, comparative effectiveness research, rapid-learning health care system, personalized radiation therapy\n\nComparative effectiveness research (CER) and big data \nThe Committee on CER Prioritization was created by the Institute of Medicine in 2009. They defined CER as \u201ca strategy that focuses on the practical comparison of two or more health intervention to discern what works best for which patients and populations.\u201d[1] In essence, the goal of CER is to identify \"which treatment will work best, in which patient, under what circumstances.\u201d[2] Big data refers to data sets that are so large that they cannot be analyzed directly by individuals or traditional processing software. Big data analytics (BDA) is a growing field with a multitude of methods that is being utilized in various sectors from business to medicine.[3] The advent of the electronic medical record (EMR) has resulted in the digitization of massive data sets of medical information, including clinic encounters, laboratory values, imaging data sets and reports, pathology reports, patient outcomes, and family history, as well as genomic and biological data, etc.\nTo help with the analysis of big data, the National Institutes of Health (NIH) has created the Big Data to Knowledge (BD2K) program, which has invested over $200 million in grant awards to foster the development of methods and tools to analyze big data in biomedical research.[4] Additionally, the BD2K program will move to make sure that biomedical big data is \u201cfindable, accessible, interoperable, and reusable\u201d (FAIR).[4] Over the past decade, CER methodologies have become increasingly prevalent in radiation oncology research, and there is much enthusiasm surrounding BDA.\n\nRapid-learning health care system (RLHCS) and personalized medicine \nThe number of articles on big data in health care has increased exponentially from under 500 articles in 2005 to over 2500 articles in 2015.[5] As the amount of biomedical big data and our ability to analyze these data continues to advance, so will the implications and use of the information we are able to extract. One of the most important steps toward advancing our ability to analyze these big data for biomedical discovery is the creation of RLHCS, which will allow for the sharing of patient data between EMRs, ideally in real-time.[6] An ideal RLHCS would take patient data that was routinely generated as part of standard patient care and compile that data into a large data system.[6][7][8] This aggregate data would then be available for both BDA to accelerate identification of new hypotheses and CER to rapidly generate evidence through hypothesis-testing studies. Clinical data from patient records can be used readily to identify novel relationships among clinical factors and patient outcomes, or to evaluate treatment effectiveness in specific subgroups, that cannot be studied adequately in randomized, controlled trials. The extreme power of RLCHS, though, is even more exciting when one considers the possibility of adding biospecimens to accelerate discovery in genomics and proteomics. As RLHCSs are created and their data sets are expanded, we will continue to identify specific genomic and proteomic data to help define cohorts and stratify patients into risk groups and treatment response groups, and potentially to help design highly tailored therapy regimens.[9] In this sense, the RLHCS would usher in a more fertile era for improving biomedical research than ever before. BDA and CER provide the research methodologies needed to rapidly generate evidence from the RLHCS. It should be noted, however, that there are substantial practical obstacles that must be addressed to achieve the vision of the RLHCS. These include patient concerns regarding privacy and security of sensitive information, interconnectivity among different health records, and regulatory barriers to the exchange of health information.\n\nIntegrating an RLHCS with oncology \nThe integration of CER, big data, and BDA is especially important in the field of oncology, where multiple groups are investing significant time and resources in efforts to expand the availability of data and advance the methods used to extract meaningful information from that data.[4][10][11][12][13][14] The American Society of Clinical Oncology started their own RLHCS, CancerLinQ, to overcome the lack of interoperability between EMRs and accomplish their goal of being able to \u201canalyze and share data on every patient with cancer.\u201d[15] While the vision of RLCHS has not yet been fully achieved, the potential impact on society has stimulated enthusiasm toward this effort.\n\nImplications for radiation oncology \nPatient reported outcomes (PROs) \nPatient reported outcomes and quality-of-life (QoL) have become a major area of focus in health care overall, particularly in oncology. The availability of PROs within EMRs provides the foundation for an RLHCS that can be leveraged to expand insights into how cancer treatments impact patient QoL. By incorporating the PROs for massive numbers of patients, RLHCS will be able to identify small variations and subgroups of patients that might be missed in the smaller number of patients included in traditional randomized controlled trials. These PROs and QoL domains can then be incorporated into clinical decision-making to help guide both providers and patients.[16] In doing this, PROs can act as a link between the objective clinical data and the subjective patient outcomes and experiences to help improve the overall care of the patient.[17] One may also conceive of potential genomics-based determinants of QoL that could be identified using BDA if RLHCSs include biospecimens linked to clinical data and PROs. Finally, surveillance of an RLHCS may also be performed to identify temporal trends in PROs to estimate outcomes after implementation of new technologies.\n\nDose selection and radiosensitivity \nThe use of tumor-specific genes and radiosensitivity to guided treatment decisions has already been established in human papilloma virus-associated squamous-cell carcinoma of the oropharynx.[18] Numerous studies have looked at identifying genes that may have implications on tumor radiosensitivity or patient toxicity.[19][20][21][22] The identification of these genes and their potential implications has led to the creation of the fields of radiogenetics and radiogenomics. Efforts are currently underway to generate meaningful gene assays that will help predict tumor response to radiation. Eschrich et al. created a 10-gene model to calculate a radiosensitivity index and applied this to patients with head-and-neck, rectal, and esophageal cancer to help stratify patients into either responders or non-responders, with 80% sensitivity and 82% specificity.[22] Similarly, Zhao et al. retrospectively created a 24-gene assay and applied this to risk matched patients who either received postoperative radiation or no radiation following prostatectomy. Patients with a high score on the gene index who received postoperative radiation were less likely to have distant metastasis at 10 years.[23]\nAs efforts to identify genes and gene assays that may be predictors of radiosensitivity continue to be validated, we will potentially be able to integrate these findings in dose selection and toxicity prediction for individual patients based on their native and tumor genetics. Scott and colleagues have recently described a genomics-based strategy for personalizing radiation therapy dose, which would support dose de-escalation for radiosensitive tumors.[24] While the clinical implication of radiosensitivity assays are still developing, big data will be key to developing future assays rapidly, as well as incorporating the genomics tools into clinical decision-making. Big data provides an opportunity to refine molecular signatures based upon real-world data and to merge genomic assay results with other clinical data elements to optimize predictive analytics. An RLHCS would provide the ideal substrate for levering big data and CER to accelerate genomics-based discovery to make precision radiation oncology a reality.\n\nPersonalized treatment recommendations \nRadiation oncology is unique in that treatment plans for patients are often already technically and physically personalized due to patient-specific variations in anatomy, tumor characteristics, and stage. Since a patient\u2019s treatment plan is usually based upon a CT scan in treatment position, radiation can be considered an inherently personalized form of medicine. However, treatment planning approaches and radiation doses are generally selected based upon class solution, with technical details such as beam arrangements and dose\u2013volume constraints adherent to generalized rules. Multiple studies have already begun to look at how BDA methods such as machine learning and neural networks can be used to aid in dose optimization and toxicity prediction modeling in radiation oncology[17][25][26][27], which could provide more optimal treatment plan alternatives for individual patients. As the data and technology behind RLHCS continues to progress, we will likely be able to utilize a full spectrum of patient-specific clinical factors, PROs, genomics, patient preference, and priorities, and a menu of treatment plan alternatives in order to optimize an individual patient\u2019s radiation therapy. In order to deliver high quality, high impact insights into radiation oncology, it is important that large datasets include detailed technical information.\n\nConclusion \nMuch of the excitement regarding big data has centered on potential for genomic discovery, high-level radiation treatment planning, and leveraging EMRs to identify associations among factors that may provide new insights into potential causal relationships that can be further studied to accelerate progress in cancer care. Although these are certainly promising areas for discovery, we most eagerly anticipate the power of big data to connect a broad range of characteristics to accelerate evidence generation and inform personalized decision-making. We envision the use of big data and CER methods to inform the individual decisions of patients and providers by synthesizing clinical and genomic data and querying an RLHCS for the latest data on effectiveness of treatment options in relevant subgroups of patients.\n\nAcknowledgements \nAuthor contributions \nBoth authors contributed to the development and editing of the manuscript and approved the final submitted version.\n\nConflict of interest statement \nThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.\n\nReferences \n\n\n\u2191 Institute of Medicine of the National Academies&#32;(2009).&#32;Initial National Priorities for Comparative Effectiveness Research.&#32;National Academies Press.&#32;ISBN&#160;9780309138369.&#32;https:\/\/www.nap.edu\/catalog\/12648\/initial-national-priorities-for-comparative-effectiveness-research . &#160; \n\n\u2191 Greenfield, S.; Rich, E.&#32;(2012).&#32;\"Welcome to the Journal of Comparative Effectiveness Research\".&#32;Journal of Comparative Effectiveness Research&#32;1&#32;(1): 1\u20133.&#32;doi:10.2217\/cer.11.13.&#32;PMID&#160;24237290. &#160; \n\n\u2191 Sivarajah, U.; Kamal, M.M.; Irani, Z.; Weerakkody, V.&#32;(2017).&#32;\"Critical analysis of Big Data challenges and analytical methods\".&#32;Journal of Business Research&#32;70: 263\u201386.&#32;doi:10.1016\/j.jbusres.2016.08.001. &#160; \n\n\u2191 4.0 4.1 4.2 Margolis, R.; Derr, L.; Dunn, M. et al.&#32;(2014).&#32;\"The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: Capitalizing on biomedical big data\".&#32;JAMIA&#32;21&#32;(6): 957\u20138.&#32;doi:10.1136\/amiajnl-2014-002974.&#32;PMC&#160;PMC4215061.&#32;PMID&#160;25008006.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4215061 . &#160; \n\n\u2191 de la Torre D\u00edez, I.; Cosgava, H.M.; Garcia-Zapirain, B.; L\u00f3pez-Coronado, M.&#32;(2016).&#32;\"Big Data in Health: a Literature Review from the Year 2005\".&#32;Journal of Medical Systems&#32;40&#32;(9): 209.&#32;doi:10.1007\/s10916-016-0565-7.&#32;PMID&#160;27520614. &#160; \n\n\u2191 6.0 6.1 Ginsburg, G.S.; Kuderer, N.M.&#32;(2012).&#32;\"Comparative effectiveness research, genomics-enabled personalized medicine, and rapid learning health care: A common bond\".&#32;Journal of Clinical Oncology&#32;30&#32;(34): 4233-42.&#32;doi:10.1200\/JCO.2012.42.6114.&#32;PMC&#160;PMC3504328.&#32;PMID&#160;23071236.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3504328 . &#160; \n\n\u2191 Ginsburg, G.S.; Staples, J.; Abernethy, A.P.&#32;(2011).&#32;\"Academic medical centers: Ripe for rapid-learning personalized health care\".&#32;Science Translational Medicine&#32;3&#32;(101): 101cm27.&#32;doi:10.1126\/scitranslmed.3002386.&#32;PMID&#160;21937754. &#160; \n\n\u2191 Abernethy, A.P.; Etheredge, L.M.; Ganz, P.A. et al.&#32;(2010).&#32;\"Rapid-learning system for cancer care\".&#32;Journal of Clinical Oncology&#32;28&#32;(27): 4268-74.&#32;doi:10.1200\/JCO.2010.28.5478.&#32;PMC&#160;PMC2953977.&#32;PMID&#160;20585094.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2953977 . &#160; \n\n\u2191 Ramsey, S.D.; Veenstra, D.; Tunis, S.R. et al.&#32;(2011).&#32;\"How comparative effectiveness research can help advance 'personalized medicine' in cancer treatment\".&#32;Health Affairs&#32;30&#32;(12): 2259\u201368.&#32;doi:10.1377\/hlthaff.2010.0637.&#32;PMC&#160;PMC3477796.&#32;PMID&#160;22147853.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3477796 . &#160; \n\n\u2191 Helft, M.&#32;(2014).&#32;\"Can big data cure cancer?\".&#32;Fortune&#32;170&#32;(2): 70\u20134, 76, 78.&#32;PMID&#160;25318238. &#160; \n\n\u2191 Williams, A.M.; Liu, Y.; Regner, K.R. et al.&#32;(2018).&#32;\"Artificial intelligence, physiological genomics, and precision medicine\".&#32;Physiological Genomics&#32;50&#32;(4): 237\u201343.&#32;doi:10.1152\/physiolgenomics.00119.2017.&#32;PMC&#160;PMC5966805.&#32;PMID&#160;29373082.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5966805 . &#160; \n\n\u2191 Savage, N.&#32;(2014).&#32;\"Big data versus the big C\".&#32;Scientific American&#32;311&#32;(1): S20\u20131.&#32;PMID&#160;24974705. &#160; \n\n\u2191 Shah, A.; Stewart, A.K.; Kolacevski, A. et al.&#32;(2016).&#32;\"Building a rapid learning health care system for oncology: Why CancerLinQ collects identifiable health information to achieve its vision\".&#32;Journal of Clinical Oncology&#32;34&#32;(7): 756\u201363.&#32;doi:10.1200\/JCO.2015.65.0598.&#32;PMID&#160;26755519. &#160; \n\n\u2191 Trifiletti, D.M.; Showalter, T.N.&#32;(2015).&#32;\"Big Data and Comparative Effectiveness Research in Radiation Oncology: Synergy and Accelerated Discovery\".&#32;Frontiers in Oncology&#32;5: 274.&#32;doi:10.3389\/fonc.2015.00274.&#32;PMC&#160;PMC4672039.&#32;PMID&#160;26697409.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4672039 . &#160; \n\n\u2191 \"Shaping the Future of Oncology: Envisioning Cancer Care in 2030: Outcomes of the ASCO Board of Directors Strategic Planning and Visioning Process, 2011-2012\".&#32;American Society of Clinical Oncology.&#32;2011.&#32;https:\/\/www.asco.org\/sites\/default\/files\/shapingfuture-lowres.pdf . &#160; \n\n\u2191 Sarin, R.&#32;(2014).&#32;\"Big Data V4 for integrating patient reported outcomes and quality-of-life indices in clinical practice\".&#32;Journal of Cancer Research and Therapies&#32;10&#32;(3): 453-5.&#32;doi:10.4103\/0973-1482.142741.&#32;PMID&#160;25313720. &#160; \n\n\u2191 17.0 17.1 Kim, K.H.; Lee, S.; Shim, J.B. et al.&#32;(2017).&#32;\"Predictive modelling analysis for development of a radiotherapy decision support system in prostate cancer: A preliminary study\".&#32;Journal of Radiotherapy in Practice&#32;16&#32;(2): 161\u201370.&#32;doi:10.1017\/S1460396916000583. &#160; \n\n\u2191 Chen, A.M.; Felix, C.; Wang, P.C. et al.&#32;(2017).&#32;\"Reduced-dose radiotherapy for human papillomavirus-associated squamous-cell carcinoma of the oropharynx: A single-arm, phase 2 study\".&#32;The Lancet, Oncology&#32;18&#32;(6): 803\u201311.&#32;doi:10.1016\/S1470-2045(17)30246-2.&#32;PMID&#160;28434660. &#160; \n\n\u2191 West, C.M.; Barnett, G.C.&#32;(2011).&#32;\"Genetics and genomics of radiotherapy toxicity: Towards prediction\".&#32;Genome Medicine&#32;3&#32;(8): 52.&#32;doi:10.1186\/gm268.&#32;PMC&#160;PMC3238178.&#32;PMID&#160;21861849.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3238178 . &#160; \n\n\u2191 Torres-Roca, J.F.; Eschrich, S.; Zhao, H. et al.&#32;(2005).&#32;\"Prediction of radiation sensitivity using a gene expression classifier\".&#32;Cancer Research&#32;65&#32;(16): 7169-76.&#32;doi:10.1158\/0008-5472.CAN-05-0656.&#32;PMID&#160;16103067. &#160; \n\n\u2191 Chistiakov, D.A.; Voronova, N.V.; Chistiakov, P.A.&#32;(2008).&#32;\"Genetic variations in DNA repair genes, radiosensitivity to cancer and susceptibility to acute tissue reactions in radiotherapy-treated cancer patients\".&#32;Acta Oncologica&#32;47&#32;(5): 809-24.&#32;doi:10.1080\/02841860801885969.&#32;PMID&#160;18568480. &#160; \n\n\u2191 22.0 22.1 Eschrich, S.A.; Pramana, J.; Zhang, H. et al.&#32;(2009).&#32;\"A gene expression model of intrinsic tumor radiosensitivity: Prediction of response and prognosis after chemoradiation\".&#32;International Journal of Radiation Oncology, Biology, and Physics&#32;75&#32;(2): 489-96.&#32;doi:10.1016\/j.ijrobp.2009.06.014.&#32;PMC&#160;PMC3038688.&#32;PMID&#160;19735873.&#32;http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3038688 . &#160; \n\n\u2191 Zhao, S.G.; Chang, S.L.; Spratt, D.E. et al.&#32;(2016).&#32;\"Development and validation of a 24-gene predictor of response to postoperative radiotherapy in prostate cancer: A matched, retrospective analysis\".&#32;The Lancet, Oncology&#32;17&#32;(11): 1612\u201320.&#32;doi:10.1016\/S1470-2045(16)30491-0.&#32;PMID&#160;27743920. &#160; \n\n\u2191 Scott, J.G.; Berglund, A.; Schell, M.J. et al.&#32;(2017).&#32;\"A genome-based model for adjusting radiotherapy dose (GARD): A retrospective, cohort-based study\".&#32;The Lancet, Oncology&#32;18&#32;(2): 202-211.&#32;doi:10.1016\/S1470-2045(16)30648-9.&#32;PMID&#160;27993569. &#160; \n\n\u2191 Kim, K.H.; Lee, S.; Shim, J.B. et al.&#32;(2017).&#32;\"A text-based data mining and toxicity prediction modeling system for a clinical decision support in radiation oncology: A preliminary study\".&#32;Journal of the Korean Physical Society&#32;71&#32;(4): 231\u20137.&#32;doi:10.3938\/jkps.71.231. &#160; \n\n\u2191 Arimura, H.; Nakamoto, T.&#32;(2016).&#32;\"Applications of Machine Learning for Radiation Therapy\".&#32;Igaku Butsuri&#32;36&#32;(1): 35\u20138.&#32;doi:10.11323\/jjmp.36.1_35.&#32;PMID&#160;28428495. &#160; \n\n\u2191 Nicolae, A.; Morton, G.; Chung, H. et al.&#32;(2017).&#32;\"Evaluation of a Machine-Learning Algorithm for Treatment Planning in Prostate Low-Dose-Rate Brachytherapy\".&#32;International Journal of Radiation Oncology, Biology, and Physics&#32;97&#32;(4): 822-829.&#32;doi:10.1016\/j.ijrobp.2016.11.036.&#32;PMID&#160;28244419. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\">https:\/\/www.limswiki.org\/index.php\/Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on big dataLIMSwiki journal articles on health informatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 13 August 2018, at 18:05.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 130 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","2c1bea416fe89e4530ea8d302ad49dbc_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_How_big_data_comparative_effectiveness_research_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:How big data, comparative effectiveness research, and rapid-learning health care systems can transform patient care in radiation oncology<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Big data and comparative effectiveness research methodologies can be applied within the framework of a rapid-learning health care system (RLHCS) to accelerate discovery and to help turn the dream of fully personalized medicine into a reality. We synthesize recent advances in <a href=\"https:\/\/www.limswiki.org\/index.php\/Genomics\" title=\"Genomics\" target=\"_blank\" class=\"wiki-link\" data-key=\"96a82dabf51cf9510dd00c5a03396c44\">genomics<\/a> with trends in big data to provide a forward-looking perspective on the potential of new advances to usher in an era of personalized radiation therapy, with emphases on the power of RLHCS to accelerate discovery and the future of individualized radiation treatment planning.\n<\/p><p><b>Keywords<\/b>: big data, radiation oncology, comparative effectiveness research, rapid-learning health care system, personalized radiation therapy\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Comparative_effectiveness_research_.28CER.29_and_big_data\">Comparative effectiveness research (CER) and big data<\/span><\/h2>\n<p>The Committee on CER Prioritization was created by the Institute of Medicine in 2009. They defined CER as \u201ca strategy that focuses on the practical comparison of two or more health intervention to discern what works best for which patients and populations.\u201d<sup id=\"rdp-ebb-cite_ref-IoMInitial09_1-0\" class=\"reference\"><a href=\"#cite_note-IoMInitial09-1\" rel=\"external_link\">[1]<\/a><\/sup> In essence, the goal of CER is to identify \"which treatment will work best, in which patient, under what circumstances.\u201d<sup id=\"rdp-ebb-cite_ref-GreenfieldWelcome12_2-0\" class=\"reference\"><a href=\"#cite_note-GreenfieldWelcome12-2\" rel=\"external_link\">[2]<\/a><\/sup> Big data refers to data sets that are so large that they cannot be analyzed directly by individuals or traditional processing software. Big data analytics (BDA) is a growing field with a multitude of methods that is being utilized in various sectors from business to medicine.<sup id=\"rdp-ebb-cite_ref-SivarajahCritical17_3-0\" class=\"reference\"><a href=\"#cite_note-SivarajahCritical17-3\" rel=\"external_link\">[3]<\/a><\/sup> The advent of the <a href=\"https:\/\/www.limswiki.org\/index.php\/Electronic_medical_record\" title=\"Electronic medical record\" target=\"_blank\" class=\"wiki-link\" data-key=\"99a695d2af23397807da0537d29d0be7\">electronic medical record<\/a> (EMR) has resulted in the digitization of massive data sets of medical information, including clinic encounters, <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratory<\/a> values, imaging data sets and reports, pathology reports, patient outcomes, and family history, as well as genomic and biological data, etc.\n<\/p><p>To help with the <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">analysis<\/a> of big data, the <a href=\"https:\/\/www.limswiki.org\/index.php\/National_Institutes_of_Health\" title=\"National Institutes of Health\" target=\"_blank\" class=\"wiki-link\" data-key=\"e5c215c48e73ae58b0695dc2af951cd0\">National Institutes of Health<\/a> (NIH) has created the Big Data to Knowledge (BD2K) program, which has invested over $200 million in grant awards to foster the development of methods and tools to analyze big data in biomedical research.<sup id=\"rdp-ebb-cite_ref-MargolisTheNat14_4-0\" class=\"reference\"><a href=\"#cite_note-MargolisTheNat14-4\" rel=\"external_link\">[4]<\/a><\/sup> Additionally, the BD2K program will move to make sure that biomedical big data is \u201cfindable, accessible, interoperable, and reusable\u201d (FAIR).<sup id=\"rdp-ebb-cite_ref-MargolisTheNat14_4-1\" class=\"reference\"><a href=\"#cite_note-MargolisTheNat14-4\" rel=\"external_link\">[4]<\/a><\/sup> Over the past decade, CER methodologies have become increasingly prevalent in radiation oncology research, and there is much enthusiasm surrounding BDA.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Rapid-learning_health_care_system_.28RLHCS.29_and_personalized_medicine\">Rapid-learning health care system (RLHCS) and personalized medicine<\/span><\/h2>\n<p>The number of articles on big data in health care has increased exponentially from under 500 articles in 2005 to over 2500 articles in 2015.<sup id=\"rdp-ebb-cite_ref-deLaTorreD.C3.ADezBig16_5-0\" class=\"reference\"><a href=\"#cite_note-deLaTorreD.C3.ADezBig16-5\" rel=\"external_link\">[5]<\/a><\/sup> As the amount of biomedical big data and our ability to analyze these data continues to advance, so will the implications and use of the <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> we are able to extract. One of the most important steps toward advancing our ability to analyze these big data for biomedical discovery is the creation of RLHCS, which will allow for the sharing of patient data between EMRs, ideally in real-time.<sup id=\"rdp-ebb-cite_ref-GinsburgCompar12_6-0\" class=\"reference\"><a href=\"#cite_note-GinsburgCompar12-6\" rel=\"external_link\">[6]<\/a><\/sup> An ideal RLHCS would take patient data that was routinely generated as part of standard patient care and compile that data into a large data system.<sup id=\"rdp-ebb-cite_ref-GinsburgCompar12_6-1\" class=\"reference\"><a href=\"#cite_note-GinsburgCompar12-6\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-GinsburgAcademic11_7-0\" class=\"reference\"><a href=\"#cite_note-GinsburgAcademic11-7\" rel=\"external_link\">[7]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-AbernathyRapid10_8-0\" class=\"reference\"><a href=\"#cite_note-AbernathyRapid10-8\" rel=\"external_link\">[8]<\/a><\/sup> This aggregate data would then be available for both BDA to accelerate identification of new hypotheses and CER to rapidly generate evidence through hypothesis-testing studies. Clinical data from patient records can be used readily to identify novel relationships among clinical factors and patient outcomes, or to evaluate treatment effectiveness in specific subgroups, that cannot be studied adequately in randomized, controlled trials. The extreme power of RLCHS, though, is even more exciting when one considers the possibility of adding biospecimens to accelerate discovery in genomics and proteomics. As RLHCSs are created and their data sets are expanded, we will continue to identify specific genomic and proteomic data to help define cohorts and stratify patients into risk groups and treatment response groups, and potentially to help design highly tailored therapy regimens.<sup id=\"rdp-ebb-cite_ref-RamseyHow11_9-0\" class=\"reference\"><a href=\"#cite_note-RamseyHow11-9\" rel=\"external_link\">[9]<\/a><\/sup> In this sense, the RLHCS would usher in a more fertile era for improving biomedical research than ever before. BDA and CER provide the research methodologies needed to rapidly generate evidence from the RLHCS. It should be noted, however, that there are substantial practical obstacles that must be addressed to achieve the vision of the RLHCS. These include patient concerns regarding privacy and security of sensitive information, interconnectivity among different health records, and regulatory barriers to the exchange of health information.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Integrating_an_RLHCS_with_oncology\">Integrating an RLHCS with oncology<\/span><\/h2>\n<p>The integration of CER, big data, and BDA is especially important in the field of oncology, where multiple groups are investing significant time and resources in efforts to expand the availability of data and advance the methods used to extract meaningful information from that data.<sup id=\"rdp-ebb-cite_ref-MargolisTheNat14_4-2\" class=\"reference\"><a href=\"#cite_note-MargolisTheNat14-4\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HelftCanBig14_10-0\" class=\"reference\"><a href=\"#cite_note-HelftCanBig14-10\" rel=\"external_link\">[10]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WilliamsArtificial18_11-0\" class=\"reference\"><a href=\"#cite_note-WilliamsArtificial18-11\" rel=\"external_link\">[11]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-SavageBigData14_12-0\" class=\"reference\"><a href=\"#cite_note-SavageBigData14-12\" rel=\"external_link\">[12]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ShahBuilding16_13-0\" class=\"reference\"><a href=\"#cite_note-ShahBuilding16-13\" rel=\"external_link\">[13]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-TrifilettiBigData15_14-0\" class=\"reference\"><a href=\"#cite_note-TrifilettiBigData15-14\" rel=\"external_link\">[14]<\/a><\/sup> The American Society of Clinical Oncology started their own RLHCS, CancerLinQ, to overcome the lack of interoperability between EMRs and accomplish their goal of being able to \u201canalyze and share data on every patient with cancer.\u201d<sup id=\"rdp-ebb-cite_ref-ASCOShaping11_15-0\" class=\"reference\"><a href=\"#cite_note-ASCOShaping11-15\" rel=\"external_link\">[15]<\/a><\/sup> While the vision of RLCHS has not yet been fully achieved, the potential impact on society has stimulated enthusiasm toward this effort.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Implications_for_radiation_oncology\">Implications for radiation oncology<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Patient_reported_outcomes_.28PROs.29\">Patient reported outcomes (PROs)<\/span><\/h3>\n<p>Patient reported outcomes and quality-of-life (QoL) have become a major area of focus in health care overall, particularly in oncology. The availability of PROs within EMRs provides the foundation for an RLHCS that can be leveraged to expand insights into how cancer treatments impact patient QoL. By incorporating the PROs for massive numbers of patients, RLHCS will be able to identify small variations and subgroups of patients that might be missed in the smaller number of patients included in traditional randomized controlled trials. These PROs and QoL domains can then be incorporated into clinical decision-making to help guide both providers and patients.<sup id=\"rdp-ebb-cite_ref-SarinBigData14_16-0\" class=\"reference\"><a href=\"#cite_note-SarinBigData14-16\" rel=\"external_link\">[16]<\/a><\/sup> In doing this, PROs can act as a link between the objective clinical data and the subjective patient outcomes and experiences to help improve the overall care of the patient.<sup id=\"rdp-ebb-cite_ref-KimPredict17_17-0\" class=\"reference\"><a href=\"#cite_note-KimPredict17-17\" rel=\"external_link\">[17]<\/a><\/sup> One may also conceive of potential genomics-based determinants of QoL that could be identified using BDA if RLHCSs include biospecimens linked to clinical data and PROs. Finally, surveillance of an RLHCS may also be performed to identify temporal trends in PROs to estimate outcomes after implementation of new technologies.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Dose_selection_and_radiosensitivity\">Dose selection and radiosensitivity<\/span><\/h3>\n<p>The use of tumor-specific genes and radiosensitivity to guided treatment decisions has already been established in human papilloma virus-associated squamous-cell carcinoma of the oropharynx.<sup id=\"rdp-ebb-cite_ref-ChenReduced17_18-0\" class=\"reference\"><a href=\"#cite_note-ChenReduced17-18\" rel=\"external_link\">[18]<\/a><\/sup> Numerous studies have looked at identifying genes that may have implications on tumor radiosensitivity or patient toxicity.<sup id=\"rdp-ebb-cite_ref-WestGenetics11_19-0\" class=\"reference\"><a href=\"#cite_note-WestGenetics11-19\" rel=\"external_link\">[19]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-Torres-RocaPredict05_20-0\" class=\"reference\"><a href=\"#cite_note-Torres-RocaPredict05-20\" rel=\"external_link\">[20]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ChistiakovGenetic08_21-0\" class=\"reference\"><a href=\"#cite_note-ChistiakovGenetic08-21\" rel=\"external_link\">[21]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-EschrichAGene09_22-0\" class=\"reference\"><a href=\"#cite_note-EschrichAGene09-22\" rel=\"external_link\">[22]<\/a><\/sup> The identification of these genes and their potential implications has led to the creation of the fields of radiogenetics and radiogenomics. Efforts are currently underway to generate meaningful gene assays that will help predict tumor response to radiation. Eschrich <i>et al.<\/i> created a 10-gene model to calculate a radiosensitivity index and applied this to patients with head-and-neck, rectal, and esophageal cancer to help stratify patients into either responders or non-responders, with 80% sensitivity and 82% specificity.<sup id=\"rdp-ebb-cite_ref-EschrichAGene09_22-1\" class=\"reference\"><a href=\"#cite_note-EschrichAGene09-22\" rel=\"external_link\">[22]<\/a><\/sup> Similarly, Zhao <i>et al.<\/i> retrospectively created a 24-gene assay and applied this to risk matched patients who either received postoperative radiation or no radiation following prostatectomy. Patients with a high score on the gene index who received postoperative radiation were less likely to have distant metastasis at 10 years.<sup id=\"rdp-ebb-cite_ref-ZhaoDevelop16_23-0\" class=\"reference\"><a href=\"#cite_note-ZhaoDevelop16-23\" rel=\"external_link\">[23]<\/a><\/sup>\n<\/p><p>As efforts to identify genes and gene assays that may be predictors of radiosensitivity continue to be validated, we will potentially be able to integrate these findings in dose selection and toxicity prediction for individual patients based on their native and tumor genetics. Scott and colleagues have recently described a genomics-based strategy for personalizing radiation therapy dose, which would support dose de-escalation for radiosensitive tumors.<sup id=\"rdp-ebb-cite_ref-ScottAGenome17_24-0\" class=\"reference\"><a href=\"#cite_note-ScottAGenome17-24\" rel=\"external_link\">[24]<\/a><\/sup> While the clinical implication of radiosensitivity assays are still developing, big data will be key to developing future assays rapidly, as well as incorporating the genomics tools into clinical decision-making. Big data provides an opportunity to refine molecular signatures based upon real-world data and to merge genomic assay results with other clinical data elements to optimize predictive analytics. An RLHCS would provide the ideal substrate for levering big data and CER to accelerate genomics-based discovery to make precision radiation oncology a reality.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Personalized_treatment_recommendations\">Personalized treatment recommendations<\/span><\/h3>\n<p>Radiation oncology is unique in that treatment plans for patients are often already technically and physically personalized due to patient-specific variations in anatomy, tumor characteristics, and stage. Since a patient\u2019s treatment plan is usually based upon a CT scan in treatment position, radiation can be considered an inherently personalized form of medicine. However, treatment planning approaches and radiation doses are generally selected based upon class solution, with technical details such as beam arrangements and dose\u2013volume constraints adherent to generalized rules. Multiple studies have already begun to look at how BDA methods such as machine learning and neural networks can be used to aid in dose optimization and toxicity prediction modeling in radiation oncology<sup id=\"rdp-ebb-cite_ref-KimPredict17_17-1\" class=\"reference\"><a href=\"#cite_note-KimPredict17-17\" rel=\"external_link\">[17]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-KimAText17_25-0\" class=\"reference\"><a href=\"#cite_note-KimAText17-25\" rel=\"external_link\">[25]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ArimuraApplications16_26-0\" class=\"reference\"><a href=\"#cite_note-ArimuraApplications16-26\" rel=\"external_link\">[26]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-NicolaeEval17_27-0\" class=\"reference\"><a href=\"#cite_note-NicolaeEval17-27\" rel=\"external_link\">[27]<\/a><\/sup>, which could provide more optimal treatment plan alternatives for individual patients. As the data and technology behind RLHCS continues to progress, we will likely be able to utilize a full spectrum of patient-specific clinical factors, PROs, genomics, patient preference, and priorities, and a menu of treatment plan alternatives in order to optimize an individual patient\u2019s radiation therapy. In order to deliver high quality, high impact insights into radiation oncology, it is important that large datasets include detailed technical information.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusion\">Conclusion<\/span><\/h2>\n<p>Much of the excitement regarding big data has centered on potential for genomic discovery, high-level radiation treatment planning, and leveraging EMRs to identify associations among factors that may provide new insights into potential causal relationships that can be further studied to accelerate progress in cancer care. Although these are certainly promising areas for discovery, we most eagerly anticipate the power of big data to connect a broad range of characteristics to accelerate evidence generation and inform personalized decision-making. We envision the use of big data and CER methods to inform the individual decisions of patients and providers by synthesizing clinical and genomic data and querying an RLHCS for the latest data on effectiveness of treatment options in relevant subgroups of patients.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>Both authors contributed to the development and editing of the manuscript and approved the final submitted version.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflict_of_interest_statement\">Conflict of interest statement<\/span><\/h3>\n<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-IoMInitial09-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IoMInitial09_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Institute of Medicine of the National Academies&#32;(2009).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.nap.edu\/catalog\/12648\/initial-national-priorities-for-comparative-effectiveness-research\" target=\"_blank\"><i>Initial National Priorities for Comparative Effectiveness Research<\/i><\/a>.&#32;National Academies Press.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9780309138369<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.nap.edu\/catalog\/12648\/initial-national-priorities-for-comparative-effectiveness-research\" target=\"_blank\">https:\/\/www.nap.edu\/catalog\/12648\/initial-national-priorities-for-comparative-effectiveness-research<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Initial+National+Priorities+for+Comparative+Effectiveness+Research&amp;rft.aulast=Institute+of+Medicine+of+the+National+Academies&amp;rft.au=Institute+of+Medicine+of+the+National+Academies&amp;rft.date=2009&amp;rft.pub=National+Academies+Press&amp;rft.isbn=9780309138369&amp;rft_id=https%3A%2F%2Fwww.nap.edu%2Fcatalog%2F12648%2Finitial-national-priorities-for-comparative-effectiveness-research&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GreenfieldWelcome12-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GreenfieldWelcome12_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Greenfield, S.; Rich, E.&#32;(2012).&#32;\"Welcome to the Journal of Comparative Effectiveness Research\".&#32;<i>Journal of Comparative Effectiveness Research<\/i>&#32;<b>1<\/b>&#32;(1): 1\u20133.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2217%2Fcer.11.13\" target=\"_blank\">10.2217\/cer.11.13<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24237290\" target=\"_blank\">24237290<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Welcome+to+the+Journal+of+Comparative+Effectiveness+Research&amp;rft.jtitle=Journal+of+Comparative+Effectiveness+Research&amp;rft.aulast=Greenfield%2C+S.%3B+Rich%2C+E.&amp;rft.au=Greenfield%2C+S.%3B+Rich%2C+E.&amp;rft.date=2012&amp;rft.volume=1&amp;rft.issue=1&amp;rft.pages=1%E2%80%933&amp;rft_id=info:doi\/10.2217%2Fcer.11.13&amp;rft_id=info:pmid\/24237290&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SivarajahCritical17-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SivarajahCritical17_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Sivarajah, U.; Kamal, M.M.; Irani, Z.; Weerakkody, V.&#32;(2017).&#32;\"Critical analysis of Big Data challenges and analytical methods\".&#32;<i>Journal of Business Research<\/i>&#32;<b>70<\/b>: 263\u201386.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbusres.2016.08.001\" target=\"_blank\">10.1016\/j.jbusres.2016.08.001<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Critical+analysis+of+Big+Data+challenges+and+analytical+methods&amp;rft.jtitle=Journal+of+Business+Research&amp;rft.aulast=Sivarajah%2C+U.%3B+Kamal%2C+M.M.%3B+Irani%2C+Z.%3B+Weerakkody%2C+V.&amp;rft.au=Sivarajah%2C+U.%3B+Kamal%2C+M.M.%3B+Irani%2C+Z.%3B+Weerakkody%2C+V.&amp;rft.date=2017&amp;rft.volume=70&amp;rft.pages=263%E2%80%9386&amp;rft_id=info:doi\/10.1016%2Fj.jbusres.2016.08.001&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MargolisTheNat14-4\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MargolisTheNat14_4-0\" rel=\"external_link\">4.0<\/a><\/sup> <sup><a href=\"#cite_ref-MargolisTheNat14_4-1\" rel=\"external_link\">4.1<\/a><\/sup> <sup><a href=\"#cite_ref-MargolisTheNat14_4-2\" rel=\"external_link\">4.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Margolis, R.; Derr, L.; Dunn, M. et al.&#32;(2014).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4215061\" target=\"_blank\">\"The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: Capitalizing on biomedical big data\"<\/a>.&#32;<i>JAMIA<\/i>&#32;<b>21<\/b>&#32;(6): 957\u20138.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1136%2Famiajnl-2014-002974\" target=\"_blank\">10.1136\/amiajnl-2014-002974<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4215061\/\" target=\"_blank\">PMC4215061<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25008006\" target=\"_blank\">25008006<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4215061\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4215061<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+National+Institutes+of+Health%27s+Big+Data+to+Knowledge+%28BD2K%29+initiative%3A+Capitalizing+on+biomedical+big+data&amp;rft.jtitle=JAMIA&amp;rft.aulast=Margolis%2C+R.%3B+Derr%2C+L.%3B+Dunn%2C+M.+et+al.&amp;rft.au=Margolis%2C+R.%3B+Derr%2C+L.%3B+Dunn%2C+M.+et+al.&amp;rft.date=2014&amp;rft.volume=21&amp;rft.issue=6&amp;rft.pages=957%E2%80%938&amp;rft_id=info:doi\/10.1136%2Famiajnl-2014-002974&amp;rft_id=info:pmc\/PMC4215061&amp;rft_id=info:pmid\/25008006&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4215061&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-deLaTorreD.C3.ADezBig16-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-deLaTorreD.C3.ADezBig16_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">de la Torre D\u00edez, I.; Cosgava, H.M.; Garcia-Zapirain, B.; L\u00f3pez-Coronado, M.&#32;(2016).&#32;\"Big Data in Health: a Literature Review from the Year 2005\".&#32;<i>Journal of Medical Systems<\/i>&#32;<b>40<\/b>&#32;(9): 209.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs10916-016-0565-7\" target=\"_blank\">10.1007\/s10916-016-0565-7<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27520614\" target=\"_blank\">27520614<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+Data+in+Health%3A+a+Literature+Review+from+the+Year+2005&amp;rft.jtitle=Journal+of+Medical+Systems&amp;rft.aulast=de+la+Torre+D%C3%ADez%2C+I.%3B+Cosgava%2C+H.M.%3B+Garcia-Zapirain%2C+B.%3B+L%C3%B3pez-Coronado%2C+M.&amp;rft.au=de+la+Torre+D%C3%ADez%2C+I.%3B+Cosgava%2C+H.M.%3B+Garcia-Zapirain%2C+B.%3B+L%C3%B3pez-Coronado%2C+M.&amp;rft.date=2016&amp;rft.volume=40&amp;rft.issue=9&amp;rft.pages=209&amp;rft_id=info:doi\/10.1007%2Fs10916-016-0565-7&amp;rft_id=info:pmid\/27520614&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GinsburgCompar12-6\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-GinsburgCompar12_6-0\" rel=\"external_link\">6.0<\/a><\/sup> <sup><a href=\"#cite_ref-GinsburgCompar12_6-1\" rel=\"external_link\">6.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ginsburg, G.S.; Kuderer, N.M.&#32;(2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3504328\" target=\"_blank\">\"Comparative effectiveness research, genomics-enabled personalized medicine, and rapid learning health care: A common bond\"<\/a>.&#32;<i>Journal of Clinical Oncology<\/i>&#32;<b>30<\/b>&#32;(34): 4233-42.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1200%2FJCO.2012.42.6114\" target=\"_blank\">10.1200\/JCO.2012.42.6114<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3504328\/\" target=\"_blank\">PMC3504328<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23071236\" target=\"_blank\">23071236<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3504328\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3504328<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Comparative+effectiveness+research%2C+genomics-enabled+personalized+medicine%2C+and+rapid+learning+health+care%3A+A+common+bond&amp;rft.jtitle=Journal+of+Clinical+Oncology&amp;rft.aulast=Ginsburg%2C+G.S.%3B+Kuderer%2C+N.M.&amp;rft.au=Ginsburg%2C+G.S.%3B+Kuderer%2C+N.M.&amp;rft.date=2012&amp;rft.volume=30&amp;rft.issue=34&amp;rft.pages=4233-42&amp;rft_id=info:doi\/10.1200%2FJCO.2012.42.6114&amp;rft_id=info:pmc\/PMC3504328&amp;rft_id=info:pmid\/23071236&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3504328&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GinsburgAcademic11-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GinsburgAcademic11_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ginsburg, G.S.; Staples, J.; Abernethy, A.P.&#32;(2011).&#32;\"Academic medical centers: Ripe for rapid-learning personalized health care\".&#32;<i>Science Translational Medicine<\/i>&#32;<b>3<\/b>&#32;(101): 101cm27.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1126%2Fscitranslmed.3002386\" target=\"_blank\">10.1126\/scitranslmed.3002386<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21937754\" target=\"_blank\">21937754<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Academic+medical+centers%3A+Ripe+for+rapid-learning+personalized+health+care&amp;rft.jtitle=Science+Translational+Medicine&amp;rft.aulast=Ginsburg%2C+G.S.%3B+Staples%2C+J.%3B+Abernethy%2C+A.P.&amp;rft.au=Ginsburg%2C+G.S.%3B+Staples%2C+J.%3B+Abernethy%2C+A.P.&amp;rft.date=2011&amp;rft.volume=3&amp;rft.issue=101&amp;rft.pages=101cm27&amp;rft_id=info:doi\/10.1126%2Fscitranslmed.3002386&amp;rft_id=info:pmid\/21937754&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AbernathyRapid10-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AbernathyRapid10_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Abernethy, A.P.; Etheredge, L.M.; Ganz, P.A. et al.&#32;(2010).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2953977\" target=\"_blank\">\"Rapid-learning system for cancer care\"<\/a>.&#32;<i>Journal of Clinical Oncology<\/i>&#32;<b>28<\/b>&#32;(27): 4268-74.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1200%2FJCO.2010.28.5478\" target=\"_blank\">10.1200\/JCO.2010.28.5478<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2953977\/\" target=\"_blank\">PMC2953977<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20585094\" target=\"_blank\">20585094<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2953977\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC2953977<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Rapid-learning+system+for+cancer+care&amp;rft.jtitle=Journal+of+Clinical+Oncology&amp;rft.aulast=Abernethy%2C+A.P.%3B+Etheredge%2C+L.M.%3B+Ganz%2C+P.A.+et+al.&amp;rft.au=Abernethy%2C+A.P.%3B+Etheredge%2C+L.M.%3B+Ganz%2C+P.A.+et+al.&amp;rft.date=2010&amp;rft.volume=28&amp;rft.issue=27&amp;rft.pages=4268-74&amp;rft_id=info:doi\/10.1200%2FJCO.2010.28.5478&amp;rft_id=info:pmc\/PMC2953977&amp;rft_id=info:pmid\/20585094&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2953977&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RamseyHow11-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RamseyHow11_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ramsey, S.D.; Veenstra, D.; Tunis, S.R. et al.&#32;(2011).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3477796\" target=\"_blank\">\"How comparative effectiveness research can help advance 'personalized medicine' in cancer treatment\"<\/a>.&#32;<i>Health Affairs<\/i>&#32;<b>30<\/b>&#32;(12): 2259\u201368.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1377%2Fhlthaff.2010.0637\" target=\"_blank\">10.1377\/hlthaff.2010.0637<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3477796\/\" target=\"_blank\">PMC3477796<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22147853\" target=\"_blank\">22147853<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3477796\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3477796<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=How+comparative+effectiveness+research+can+help+advance+%27personalized+medicine%27+in+cancer+treatment&amp;rft.jtitle=Health+Affairs&amp;rft.aulast=Ramsey%2C+S.D.%3B+Veenstra%2C+D.%3B+Tunis%2C+S.R.+et+al.&amp;rft.au=Ramsey%2C+S.D.%3B+Veenstra%2C+D.%3B+Tunis%2C+S.R.+et+al.&amp;rft.date=2011&amp;rft.volume=30&amp;rft.issue=12&amp;rft.pages=2259%E2%80%9368&amp;rft_id=info:doi\/10.1377%2Fhlthaff.2010.0637&amp;rft_id=info:pmc\/PMC3477796&amp;rft_id=info:pmid\/22147853&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3477796&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HelftCanBig14-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HelftCanBig14_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Helft, M.&#32;(2014).&#32;\"Can big data cure cancer?\".&#32;<i>Fortune<\/i>&#32;<b>170<\/b>&#32;(2): 70\u20134, 76, 78.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25318238\" target=\"_blank\">25318238<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Can+big+data+cure+cancer%3F&amp;rft.jtitle=Fortune&amp;rft.aulast=Helft%2C+M.&amp;rft.au=Helft%2C+M.&amp;rft.date=2014&amp;rft.volume=170&amp;rft.issue=2&amp;rft.pages=70%E2%80%934%2C+76%2C+78&amp;rft_id=info:pmid\/25318238&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WilliamsArtificial18-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WilliamsArtificial18_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Williams, A.M.; Liu, Y.; Regner, K.R. et al.&#32;(2018).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5966805\" target=\"_blank\">\"Artificial intelligence, physiological genomics, and precision medicine\"<\/a>.&#32;<i>Physiological Genomics<\/i>&#32;<b>50<\/b>&#32;(4): 237\u201343.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1152%2Fphysiolgenomics.00119.2017\" target=\"_blank\">10.1152\/physiolgenomics.00119.2017<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5966805\/\" target=\"_blank\">PMC5966805<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/29373082\" target=\"_blank\">29373082<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5966805\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC5966805<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Artificial+intelligence%2C+physiological+genomics%2C+and+precision+medicine&amp;rft.jtitle=Physiological+Genomics&amp;rft.aulast=Williams%2C+A.M.%3B+Liu%2C+Y.%3B+Regner%2C+K.R.+et+al.&amp;rft.au=Williams%2C+A.M.%3B+Liu%2C+Y.%3B+Regner%2C+K.R.+et+al.&amp;rft.date=2018&amp;rft.volume=50&amp;rft.issue=4&amp;rft.pages=237%E2%80%9343&amp;rft_id=info:doi\/10.1152%2Fphysiolgenomics.00119.2017&amp;rft_id=info:pmc\/PMC5966805&amp;rft_id=info:pmid\/29373082&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5966805&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SavageBigData14-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SavageBigData14_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Savage, N.&#32;(2014).&#32;\"Big data versus the big C\".&#32;<i>Scientific American<\/i>&#32;<b>311<\/b>&#32;(1): S20\u20131.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24974705\" target=\"_blank\">24974705<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+data+versus+the+big+C&amp;rft.jtitle=Scientific+American&amp;rft.aulast=Savage%2C+N.&amp;rft.au=Savage%2C+N.&amp;rft.date=2014&amp;rft.volume=311&amp;rft.issue=1&amp;rft.pages=S20%E2%80%931&amp;rft_id=info:pmid\/24974705&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ShahBuilding16-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ShahBuilding16_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Shah, A.; Stewart, A.K.; Kolacevski, A. et al.&#32;(2016).&#32;\"Building a rapid learning health care system for oncology: Why CancerLinQ collects identifiable health information to achieve its vision\".&#32;<i>Journal of Clinical Oncology<\/i>&#32;<b>34<\/b>&#32;(7): 756\u201363.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1200%2FJCO.2015.65.0598\" target=\"_blank\">10.1200\/JCO.2015.65.0598<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26755519\" target=\"_blank\">26755519<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Building+a+rapid+learning+health+care+system+for+oncology%3A+Why+CancerLinQ+collects+identifiable+health+information+to+achieve+its+vision&amp;rft.jtitle=Journal+of+Clinical+Oncology&amp;rft.aulast=Shah%2C+A.%3B+Stewart%2C+A.K.%3B+Kolacevski%2C+A.+et+al.&amp;rft.au=Shah%2C+A.%3B+Stewart%2C+A.K.%3B+Kolacevski%2C+A.+et+al.&amp;rft.date=2016&amp;rft.volume=34&amp;rft.issue=7&amp;rft.pages=756%E2%80%9363&amp;rft_id=info:doi\/10.1200%2FJCO.2015.65.0598&amp;rft_id=info:pmid\/26755519&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TrifilettiBigData15-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TrifilettiBigData15_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Trifiletti, D.M.; Showalter, T.N.&#32;(2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4672039\" target=\"_blank\">\"Big Data and Comparative Effectiveness Research in Radiation Oncology: Synergy and Accelerated Discovery\"<\/a>.&#32;<i>Frontiers in Oncology<\/i>&#32;<b>5<\/b>: 274.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3389%2Ffonc.2015.00274\" target=\"_blank\">10.3389\/fonc.2015.00274<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4672039\/\" target=\"_blank\">PMC4672039<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26697409\" target=\"_blank\">26697409<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4672039\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC4672039<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+Data+and+Comparative+Effectiveness+Research+in+Radiation+Oncology%3A+Synergy+and+Accelerated+Discovery&amp;rft.jtitle=Frontiers+in+Oncology&amp;rft.aulast=Trifiletti%2C+D.M.%3B+Showalter%2C+T.N.&amp;rft.au=Trifiletti%2C+D.M.%3B+Showalter%2C+T.N.&amp;rft.date=2015&amp;rft.volume=5&amp;rft.pages=274&amp;rft_id=info:doi\/10.3389%2Ffonc.2015.00274&amp;rft_id=info:pmc\/PMC4672039&amp;rft_id=info:pmid\/26697409&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4672039&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ASCOShaping11-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ASCOShaping11_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.asco.org\/sites\/default\/files\/shapingfuture-lowres.pdf\" target=\"_blank\">\"Shaping the Future of Oncology: Envisioning Cancer Care in 2030: Outcomes of the ASCO Board of Directors Strategic Planning and Visioning Process, 2011-2012\"<\/a>.&#32;American Society of Clinical Oncology.&#32;2011<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.asco.org\/sites\/default\/files\/shapingfuture-lowres.pdf\" target=\"_blank\">https:\/\/www.asco.org\/sites\/default\/files\/shapingfuture-lowres.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Shaping+the+Future+of+Oncology%3A+Envisioning+Cancer+Care+in+2030%3A+Outcomes+of+the+ASCO+Board+of+Directors+Strategic+Planning+and+Visioning+Process%2C+2011-2012&amp;rft.atitle=&amp;rft.date=2011&amp;rft.pub=American+Society+of+Clinical+Oncology&amp;rft_id=https%3A%2F%2Fwww.asco.org%2Fsites%2Fdefault%2Ffiles%2Fshapingfuture-lowres.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SarinBigData14-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SarinBigData14_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Sarin, R.&#32;(2014).&#32;\"Big Data V4 for integrating patient reported outcomes and quality-of-life indices in clinical practice\".&#32;<i>Journal of Cancer Research and Therapies<\/i>&#32;<b>10<\/b>&#32;(3): 453-5.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4103%2F0973-1482.142741\" target=\"_blank\">10.4103\/0973-1482.142741<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25313720\" target=\"_blank\">25313720<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+Data+V4+for+integrating+patient+reported+outcomes+and+quality-of-life+indices+in+clinical+practice&amp;rft.jtitle=Journal+of+Cancer+Research+and+Therapies&amp;rft.aulast=Sarin%2C+R.&amp;rft.au=Sarin%2C+R.&amp;rft.date=2014&amp;rft.volume=10&amp;rft.issue=3&amp;rft.pages=453-5&amp;rft_id=info:doi\/10.4103%2F0973-1482.142741&amp;rft_id=info:pmid\/25313720&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KimPredict17-17\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-KimPredict17_17-0\" rel=\"external_link\">17.0<\/a><\/sup> <sup><a href=\"#cite_ref-KimPredict17_17-1\" rel=\"external_link\">17.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kim, K.H.; Lee, S.; Shim, J.B. et al.&#32;(2017).&#32;\"Predictive modelling analysis for development of a radiotherapy decision support system in prostate cancer: A preliminary study\".&#32;<i>Journal of Radiotherapy in Practice<\/i>&#32;<b>16<\/b>&#32;(2): 161\u201370.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1017%2FS1460396916000583\" target=\"_blank\">10.1017\/S1460396916000583<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Predictive+modelling+analysis+for+development+of+a+radiotherapy+decision+support+system+in+prostate+cancer%3A+A+preliminary+study&amp;rft.jtitle=Journal+of+Radiotherapy+in+Practice&amp;rft.aulast=Kim%2C+K.H.%3B+Lee%2C+S.%3B+Shim%2C+J.B.+et+al.&amp;rft.au=Kim%2C+K.H.%3B+Lee%2C+S.%3B+Shim%2C+J.B.+et+al.&amp;rft.date=2017&amp;rft.volume=16&amp;rft.issue=2&amp;rft.pages=161%E2%80%9370&amp;rft_id=info:doi\/10.1017%2FS1460396916000583&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ChenReduced17-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ChenReduced17_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Chen, A.M.; Felix, C.; Wang, P.C. et al.&#32;(2017).&#32;\"Reduced-dose radiotherapy for human papillomavirus-associated squamous-cell carcinoma of the oropharynx: A single-arm, phase 2 study\".&#32;<i>The Lancet, Oncology<\/i>&#32;<b>18<\/b>&#32;(6): 803\u201311.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS1470-2045%2817%2930246-2\" target=\"_blank\">10.1016\/S1470-2045(17)30246-2<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28434660\" target=\"_blank\">28434660<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Reduced-dose+radiotherapy+for+human+papillomavirus-associated+squamous-cell+carcinoma+of+the+oropharynx%3A+A+single-arm%2C+phase+2+study&amp;rft.jtitle=The+Lancet%2C+Oncology&amp;rft.aulast=Chen%2C+A.M.%3B+Felix%2C+C.%3B+Wang%2C+P.C.+et+al.&amp;rft.au=Chen%2C+A.M.%3B+Felix%2C+C.%3B+Wang%2C+P.C.+et+al.&amp;rft.date=2017&amp;rft.volume=18&amp;rft.issue=6&amp;rft.pages=803%E2%80%9311&amp;rft_id=info:doi\/10.1016%2FS1470-2045%2817%2930246-2&amp;rft_id=info:pmid\/28434660&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WestGenetics11-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WestGenetics11_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">West, C.M.; Barnett, G.C.&#32;(2011).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3238178\" target=\"_blank\">\"Genetics and genomics of radiotherapy toxicity: Towards prediction\"<\/a>.&#32;<i>Genome Medicine<\/i>&#32;<b>3<\/b>&#32;(8): 52.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fgm268\" target=\"_blank\">10.1186\/gm268<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3238178\/\" target=\"_blank\">PMC3238178<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21861849\" target=\"_blank\">21861849<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3238178\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3238178<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Genetics+and+genomics+of+radiotherapy+toxicity%3A+Towards+prediction&amp;rft.jtitle=Genome+Medicine&amp;rft.aulast=West%2C+C.M.%3B+Barnett%2C+G.C.&amp;rft.au=West%2C+C.M.%3B+Barnett%2C+G.C.&amp;rft.date=2011&amp;rft.volume=3&amp;rft.issue=8&amp;rft.pages=52&amp;rft_id=info:doi\/10.1186%2Fgm268&amp;rft_id=info:pmc\/PMC3238178&amp;rft_id=info:pmid\/21861849&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3238178&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Torres-RocaPredict05-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Torres-RocaPredict05_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Torres-Roca, J.F.; Eschrich, S.; Zhao, H. et al.&#32;(2005).&#32;\"Prediction of radiation sensitivity using a gene expression classifier\".&#32;<i>Cancer Research<\/i>&#32;<b>65<\/b>&#32;(16): 7169-76.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1158%2F0008-5472.CAN-05-0656\" target=\"_blank\">10.1158\/0008-5472.CAN-05-0656<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/16103067\" target=\"_blank\">16103067<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Prediction+of+radiation+sensitivity+using+a+gene+expression+classifier&amp;rft.jtitle=Cancer+Research&amp;rft.aulast=Torres-Roca%2C+J.F.%3B+Eschrich%2C+S.%3B+Zhao%2C+H.+et+al.&amp;rft.au=Torres-Roca%2C+J.F.%3B+Eschrich%2C+S.%3B+Zhao%2C+H.+et+al.&amp;rft.date=2005&amp;rft.volume=65&amp;rft.issue=16&amp;rft.pages=7169-76&amp;rft_id=info:doi\/10.1158%2F0008-5472.CAN-05-0656&amp;rft_id=info:pmid\/16103067&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ChistiakovGenetic08-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ChistiakovGenetic08_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Chistiakov, D.A.; Voronova, N.V.; Chistiakov, P.A.&#32;(2008).&#32;\"Genetic variations in DNA repair genes, radiosensitivity to cancer and susceptibility to acute tissue reactions in radiotherapy-treated cancer patients\".&#32;<i>Acta Oncologica<\/i>&#32;<b>47<\/b>&#32;(5): 809-24.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1080%2F02841860801885969\" target=\"_blank\">10.1080\/02841860801885969<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/18568480\" target=\"_blank\">18568480<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Genetic+variations+in+DNA+repair+genes%2C+radiosensitivity+to+cancer+and+susceptibility+to+acute+tissue+reactions+in+radiotherapy-treated+cancer+patients&amp;rft.jtitle=Acta+Oncologica&amp;rft.aulast=Chistiakov%2C+D.A.%3B+Voronova%2C+N.V.%3B+Chistiakov%2C+P.A.&amp;rft.au=Chistiakov%2C+D.A.%3B+Voronova%2C+N.V.%3B+Chistiakov%2C+P.A.&amp;rft.date=2008&amp;rft.volume=47&amp;rft.issue=5&amp;rft.pages=809-24&amp;rft_id=info:doi\/10.1080%2F02841860801885969&amp;rft_id=info:pmid\/18568480&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EschrichAGene09-22\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-EschrichAGene09_22-0\" rel=\"external_link\">22.0<\/a><\/sup> <sup><a href=\"#cite_ref-EschrichAGene09_22-1\" rel=\"external_link\">22.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Eschrich, S.A.; Pramana, J.; Zhang, H. et al.&#32;(2009).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3038688\" target=\"_blank\">\"A gene expression model of intrinsic tumor radiosensitivity: Prediction of response and prognosis after chemoradiation\"<\/a>.&#32;<i>International Journal of Radiation Oncology, Biology, and Physics<\/i>&#32;<b>75<\/b>&#32;(2): 489-96.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijrobp.2009.06.014\" target=\"_blank\">10.1016\/j.ijrobp.2009.06.014<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3038688\/\" target=\"_blank\">PMC3038688<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19735873\" target=\"_blank\">19735873<\/a><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3038688\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&amp;artid=PMC3038688<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+gene+expression+model+of+intrinsic+tumor+radiosensitivity%3A+Prediction+of+response+and+prognosis+after+chemoradiation&amp;rft.jtitle=International+Journal+of+Radiation+Oncology%2C+Biology%2C+and+Physics&amp;rft.aulast=Eschrich%2C+S.A.%3B+Pramana%2C+J.%3B+Zhang%2C+H.+et+al.&amp;rft.au=Eschrich%2C+S.A.%3B+Pramana%2C+J.%3B+Zhang%2C+H.+et+al.&amp;rft.date=2009&amp;rft.volume=75&amp;rft.issue=2&amp;rft.pages=489-96&amp;rft_id=info:doi\/10.1016%2Fj.ijrobp.2009.06.014&amp;rft_id=info:pmc\/PMC3038688&amp;rft_id=info:pmid\/19735873&amp;rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3038688&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZhaoDevelop16-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ZhaoDevelop16_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zhao, S.G.; Chang, S.L.; Spratt, D.E. et al.&#32;(2016).&#32;\"Development and validation of a 24-gene predictor of response to postoperative radiotherapy in prostate cancer: A matched, retrospective analysis\".&#32;<i>The Lancet, Oncology<\/i>&#32;<b>17<\/b>&#32;(11): 1612\u201320.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS1470-2045%2816%2930491-0\" target=\"_blank\">10.1016\/S1470-2045(16)30491-0<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27743920\" target=\"_blank\">27743920<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Development+and+validation+of+a+24-gene+predictor+of+response+to+postoperative+radiotherapy+in+prostate+cancer%3A+A+matched%2C+retrospective+analysis&amp;rft.jtitle=The+Lancet%2C+Oncology&amp;rft.aulast=Zhao%2C+S.G.%3B+Chang%2C+S.L.%3B+Spratt%2C+D.E.+et+al.&amp;rft.au=Zhao%2C+S.G.%3B+Chang%2C+S.L.%3B+Spratt%2C+D.E.+et+al.&amp;rft.date=2016&amp;rft.volume=17&amp;rft.issue=11&amp;rft.pages=1612%E2%80%9320&amp;rft_id=info:doi\/10.1016%2FS1470-2045%2816%2930491-0&amp;rft_id=info:pmid\/27743920&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ScottAGenome17-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ScottAGenome17_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Scott, J.G.; Berglund, A.; Schell, M.J. et al.&#32;(2017).&#32;\"A genome-based model for adjusting radiotherapy dose (GARD): A retrospective, cohort-based study\".&#32;<i>The Lancet, Oncology<\/i>&#32;<b>18<\/b>&#32;(2): 202-211.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS1470-2045%2816%2930648-9\" target=\"_blank\">10.1016\/S1470-2045(16)30648-9<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27993569\" target=\"_blank\">27993569<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+genome-based+model+for+adjusting+radiotherapy+dose+%28GARD%29%3A+A+retrospective%2C+cohort-based+study&amp;rft.jtitle=The+Lancet%2C+Oncology&amp;rft.aulast=Scott%2C+J.G.%3B+Berglund%2C+A.%3B+Schell%2C+M.J.+et+al.&amp;rft.au=Scott%2C+J.G.%3B+Berglund%2C+A.%3B+Schell%2C+M.J.+et+al.&amp;rft.date=2017&amp;rft.volume=18&amp;rft.issue=2&amp;rft.pages=202-211&amp;rft_id=info:doi\/10.1016%2FS1470-2045%2816%2930648-9&amp;rft_id=info:pmid\/27993569&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KimAText17-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KimAText17_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kim, K.H.; Lee, S.; Shim, J.B. et al.&#32;(2017).&#32;\"A text-based data mining and toxicity prediction modeling system for a clinical decision support in radiation oncology: A preliminary study\".&#32;<i>Journal of the Korean Physical Society<\/i>&#32;<b>71<\/b>&#32;(4): 231\u20137.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3938%2Fjkps.71.231\" target=\"_blank\">10.3938\/jkps.71.231<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A+text-based+data+mining+and+toxicity+prediction+modeling+system+for+a+clinical+decision+support+in+radiation+oncology%3A+A+preliminary+study&amp;rft.jtitle=Journal+of+the+Korean+Physical+Society&amp;rft.aulast=Kim%2C+K.H.%3B+Lee%2C+S.%3B+Shim%2C+J.B.+et+al.&amp;rft.au=Kim%2C+K.H.%3B+Lee%2C+S.%3B+Shim%2C+J.B.+et+al.&amp;rft.date=2017&amp;rft.volume=71&amp;rft.issue=4&amp;rft.pages=231%E2%80%937&amp;rft_id=info:doi\/10.3938%2Fjkps.71.231&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ArimuraApplications16-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ArimuraApplications16_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Arimura, H.; Nakamoto, T.&#32;(2016).&#32;\"Applications of Machine Learning for Radiation Therapy\".&#32;<i>Igaku Butsuri<\/i>&#32;<b>36<\/b>&#32;(1): 35\u20138.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.11323%2Fjjmp.36.1_35\" target=\"_blank\">10.11323\/jjmp.36.1_35<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28428495\" target=\"_blank\">28428495<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Applications+of+Machine+Learning+for+Radiation+Therapy&amp;rft.jtitle=Igaku+Butsuri&amp;rft.aulast=Arimura%2C+H.%3B+Nakamoto%2C+T.&amp;rft.au=Arimura%2C+H.%3B+Nakamoto%2C+T.&amp;rft.date=2016&amp;rft.volume=36&amp;rft.issue=1&amp;rft.pages=35%E2%80%938&amp;rft_id=info:doi\/10.11323%2Fjjmp.36.1_35&amp;rft_id=info:pmid\/28428495&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NicolaeEval17-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NicolaeEval17_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Nicolae, A.; Morton, G.; Chung, H. et al.&#32;(2017).&#32;\"Evaluation of a Machine-Learning Algorithm for Treatment Planning in Prostate Low-Dose-Rate Brachytherapy\".&#32;<i>International Journal of Radiation Oncology, Biology, and Physics<\/i>&#32;<b>97<\/b>&#32;(4): 822-829.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijrobp.2016.11.036\" target=\"_blank\">10.1016\/j.ijrobp.2016.11.036<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28244419\" target=\"_blank\">28244419<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Evaluation+of+a+Machine-Learning+Algorithm+for+Treatment+Planning+in+Prostate+Low-Dose-Rate+Brachytherapy&amp;rft.jtitle=International+Journal+of+Radiation+Oncology%2C+Biology%2C+and+Physics&amp;rft.aulast=Nicolae%2C+A.%3B+Morton%2C+G.%3B+Chung%2C+H.+et+al.&amp;rft.au=Nicolae%2C+A.%3B+Morton%2C+G.%3B+Chung%2C+H.+et+al.&amp;rft.date=2017&amp;rft.volume=97&amp;rft.issue=4&amp;rft.pages=822-829&amp;rft_id=info:doi\/10.1016%2Fj.ijrobp.2016.11.036&amp;rft_id=info:pmid\/28244419&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193146\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.677 seconds\nReal time usage: 0.705 seconds\nPreprocessor visited node count: 22474\/1000000\nPreprocessor generated node count: 36544\/1000000\nPost\u2010expand include size: 192255\/2097152 bytes\nTemplate argument size: 59958\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 683.778 1 - -total\n 85.09% 581.803 1 - Template:Reflist\n 73.77% 504.413 27 - Template:Citation\/core\n 69.71% 476.696 25 - Template:Cite_journal\n 10.55% 72.140 54 - Template:Citation\/identifier\n 10.50% 71.807 1 - Template:Infobox_journal_article\n 10.10% 69.094 1 - Template:Infobox\n 6.15% 42.066 1 - Template:Cite_book\n 5.93% 40.524 80 - Template:Infobox\/row\n 3.79% 25.947 116 - Template:Hide_in_print\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10717-0!*!0!!en!*!* and timestamp 20181214193145 and revision id 33693\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology\">https:\/\/www.limswiki.org\/index.php\/Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","2c1bea416fe89e4530ea8d302ad49dbc_images":[],"2c1bea416fe89e4530ea8d302ad49dbc_timestamp":1544815905,"795feead44bb9c43869be23a90bf9d75_type":"article","795feead44bb9c43869be23a90bf9d75_title":"The development of data science: Implications for education, employment, research, and the data revolution for sustainable development (Murtagh and Devlin 2018)","795feead44bb9c43869be23a90bf9d75_url":"https:\/\/www.limswiki.org\/index.php\/Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development","795feead44bb9c43869be23a90bf9d75_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:The development of data science: Implications for education, employment, research, and the data revolution for sustainable development\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nThe development of data science: Implications for education, employment, research, and the data revolution for sustainable developmentJournal\n \nBig Data and Cognitive ComputingAuthor(s)\n \nMurtagh, Fionn; Devlin, KeithAuthor affiliation(s)\n \nUniversity of Huddersfield, Stanford UniversityPrimary contact\n \nEmail: fmurtagh at acm dot orgYear published\n \n2018Volume and issue\n \n2(2)Page(s)\n \n14DOI\n \n10.3390\/bdcc2020014ISSN\n \n2504-2289Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttp:\/\/www.mdpi.com\/2504-2289\/2\/2\/14\/htmDownload\n \nhttp:\/\/www.mdpi.com\/2504-2289\/2\/2\/14\/pdf (PDF)\n\nContents\n\n1 Abstract \n2 1. Introduction: Data science as the convergence and bridging of disciplines \n3 2. Historical development of data science and some contemporary examples of cross-disciplinarity \n\n3.1 2.1 Historical prominence of data science in recent times \n3.2 2.2 Practical association of disciplines and sub-disciplines \n\n\n4 3. Open data, reproducibility, and the data curation challenge \n5 4. Integration of data and analytics: Context of applications \n6 5. Short review of contemporary data science in education and in employment \n\n6.1 5.1. Teaching and learning for data science \n6.2 5.2. Employment requirements in data science \n\n\n7 6. Data science methodology to address: Selection bias, scale and aggregation effects, and qualitative evaluation of decision-making impact \n8 7. Benefits of high profiling of data science \n9 8. Important new research challenges from data \n10 9. Information space theory for big data analytics in internet of things and smart environments \n\n10.1 9.1 Context, situation theory, and completion diagrams \n\n\n11 10. Conclusions \n12 Acknowledgements \n\n12.1 Author contributions \n12.2 Conflicts of interest \n\n\n13 References \n14 Notes \n\n\n\nAbstract \nIn data science, we are concerned with the integration of relevant sciences in observed and empirical contexts. This results in the unification of analytical methodologies, and of observed and empirical data contexts. Given the dynamic nature of convergence, the origins and many evolutions of the data science theme are described. The following are covered in this article: the rapidly growing post-graduate university course provisioning for data science; a preliminary study of employability requirements; and how past eminent work in the social sciences and other areas, certainly mathematics, can be of immediate and direct relevance and benefit for innovative methodology, and for facing and addressing the ethical aspect of big data analytics, relating to data aggregation and scale effects. Associated also with data science is how direct and indirect outcomes and consequences of data science include decision support and policy making, and both qualitative as well as quantitative outcomes. For such reasons, the importance is noted of how data science builds collaboratively on other domains, potentially with innovative methodologies and practice. Further sections point towards some of the major current research issues.\nKeywords: big data training and learning, company and business requirements, ethics, impact, decision support, data engineering, open data, smart homes, smart cities, IoT\n\n1. Introduction: Data science as the convergence and bridging of disciplines \nThe context of our problem solving and analytics will always be quite fundamental, very specific, and particularly oriented. (Section 4 of this article draws some interesting and relevant implications of this.) This article is oriented towards commonality and mutual influence of methodologies, and of analytical processes and procedures. A nice example of the parallel nature of such things is how \"big data analytics\" is often considered a synonym of \"data science.\" In Section 2.2, it is mentioned how public transport may well use smartphone and mobile phone wireless connection data to observe locations of individuals. This close association or, perhaps even, identity of big data analytics and data science will have growing importance with the internet of things (IoT), and smart cities and smart homes, and so on (as noted in Section 8). The McKinsey Global Institute provided an outstanding perspective on this idea in their paper The age of analytics: Competing in a data-driven world.[1]\nIn Section 8 and Section 9 of this article, very important developments are at issue, encompassing newly oriented and pursued methodologies, and the integration of research domains. Section 7 notes how important all of the content here is to sustainable development. The phrase \"data revolution\" is based here on ongoing work by the United Nations, and by so many of us in this domain, and from national authorities in Africa and the Middle East discussing issues here at the most recent (2017) World Statistics Congress.\nThis converging and bridging of disciplines is increasingly important. For example, Mahabal et al.[2] discuss the parallels between astronomy and Earth science data, methodology transfer, and metadata and ontologies characterized as being crucial. They claim the convergence or bridging of disciplines must address \u201cnon-homogeneous observables, and varied spatial, temporal coverage at different resolutions.\u201d[2] This quotation is very familiar to us in regard to how NoSQL databases are now widely used, as well as traditional relational databases. Another example is how text mining, social media, and many other domains have become so very important in many contexts. Then, given computational support, \u201cit is the complexity more than the data volume that proves to be a bigger challenge.\u201d[2] Further benefits of this data science convergence are termed here \"tractability\" and \"reproducibility.\" Mahabal et al.[2] also discuss the complexity relating to resolution and distributions. In a separate work, Murtagh[3] characterized this in terms of data encoding. Plenty of work now emphasizes the importance of p-adic data encoding (binary or ternary when p = 2 or 3), compared with real-valued encoding (m-adic, especially when m = 10).\nThe convergence and bridging of disciplines is fully emphasized by Mahabal et al. as such[2]: \n\nMethodology transfer can almost never be unidirectional. Diverse fields grow by learning tricks employed by other disciplines. The important thing is to abstract data\u2014described by meaningful metadata\u2014and the metadata in turn connected by a good ontology.\nFurther description is at issue in regard to collaboration in data science[2]: \n\nWe have described here a few techniques from astroinformatics that are finding use in geoinformatics. There would be many from earth science that space science would do well to emulate. Even other disciplines like bioinformatics provide ample opportunities for methodology transfer and collaboration. With growing data volumes, and more importantly the increasing complexity, data science is our only refuge. Collaboration in data science will be beneficial to all sciences.\n2. Historical development of data science and some contemporary examples of cross-disciplinarity \nA short historical perspective that follows is with reference to such disciplines as computer and information sciences, mathematics and statistics, physics, and, implicitly, social sciences. In concluding this description, a key point will be how data science encompasses and embraces all of the following: cross-disciplinarity, interdisciplinarity, and multidisciplinarity.\n\n2.1 Historical prominence of data science in recent times \nThe origins of data science are largely due to Chikio Hayashi and others. Hayashi[4] says \u201cI will present 'data science' as a new concept,\u201d followed by a relevant introduction to the science of data: \u201cData Science consists of three phases: design for data, collection of data and analysis on data.\u201d[4] In Ohsumi[5], the abstract has this: \u201cIn 1992, the author argued the urgency of the need to grasp the concept 'data science'. Despite the emergence of concepts such as data mining, this issue has not been addressed.\u201d\nEscoufier et al.[6] note how data science arises from the convergence of computer science and statistics, which \"gives birth to a new science at its core.\" They conclude that \"[t]o take data as a starting point provides a complementary vision of theory and practice, and avoids creating an unfortunate gap between two steps, both of which are essential in any scientific process.\"[6]\nCao provides a comprehensive overview of data science[7], noting how the \u201cfirst conference to adopt 'data science' as a topic\u201d was the International Federation of Classification Societies (IFCS) 1996 conference, in Kobe, Japan. This was fully consistent with our work as participants, then and now (IFCS 2017, in Tokyo, Japan, also had \"data science\" in its title). Ueno[8] makes a similar point about IFCS 1996 as the first conference with \"data science\" in its title, and he also claims that the journal Behaviormetrika is \"the oldest journal addressing the topic of data science,\" when it started in 1974. He describes data science as \"an interdisciplinary field that includes the use of statistical methods to extract meaningful knowledge from data in various forms: either structured or unstructured.\"[8]\nCao[7] provides additional historical perspectives, with the section heading \"The Data Science journey,\" relating largely to work in the 1960s and 1970s. This includes \"information discovery\" as a continuing key objective in data science. Englmeier and Murtaugh[9] also make note of this objective, emphasizing the \u201csemantic dimension of data science,\u201d through the information discovery lifecyle, and the \u201cdiscovery lifecycle in text mining.\u201d While also emphasizing cooperation, and cross-disciplinarity, there is this: we see the data scientist\u2019s responsibility...\n\n in the design of an overarching semantic layer addressing data and analysis tools,\n in identifying suitable data sources and data patterns that correspond to the appearance of structured and unstructured data, and\n in the management of the information discovery lifecycle and discovery teams.\nAn ever-more important issue arises from the data sources that are employed. As a summary expression, data science is, firstly, the integration of data sources and analytical and related data processing methodologies, and, secondly and quite fundamentally, arising from the convergence of disciplines. Convergence of disciplines can be quite beneficial in practice, particularly in regard to addressing and solving problems, and also in regard to the cooperation yielded by cross-disciplinarity. See Section 5, below, for some current discussion on how the problems and challenges to be addressed can and should be, quite naturally, arising out of all aspects of data science.\nThe current era of data science can be considered as a culmination of previous epochs that gave rise to major digital technology advances, with implications in all social domains. Largely, the first epoch (in the 1980s) brought about laptop and desktop computers, and the second epoch (in the 1990s) gave rise to the internet and the World Wide Web.\n\n2.2 Practical association of disciplines and sub-disciplines \nCao[7] also makes mention of data science being centered on the following disciplines: statistics, informatics, sociology, and management science. Clearly there is emphasis on \u201csynergy of several research disciplines\u201d and how \u201cinterdisciplinary initiatives are necessary to bridge the gaps between the respective disciplines.\u201d[7] This is exciting and not least because of how there is convergence of disciplines or subdisciplines. We may consider, for example, how the digital humanities can incorporate relevant areas of a few disciplines, how computational psychoanalysis can come to the fore.[10] With a major focus on psychometrics, Coombs[11] has chapters that proceed from \u201cBasic Concepts\u201d to \u201cOn Methods of Collecting Data,\u201d and \u201cPreferential Choice Data.\u201d\nNow, data is so very central to all of our sciences, and to all aspects of our engineering and technology. Murtagh[3] defines just what data is, which includes the concept of data coding, or perhaps also, this should be termed data encoding. After all, data is measurement. This underscores the importance of the mathematical underpinnings in data science. Implications that follow include the relevance and importance for new, innovative directions to be followed, and from effective problem solving. The mathematical view of what measurement means is all important, as well as in the discipline of physics. Murtagh[3] cites eminent physicist Paul Dirac as to how mathematics underpins all of physics, and how the work of eminent psychoanalyst Ignacio Matte Blanco has mathematics being integral to psychoanalysis.\nFrom a major study of big data and surveying by the American Association for Public Opinion Research comes the following[12]: \u201cThe classic statistical paradigm was one in which researchers formulated a hypothesis, identified a population frame, designed a survey and a sampling technique and then analyzed the results \u2026 The new paradigm means it is now possible to digitally capture, semantically reconcile, aggregate, and correlate data.\u201d\nAbbany[13] notes that wireless connection data is forming a basis for public transport management. Such big data sources can be associated with, or even integrated with, personal and social behavioral patterns and activities. \u201cBetter living through data?\u201d asks Abbany, followed by a very critical statement: \u201cThe other thing I need to declare is that I\u2019m no fan of our contemporary belief that life can only get better the more data we have at our disposal.\u201d[13] A response to this would be that data science, as the science of data, is everything relating to the path and trajectory connecting data, information, knowledge, and wisdom.\nDarabi[14] reports that \u201cThe UK\u2019s next census will be its last,\u201d with administrative, governmental authorities\u2019 data replacing the national census. This is acknowledged: \u201cCollecting the data itself is only half the work. A great deal of effort must go into combining it with other sources, in order to answer real questions.\u201d That can be understood as undertaking scientific investigation of such data, and other potentially relevant data. The cross-disciplinarity inherent in that also can, and perhaps must, lead to new interdisciplinary linkages. Arising out of the ending of the national census is the recognition that how the \"government counts its people is changing, and it could transform policy.\u201d\nOne issue here has been how mathematics underpins so much, across disciplines, and also in the commercial and in most social domains. Many universities in the recent past shut down their mathematics departments and no longer provide teaching in mathematics. However, this is being reversed, with university courses again being provided in mathematics.\n\n3. Open data, reproducibility, and the data curation challenge \nWhile generally recognized as so important for innovation in both application outcomes and in regard to analytics and methodologies, open data plays a key role for data scientists. (Information and news about open data is well provided by the organization Open Data Institute: https:\/\/theodi.org).\nOne major aspect of how big data analytics are quite central to data science is the increasing availability of open data. Cao[7] associates this with methodology, through \u201cthe open model rather than a closed one.\u201d This concept was central to a May 2017 London presentation by Dr. Robert Hanisch, Director, Office of Data and Informatics, National Institute of Standards and Technology. Dr. Hanisch worked for 30 years on the Hubble Space Telescope (HST) project. Due to open access to observed data, from our cosmos, Dr. Hanisch noted that three times the number of people directly engaged in HST work were working on HST data. As such, there were three times the benefits drawn from HST data.\nDr. Hanisch noted how important the national metrology institutes were to their efforts. Arising from this was, and is, the importance of reproducibility and interoperability of all of analytics comprising data science. Underpinning these very important themes in data science work is data curation. Data curation is still a major challenge to be addressed. Noted in Dr. Hanisch\u2019s presentation is the contemporary \u201ccrisis\u201d of reproducibility. At issue is to support data management from acquisition to publication, whether it occurs in business, medical, governmental or other sectors. The computing expert will recognize this crucial theme of data curation as associated with metadata and evolving ontologies.\nFor the latter, i.e., Murtagh et al.[15] discuss in a broad and general context the very important and central role of evolving ontology, research publishing, and research funding. While challenges remain to be pursued and addressed, it is important to note that astronomy and astrophysics offer interesting paradigms for open data, and, in many ways, for data curation. Certainly, further research will be carried out on data curation, as well as evolving and interacting ontologies, all of which are core issues for metrology, hence the very basis of all manufacturing technology, and, as described in the latter citation, for research publications and research funding.\nCao[7] also discusses \u201cthe open model and open data.\u201d From that discussion results the concept of multidisciplinarity\u2014expressed here as the convergence of disciplines\u2014which can be aided and facilitated by the openness of analytics, data management, and all data science methodologies. Keeping methodologies open allows domain experts to both link up with and perhaps even, if feasible, to integrate with all that is at issue in other relevant domains. As such, a plea for openness of data science as a discipline continues to grow, particularly when viewed as a convergence of disciplines.\n\n4. Integration of data and analytics: Context of applications \nThe integration of data and analytics in data science has resulted in a strong need to acknowledge and address challenges and other issues with data and the underpinning or contextual reality of the data. Informally expressed, our data represents reality or the context from which the measurements arose, i.e., the data numeric values or qualitative representations. This requires data scientists to focus on quality and standards of work.\nHand[16] contributes numerous important points relevant to the discussion here, describing the problems of data quality\u2014in the big data context\u2014relating to administrative data. He notes that data curation is relevant for reproducibility of analytics. The implications for analytics[16]: \n\n... the fact that data are often not of the highest quality has led to the development of relevant statistical methods and tools, such as detection methods based on integrity checks and on statistical properties ... However, this emphasis has often not been matched within the realm of machine learning, which places more emphasis on the final modelling stage of data analysis. This can be unfortunate: feed data into an algorithm and a number will emerge, whether or not it makes sense. However, even within the statistical community, most teaching implicitly assumes perfect data ... Challenge 1. Statistics teaching should cover data quality issues.\nOur analytics should not be a \u201cblack box,\u201d a term that was informally used in regard to neural networks in earlier times. Rather, transparency should always be a key property of analytical methodologies.\nThe view offered by Anderson[17], and discussed by Murtagh[18], quoting Peter Norvig, Google\u2019s research director[18]:\n\nPetabytes allow us to say: \"Correlation is enough.\" We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot. \nHowever, this interesting view, inspired by contemporary search engine technology, is provocative. The author maintains that \"[c]orrelation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.\"\nIt cannot be accepted that correlation supersedes causation, i.e., that analytics can be automated fully, and thereby obfuscate, or make redundant, data science as well as health and well-being analytics. Englmeier and Murtagh[19] reflect similarly on the above, stating their case for comprehensive information governance, encompassing fully the contextualization of all the analytics that are being carried out. Murtagh and Farid[20] discuss quite a good deal of the contextualization of analytics of health and well-being data. In the discussion accompanying the seminal work by Allin and Hand[21] in statistical perspectives on health and well-being, the authors responded to our comments: \u201cWe agree with Murtagh that \u2018big data\u2019 may offer insights, provided that there are appropriate analytics.\u201d\nIt is quite relevant to note here that data science has inherent and integral involvement in the sourcing and in the origins of data, i.e., selection and measurement. Wessel[22] makes this point clear while discussing software applications. This implies full integration of the analytics with what data is selected and sourced, and that may well imply what and how measurement is carried out.\nThe takeaway here may be the priority to be accorded to induction-based, i.e., inductive, reasoning (cf., Murtagh[3]). This could be a minor argument for the importance of approaches that follow from data mining, unsupervised classification, latent semantic analysis, and various other themes. Clearly, however, all of one\u2019s studying and teaching, as well as one\u2019s work for companies, government agencies, and health and other authorities should and really must be properly focused on the aims and objectives. The latter, of course, may need, partially in any case, to be determined by the expert data scientist.\n\n5. Short review of contemporary data science in education and in employment \nIt is quite clear to all involved with many businesses and educational institutions that data science is becoming one of the most important employment prospects. A comparison of employment salaries in the U.S. has data science as having the highest median salary in 2017.[23] It follows that data science and big data will certainly be studied in university courses, and these will of course be related to the prospects and potential for the students.\nIn this section, two themes relate to the contemporary context: higher education in data science and company employment advertisements. We turn to accessible and available data to discuss these themes. Of course, an expert data scientist is very likely to be involved in many discussions and debates with current and potential students, company executives, and many others. It can even be seen that most disciplines have to be integrated into data science, requiring pedagogical innovation in education.[24]\n\n5.1. Teaching and learning for data science \nBriefly considered here are current higher education post-graduate programs (usually termed MSc courses) in data science. This is also possibly relevant for undergraduate programs, and certainly relevant for undergraduate projects and company placements of students.\nIn universities in all countries worldwide, in recent years, there has been a great increase in graduate level courses in data science, and increasingly also in undergraduate level courses. Press[25] maintains a listing of graduate courses, in some cases, but not in all, with the title \"data science.\" This listing, with links to the host institute, contains 102 MSc courses, 19 online courses, 11 free online courses, and eight for a fee, for a total of 140 graduate level courses.\nThe theme of having data and analytics well integrated is again reiterated. Consider the most essential requirements of a data scientist, as described by Englmeier and Murtagh[19], and then note the close linkage between data science and big data. Emphasized is the great need to avoid false positives coming from the data science analytics that is carried out. This arises from treating the data without fully linking and even integrating the analytics with the context, the relevance, and all that is to do with application and problem conceptualization. They note the well-known errors arising out of the Google Flu Trends, arising from Google searching, and service usage patterns obtained from taxi company, Uber. These were outcomes that produced false positives. There must be fully comprehensive information governance, encompassing all levels of information discovery, through conceptualization that can benefit significantly if collectively undertaken.\n\n5.2. Employment requirements in data science \nMany employment opportunities are now on offer for data scientists. For example, the comprehensive review of data science by Cao[7] has a section entitled \u201cData economy: Data industrialization and services\u201d that describes the expanding business opportunities available to budding data scientists. More examples of opportunities can be found at the increasingly popular web service DataScientistJobs: https:\/\/datascientistjobs.co.uk.\nThe stated requirements for data science jobs vary. While one must give the fullest perspective to the companies that one works with, and the university data science courses that one teaches, what follows is both a consideration and a selection of job requirement data, and preliminary results. This preliminary study of requirements for data science roles, some of them in senior management, represents what will become expanded research over time, both for the benefit of our data science students so they may be better prepared for work, and association, with companies both nationally and globally.\nOnline discussion of data science and big data have become commonplace, and they often include discussion of surveys that have been conducted. Examples include NewVantage Partners' Big Data Executive Survey[26] of senior corporate executives, which found the dominant sector was financial services. This survey concerned internal investment and organizational matters, as well as business practices and plans. Another survey by Hayes[27] asked questions of more than 620 data professionals in regard to skills required and at issue in data science. The author provides an interesting summary and presentation of results obtained using factor analysis.\nDescriptions of new data scientist job listings from 2015 to 2017 were considered, all from the distribution list StatsJobs (sometimes indirectly, through links) in England. In most cases, specific programming languages or software environments at issue were indicated. In a few cases, the job advertisements did not explicitly list these details. Retained for use here were 73 such job descriptions. The frequent (more than four advertisements) software languages and software environments were as follows: R (50), Python (44), SQL (30), SAS (28), Hadoop (25), Matlab (17), SPSS (17), Java (16), Hive (14), Excel (9), MapReduce (9), NoSQL (8), Spark (7), C++ (6), Pig (6), Tableau (6), HBase (5), C# (4), Mahout (4), QlikView (4), and Scala (4).\nTo have sufficient comparability of software languages or environments, 21 of the above were selected that were required by at least four employers. Since some employers failed to list any software language or environment requirements, and indeed about three of the set of 72 had no explanation at all of what was required, consequently the set of employers was reduced to 60. Thus, a cross-tabulation of 60 employers seeking to employ a data scientist was used, with a few requirements or desires for expertise in particular software languages and software environments. The manner of expression was most often \"one or another or another again.\"\nCorrespondence analysis takes the employer set, and the software set, in the dual multidimensional spaces, both endowed with the chi squared metric, and maps both clouds into a factor space endowed with the Euclidean metric. Hierarchical clustering was carried out from the full dimensionality (therefore with no loss of, or decrease in, information content) factor space. The goal was to see what associations of software are most likely to be the case from these data scientist job advertisements.\nFigure 1 and Figure 2 display the clustering of the software languages or environments. It is our intention to take such a mapping much further, with supplementary elements, also termed contextual elements, to locate them in the factor space, and these would include country or location of the job, industrial sector or government agency, or global corporate firm for the job. The objective would be to determine sector or regional preferences in the skills and abilities of the data scientists employed.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1: From 60 data scientist job advertisements, non-empty from the initial set of 72 employers, with use of 21 software languages or environments. The latter were required by at least four employers. Displayed is the principal factor plane.\n\n\n\n\n\n\n\n\n\n\n\n\n Figure 2: Hierarchical clustering, and derived three-class partition, of the 21 software languages or environments at issue here, based on the full dimensionality, Euclidean-metric endowed, factor space. See Section 3. The agglomerative criterion is the Ward minimum variance method.\n\n\n\n6. Data science methodology to address: Selection bias, scale and aggregation effects, and qualitative evaluation of decision-making impact \nAfter reviewing some of the challenges of big data analytics, involving selection bias and replacement of individual attributes with aggregated attributes (hence, the collective attributes of groups to which the individual can belong), it's time to point to innovative new methodological perspectives that can both address such issues and challenges, but also benefit from the context, for example of using big data sources. Murtagh[28] uses as an example a case study involving work for a major company, with its own example of how aggregated data can be used, if required, for individual-related analysis.\nEthical as well as methodological issues arise in scale effects, representation and expression, and particular context effect. Here, both the ethical implications, and the potential for qualitatively and quantitatively evaluating impact of decision-making and policy-making are summarized.\nThe quite regular lacking of coordination, alignment, and integration of methodology, including modelling, with data sourcing, is pointed to by Hand[29], noting the \u201cignorance of selection mechanisms has led to mistakes,\u201d and that selection or distortion processes can apply to not only \"human interactions\u2014where it has been suggested that the notion that \u2018data=all\u2019 can replace the need for careful theorising and statistical modelling\u2014but also in the hard sciences and medicine.\u201d\nKeiding and Louis[30] point out how one case study discussed \u201cshows the value of using \u2018big data\u2019 to conduct research on surveys (as distinct from survey research).\u201d Limitations though are clear: \u201cAlthough randomization in some form is very beneficial, it is by no means a panacea. Trial participants are commonly very different from the external ... pool, in part because of self-selection ...\u201d\nThe authors address these contemporary issues by noting \u201c[w]hen informing policy, inference to identified reference populations is key.\u201d[30] This is part of the bridge which is needed, between data analytics technology and deployment of outcomes.\nThey also warn of additional caveats. \"In all situations, modelling is needed to accommodate non-response, dropouts and other forms of missing data.\u201d[30] Noting that \u201c[r]epresentativity should be avoided,\u201d here is a fundamental way to address what we need to address: \u201cAssessment of external validity, i.e. generalization to the population from which the study subjects originated or to other populations, will in principle proceed via formulation of abstract laws of nature similar to physical laws.\u201d[30] When considering Keiding and Louis' words, it is worth noting how\u2014related to eminent social scientist Pierre Bourdieu\u2019s work\u2014homology between fields of study offer clear perspectives on how beneficial innovative practice can be pursued.\nThis incorporates our need to \u201crehabilitate the individual\u201d in our analytics, and not simply replace the individual by the mean of some group. Many case studies of the latter are provided by eminent mathematical data scientist Cathy O\u2019Neill.[31] Le Roux and Lebaron[32] have a similar sentiment: \u201cRehabilitation of individuals. The context model is always formulated at the individual level, being opposed therefore to modelling at an aggregate level for which the individuals are only an \u2018error term\u2019 of the model.\u201d\nCalibrating surveys and other data sources, through use of big data, has been at issue in addressing challenges and obstacles described in Keiding and Louis.[30] In regard to decision-making and policy-making, the analysis of discourse in a data-driven way can provide relevant or necessary contextualization. Without having such an approach, a limited capability on the part of those in authority emerges: \u201ctop-down communication campaigns both predominate and are advised by those involved in social marketing ... However, this rarely manifests itself through measurable behaviour change ...\u201d[33]\nInstead, mediated by the latent semantic mapping of the discourse, we develop semantic distance measures between deliberative actions and the aggregate social effect.[33] We let the data speak in regard to influence, impact, and reach. Impact is defined in terms of semantic distance between the initiating action and the net aggregate outcome. This can be statistically tested. It can be visualized. It can be further visualized and evaluated.\nFor research and for all engagement in data science, it is motivational to both address and have significant achievements in regard to innovative methodology.\n\n7. Benefits of high profiling of data science \nMany blog posting declare \u201cbig data is dead.\u201d (A Google query of the phrase, dated 2017-12-29, lists 153,000 results.) At issue is just this: complete priority is to be given to the problems to be solved and the challenges to be addressed. In Cao's extensive and outstanding detailing of many aspects of data science[7], in [7] is acknowledgement that there is much that is still currently \u201ctremendous hype and buzz,\u201d and \u201cengendering enormous hype and even bewilderment.\u201d There is this perspective, too, which can be a viewpoint if the sole aim were for data science to automate data analytics in all domains of application: counterposed to advanced analytics, \u201cdummy analytics is becoming the default setting of management and operational systems.\u201d[7]\nFully in line with the context of those perspectives, a major theme of this article is that the convergence of disciplines in the data science framework builds on cooperative and collaborative expertise, and thus does not seek to replace or supplant such expertise. A major conclusion is not to replace current disciplines (mathematics, statistics, computing, engineering, physics and chemistry, arts and humanities, social and psychological sciences, and so on) but\u2014where relevant and where appropriate, and also where motivated and where justified\u2014to re-orientate and to bridge primary as well as foundational levels of disciplines.\nIn somewhat humorous fashion, in the sense of revolution versus evolution, let the following be noted. At the 61st World Statistics Congress, in July 2017, in Marrakech, Morocco, there was a session organized jointly by the High Commission for Planning (HCP) of Morocco and the Ministry of Development Planning and Statistics (MDPS) of Qatar. This session was entitled \u201cThe Data Revolution for the Sustainable Development Goals.\u201d One comment raised in the question and answer session was a request for evolution to be at issue rather than revolution. At the same time, it's interesting to note how there is an important advisory group in the United Nations, called the Data Revolution Group (see http:\/\/www.undatarevolution.org) which seeks \u201c[m]obilising the data revolution for sustainable development.\u201d For data science, it is clear that there is great inspiration here. Some other organizational initiatives will now be mentioned. This is to complement a great deal that is being done already by major organizations in statistics, classification and data research, engineering, and explicitly in data science.\nIn European research funding, i.e., Horizon 2020, an important supported project is that of the European Data Science Academy or EDSA (http:\/\/edsa-project.eu), which dates back to 2005. There could well be an important role for such an organization in the future, in regard to sponsoring fellowship levels of organizational memberships, and it would be interesting to promote chartered membership. In the European Commission context, dating from July 2014, we find the \"best practice guidelines for public authorities and open data\u201d under the scope of governments embracing the \"potential of big data.\u201d[34]\nAt the U.K. national level, an important initiative, directly or indirectly related to much that was under discussion in this article (in Section 3, in particular) is open data. The Open Data Institute (see https:\/\/theodi.org) in the U.K. was founded in 2012 by Sir Tim Berners-Lee and Sir Nigel Shadbolt. In welcoming membership applications, there is this: \u201cMembership: Join the data revolution.\u201d There is this prominent statement too: \u201cData is changing our world.\u201d\nIn a practical sense, focused on data to begin with and entirely relevant for data curation now and in the future, we find in comparison the Research Data Alliance or RDA (see https:\/\/www.rd-alliance.org). RDA is supported by the E.U., by the NSF (National Science Foundation) and NIST (National Institute of Standards and Technology) in the U.S., by the JISC (Joint Information Systems Committee) and other agencies in the U.K., and by Australia and Japan.\n\n8. Important new research challenges from data \nThis and the following section engage with major new developments, for problem solving, and for data science and big data analytics, with the partial or complete integration of relevant sciences and technologies, and methodologies, in observed and empirical contexts.\nData science\u2014integrating potentially all application domains, with mathematical foundations for methodology as befits observational science, and integrated observational and experimental science\u2014fully relates data to all that is accomplished and achieved from the data sources. This results in the great importance of the contemporary increasing orientation towards, and requirement for, open data. Mahabal et al.[2] offer a good explanation of this development in data science, and of the potential here for application transfers, in parallel with methodology transfers.\nThe Open Universe initiative (http:\/\/www.openuniverse.asi.it) was established by the United Nations.[35]\nThe initiative stated that \"acknowledging that open data access drives innovation and productivity is a well-established principle in every scientific discipline. However, there is still a considerable degree of unevenness in the services currently offered by providers of data ...\" Among six objectives is the goal of \"advancing calibration quality and statistical integrity,\u201d with outcomes for education, globally, and private sector involvement. Here, and through transference to each and all domains for data science, what is required for open data and all associated open information, is that it must be findable, accessible, interoperable, and reusable (the FAIR Principles), as well as reproducible.[36]\nSupporting the FAIR principles is ESASKY (European Space Agency, Sky, accessible from http:\/\/sci.esa.int\/home), described as \"a discovery portal that provides full access to the entire sky. This open-science application allows computer, tablet and mobile users to visualise cosmic objects near and far across the electromagnetic spectrum.\u201d The interesting new research challenges in Data Science can be stated to be foremostly related to the transfer to many domains of FAIR-based open science, discovery portals.\nAn important application domain in this regard will be emerging smart technologies, which encompass smart homes, smart cities, smart environments in general, and the internet of things. An important \"situation theory\" methodology, in an information space that is mathematically based, furnishing a comprehensive representational system, is proposed by Devlin.[37] Associated with this are the social, legal, and economic aspects of emerging smart technologies in real-life applications.\n\n9. Information space theory for big data analytics in internet of things and smart environments \nContext is extremely important in big data analytics, and in many other domains.[20] Situation theory provides humans (generally, trained domain experts) with powerful, flexible representations that enable them to perform better, both as analysts and decision makers. Systems such as the one outlined in Figure 3 for the U.S. Army (sourced from Devlin[38]) have a software back-end, possibly including artificial intelligence (AI), but they are in no way \u201ccalculators\u201d or expert systems for making decisions. What was done was to harness the power of mathematics primarily as a representational system, compared to its computational capacity. While the back-end software can manipulate the network\u2014each completion diagram is a structurally identical piece of code\u2014perhaps permitting the eventual application of familiar-looking network-optimization algorithms, many of those completion diagrams represent inherently human thoughts, intentions, and actions, and, for the foreseeable future, the human mind remains the best tool to handle them. This work for the United States Army used situation theory to develop a first-iteration specification for a workstation to be used by a field commander, in both mission planning and real-time control. This work takes into account the many different ontologies in a modern battlefield. The role of ontologies is very central in qualitative analysis of research, cf., Darabi[14].\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3: The completion-diagram network is complex. In the screenshot, the aerial view (taken from a previous mission used in training) is of an urban battlefield. The back-end system links the elements in each completion diagram to a corresponding feature in the aerial view, permitting the user to work fluidly with the two representations, having the benefit of two very different views, one spatial, the other human-structural, so the user can explore the domain from (literally) different perspectives.\n\n\n\n9.1 Context, situation theory, and completion diagrams \nIn the early 1980s, a group of researchers at or connected to Stanford University started to develop an analogous mathematically-based representation of communicating humans, looking deeper than the mere fact of communication (captured by the network model used by the telecommunication engineers) to take account of what was being communicated. (Part of the challenge was to decide how far it is possible to go into categorizing that \u201cwhat\u201d in order to achieve a representation that is useful in analyzing communication and designing communication-based activities such as work.) That approach is generally referred to as situation theory. Devlin was one of those early pioneers, who wrote a theoretical book on the subject, Logic and Information.[37] Subsequently, the techniques developed by the Stanford group were applied by Devlin and Rosenberg[39][40] to solve an actual workplace problem involving communication in the workplace.\nThe representation[40] used was (of necessity) similar to that used by telecommunication engineers, Google, the postal system, UPS, and FedEx in that the domain is represented by a network. However, whereas those earlier examples had networks of point nodes, the nodes in the network were more complicated objects, which were termed \u201ccompletion diagrams.\u201d See the right-hand side of Figure 3, where \u201csituation s1\u201d results in \u201ctype T1\u201d, and \u201csituation s2\u201d results in \u201ctype T2\u201d, so that transition from \u201csituation 1\u201d to \u201csituation s2\u201d has the related association between \u201ctype T1\u201d and \u201ctype T2\u201d. The exact nature of the entities in such a completion diagram: they can be considered as capturing the key elements of a basic human act, here military and managerial action, including a communicative act. Much of Logic and Information is devoted to the development and explication of such a completion diagram. It has its origins in work by Barwise and Perry.[41]\nInformation is a vehicle for the use of a big data approach to underpin the study of interaction and communication in smart environments (e.g., cities, workplaces, and homes). \"Information space theory\" is to provide the focus for building an inter-disciplinary community concerned with social and technological issues associated with recent technological advances. Relevant emerging research and innovation disciplines include the internet of things, internet of everything, and big data analytics, among others, that contribute to the design, development, and effective implementation of smart environments in real life.\nResearch projects related to both \u201cinformation space theory\u201d and \u201cinteraction space theory\u201d include SANE, \u201cSustainable Accommodation for the New Economy\u201d, a European Framework 5 research project with very innovative aims and outcomes for research and for industrial companies. It's described as \"a multi-disciplinary and multi-cultural R&amp;D project that takes a location independent approach to the design of a sustainable workplace to ensure compatibility between fixed and mobile, local and remote work areas\" and one that will specify, prototype, and develop a set of ICT tools with \"emphasis being on the innovative application of emerging technologies and services.\"[42] Another project involving universities in the U.K. and in Germany was IS-VIT, \u201cInteraction Space of the Virtual IT Workplace\u201d. Related outcomes of these projects are described by Rosenberg et al.[43] and Walkowski et al.[44]\nInformation space theory takes into account the following: (i) People who inhabit smart environments and spontaneously generate data and information in the course of their day-to-day activities; (ii) Place which can be public (smart cities), privileged (workplaces) or private (homes) with varying degrees of privacy and security constraints that shape information sharing; and (iii) Patterns of interaction between people and technology that is an integral part of smart environments and influences human\u2013human, human\u2013device and device\u2013device interaction.\nA summary follows of inter-disciplinary information space theory and its application in smart environments: (i) an introduction to studies of information, data and interaction; (ii) big data analytics as a tool for the development of information space theory; (iii) information space theory and its impact on the design of smart environments; (iv) information space and human communication research, involving an account of the evolution of smart interaction systems; (v) further refinement of information space theory informed by cross-disciplinary perspectives and requirements of application in smart systems and emerging technologies, including contribution to the application of big data analytics in real-life smart environments; and (vi) the concept of information space as a distinct feature of human context that makes it possible for people to achieve coordination and reciprocity of perspectives through smart interaction systems that safeguard their privacy and security.\nSuch work builds on the work of an inter-disciplinary group of researchers within mathematics, computer, and social sciences who are attempting to address key research questions: How do emerging smart technologies influence information sharing in interaction between people and technology in smart environments? What are the social, legal, and economic impacts of emerging smart technologies in real-life application?\nTo this end, the concept of information space will guide the investigation into interactions that occur within smart environments, taking account of human\u2013human, human\u2013device, and device\u2013device interaction in a uniform framework. Special attention is given to information sharing\u2014pathways, enablers and gatekeepers\u2014to incorporate security and privacy concerns that urgently need to be addressed in order to optimize the technology potential in real-life applications of smart environments. The working assumption behind this approach is that inter-disciplinary, formal, and theoretical understanding of the nature of these interactions is essential for these concerns to be addressed and resolved.\nIn this context, mathematics plays a crucial role in developing and using a mathematically-based representation framework for the analysis and design of work in the era of the internet of things. Both in life and in scientific studies, what we can achieve depends on, and is constrained by, the representational system we use. The greater the complexity of the domain, the more significant is the representation at our disposal\u2014representations are what make it possible for us to understand and reason about the world. For instance, trade, commerce, and financial activity in Europe were revolutionized by the introduction of the Hindu-Arabic, decimal arithmetic system (\u201cmodern arithmetic\u201d) in the thirteenth century, which made it possible for anyone to become proficient in arithmetic after just a few weeks practice. A similar revolution occurred in the 1980s, when the introduction of the modern, windows\u2013icons\u2013mouse interface for personal computers made it possible for ordinary people to use what had until then been a tool for trained experts. Long before those two examples, the introduction of numbers themselves, in the form of a monetary system, transformed human life by providing a simple, quantitative representation system for property ownership and social indebtedness.\nThe rise of natural science involved a new representation system that assigned numerical values to various features of the environment (features given names such as length, area, volume, mass, temperature, momentum, etc.) and shifted the focus from trying to understand why things occurred to simply measuring how one quantified feature varied with another\u2014an approach that proved to be extremely fruitful for society. The representation systems of the natural sciences have all been based on mathematics to a considerable extent. In the social realm, mathematically-based representation systems are less common, but when they have been developed, they have proved to be extremely powerful. (Money is a particularly dramatic example.) Indeed, one of the most widespread applications of mathematics in today\u2019s world is the optimization of various human activities. Computer queries depend on optimization in a mathematical space that treats every living human as a node in a simple mathematical structure called a graph. \u201cModelling\u201d a person as a point node in a mathematical network omits all information about a person save for one factor: the connections of that human to all other humans. However, for questions that hinge on that one factor, the representation enables mathematical algorithms to be applied that provide society with one of its most important tools.\nAnother example is provided by the algorithms that route our telephone calls, our internet communication, or mail and package delivery systems, and our transportation systems. In those cases, whereas a search engine like Google represents the human domain as a two-dimensional network of nodes and edges, the domains of communicating devices such as phones or computers, of letters and packages in shipment, and of travelers are represented as high dimensional \u201cpolytopes,\u201d generalizations of the familiar polygons of high school geometry to higher dimensions, to which mathematical methods such as the Simplex Method or Karmarkar\u2019s can be applied to determine optimal routing. These representations work by ignoring almost everything about the entities in the domain apart from the one or two features that are germane to the task. The result is that the power of mathematics can be brought to bear to a problem that, on the face of it, is part of the complex web of human activity that defies the methods of science in terms of its complexity and (local) unpredictability.\n\n10. Conclusions \nHaving indicated a few highly important and relatively recent organizational initiatives, data science\u2014viewed as the convergence of disciplines, or, in practice, sub-disciplines\u2014should very much incorporate open methodology, open data, and transparency, reproducibility, and interoperability.\nThis article has sought to form a foundation for further study of the specific content of data science education and training, and of business sector importance. After all, progress and impact ensure development and evolution over time. As noted above, too, we may, if we wish, refer to the contemporary data revolution.\nBoth challenges and impactful potential are prominent, and it is good to see them as predominant in our rapidly growing discipline of data science. There are also important directions (in new research challenges and application of information space theory) to both follow and to incorporate in other domains.\n\nAcknowledgements \nAuthor contributions \nSection 9 by K.D. and other sections by F.M. All represent our extensive research work, teaching, and some consultancy also.\n\nConflicts of interest \nThe authors declare no conflict of interest.\n\nReferences \n\n\n\u2191 Henke, N.; Bughin, J.; Chui, M. et al.&#32;(December 2016).&#32;\"The age of analytics: Competing in a data-driven world\".&#32;McKinsey &amp; Company.&#32;pp. 136.&#32;https:\/\/www.mckinsey.com\/business-functions\/mckinsey-analytics\/our-insights\/the-age-of-analytics-competing-in-a-data-driven-world .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 2.0 2.1 2.2 2.3 2.4 2.5 2.6 Mahabal, A.A.; Crichton, D.; Djorgovki, S.G. et al.&#32;(2017).&#32;\"From Sky to Earth: Data Science Methodology Transfer\".&#32;Proceedings of the International Astronomical Union: 1\u201310.&#32;doi:10.1017\/S1743921317000060. &#160; \n\n\u2191 3.0 3.1 3.2 3.3 Murtagh, F.&#32;(2017).&#32;Data Science Foundations: Geometry and Topology of Complex Hierarchic Systems and Big Data Analytics.&#32;CRC Press.&#32;pp.&#160;206.&#32;ISBN&#160;9781498763936. &#160; \n\n\u2191 4.0 4.1 Hayashi, C.&#32;(1998).&#32;\"What is Data Science? Fundamental concepts and a heuristic example\".&#32;In&#32;Hayashi, C.; Yajima, K.; Bock H.H. et al..&#32;Data Science, Classification, and Related Methods.&#32;Springer.&#32;pp.&#160;40\u201351.&#32;ISBN&#160;9784431702085. &#160; \n\n\u2191 Ohsumi, N.&#32;(2000).&#32;\"From data analysis to data science\".&#32;In&#32;Kiers, H.A.L.; Rasson, J.-P.; Groenen, P.J.F. et al..&#32;Data Science, Classification, and Related Methods.&#32;Springer.&#32;pp.&#160;329\u201334.&#32;ISBN&#160;9783540675211. &#160; \n\n\u2191 6.0 6.1 Escoufier, Y.; Fichet, B.; Lebart, L. et al., ed.&#32;(1995).&#32;Data Science and Its Applications.&#32;Academic Press. &#160; \n\n\u2191 7.0 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 Cao, L.&#32;(2017).&#32;\"Data Science: A Comprehensive Overview\".&#32;ACM Computing Surveys&#32;50&#32;(3): 43.&#32;doi:10.1145\/3076253. &#160; \n\n\u2191 8.0 8.1 Ueno, M.&#32;(2017).&#32;\"As the oldest journal of data science\".&#32;Behaviormetrika&#32;44&#32;(1): 1\u20132.&#32;doi:10.1007\/s41237-016-0011-7. &#160; \n\n\u2191 Englmeier, K.; Murtagh, F.&#32;(2017).&#32;\"Data scientist - Manager of the discovery lifecycle\".&#32;Proceedings of the 6th International Conference on Data Science, Technology and Applications: 133\u2013140.&#32;doi:10.5220\/0006393801330140. &#160; \n\n\u2191 Murtagh, F.&#32;(2017).&#32;\"Chapter 8: Geometry and Topology of Matte Blanco's Bi-Logic in Psychoanalytics\".&#32;Data Science Foundations: Geometry and Topology of Complex Hierarchic Systems and Big Data Analytics.&#32;CRC Press.&#32;pp.&#160;147\u201362.&#32;ISBN&#160;9781498763936. &#160; \n\n\u2191 Coombs, C.H.&#32;(1964).&#32;A Theory of Data.&#32;Wiley. &#160; \n\n\u2191 Japec, L.; Kreuter, F.; Berg, M. et al.&#32;(12 February 2015).&#32;\"AAPORT Report: Big Data\".&#32;AAPOR.&#32;https:\/\/www.aapor.org\/Education-Resources\/Reports\/Big-Data.aspx .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 13.0 13.1 Abbany, Z.&#32;(27 November 2017).&#32;\"A public transport model built on open data\".&#32;DW.&#32;Deutsche Welle.&#32;https:\/\/www.dw.com\/en\/a-public-transport-model-built-on-open-data\/a-41546053 .&#32;Retrieved 27 November 2017 . &#160; \n\n\u2191 14.0 14.1 Darabi, A.&#32;(05 December 2017).&#32;\"The UK\u2019s next census will be its last\u2014here\u2019s why\".&#32;Apolitical.&#32;Apolitical Group Limited.&#32;https:\/\/apolitical.co\/solution_article\/uks-next-census-will-last-heres\/ . &#160; \n\n\u2191 Murtagh, F.; Orlov, M.; Mirkin, B.&#32;(2018).&#32;\"Qualitative Judgement of Research Impact: Domain Taxonomy as a Fundamental Framework for Judgement of the Quality of Research\".&#32;Journal of Classification&#32;35&#32;(1): 5\u201328.&#32;doi:10.1007\/s00357-018-9247-0. &#160; \n\n\u2191 16.0 16.1 Hand, D.J.&#32;(2018).&#32;\"Statistical challenges of administrative and transaction data\".&#32;Statistics in Society Series A&#32;181&#32;(3): 555\u2013605.&#32;doi:10.1111\/rssa.12315. &#160; \n\n\u2191 Anderson, C.&#32;(23 June 2008).&#32;\"The End of Theory: The Data Deluge Makes The Scientific Method Obsolete\".&#32;Wired.&#32;Cond\u00e9 Nast.&#32;https:\/\/www.wired.com\/2008\/06\/pb-theory\/ . &#160; \n\n\u2191 18.0 18.1 Murtagh, F.&#32;(2008).&#32;\"Origins of Modern Data Analysis Linked to the Beginnings and Early Development of Computer Science and Information Engineering\".&#32;Electronic Journal for History of Probability and Statistics&#32;4&#32;(2): 1\u201326.&#32;https:\/\/arxiv.org\/abs\/0811.2519 . &#160; \n\n\u2191 19.0 19.1 Englmeier, K.; Murtagh, F.&#32;(2017).&#32;\"Editorial: What Can We Expect from Data Scientists?\".&#32;Journal of Theoretical and Applied Electronic Commerce Research&#32;12&#32;(1): 1\u20135.&#32;doi:10.4067\/S0718-18762017000100001. &#160; \n\n\u2191 20.0 20.1 Murtagh, F.; Farid, M.&#32;(2017).&#32;\"Contextualizing Geometric Data Analysis and Related Data Analytics: A Virtual Microscope for Big Data Analytics\".&#32;Journal of Interdisciplinary Methodologies and Issues in Sciences&#32;3&#32;(Digital Contextualization): 1\u201319.&#32;doi:10.18713\/JIMIS-010917-3-1. &#160; \n\n\u2191 Allin, P.; Hand, D.J.&#32;(2016).&#32;\"New statistics for old?\u2014Measuring the wellbeing of the UK\".&#32;Statistics in Society Series A&#32;180&#32;(1): 3\u201343.&#32;doi:10.1111\/rssa.12188. &#160; \n\n\u2191 Wessel, M.&#32;(03 November 2016).&#32;\"You Don't Need Big Data - You Need the Right Data\".&#32;Harvard Business Review.&#32;https:\/\/hbr.org\/2016\/11\/you-dont-need-big-data-you-need-the-right-data .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 \"Jobs Rated Report 2017: Ranking 200 Jobs\".&#32;CareerCast.com.&#32;2017.&#32;https:\/\/www.careercast.com\/jobs-rated\/2017-jobs-rated-report .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 Daniel, B.K.&#32;(2018).&#32;\"Reimaging Research Methodology as Data Science\".&#32;Big Data and Cognitive Computing&#32;2&#32;(1): 4.&#32;doi:10.3390\/bdcc2010004. &#160; \n\n\u2191 Press, G.&#32;(28 February 2018).&#32;\"Graduate Programs in Data Science and Big Data Analytics\".&#32;What's the Big Data?.&#32;https:\/\/whatsthebigdata.com\/2012\/08\/09\/graduate-programs-in-big-data-and-data-science\/ .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 \"Big Data Executive Survey 2017\"&#32;(PDF).&#32;NewVantage Partners LLC.&#32;January 2017.&#32;http:\/\/newvantage.com\/wp-content\/uploads\/2017\/01\/Big-Data-Executive-Survey-2017-Executive-Summary.pdf .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 Hayes, B.&#32;(18 January 2016).&#32;\"Empirically-Based Approach to Understanding the Structure of Data Science\".&#32;Business Over Broadway.&#32;http:\/\/businessoverbroadway.com\/empirically-based-approach-to-understanding-the-structure-of-data-science .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 Murtagh, F.&#32;(2018).&#32;\"Security and ethics in Big Data: Analytical foundations for surveys\".&#32;Archives of Data Science&#32;Submitted. &#160; \n\n\u2191 Hand, D.&#32;(06 April 2017).&#32;\"The dangers of not seeing what isn\u2019t there: Selection bias in statistical modelling\".&#32;Irish Statistical Association (ISA) Gossett Lecture 2017.&#32;https:\/\/www.ucc.ie\/en\/matsci\/news\/irish-statistical-association-isa-gossett-lecture-2017.html . &#160; \n\n\u2191 30.0 30.1 30.2 30.3 30.4 Keiding, N.; Louis, T.A.&#32;(2016).&#32;\"Perils and potentials of self\u2010selected entry to epidemiological studies and surveys\".&#32;Statistics in Society Series A&#32;179&#32;(2): 319\u201376.&#32;doi:10.1111\/rssa.12136. &#160; \n\n\u2191 O'Neil, C.&#32;(2016).&#32;Weapons of Math Destruction.&#32;Crown.&#32;pp.&#160;272.&#32;ISBN&#160;9780553418811. &#160; \n\n\u2191 \"Chapitre 1. Id\u00e9es\u2013clefs de l\u2019analyse g\u00e9om\u00e9trique des donn\u00e9es\".&#32;La M\u00e9thodologie de Pierre Bourdieu en Action: Espace Culturel, Espace Social et Analyse des Donn\u00e9es.&#32;Dunod.&#32;2015.&#32;pp.&#160;3\u201320.&#32;doi:10.3917\/dunod.lebar.2015.01.0003.&#32;ISBN&#160;9782100703845. &#160; \n\n\u2191 33.0 33.1 Murtagh, F.; Pianosi, M.; Bull, R.&#32;(2016).&#32;\"Semantic mapping of discourse and activity, using Habermas\u2019s theory of communicative action to analyze process\".&#32;Quality &amp; Quantity&#32;50&#32;(4): 1675\u20131694.&#32;doi:10.1007\/s11135-015-0228-7. &#160; \n\n\u2191 \"Commission urges governments to embrace potential of Big Data\".&#32;Press Release Database.&#32;European Commission.&#32;02 July 2014.&#32;http:\/\/europa.eu\/rapid\/press-release_IP-14-769_en.htm . &#160; \n\n\u2191 Committee on the Peaceful Uses of Outer Space&#32;(14 June 2016).&#32;\"\u201cOpen Universe\u201d proposal, an initiative under the auspices of the Committee on the Peaceful Uses of Outer Space for expanding availability of and accessibility to open source space science data\"&#32;(PDF).&#32;United Nations Office for Outer Space Affairs.&#32;http:\/\/www.unoosa.org\/res\/oosadoc\/data\/documents\/2016\/aac_1052016crp\/aac_1052016crp_6_0_html\/AC105_2016_CRP06E.pdf .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J. et al.&#32;(2016).&#32;\"The FAIR Guiding Principles for scientific data management and stewardship\".&#32;Scientific Data&#32;3: 160018.&#32;doi:10.1038\/sdata.2016.18. &#160; \n\n\u2191 37.0 37.1 Devlin, K.&#32;(1991).&#32;Logic and Information.&#32;Cambridge University Press.&#32;ISBN&#160;0521499712. &#160; \n\n\u2191 Devlin, K.&#32;(July 2011).&#32;\"A uniform framework for describing and analyzing the modern battlefield\"&#32;(PDF).&#32;Standford University.&#32;http:\/\/web.stanford.edu\/~kdevlin\/Papers\/Army_report_0711.pdf .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 Devlin, K.; Rosenberg, D.&#32;(1996).&#32;Language at Work: Analyzing Communication Breakdown in the Workplace to Inform Systems Design.&#32;Center for the Study of Language and Information.&#32;pp.&#160;212.&#32;ISBN&#160;9781575860510. &#160; \n\n\u2191 40.0 40.1 Devlin, K.; Rosenberg, D.&#32;(2008).&#32;\"Information in the Study of Human Interaction\".&#32;In&#32;Adriaana, P.; van Benthem, J.; Gabbay, D. et al..&#32;Philosophy of Information.&#32;Elsevier.&#32;pp.&#160;685\u2013709.&#32;doi:10.1016\/B978-0-444-51726-5.50021-2. &#160; \n\n\u2191 Barwise, J.; Perry, J.&#32;(1999).&#32;Situations and Attitudes.&#32;Center for the Study of Language and Information.&#32;pp.&#160;376.&#32;ISBN&#160;9781575861937. &#160; \n\n\u2191 \"Sustainable Accommodation in the New Economy\".&#32;CORDIS.&#32;European Commission.&#32;15 May 2008.&#32;https:\/\/cordis.europa.eu\/project\/rcn\/58059_en.html .&#32;Retrieved 18 June 2018 . &#160; \n\n\u2191 Rosenberg, D.; Foley, S.; Lievonen, M. et al.&#32;(2005).&#32;\"Interaction spaces in computer-mediated communication\".&#32;AI &amp; Society&#32;19&#32;(1): 22\u201333.&#32;doi:10.1007\/s00146-004-0299-9. &#160; \n\n\u2191 Walkoski, S.; D\u00f6rner, R.; Lievonen, M. et al.&#32;(2011).&#32;\"Using a game controller for relaying deictic gestures in computer-mediated communication\".&#32;International Journal of Human-Computer Studies&#32;69: 362\u201374.&#32;doi:10.1016\/j.ijhcs.2011.01.002. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference. The original inline citation method was unorthodox; these inline citations have been made clearer with the addition of the author of the citation. This often required sentences containing inline citations to be reconstructed. Several URL mentions in the text were turned into full citations. Several vanity statements and irrelevant comments were removed for improved readability.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\">https:\/\/www.limswiki.org\/index.php\/Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on big dataLIMSwiki journal articles on education\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 17 July 2018, at 23:27.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 2,443 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","795feead44bb9c43869be23a90bf9d75_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_The_development_of_data_science_Implications_for_education_employment_research_and_the_data_revolution_for_sustainable_development skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:The development of data science: Implications for education, employment, research, and the data revolution for sustainable development<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>In data science, we are concerned with the integration of relevant sciences in observed and empirical contexts. This results in the unification of analytical methodologies, and of observed and empirical data contexts. Given the dynamic nature of convergence, the origins and many evolutions of the data science theme are described. The following are covered in this article: the rapidly growing post-graduate university course provisioning for data science; a preliminary study of employability requirements; and how past eminent work in the social sciences and other areas, certainly mathematics, can be of immediate and direct relevance and benefit for innovative methodology, and for facing and addressing the ethical aspect of big data <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">analytics<\/a>, relating to data aggregation and scale effects. Associated also with data science is how direct and indirect outcomes and consequences of data science include decision support and policy making, and both qualitative as well as quantitative outcomes. For such reasons, the importance is noted of how data science builds collaboratively on other domains, potentially with innovative methodologies and practice. Further sections point towards some of the major current research issues.\n<\/p><p><b>Keywords<\/b>: big data training and learning, company and business requirements, ethics, impact, decision support, data engineering, open data, smart homes, smart cities, IoT\n<\/p>\n<h2><span class=\"mw-headline\" id=\"1._Introduction:_Data_science_as_the_convergence_and_bridging_of_disciplines\">1. Introduction: Data science as the convergence and bridging of disciplines<\/span><\/h2>\n<p>The context of our problem solving and analytics will always be quite fundamental, very specific, and particularly oriented. (Section 4 of this article draws some interesting and relevant implications of this.) This article is oriented towards commonality and mutual influence of methodologies, and of analytical processes and procedures. A nice example of the parallel nature of such things is how \"big data analytics\" is often considered a synonym of \"data science.\" In Section 2.2, it is mentioned how public transport may well use smartphone and mobile phone wireless connection data to observe locations of individuals. This close association or, perhaps even, identity of big data analytics and data science will have growing importance with the internet of things (IoT), and smart cities and smart homes, and so on (as noted in Section 8). The McKinsey Global Institute provided an outstanding perspective on this idea in their paper <i>The age of analytics: Competing in a data-driven world<\/i>.<sup id=\"rdp-ebb-cite_ref-HenkeTheAge16_1-0\" class=\"reference\"><a href=\"#cite_note-HenkeTheAge16-1\" rel=\"external_link\">[1]<\/a><\/sup>\n<\/p><p>In Section 8 and Section 9 of this article, very important developments are at issue, encompassing newly oriented and pursued methodologies, and the integration of research domains. Section 7 notes how important all of the content here is to sustainable development. The phrase \"data revolution\" is based here on ongoing work by the United Nations, and by so many of us in this domain, and from national authorities in Africa and the Middle East discussing issues here at the most recent (2017) World Statistics Congress.\n<\/p><p>This converging and bridging of disciplines is increasingly important. For example, Mahabal <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-MahabalFromSky17_2-0\" class=\"reference\"><a href=\"#cite_note-MahabalFromSky17-2\" rel=\"external_link\">[2]<\/a><\/sup> discuss the parallels between astronomy and Earth science data, methodology transfer, and metadata and ontologies characterized as being crucial. They claim the convergence or bridging of disciplines must address \u201cnon-homogeneous observables, and varied spatial, temporal coverage at different resolutions.\u201d<sup id=\"rdp-ebb-cite_ref-MahabalFromSky17_2-1\" class=\"reference\"><a href=\"#cite_note-MahabalFromSky17-2\" rel=\"external_link\">[2]<\/a><\/sup> This quotation is very familiar to us in regard to how NoSQL databases are now widely used, as well as traditional relational databases. Another example is how text mining, social media, and many other domains have become so very important in many contexts. Then, given computational support, \u201cit is the complexity more than the data volume that proves to be a bigger challenge.\u201d<sup id=\"rdp-ebb-cite_ref-MahabalFromSky17_2-2\" class=\"reference\"><a href=\"#cite_note-MahabalFromSky17-2\" rel=\"external_link\">[2]<\/a><\/sup> Further benefits of this data science convergence are termed here \"tractability\" and \"reproducibility.\" Mahabal <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-MahabalFromSky17_2-3\" class=\"reference\"><a href=\"#cite_note-MahabalFromSky17-2\" rel=\"external_link\">[2]<\/a><\/sup> also discuss the complexity relating to resolution and distributions. In a separate work, Murtagh<sup id=\"rdp-ebb-cite_ref-MurtaghData17_3-0\" class=\"reference\"><a href=\"#cite_note-MurtaghData17-3\" rel=\"external_link\">[3]<\/a><\/sup> characterized this in terms of data encoding. Plenty of work now emphasizes the importance of <i>p<\/i>-adic data encoding (binary or ternary when <i>p<\/i> = 2 or 3), compared with real-valued encoding (<i>m<\/i>-adic, especially when <i>m<\/i> = 10).\n<\/p><p>The convergence and bridging of disciplines is fully emphasized by Mahabal <i>et al.<\/i> as such<sup id=\"rdp-ebb-cite_ref-MahabalFromSky17_2-4\" class=\"reference\"><a href=\"#cite_note-MahabalFromSky17-2\" rel=\"external_link\">[2]<\/a><\/sup>: \n<\/p>\n<blockquote>Methodology transfer can almost never be unidirectional. Diverse fields grow by learning tricks employed by other disciplines. The important thing is to abstract data\u2014described by meaningful metadata\u2014and the metadata in turn connected by a good ontology.<\/blockquote>\n<p>Further description is at issue in regard to collaboration in data science<sup id=\"rdp-ebb-cite_ref-MahabalFromSky17_2-5\" class=\"reference\"><a href=\"#cite_note-MahabalFromSky17-2\" rel=\"external_link\">[2]<\/a><\/sup>: \n<\/p>\n<blockquote>We have described here a few techniques from astroinformatics that are finding use in <a href=\"https:\/\/www.limswiki.org\/index.php\/Geoinformatics\" title=\"Geoinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"2dc37de467d4af308f4b02d8e2ba12d1\">geoinformatics<\/a>. There would be many from earth science that space science would do well to emulate. Even other disciplines like <a href=\"https:\/\/www.limswiki.org\/index.php\/Bioinformatics\" title=\"Bioinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"8f506695fdbb26e3f314da308f8c053b\">bioinformatics<\/a> provide ample opportunities for methodology transfer and collaboration. With growing data volumes, and more importantly the increasing complexity, data science is our only refuge. Collaboration in data science will be beneficial to all sciences.<\/blockquote>\n<h2><span class=\"mw-headline\" id=\"2._Historical_development_of_data_science_and_some_contemporary_examples_of_cross-disciplinarity\">2. Historical development of data science and some contemporary examples of cross-disciplinarity<\/span><\/h2>\n<p>A short historical perspective that follows is with reference to such disciplines as computer and information sciences, mathematics and statistics, physics, and, implicitly, social sciences. In concluding this description, a key point will be how data science encompasses and embraces all of the following: cross-disciplinarity, interdisciplinarity, and multidisciplinarity.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"2.1_Historical_prominence_of_data_science_in_recent_times\">2.1 Historical prominence of data science in recent times<\/span><\/h3>\n<p>The origins of data science are largely due to Chikio Hayashi and others. Hayashi<sup id=\"rdp-ebb-cite_ref-HayashiWhatIs98_4-0\" class=\"reference\"><a href=\"#cite_note-HayashiWhatIs98-4\" rel=\"external_link\">[4]<\/a><\/sup> says \u201cI will present 'data science' as a new concept,\u201d followed by a relevant introduction to the science of data: \u201cData Science consists of three phases: design for data, collection of data and analysis on data.\u201d<sup id=\"rdp-ebb-cite_ref-HayashiWhatIs98_4-1\" class=\"reference\"><a href=\"#cite_note-HayashiWhatIs98-4\" rel=\"external_link\">[4]<\/a><\/sup> In Ohsumi<sup id=\"rdp-ebb-cite_ref-OhsumiFromData00_5-0\" class=\"reference\"><a href=\"#cite_note-OhsumiFromData00-5\" rel=\"external_link\">[5]<\/a><\/sup>, the abstract has this: \u201cIn 1992, the author argued the urgency of the need to grasp the concept 'data science'. Despite the emergence of concepts such as data mining, this issue has not been addressed.\u201d\n<\/p><p>Escoufier <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-EscoufierDataSci95_6-0\" class=\"reference\"><a href=\"#cite_note-EscoufierDataSci95-6\" rel=\"external_link\">[6]<\/a><\/sup> note how data science arises from the convergence of computer science and statistics, which \"gives birth to a new science at its core.\" They conclude that \"[t]o take data as a starting point provides a complementary vision of theory and practice, and avoids creating an unfortunate gap between two steps, both of which are essential in any scientific process.\"<sup id=\"rdp-ebb-cite_ref-EscoufierDataSci95_6-1\" class=\"reference\"><a href=\"#cite_note-EscoufierDataSci95-6\" rel=\"external_link\">[6]<\/a><\/sup>\n<\/p><p>Cao provides a comprehensive overview of data science<sup id=\"rdp-ebb-cite_ref-CaoData17_7-0\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup>, noting how the \u201cfirst conference to adopt 'data science' as a topic\u201d was the International Federation of Classification Societies (IFCS) 1996 conference, in Kobe, Japan. This was fully consistent with our work as participants, then and now (IFCS 2017, in Tokyo, Japan, also had \"data science\" in its title). Ueno<sup id=\"rdp-ebb-cite_ref-UenoAsThe17_8-0\" class=\"reference\"><a href=\"#cite_note-UenoAsThe17-8\" rel=\"external_link\">[8]<\/a><\/sup> makes a similar point about IFCS 1996 as the first conference with \"data science\" in its title, and he also claims that the journal <i>Behaviormetrika<\/i> is \"the oldest journal addressing the topic of data science,\" when it started in 1974. He describes data science as \"an interdisciplinary field that includes the use of statistical methods to extract meaningful knowledge from data in various forms: either structured or unstructured.\"<sup id=\"rdp-ebb-cite_ref-UenoAsThe17_8-1\" class=\"reference\"><a href=\"#cite_note-UenoAsThe17-8\" rel=\"external_link\">[8]<\/a><\/sup>\n<\/p><p>Cao<sup id=\"rdp-ebb-cite_ref-CaoData17_7-1\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup> provides additional historical perspectives, with the section heading \"The Data Science journey,\" relating largely to work in the 1960s and 1970s. This includes \"information discovery\" as a continuing key objective in data science. Englmeier and Murtaugh<sup id=\"rdp-ebb-cite_ref-EnglmeierData17_9-0\" class=\"reference\"><a href=\"#cite_note-EnglmeierData17-9\" rel=\"external_link\">[9]<\/a><\/sup> also make note of this objective, emphasizing the \u201csemantic dimension of data science,\u201d through the information discovery lifecyle, and the \u201cdiscovery lifecycle in text mining.\u201d While also emphasizing cooperation, and cross-disciplinarity, there is this: we see the data scientist\u2019s responsibility...\n<\/p>\n<ul><li> in the design of an overarching semantic layer addressing data and analysis tools,<\/li>\n<li> in identifying suitable data sources and data patterns that correspond to the appearance of structured and unstructured data, and<\/li>\n<li> in the management of the information discovery lifecycle and discovery teams.<\/li><\/ul>\n<p>An ever-more important issue arises from the data sources that are employed. As a summary expression, data science is, firstly, the integration of data sources and analytical and related data processing methodologies, and, secondly and quite fundamentally, arising from the convergence of disciplines. Convergence of disciplines can be quite beneficial in practice, particularly in regard to addressing and solving problems, and also in regard to the cooperation yielded by cross-disciplinarity. See Section 5, below, for some current discussion on how the problems and challenges to be addressed can and should be, quite naturally, arising out of all aspects of data science.\n<\/p><p>The current era of data science can be considered as a culmination of previous epochs that gave rise to major digital technology advances, with implications in all social domains. Largely, the first epoch (in the 1980s) brought about laptop and desktop computers, and the second epoch (in the 1990s) gave rise to the internet and the World Wide Web.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"2.2_Practical_association_of_disciplines_and_sub-disciplines\">2.2 Practical association of disciplines and sub-disciplines<\/span><\/h3>\n<p>Cao<sup id=\"rdp-ebb-cite_ref-CaoData17_7-2\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup> also makes mention of data science being centered on the following disciplines: statistics, <a href=\"https:\/\/www.limswiki.org\/index.php\/Informatics\" title=\"Informatics\" class=\"mw-disambig wiki-link\" target=\"_blank\" data-key=\"ea0ff624ac3a644c35d2b51d39047bdf\">informatics<\/a>, sociology, and management science. Clearly there is emphasis on \u201csynergy of several research disciplines\u201d and how \u201cinterdisciplinary initiatives are necessary to bridge the gaps between the respective disciplines.\u201d<sup id=\"rdp-ebb-cite_ref-CaoData17_7-3\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup> This is exciting and not least because of how there is convergence of disciplines or subdisciplines. We may consider, for example, how the digital humanities can incorporate relevant areas of a few disciplines, how computational psychoanalysis can come to the fore.<sup id=\"rdp-ebb-cite_ref-MurtaghData17-8_10-0\" class=\"reference\"><a href=\"#cite_note-MurtaghData17-8-10\" rel=\"external_link\">[10]<\/a><\/sup> With a major focus on psychometrics, Coombs<sup id=\"rdp-ebb-cite_ref-CoombsATheory64_11-0\" class=\"reference\"><a href=\"#cite_note-CoombsATheory64-11\" rel=\"external_link\">[11]<\/a><\/sup> has chapters that proceed from \u201cBasic Concepts\u201d to \u201cOn Methods of Collecting Data,\u201d and \u201cPreferential Choice Data.\u201d\n<\/p><p>Now, data is so very central to all of our sciences, and to all aspects of our engineering and technology. Murtagh<sup id=\"rdp-ebb-cite_ref-MurtaghData17_3-1\" class=\"reference\"><a href=\"#cite_note-MurtaghData17-3\" rel=\"external_link\">[3]<\/a><\/sup> defines just what data is, which includes the concept of data coding, or perhaps also, this should be termed data encoding. After all, data is measurement. This underscores the importance of the mathematical underpinnings in data science. Implications that follow include the relevance and importance for new, innovative directions to be followed, and from effective problem solving. The mathematical view of what measurement means is all important, as well as in the discipline of physics. Murtagh<sup id=\"rdp-ebb-cite_ref-MurtaghData17_3-2\" class=\"reference\"><a href=\"#cite_note-MurtaghData17-3\" rel=\"external_link\">[3]<\/a><\/sup> cites eminent physicist Paul Dirac as to how mathematics underpins all of physics, and how the work of eminent psychoanalyst Ignacio Matte Blanco has mathematics being integral to psychoanalysis.\n<\/p><p>From a major study of big data and surveying by the American Association for Public Opinion Research comes the following<sup id=\"rdp-ebb-cite_ref-JapecAAPOR15_12-0\" class=\"reference\"><a href=\"#cite_note-JapecAAPOR15-12\" rel=\"external_link\">[12]<\/a><\/sup>: \u201cThe classic statistical paradigm was one in which researchers formulated a hypothesis, identified a population frame, designed a survey and a sampling technique and then analyzed the results \u2026 The new paradigm means it is now possible to digitally capture, semantically reconcile, aggregate, and correlate data.\u201d\n<\/p><p>Abbany<sup id=\"rdp-ebb-cite_ref-AbbanyAPublic17_13-0\" class=\"reference\"><a href=\"#cite_note-AbbanyAPublic17-13\" rel=\"external_link\">[13]<\/a><\/sup> notes that wireless connection data is forming a basis for public transport management. Such big data sources can be associated with, or even integrated with, personal and social behavioral patterns and activities. \u201cBetter living through data?\u201d asks Abbany, followed by a very critical statement: \u201cThe other thing I need to declare is that I\u2019m no fan of our contemporary belief that life can only get better the more data we have at our disposal.\u201d<sup id=\"rdp-ebb-cite_ref-AbbanyAPublic17_13-1\" class=\"reference\"><a href=\"#cite_note-AbbanyAPublic17-13\" rel=\"external_link\">[13]<\/a><\/sup> A response to this would be that data science, as the science of data, is everything relating to the path and trajectory connecting data, information, knowledge, and wisdom.\n<\/p><p>Darabi<sup id=\"rdp-ebb-cite_ref-DarabiTheUK17_14-0\" class=\"reference\"><a href=\"#cite_note-DarabiTheUK17-14\" rel=\"external_link\">[14]<\/a><\/sup> reports that \u201cThe UK\u2019s next census will be its last,\u201d with administrative, governmental authorities\u2019 data replacing the national census. This is acknowledged: \u201cCollecting the data itself is only half the work. A great deal of effort must go into combining it with other sources, in order to answer real questions.\u201d That can be understood as undertaking scientific investigation of such data, and other potentially relevant data. The cross-disciplinarity inherent in that also can, and perhaps must, lead to new interdisciplinary linkages. Arising out of the ending of the national census is the recognition that how the \"government counts its people is changing, and it could transform policy.\u201d\n<\/p><p>One issue here has been how mathematics underpins so much, across disciplines, and also in the commercial and in most social domains. Many universities in the recent past shut down their mathematics departments and no longer provide teaching in mathematics. However, this is being reversed, with university courses again being provided in mathematics.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"3._Open_data.2C_reproducibility.2C_and_the_data_curation_challenge\">3. Open data, reproducibility, and the data curation challenge<\/span><\/h2>\n<p>While generally recognized as so important for innovation in both application outcomes and in regard to analytics and methodologies, open data plays a key role for data scientists. (Information and news about open data is well provided by the organization Open Data Institute: <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/theodi.org\" target=\"_blank\">https:\/\/theodi.org<\/a>).\n<\/p><p>One major aspect of how big data analytics are quite central to data science is the increasing availability of open data. Cao<sup id=\"rdp-ebb-cite_ref-CaoData17_7-4\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup> associates this with methodology, through \u201cthe open model rather than a closed one.\u201d This concept was central to a May 2017 London presentation by Dr. Robert Hanisch, Director, Office of Data and Informatics, National Institute of Standards and Technology. Dr. Hanisch worked for 30 years on the Hubble Space Telescope (HST) project. Due to open access to observed data, from our cosmos, Dr. Hanisch noted that three times the number of people directly engaged in HST work were working on HST data. As such, there were three times the benefits drawn from HST data.\n<\/p><p>Dr. Hanisch noted how important the national metrology institutes were to their efforts. Arising from this was, and is, the importance of reproducibility and interoperability of all of analytics comprising data science. Underpinning these very important themes in data science work is data curation. Data curation is still a major challenge to be addressed. Noted in Dr. Hanisch\u2019s presentation is the contemporary \u201ccrisis\u201d of reproducibility. At issue is to support data management from acquisition to publication, whether it occurs in business, medical, governmental or other sectors. The computing expert will recognize this crucial theme of data curation as associated with metadata and evolving ontologies.\n<\/p><p>For the latter, i.e., Murtagh <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-MurtaghQual18_15-0\" class=\"reference\"><a href=\"#cite_note-MurtaghQual18-15\" rel=\"external_link\">[15]<\/a><\/sup> discuss in a broad and general context the very important and central role of evolving ontology, research publishing, and research funding. While challenges remain to be pursued and addressed, it is important to note that astronomy and astrophysics offer interesting paradigms for open data, and, in many ways, for data curation. Certainly, further research will be carried out on data curation, as well as evolving and interacting ontologies, all of which are core issues for metrology, hence the very basis of all manufacturing technology, and, as described in the latter citation, for research publications and research funding.\n<\/p><p>Cao<sup id=\"rdp-ebb-cite_ref-CaoData17_7-5\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup> also discusses \u201cthe open model and open data.\u201d From that discussion results the concept of multidisciplinarity\u2014expressed here as the convergence of disciplines\u2014which can be aided and facilitated by the openness of analytics, data management, and all data science methodologies. Keeping methodologies open allows domain experts to both link up with and perhaps even, if feasible, to integrate with all that is at issue in other relevant domains. As such, a plea for openness of data science as a discipline continues to grow, particularly when viewed as a convergence of disciplines.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"4._Integration_of_data_and_analytics:_Context_of_applications\">4. Integration of data and analytics: Context of applications<\/span><\/h2>\n<p>The integration of data and analytics in data science has resulted in a strong need to acknowledge and address challenges and other issues with data and the underpinning or contextual reality of the data. Informally expressed, our data represents reality or the context from which the measurements arose, i.e., the data numeric values or qualitative representations. This requires data scientists to focus on quality and standards of work.\n<\/p><p>Hand<sup id=\"rdp-ebb-cite_ref-HandStat18_16-0\" class=\"reference\"><a href=\"#cite_note-HandStat18-16\" rel=\"external_link\">[16]<\/a><\/sup> contributes numerous important points relevant to the discussion here, describing the problems of data quality\u2014in the big data context\u2014relating to administrative data. He notes that data curation is relevant for reproducibility of analytics. The implications for analytics<sup id=\"rdp-ebb-cite_ref-HandStat18_16-1\" class=\"reference\"><a href=\"#cite_note-HandStat18-16\" rel=\"external_link\">[16]<\/a><\/sup>: \n<\/p>\n<blockquote>... the fact that data are often not of the highest quality has led to the development of relevant statistical methods and tools, such as detection methods based on integrity checks and on statistical properties ... However, this emphasis has often not been matched within the realm of machine learning, which places more emphasis on the final modelling stage of data analysis. This can be unfortunate: feed data into an algorithm and a number will emerge, whether or not it makes sense. However, even within the statistical community, most teaching implicitly assumes perfect data ... Challenge 1. Statistics teaching should cover data quality issues.<\/blockquote>\n<p>Our analytics should not be a \u201cblack box,\u201d a term that was informally used in regard to neural networks in earlier times. Rather, transparency should always be a key property of analytical methodologies.\n<\/p><p>The view offered by Anderson<sup id=\"rdp-ebb-cite_ref-AndersonTheEnd08_17-0\" class=\"reference\"><a href=\"#cite_note-AndersonTheEnd08-17\" rel=\"external_link\">[17]<\/a><\/sup>, and discussed by Murtagh<sup id=\"rdp-ebb-cite_ref-MurtaghOrigins08_18-0\" class=\"reference\"><a href=\"#cite_note-MurtaghOrigins08-18\" rel=\"external_link\">[18]<\/a><\/sup>, quoting Peter Norvig, Google\u2019s research director<sup id=\"rdp-ebb-cite_ref-MurtaghOrigins08_18-1\" class=\"reference\"><a href=\"#cite_note-MurtaghOrigins08-18\" rel=\"external_link\">[18]<\/a><\/sup>:\n<\/p>\n<blockquote>Petabytes allow us to say: \"Correlation is enough.\" We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.<\/blockquote> \n<p>However, this interesting view, inspired by contemporary search engine technology, is provocative. The author maintains that \"[c]orrelation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.\"\n<\/p><p>It cannot be accepted that correlation supersedes causation, i.e., that analytics can be automated fully, and thereby obfuscate, or make redundant, data science as well as health and well-being analytics. Englmeier and Murtagh<sup id=\"rdp-ebb-cite_ref-EnglmeierEdit17_19-0\" class=\"reference\"><a href=\"#cite_note-EnglmeierEdit17-19\" rel=\"external_link\">[19]<\/a><\/sup> reflect similarly on the above, stating their case for comprehensive information governance, encompassing fully the contextualization of all the analytics that are being carried out. Murtagh and Farid<sup id=\"rdp-ebb-cite_ref-MurtaghContext17_20-0\" class=\"reference\"><a href=\"#cite_note-MurtaghContext17-20\" rel=\"external_link\">[20]<\/a><\/sup> discuss quite a good deal of the contextualization of analytics of health and well-being data. In the discussion accompanying the seminal work by Allin and Hand<sup id=\"rdp-ebb-cite_ref-AllinNew16_21-0\" class=\"reference\"><a href=\"#cite_note-AllinNew16-21\" rel=\"external_link\">[21]<\/a><\/sup> in statistical perspectives on health and well-being, the authors responded to our comments: \u201cWe agree with Murtagh that \u2018big data\u2019 may offer insights, provided that there are appropriate analytics.\u201d\n<\/p><p>It is quite relevant to note here that data science has inherent and integral involvement in the sourcing and in the origins of data, i.e., selection and measurement. Wessel<sup id=\"rdp-ebb-cite_ref-WesselYou16_22-0\" class=\"reference\"><a href=\"#cite_note-WesselYou16-22\" rel=\"external_link\">[22]<\/a><\/sup> makes this point clear while discussing software applications. This implies full integration of the analytics with what data is selected and sourced, and that may well imply what and how measurement is carried out.\n<\/p><p>The takeaway here may be the priority to be accorded to induction-based, i.e., inductive, reasoning (cf., Murtagh<sup id=\"rdp-ebb-cite_ref-MurtaghData17_3-3\" class=\"reference\"><a href=\"#cite_note-MurtaghData17-3\" rel=\"external_link\">[3]<\/a><\/sup>). This could be a minor argument for the importance of approaches that follow from data mining, unsupervised classification, latent semantic analysis, and various other themes. Clearly, however, all of one\u2019s studying and teaching, as well as one\u2019s work for companies, government agencies, and health and other authorities should and really must be properly focused on the aims and objectives. The latter, of course, may need, partially in any case, to be determined by the expert data scientist.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"5._Short_review_of_contemporary_data_science_in_education_and_in_employment\">5. Short review of contemporary data science in education and in employment<\/span><\/h2>\n<p>It is quite clear to all involved with many businesses and educational institutions that data science is becoming one of the most important employment prospects. A comparison of employment salaries in the U.S. has data science as having the highest median salary in 2017.<sup id=\"rdp-ebb-cite_ref-CCJobs17_23-0\" class=\"reference\"><a href=\"#cite_note-CCJobs17-23\" rel=\"external_link\">[23]<\/a><\/sup> It follows that data science and big data will certainly be studied in university courses, and these will of course be related to the prospects and potential for the students.\n<\/p><p>In this section, two themes relate to the contemporary context: higher education in data science and company employment advertisements. We turn to accessible and available data to discuss these themes. Of course, an expert data scientist is very likely to be involved in many discussions and debates with current and potential students, company executives, and many others. It can even be seen that most disciplines have to be integrated into data science, requiring pedagogical innovation in education.<sup id=\"rdp-ebb-cite_ref-DanielReimaging18_24-0\" class=\"reference\"><a href=\"#cite_note-DanielReimaging18-24\" rel=\"external_link\">[24]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"5.1._Teaching_and_learning_for_data_science\">5.1. Teaching and learning for data science<\/span><\/h3>\n<p>Briefly considered here are current higher education post-graduate programs (usually termed MSc courses) in data science. This is also possibly relevant for undergraduate programs, and certainly relevant for undergraduate projects and company placements of students.\n<\/p><p>In universities in all countries worldwide, in recent years, there has been a great increase in graduate level courses in data science, and increasingly also in undergraduate level courses. Press<sup id=\"rdp-ebb-cite_ref-PRessGrad18_25-0\" class=\"reference\"><a href=\"#cite_note-PRessGrad18-25\" rel=\"external_link\">[25]<\/a><\/sup> maintains a listing of graduate courses, in some cases, but not in all, with the title \"data science.\" This listing, with links to the host institute, contains 102 MSc courses, 19 online courses, 11 free online courses, and eight for a fee, for a total of 140 graduate level courses.\n<\/p><p>The theme of having data and analytics well integrated is again reiterated. Consider the most essential requirements of a data scientist, as described by Englmeier and Murtagh<sup id=\"rdp-ebb-cite_ref-EnglmeierEdit17_19-1\" class=\"reference\"><a href=\"#cite_note-EnglmeierEdit17-19\" rel=\"external_link\">[19]<\/a><\/sup>, and then note the close linkage between data science and big data. Emphasized is the great need to avoid false positives coming from the data science analytics that is carried out. This arises from treating the data without fully linking and even integrating the analytics with the context, the relevance, and all that is to do with application and problem conceptualization. They note the well-known errors arising out of the Google Flu Trends, arising from Google searching, and service usage patterns obtained from taxi company, Uber. These were outcomes that produced false positives. There must be fully comprehensive information governance, encompassing all levels of information discovery, through conceptualization that can benefit significantly if collectively undertaken.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"5.2._Employment_requirements_in_data_science\">5.2. Employment requirements in data science<\/span><\/h3>\n<p>Many employment opportunities are now on offer for data scientists. For example, the comprehensive review of data science by Cao<sup id=\"rdp-ebb-cite_ref-CaoData17_7-6\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup> has a section entitled \u201cData economy: Data industrialization and services\u201d that describes the expanding business opportunities available to budding data scientists. More examples of opportunities can be found at the increasingly popular web service DataScientistJobs: <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/datascientistjobs.co.uk\" target=\"_blank\">https:\/\/datascientistjobs.co.uk<\/a>.\n<\/p><p>The stated requirements for data science jobs vary. While one must give the fullest perspective to the companies that one works with, and the university data science courses that one teaches, what follows is both a consideration and a selection of job requirement data, and preliminary results. This preliminary study of requirements for data science roles, some of them in senior management, represents what will become expanded research over time, both for the benefit of our data science students so they may be better prepared for work, and association, with companies both nationally and globally.\n<\/p><p>Online discussion of data science and big data have become commonplace, and they often include discussion of surveys that have been conducted. Examples include NewVantage Partners' Big Data Executive Survey<sup id=\"rdp-ebb-cite_ref-NVPBigData17_26-0\" class=\"reference\"><a href=\"#cite_note-NVPBigData17-26\" rel=\"external_link\">[26]<\/a><\/sup> of senior corporate executives, which found the dominant sector was financial services. This survey concerned internal investment and organizational matters, as well as business practices and plans. Another survey by Hayes<sup id=\"rdp-ebb-cite_ref-HayesEmperically16_27-0\" class=\"reference\"><a href=\"#cite_note-HayesEmperically16-27\" rel=\"external_link\">[27]<\/a><\/sup> asked questions of more than 620 data professionals in regard to skills required and at issue in data science. The author provides an interesting summary and presentation of results obtained using factor analysis.\n<\/p><p>Descriptions of new data scientist job listings from 2015 to 2017 were considered, all from the distribution list StatsJobs (sometimes indirectly, through links) in England. In most cases, specific programming languages or software environments at issue were indicated. In a few cases, the job advertisements did not explicitly list these details. Retained for use here were 73 such job descriptions. The frequent (more than four advertisements) software languages and software environments were as follows: R (50), Python (44), SQL (30), SAS (28), Hadoop (25), Matlab (17), SPSS (17), Java (16), Hive (14), Excel (9), MapReduce (9), NoSQL (8), Spark (7), C++ (6), Pig (6), Tableau (6), HBase (5), C# (4), Mahout (4), QlikView (4), and Scala (4).\n<\/p><p>To have sufficient comparability of software languages or environments, 21 of the above were selected that were required by at least four employers. Since some employers failed to list any software language or environment requirements, and indeed about three of the set of 72 had no explanation at all of what was required, consequently the set of employers was reduced to 60. Thus, a cross-tabulation of 60 employers seeking to employ a data scientist was used, with a few requirements or desires for expertise in particular software languages and software environments. The manner of expression was most often \"one or another or another again.\"\n<\/p><p>Correspondence analysis takes the employer set, and the software set, in the dual multidimensional spaces, both endowed with the chi squared metric, and maps both clouds into a factor space endowed with the Euclidean metric. Hierarchical clustering was carried out from the full dimensionality (therefore with no loss of, or decrease in, information content) factor space. The goal was to see what associations of software are most likely to be the case from these data scientist job advertisements.\n<\/p><p>Figure 1 and Figure 2 display the clustering of the software languages or environments. It is our intention to take such a mapping much further, with supplementary elements, also termed contextual elements, to locate them in the factor space, and these would include country or location of the job, industrial sector or government agency, or global corporate firm for the job. The objective would be to determine sector or regional preferences in the skills and abilities of the data scientists employed.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Murtagh_BigDataCogComp2018_2-2.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"163115b4814f2b99528674804e4577d0\"><img alt=\"Fig1 Murtagh BigDataCogComp2018 2-2.jpg\" src=\"https:\/\/www.limswiki.org\/images\/3\/34\/Fig1_Murtagh_BigDataCogComp2018_2-2.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1:<\/b> From 60 data scientist job advertisements, non-empty from the initial set of 72 employers, with use of 21 software languages or environments. The latter were required by at least four employers. Displayed is the principal factor plane.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Murtagh_BigDataCogComp2018_2-2.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"346bfaf9983583dd7a108d7abd1232d5\"><img alt=\"Fig2 Murtagh BigDataCogComp2018 2-2.jpg\" src=\"https:\/\/www.limswiki.org\/images\/d\/df\/Fig2_Murtagh_BigDataCogComp2018_2-2.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2:<\/b> Hierarchical clustering, and derived three-class partition, of the 21 software languages or environments at issue here, based on the full dimensionality, Euclidean-metric endowed, factor space. See Section 3. The agglomerative criterion is the Ward minimum variance method.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"6._Data_science_methodology_to_address:_Selection_bias.2C_scale_and_aggregation_effects.2C_and_qualitative_evaluation_of_decision-making_impact\">6. Data science methodology to address: Selection bias, scale and aggregation effects, and qualitative evaluation of decision-making impact<\/span><\/h2>\n<p>After reviewing some of the challenges of big data analytics, involving selection bias and replacement of individual attributes with aggregated attributes (hence, the collective attributes of groups to which the individual can belong), it's time to point to innovative new methodological perspectives that can both address such issues and challenges, but also benefit from the context, for example of using big data sources. Murtagh<sup id=\"rdp-ebb-cite_ref-MurtaghSec_28-0\" class=\"reference\"><a href=\"#cite_note-MurtaghSec-28\" rel=\"external_link\">[28]<\/a><\/sup> uses as an example a case study involving work for a major company, with its own example of how aggregated data can be used, if required, for individual-related analysis.\n<\/p><p>Ethical as well as methodological issues arise in scale effects, representation and expression, and particular context effect. Here, both the ethical implications, and the potential for qualitatively and quantitatively evaluating impact of decision-making and policy-making are summarized.\n<\/p><p>The quite regular lacking of coordination, alignment, and integration of methodology, including modelling, with data sourcing, is pointed to by Hand<sup id=\"rdp-ebb-cite_ref-HandTheDangers17_29-0\" class=\"reference\"><a href=\"#cite_note-HandTheDangers17-29\" rel=\"external_link\">[29]<\/a><\/sup>, noting the \u201cignorance of selection mechanisms has led to mistakes,\u201d and that selection or distortion processes can apply to not only \"human interactions\u2014where it has been suggested that the notion that \u2018data=all\u2019 can replace the need for careful theorising and statistical modelling\u2014but also in the hard sciences and medicine.\u201d\n<\/p><p>Keiding and Louis<sup id=\"rdp-ebb-cite_ref-KeidingPerils16_30-0\" class=\"reference\"><a href=\"#cite_note-KeidingPerils16-30\" rel=\"external_link\">[30]<\/a><\/sup> point out how one case study discussed \u201cshows the value of using \u2018big data\u2019 to conduct research on surveys (as distinct from survey research).\u201d Limitations though are clear: \u201cAlthough randomization in some form is very beneficial, it is by no means a panacea. Trial participants are commonly very different from the external ... pool, in part because of self-selection ...\u201d\n<\/p><p>The authors address these contemporary issues by noting \u201c[w]hen informing policy, inference to identified reference populations is key.\u201d<sup id=\"rdp-ebb-cite_ref-KeidingPerils16_30-1\" class=\"reference\"><a href=\"#cite_note-KeidingPerils16-30\" rel=\"external_link\">[30]<\/a><\/sup> This is part of the bridge which is needed, between data analytics technology and deployment of outcomes.\n<\/p><p>They also warn of additional caveats. \"In all situations, modelling is needed to accommodate non-response, dropouts and other forms of missing data.\u201d<sup id=\"rdp-ebb-cite_ref-KeidingPerils16_30-2\" class=\"reference\"><a href=\"#cite_note-KeidingPerils16-30\" rel=\"external_link\">[30]<\/a><\/sup> Noting that \u201c[r]epresentativity should be avoided,\u201d here is a fundamental way to address what we need to address: \u201cAssessment of external validity, i.e. generalization to the population from which the study subjects originated or to other populations, will in principle proceed via formulation of abstract laws of nature similar to physical laws.\u201d<sup id=\"rdp-ebb-cite_ref-KeidingPerils16_30-3\" class=\"reference\"><a href=\"#cite_note-KeidingPerils16-30\" rel=\"external_link\">[30]<\/a><\/sup> When considering Keiding and Louis' words, it is worth noting how\u2014related to eminent social scientist Pierre Bourdieu\u2019s work\u2014homology between fields of study offer clear perspectives on how beneficial innovative practice can be pursued.\n<\/p><p>This incorporates our need to \u201crehabilitate the individual\u201d in our analytics, and not simply replace the individual by the mean of some group. Many case studies of the latter are provided by eminent mathematical data scientist Cathy O\u2019Neill.<sup id=\"rdp-ebb-cite_ref-O.27NeilWeapons16_31-0\" class=\"reference\"><a href=\"#cite_note-O.27NeilWeapons16-31\" rel=\"external_link\">[31]<\/a><\/sup> Le Roux and Lebaron<sup id=\"rdp-ebb-cite_ref-LeRouxLeMeth15_32-0\" class=\"reference\"><a href=\"#cite_note-LeRouxLeMeth15-32\" rel=\"external_link\">[32]<\/a><\/sup> have a similar sentiment: \u201cRehabilitation of individuals. The context model is always formulated at the individual level, being opposed therefore to modelling at an aggregate level for which the individuals are only an \u2018error term\u2019 of the model.\u201d\n<\/p><p>Calibrating surveys and other data sources, through use of big data, has been at issue in addressing challenges and obstacles described in Keiding and Louis.<sup id=\"rdp-ebb-cite_ref-KeidingPerils16_30-4\" class=\"reference\"><a href=\"#cite_note-KeidingPerils16-30\" rel=\"external_link\">[30]<\/a><\/sup> In regard to decision-making and policy-making, the analysis of discourse in a data-driven way can provide relevant or necessary contextualization. Without having such an approach, a limited capability on the part of those in authority emerges: \u201ctop-down communication campaigns both predominate and are advised by those involved in social marketing ... However, this rarely manifests itself through measurable behaviour change ...\u201d<sup id=\"rdp-ebb-cite_ref-MurtaghTracking16_33-0\" class=\"reference\"><a href=\"#cite_note-MurtaghTracking16-33\" rel=\"external_link\">[33]<\/a><\/sup>\n<\/p><p>Instead, mediated by the latent semantic mapping of the discourse, we develop semantic distance measures between deliberative actions and the aggregate social effect.<sup id=\"rdp-ebb-cite_ref-MurtaghTracking16_33-1\" class=\"reference\"><a href=\"#cite_note-MurtaghTracking16-33\" rel=\"external_link\">[33]<\/a><\/sup> We let the data speak in regard to influence, impact, and reach. Impact is defined in terms of semantic distance between the initiating action and the net aggregate outcome. This can be statistically tested. It can be visualized. It can be further visualized and evaluated.\n<\/p><p>For research and for all engagement in data science, it is motivational to both address and have significant achievements in regard to innovative methodology.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"7._Benefits_of_high_profiling_of_data_science\">7. Benefits of high profiling of data science<\/span><\/h2>\n<p>Many blog posting declare \u201cbig data is dead.\u201d (A Google query of the phrase, dated 2017-12-29, lists 153,000 results.) At issue is just this: complete priority is to be given to the problems to be solved and the challenges to be addressed. In Cao's extensive and outstanding detailing of many aspects of data science<sup id=\"rdp-ebb-cite_ref-CaoData17_7-7\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup>, in [7] is acknowledgement that there is much that is still currently \u201ctremendous hype and buzz,\u201d and \u201cengendering enormous hype and even bewilderment.\u201d There is this perspective, too, which can be a viewpoint if the sole aim were for data science to automate data analytics in all domains of application: counterposed to advanced analytics, \u201cdummy analytics is becoming the default setting of management and operational systems.\u201d<sup id=\"rdp-ebb-cite_ref-CaoData17_7-8\" class=\"reference\"><a href=\"#cite_note-CaoData17-7\" rel=\"external_link\">[7]<\/a><\/sup>\n<\/p><p>Fully in line with the context of those perspectives, a major theme of this article is that the convergence of disciplines in the data science framework builds on cooperative and collaborative expertise, and thus does not seek to replace or supplant such expertise. A major conclusion is not to replace current disciplines (mathematics, statistics, computing, engineering, physics and chemistry, arts and humanities, social and psychological sciences, and so on) but\u2014where relevant and where appropriate, and also where motivated and where justified\u2014to re-orientate and to bridge primary as well as foundational levels of disciplines.\n<\/p><p>In somewhat humorous fashion, in the sense of revolution versus evolution, let the following be noted. At the 61st World Statistics Congress, in July 2017, in Marrakech, Morocco, there was a session organized jointly by the High Commission for Planning (HCP) of Morocco and the Ministry of Development Planning and Statistics (MDPS) of Qatar. This session was entitled \u201cThe Data Revolution for the Sustainable Development Goals.\u201d One comment raised in the question and answer session was a request for evolution to be at issue rather than revolution. At the same time, it's interesting to note how there is an important advisory group in the United Nations, called the Data Revolution Group (see <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.undatarevolution.org\" target=\"_blank\">http:\/\/www.undatarevolution.org<\/a>) which seeks \u201c[m]obilising the data revolution for sustainable development.\u201d For data science, it is clear that there is great inspiration here. Some other organizational initiatives will now be mentioned. This is to complement a great deal that is being done already by major organizations in statistics, classification and data research, engineering, and explicitly in data science.\n<\/p><p>In European research funding, i.e., Horizon 2020, an important supported project is that of the European Data Science Academy or EDSA (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/edsa-project.eu\" target=\"_blank\">http:\/\/edsa-project.eu<\/a>), which dates back to 2005. There could well be an important role for such an organization in the future, in regard to sponsoring fellowship levels of organizational memberships, and it would be interesting to promote chartered membership. In the European Commission context, dating from July 2014, we find the \"best practice guidelines for public authorities and open data\u201d under the scope of governments embracing the \"potential of big data.\u201d<sup id=\"rdp-ebb-cite_ref-ECCommission14_34-0\" class=\"reference\"><a href=\"#cite_note-ECCommission14-34\" rel=\"external_link\">[34]<\/a><\/sup>\n<\/p><p>At the U.K. national level, an important initiative, directly or indirectly related to much that was under discussion in this article (in Section 3, in particular) is open data. The Open Data Institute (see <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/theodi.org\" target=\"_blank\">https:\/\/theodi.org<\/a>) in the U.K. was founded in 2012 by Sir Tim Berners-Lee and Sir Nigel Shadbolt. In welcoming membership applications, there is this: \u201cMembership: Join the data revolution.\u201d There is this prominent statement too: \u201cData is changing our world.\u201d\n<\/p><p>In a practical sense, focused on data to begin with and entirely relevant for data curation now and in the future, we find in comparison the Research Data Alliance or RDA (see <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.rd-alliance.org\" target=\"_blank\">https:\/\/www.rd-alliance.org<\/a>). RDA is supported by the E.U., by the NSF (National Science Foundation) and NIST (National Institute of Standards and Technology) in the U.S., by the JISC (Joint Information Systems Committee) and other agencies in the U.K., and by Australia and Japan.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"8._Important_new_research_challenges_from_data\">8. Important new research challenges from data<\/span><\/h2>\n<p>This and the following section engage with major new developments, for problem solving, and for data science and big data analytics, with the partial or complete integration of relevant sciences and technologies, and methodologies, in observed and empirical contexts.\n<\/p><p>Data science\u2014integrating potentially all application domains, with mathematical foundations for methodology as befits observational science, and integrated observational and experimental science\u2014fully relates data to all that is accomplished and achieved from the data sources. This results in the great importance of the contemporary increasing orientation towards, and requirement for, open data. Mahabal <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-MahabalFromSky17_2-6\" class=\"reference\"><a href=\"#cite_note-MahabalFromSky17-2\" rel=\"external_link\">[2]<\/a><\/sup> offer a good explanation of this development in data science, and of the potential here for application transfers, in parallel with methodology transfers.\n<\/p><p>The Open Universe initiative (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.openuniverse.asi.it\" target=\"_blank\">http:\/\/www.openuniverse.asi.it<\/a>) was established by the United Nations.<sup id=\"rdp-ebb-cite_ref-CPUOUOpen16_35-0\" class=\"reference\"><a href=\"#cite_note-CPUOUOpen16-35\" rel=\"external_link\">[35]<\/a><\/sup>\n<\/p><p>The initiative stated that \"acknowledging that open data access drives innovation and productivity is a well-established principle in every scientific discipline. However, there is still a considerable degree of unevenness in the services currently offered by providers of data ...\" Among six objectives is the goal of \"advancing calibration quality and statistical integrity,\u201d with outcomes for education, globally, and private sector involvement. Here, and through transference to each and all domains for data science, what is required for open data and all associated open information, is that it must be findable, accessible, interoperable, and reusable (the FAIR Principles), as well as reproducible.<sup id=\"rdp-ebb-cite_ref-WilkinsonTheFAIR16_36-0\" class=\"reference\"><a href=\"#cite_note-WilkinsonTheFAIR16-36\" rel=\"external_link\">[36]<\/a><\/sup>\n<\/p><p>Supporting the FAIR principles is ESASKY (European Space Agency, Sky, accessible from <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/sci.esa.int\/home\" target=\"_blank\">http:\/\/sci.esa.int\/home<\/a>), described as \"a discovery portal that provides full access to the entire sky. This open-science application allows computer, tablet and mobile users to visualise cosmic objects near and far across the electromagnetic spectrum.\u201d The interesting new research challenges in Data Science can be stated to be foremostly related to the transfer to many domains of FAIR-based open science, discovery portals.\n<\/p><p>An important application domain in this regard will be emerging smart technologies, which encompass smart homes, smart cities, smart environments in general, and the internet of things. An important \"situation theory\" methodology, in an information space that is mathematically based, furnishing a comprehensive representational system, is proposed by Devlin.<sup id=\"rdp-ebb-cite_ref-DevlinLogic91_37-0\" class=\"reference\"><a href=\"#cite_note-DevlinLogic91-37\" rel=\"external_link\">[37]<\/a><\/sup> Associated with this are the social, legal, and economic aspects of emerging smart technologies in real-life applications.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"9._Information_space_theory_for_big_data_analytics_in_internet_of_things_and_smart_environments\">9. Information space theory for big data analytics in internet of things and smart environments<\/span><\/h2>\n<p>Context is extremely important in big data analytics, and in many other domains.<sup id=\"rdp-ebb-cite_ref-MurtaghContext17_20-1\" class=\"reference\"><a href=\"#cite_note-MurtaghContext17-20\" rel=\"external_link\">[20]<\/a><\/sup> Situation theory provides humans (generally, trained domain experts) with powerful, flexible representations that enable them to perform better, both as analysts and decision makers. Systems such as the one outlined in Figure 3 for the U.S. Army (sourced from Devlin<sup id=\"rdp-ebb-cite_ref-DevlinAUniform11_38-0\" class=\"reference\"><a href=\"#cite_note-DevlinAUniform11-38\" rel=\"external_link\">[38]<\/a><\/sup>) have a software back-end, possibly including artificial intelligence (AI), but they are in no way \u201ccalculators\u201d or expert systems for making decisions. What was done was to harness the power of mathematics primarily as a representational system, compared to its computational capacity. While the back-end software can manipulate the network\u2014each completion diagram is a structurally identical piece of code\u2014perhaps permitting the eventual application of familiar-looking network-optimization algorithms, many of those completion diagrams represent inherently human thoughts, intentions, and actions, and, for the foreseeable future, the human mind remains the best tool to handle them. This work for the United States Army used situation theory to develop a first-iteration specification for a workstation to be used by a field commander, in both mission planning and real-time control. This work takes into account the many different ontologies in a modern battlefield. The role of ontologies is very central in qualitative analysis of research, cf., Darabi<sup id=\"rdp-ebb-cite_ref-DarabiTheUK17_14-1\" class=\"reference\"><a href=\"#cite_note-DarabiTheUK17-14\" rel=\"external_link\">[14]<\/a><\/sup>.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Murtagh_BigDataCogComp2018_2-2.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"6779d9263618b7ee7296c7a548f94506\"><img alt=\"Fig3 Murtagh BigDataCogComp2018 2-2.jpg\" src=\"https:\/\/www.limswiki.org\/images\/7\/77\/Fig3_Murtagh_BigDataCogComp2018_2-2.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3:<\/b> The completion-diagram network is complex. In the screenshot, the aerial view (taken from a previous mission used in training) is of an urban battlefield. The back-end system links the elements in each completion diagram to a corresponding feature in the aerial view, permitting the user to work fluidly with the two representations, having the benefit of two very different views, one spatial, the other human-structural, so the user can explore the domain from (literally) different perspectives.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"9.1_Context.2C_situation_theory.2C_and_completion_diagrams\">9.1 Context, situation theory, and completion diagrams<\/span><\/h3>\n<p>In the early 1980s, a group of researchers at or connected to Stanford University started to develop an analogous mathematically-based representation of communicating humans, looking deeper than the mere fact of communication (captured by the network model used by the telecommunication engineers) to take account of what was being communicated. (Part of the challenge was to decide how far it is possible to go into categorizing that \u201cwhat\u201d in order to achieve a representation that is useful in analyzing communication and designing communication-based activities such as work.) That approach is generally referred to as situation theory. Devlin was one of those early pioneers, who wrote a theoretical book on the subject, <i>Logic and Information<\/i>.<sup id=\"rdp-ebb-cite_ref-DevlinLogic91_37-1\" class=\"reference\"><a href=\"#cite_note-DevlinLogic91-37\" rel=\"external_link\">[37]<\/a><\/sup> Subsequently, the techniques developed by the Stanford group were applied by Devlin and Rosenberg<sup id=\"rdp-ebb-cite_ref-DevlinAnal96_39-0\" class=\"reference\"><a href=\"#cite_note-DevlinAnal96-39\" rel=\"external_link\">[39]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-DevlinInfo08_40-0\" class=\"reference\"><a href=\"#cite_note-DevlinInfo08-40\" rel=\"external_link\">[40]<\/a><\/sup> to solve an actual workplace problem involving communication in the workplace.\n<\/p><p>The representation<sup id=\"rdp-ebb-cite_ref-DevlinInfo08_40-1\" class=\"reference\"><a href=\"#cite_note-DevlinInfo08-40\" rel=\"external_link\">[40]<\/a><\/sup> used was (of necessity) similar to that used by telecommunication engineers, Google, the postal system, UPS, and FedEx in that the domain is represented by a network. However, whereas those earlier examples had networks of point nodes, the nodes in the network were more complicated objects, which were termed \u201ccompletion diagrams.\u201d See the right-hand side of Figure 3, where \u201csituation s1\u201d results in \u201ctype T1\u201d, and \u201csituation s2\u201d results in \u201ctype T2\u201d, so that transition from \u201csituation 1\u201d to \u201csituation s2\u201d has the related association between \u201ctype T1\u201d and \u201ctype T2\u201d. The exact nature of the entities in such a completion diagram: they can be considered as capturing the key elements of a basic human act, here military and managerial action, including a communicative act. Much of <i>Logic and Information<\/i> is devoted to the development and explication of such a completion diagram. It has its origins in work by Barwise and Perry.<sup id=\"rdp-ebb-cite_ref-BarwiseSituations99_41-0\" class=\"reference\"><a href=\"#cite_note-BarwiseSituations99-41\" rel=\"external_link\">[41]<\/a><\/sup>\n<\/p><p>Information is a vehicle for the use of a big data approach to underpin the study of interaction and communication in smart environments (e.g., cities, workplaces, and homes). \"Information space theory\" is to provide the focus for building an inter-disciplinary community concerned with social and technological issues associated with recent technological advances. Relevant emerging research and innovation disciplines include the internet of things, internet of everything, and big data analytics, among others, that contribute to the design, development, and effective implementation of smart environments in real life.\n<\/p><p>Research projects related to both \u201cinformation space theory\u201d and \u201cinteraction space theory\u201d include SANE, \u201cSustainable Accommodation for the New Economy\u201d, a European Framework 5 research project with very innovative aims and outcomes for research and for industrial companies. It's described as \"a multi-disciplinary and multi-cultural R&amp;D project that takes a location independent approach to the design of a sustainable workplace to ensure compatibility between fixed and mobile, local and remote work areas\" and one that will specify, prototype, and develop a set of ICT tools with \"emphasis being on the innovative application of emerging technologies and services.\"<sup id=\"rdp-ebb-cite_ref-CORDIS-SANE00_42-0\" class=\"reference\"><a href=\"#cite_note-CORDIS-SANE00-42\" rel=\"external_link\">[42]<\/a><\/sup> Another project involving universities in the U.K. and in Germany was IS-VIT, \u201cInteraction Space of the Virtual IT Workplace\u201d. Related outcomes of these projects are described by Rosenberg <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-RosenbergInteraction05_43-0\" class=\"reference\"><a href=\"#cite_note-RosenbergInteraction05-43\" rel=\"external_link\">[43]<\/a><\/sup> and Walkowski <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-WalkowskiUsing11_44-0\" class=\"reference\"><a href=\"#cite_note-WalkowskiUsing11-44\" rel=\"external_link\">[44]<\/a><\/sup>\n<\/p><p>Information space theory takes into account the following: (i) People who inhabit smart environments and spontaneously generate data and information in the course of their day-to-day activities; (ii) Place which can be public (smart cities), privileged (workplaces) or private (homes) with varying degrees of privacy and security constraints that shape information sharing; and (iii) Patterns of interaction between people and technology that is an integral part of smart environments and influences human\u2013human, human\u2013device and device\u2013device interaction.\n<\/p><p>A summary follows of inter-disciplinary information space theory and its application in smart environments: (i) an introduction to studies of information, data and interaction; (ii) big data analytics as a tool for the development of information space theory; (iii) information space theory and its impact on the design of smart environments; (iv) information space and human communication research, involving an account of the evolution of smart interaction systems; (v) further refinement of information space theory informed by cross-disciplinary perspectives and requirements of application in smart systems and emerging technologies, including contribution to the application of big data analytics in real-life smart environments; and (vi) the concept of information space as a distinct feature of human context that makes it possible for people to achieve coordination and reciprocity of perspectives through smart interaction systems that safeguard their privacy and security.\n<\/p><p>Such work builds on the work of an inter-disciplinary group of researchers within mathematics, computer, and social sciences who are attempting to address key research questions: How do emerging smart technologies influence information sharing in interaction between people and technology in smart environments? What are the social, legal, and economic impacts of emerging smart technologies in real-life application?\n<\/p><p>To this end, the concept of information space will guide the investigation into interactions that occur within smart environments, taking account of human\u2013human, human\u2013device, and device\u2013device interaction in a uniform framework. Special attention is given to information sharing\u2014pathways, enablers and gatekeepers\u2014to incorporate security and privacy concerns that urgently need to be addressed in order to optimize the technology potential in real-life applications of smart environments. The working assumption behind this approach is that inter-disciplinary, formal, and theoretical understanding of the nature of these interactions is essential for these concerns to be addressed and resolved.\n<\/p><p>In this context, mathematics plays a crucial role in developing and using a mathematically-based representation framework for the analysis and design of work in the era of the internet of things. Both in life and in scientific studies, what we can achieve depends on, and is constrained by, the representational system we use. The greater the complexity of the domain, the more significant is the representation at our disposal\u2014representations are what make it possible for us to understand and reason about the world. For instance, trade, commerce, and financial activity in Europe were revolutionized by the introduction of the Hindu-Arabic, decimal arithmetic system (\u201cmodern arithmetic\u201d) in the thirteenth century, which made it possible for anyone to become proficient in arithmetic after just a few weeks practice. A similar revolution occurred in the 1980s, when the introduction of the modern, windows\u2013icons\u2013mouse interface for personal computers made it possible for ordinary people to use what had until then been a tool for trained experts. Long before those two examples, the introduction of numbers themselves, in the form of a monetary system, transformed human life by providing a simple, quantitative representation system for property ownership and social indebtedness.\n<\/p><p>The rise of natural science involved a new representation system that assigned numerical values to various features of the environment (features given names such as length, area, volume, mass, temperature, momentum, etc.) and shifted the focus from trying to understand why things occurred to simply measuring how one quantified feature varied with another\u2014an approach that proved to be extremely fruitful for society. The representation systems of the natural sciences have all been based on mathematics to a considerable extent. In the social realm, mathematically-based representation systems are less common, but when they have been developed, they have proved to be extremely powerful. (Money is a particularly dramatic example.) Indeed, one of the most widespread applications of mathematics in today\u2019s world is the optimization of various human activities. Computer queries depend on optimization in a mathematical space that treats every living human as a node in a simple mathematical structure called a graph. \u201cModelling\u201d a person as a point node in a mathematical network omits all information about a person save for one factor: the connections of that human to all other humans. However, for questions that hinge on that one factor, the representation enables mathematical algorithms to be applied that provide society with one of its most important tools.\n<\/p><p>Another example is provided by the algorithms that route our telephone calls, our internet communication, or mail and package delivery systems, and our transportation systems. In those cases, whereas a search engine like Google represents the human domain as a two-dimensional network of nodes and edges, the domains of communicating devices such as phones or computers, of letters and packages in shipment, and of travelers are represented as high dimensional \u201cpolytopes,\u201d generalizations of the familiar polygons of high school geometry to higher dimensions, to which mathematical methods such as the Simplex Method or Karmarkar\u2019s can be applied to determine optimal routing. These representations work by ignoring almost everything about the entities in the domain apart from the one or two features that are germane to the task. The result is that the power of mathematics can be brought to bear to a problem that, on the face of it, is part of the complex web of human activity that defies the methods of science in terms of its complexity and (local) unpredictability.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"10._Conclusions\">10. Conclusions<\/span><\/h2>\n<p>Having indicated a few highly important and relatively recent organizational initiatives, data science\u2014viewed as the convergence of disciplines, or, in practice, sub-disciplines\u2014should very much incorporate open methodology, open data, and transparency, reproducibility, and interoperability.\n<\/p><p>This article has sought to form a foundation for further study of the specific content of data science education and training, and of business sector importance. After all, progress and impact ensure development and evolution over time. As noted above, too, we may, if we wish, refer to the contemporary data revolution.\n<\/p><p>Both challenges and impactful potential are prominent, and it is good to see them as predominant in our rapidly growing discipline of data science. There are also important directions (in new research challenges and application of information space theory) to both follow and to incorporate in other domains.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>Section 9 by K.D. and other sections by F.M. All represent our extensive research work, teaching, and some consultancy also.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h3>\n<p>The authors declare no conflict of interest.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-HenkeTheAge16-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HenkeTheAge16_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Henke, N.; Bughin, J.; Chui, M. et al.&#32;(December 2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.mckinsey.com\/business-functions\/mckinsey-analytics\/our-insights\/the-age-of-analytics-competing-in-a-data-driven-world\" target=\"_blank\">\"The age of analytics: Competing in a data-driven world\"<\/a>.&#32;McKinsey &amp; Company.&#32;pp. 136<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.mckinsey.com\/business-functions\/mckinsey-analytics\/our-insights\/the-age-of-analytics-competing-in-a-data-driven-world\" target=\"_blank\">https:\/\/www.mckinsey.com\/business-functions\/mckinsey-analytics\/our-insights\/the-age-of-analytics-competing-in-a-data-driven-world<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=The+age+of+analytics%3A+Competing+in+a+data-driven+world&amp;rft.atitle=&amp;rft.aulast=Henke%2C+N.%3B+Bughin%2C+J.%3B+Chui%2C+M.+et+al.&amp;rft.au=Henke%2C+N.%3B+Bughin%2C+J.%3B+Chui%2C+M.+et+al.&amp;rft.date=December+2016&amp;rft.pages=pp.+136&amp;rft.pub=McKinsey+%26+Company&amp;rft_id=https%3A%2F%2Fwww.mckinsey.com%2Fbusiness-functions%2Fmckinsey-analytics%2Four-insights%2Fthe-age-of-analytics-competing-in-a-data-driven-world&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MahabalFromSky17-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MahabalFromSky17_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-MahabalFromSky17_2-1\" rel=\"external_link\">2.1<\/a><\/sup> <sup><a href=\"#cite_ref-MahabalFromSky17_2-2\" rel=\"external_link\">2.2<\/a><\/sup> <sup><a href=\"#cite_ref-MahabalFromSky17_2-3\" rel=\"external_link\">2.3<\/a><\/sup> <sup><a href=\"#cite_ref-MahabalFromSky17_2-4\" rel=\"external_link\">2.4<\/a><\/sup> <sup><a href=\"#cite_ref-MahabalFromSky17_2-5\" rel=\"external_link\">2.5<\/a><\/sup> <sup><a href=\"#cite_ref-MahabalFromSky17_2-6\" rel=\"external_link\">2.6<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Mahabal, A.A.; Crichton, D.; Djorgovki, S.G. et al.&#32;(2017).&#32;\"From Sky to Earth: Data Science Methodology Transfer\".&#32;<i>Proceedings of the International Astronomical Union<\/i>: 1\u201310.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1017%2FS1743921317000060\" target=\"_blank\">10.1017\/S1743921317000060<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=From+Sky+to+Earth%3A+Data+Science+Methodology+Transfer&amp;rft.jtitle=Proceedings+of+the+International+Astronomical+Union&amp;rft.aulast=Mahabal%2C+A.A.%3B+Crichton%2C+D.%3B+Djorgovki%2C+S.G.+et+al.&amp;rft.au=Mahabal%2C+A.A.%3B+Crichton%2C+D.%3B+Djorgovki%2C+S.G.+et+al.&amp;rft.date=2017&amp;rft.pages=1%E2%80%9310&amp;rft_id=info:doi\/10.1017%2FS1743921317000060&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MurtaghData17-3\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MurtaghData17_3-0\" rel=\"external_link\">3.0<\/a><\/sup> <sup><a href=\"#cite_ref-MurtaghData17_3-1\" rel=\"external_link\">3.1<\/a><\/sup> <sup><a href=\"#cite_ref-MurtaghData17_3-2\" rel=\"external_link\">3.2<\/a><\/sup> <sup><a href=\"#cite_ref-MurtaghData17_3-3\" rel=\"external_link\">3.3<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Murtagh, F.&#32;(2017).&#32;<i>Data Science Foundations: Geometry and Topology of Complex Hierarchic Systems and Big Data Analytics<\/i>.&#32;CRC Press.&#32;pp.&#160;206.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781498763936.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Data+Science+Foundations%3A+Geometry+and+Topology+of+Complex+Hierarchic+Systems+and+Big+Data+Analytics&amp;rft.aulast=Murtagh%2C+F.&amp;rft.au=Murtagh%2C+F.&amp;rft.date=2017&amp;rft.pages=pp.%26nbsp%3B206&amp;rft.pub=CRC+Press&amp;rft.isbn=9781498763936&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HayashiWhatIs98-4\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-HayashiWhatIs98_4-0\" rel=\"external_link\">4.0<\/a><\/sup> <sup><a href=\"#cite_ref-HayashiWhatIs98_4-1\" rel=\"external_link\">4.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Hayashi, C.&#32;(1998).&#32;\"What is Data Science? Fundamental concepts and a heuristic example\".&#32;In&#32;Hayashi, C.; Yajima, K.; Bock H.H. et al..&#32;<i>Data Science, Classification, and Related Methods<\/i>.&#32;Springer.&#32;pp.&#160;40\u201351.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9784431702085.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=What+is+Data+Science%3F+Fundamental+concepts+and+a+heuristic+example&amp;rft.atitle=Data+Science%2C+Classification%2C+and+Related+Methods&amp;rft.aulast=Hayashi%2C+C.&amp;rft.au=Hayashi%2C+C.&amp;rft.date=1998&amp;rft.pages=pp.%26nbsp%3B40%E2%80%9351&amp;rft.pub=Springer&amp;rft.isbn=9784431702085&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OhsumiFromData00-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OhsumiFromData00_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Ohsumi, N.&#32;(2000).&#32;\"From data analysis to data science\".&#32;In&#32;Kiers, H.A.L.; Rasson, J.-P.; Groenen, P.J.F. et al..&#32;<i>Data Science, Classification, and Related Methods<\/i>.&#32;Springer.&#32;pp.&#160;329\u201334.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9783540675211.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=From+data+analysis+to+data+science&amp;rft.atitle=Data+Science%2C+Classification%2C+and+Related+Methods&amp;rft.aulast=Ohsumi%2C+N.&amp;rft.au=Ohsumi%2C+N.&amp;rft.date=2000&amp;rft.pages=pp.%26nbsp%3B329%E2%80%9334&amp;rft.pub=Springer&amp;rft.isbn=9783540675211&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EscoufierDataSci95-6\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-EscoufierDataSci95_6-0\" rel=\"external_link\">6.0<\/a><\/sup> <sup><a href=\"#cite_ref-EscoufierDataSci95_6-1\" rel=\"external_link\">6.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Escoufier, Y.; Fichet, B.; Lebart, L. et al., ed.&#32;(1995).&#32;<i>Data Science and Its Applications<\/i>.&#32;Academic Press.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Data+Science+and+Its+Applications&amp;rft.date=1995&amp;rft.pub=Academic+Press&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CaoData17-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-CaoData17_7-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-CaoData17_7-1\" rel=\"external_link\">7.1<\/a><\/sup> <sup><a href=\"#cite_ref-CaoData17_7-2\" rel=\"external_link\">7.2<\/a><\/sup> <sup><a href=\"#cite_ref-CaoData17_7-3\" rel=\"external_link\">7.3<\/a><\/sup> <sup><a href=\"#cite_ref-CaoData17_7-4\" rel=\"external_link\">7.4<\/a><\/sup> <sup><a href=\"#cite_ref-CaoData17_7-5\" rel=\"external_link\">7.5<\/a><\/sup> <sup><a href=\"#cite_ref-CaoData17_7-6\" rel=\"external_link\">7.6<\/a><\/sup> <sup><a href=\"#cite_ref-CaoData17_7-7\" rel=\"external_link\">7.7<\/a><\/sup> <sup><a href=\"#cite_ref-CaoData17_7-8\" rel=\"external_link\">7.8<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Cao, L.&#32;(2017).&#32;\"Data Science: A Comprehensive Overview\".&#32;<i>ACM Computing Surveys<\/i>&#32;<b>50<\/b>&#32;(3): 43.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1145%2F3076253\" target=\"_blank\">10.1145\/3076253<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+Science%3A+A+Comprehensive+Overview&amp;rft.jtitle=ACM+Computing+Surveys&amp;rft.aulast=Cao%2C+L.&amp;rft.au=Cao%2C+L.&amp;rft.date=2017&amp;rft.volume=50&amp;rft.issue=3&amp;rft.pages=43&amp;rft_id=info:doi\/10.1145%2F3076253&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-UenoAsThe17-8\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-UenoAsThe17_8-0\" rel=\"external_link\">8.0<\/a><\/sup> <sup><a href=\"#cite_ref-UenoAsThe17_8-1\" rel=\"external_link\">8.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ueno, M.&#32;(2017).&#32;\"As the oldest journal of data science\".&#32;<i>Behaviormetrika<\/i>&#32;<b>44<\/b>&#32;(1): 1\u20132.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs41237-016-0011-7\" target=\"_blank\">10.1007\/s41237-016-0011-7<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=As+the+oldest+journal+of+data+science&amp;rft.jtitle=Behaviormetrika&amp;rft.aulast=Ueno%2C+M.&amp;rft.au=Ueno%2C+M.&amp;rft.date=2017&amp;rft.volume=44&amp;rft.issue=1&amp;rft.pages=1%E2%80%932&amp;rft_id=info:doi\/10.1007%2Fs41237-016-0011-7&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EnglmeierData17-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-EnglmeierData17_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Englmeier, K.; Murtagh, F.&#32;(2017).&#32;\"Data scientist - Manager of the discovery lifecycle\".&#32;<i>Proceedings of the 6th International Conference on Data Science, Technology and Applications<\/i>: 133\u2013140.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5220%2F0006393801330140\" target=\"_blank\">10.5220\/0006393801330140<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+scientist+-+Manager+of+the+discovery+lifecycle&amp;rft.jtitle=Proceedings+of+the+6th+International+Conference+on+Data+Science%2C+Technology+and+Applications&amp;rft.aulast=Englmeier%2C+K.%3B+Murtagh%2C+F.&amp;rft.au=Englmeier%2C+K.%3B+Murtagh%2C+F.&amp;rft.date=2017&amp;rft.pages=133%E2%80%93140&amp;rft_id=info:doi\/10.5220%2F0006393801330140&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MurtaghData17-8-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MurtaghData17-8_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Murtagh, F.&#32;(2017).&#32;\"Chapter 8: Geometry and Topology of Matte Blanco's Bi-Logic in Psychoanalytics\".&#32;<i>Data Science Foundations: Geometry and Topology of Complex Hierarchic Systems and Big Data Analytics<\/i>.&#32;CRC Press.&#32;pp.&#160;147\u201362.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781498763936.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Chapter+8%3A+Geometry+and+Topology+of+Matte+Blanco%27s+Bi-Logic+in+Psychoanalytics&amp;rft.atitle=Data+Science+Foundations%3A+Geometry+and+Topology+of+Complex+Hierarchic+Systems+and+Big+Data+Analytics&amp;rft.aulast=Murtagh%2C+F.&amp;rft.au=Murtagh%2C+F.&amp;rft.date=2017&amp;rft.pages=pp.%26nbsp%3B147%E2%80%9362&amp;rft.pub=CRC+Press&amp;rft.isbn=9781498763936&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CoombsATheory64-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CoombsATheory64_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Coombs, C.H.&#32;(1964).&#32;<i>A Theory of Data<\/i>.&#32;Wiley.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=A+Theory+of+Data&amp;rft.aulast=Coombs%2C+C.H.&amp;rft.au=Coombs%2C+C.H.&amp;rft.date=1964&amp;rft.pub=Wiley&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JapecAAPOR15-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JapecAAPOR15_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Japec, L.; Kreuter, F.; Berg, M. et al.&#32;(12 February 2015).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.aapor.org\/Education-Resources\/Reports\/Big-Data.aspx\" target=\"_blank\">\"AAPORT Report: Big Data\"<\/a>.&#32;AAPOR<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.aapor.org\/Education-Resources\/Reports\/Big-Data.aspx\" target=\"_blank\">https:\/\/www.aapor.org\/Education-Resources\/Reports\/Big-Data.aspx<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=AAPORT+Report%3A+Big+Data&amp;rft.atitle=&amp;rft.aulast=Japec%2C+L.%3B+Kreuter%2C+F.%3B+Berg%2C+M.+et+al.&amp;rft.au=Japec%2C+L.%3B+Kreuter%2C+F.%3B+Berg%2C+M.+et+al.&amp;rft.date=12+February+2015&amp;rft.pub=AAPOR&amp;rft_id=https%3A%2F%2Fwww.aapor.org%2FEducation-Resources%2FReports%2FBig-Data.aspx&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AbbanyAPublic17-13\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-AbbanyAPublic17_13-0\" rel=\"external_link\">13.0<\/a><\/sup> <sup><a href=\"#cite_ref-AbbanyAPublic17_13-1\" rel=\"external_link\">13.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">Abbany, Z.&#32;(27 November 2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.dw.com\/en\/a-public-transport-model-built-on-open-data\/a-41546053\" target=\"_blank\">\"A public transport model built on open data\"<\/a>.&#32;<i>DW<\/i>.&#32;Deutsche Welle<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.dw.com\/en\/a-public-transport-model-built-on-open-data\/a-41546053\" target=\"_blank\">https:\/\/www.dw.com\/en\/a-public-transport-model-built-on-open-data\/a-41546053<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 27 November 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=A+public+transport+model+built+on+open+data&amp;rft.atitle=DW&amp;rft.aulast=Abbany%2C+Z.&amp;rft.au=Abbany%2C+Z.&amp;rft.date=27+November+2017&amp;rft.pub=Deutsche+Welle&amp;rft_id=https%3A%2F%2Fwww.dw.com%2Fen%2Fa-public-transport-model-built-on-open-data%2Fa-41546053&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DarabiTheUK17-14\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-DarabiTheUK17_14-0\" rel=\"external_link\">14.0<\/a><\/sup> <sup><a href=\"#cite_ref-DarabiTheUK17_14-1\" rel=\"external_link\">14.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">Darabi, A.&#32;(05 December 2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/apolitical.co\/solution_article\/uks-next-census-will-last-heres\/\" target=\"_blank\">\"The UK\u2019s next census will be its last\u2014here\u2019s why\"<\/a>.&#32;<i>Apolitical<\/i>.&#32;Apolitical Group Limited<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/apolitical.co\/solution_article\/uks-next-census-will-last-heres\/\" target=\"_blank\">https:\/\/apolitical.co\/solution_article\/uks-next-census-will-last-heres\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=The+UK%E2%80%99s+next+census+will+be+its+last%E2%80%94here%E2%80%99s+why&amp;rft.atitle=Apolitical&amp;rft.aulast=Darabi%2C+A.&amp;rft.au=Darabi%2C+A.&amp;rft.date=05+December+2017&amp;rft.pub=Apolitical+Group+Limited&amp;rft_id=https%3A%2F%2Fapolitical.co%2Fsolution_article%2Fuks-next-census-will-last-heres%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MurtaghQual18-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MurtaghQual18_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Murtagh, F.; Orlov, M.; Mirkin, B.&#32;(2018).&#32;\"Qualitative Judgement of Research Impact: Domain Taxonomy as a Fundamental Framework for Judgement of the Quality of Research\".&#32;<i>Journal of Classification<\/i>&#32;<b>35<\/b>&#32;(1): 5\u201328.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs00357-018-9247-0\" target=\"_blank\">10.1007\/s00357-018-9247-0<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Qualitative+Judgement+of+Research+Impact%3A+Domain+Taxonomy+as+a+Fundamental+Framework+for+Judgement+of+the+Quality+of+Research&amp;rft.jtitle=Journal+of+Classification&amp;rft.aulast=Murtagh%2C+F.%3B+Orlov%2C+M.%3B+Mirkin%2C+B.&amp;rft.au=Murtagh%2C+F.%3B+Orlov%2C+M.%3B+Mirkin%2C+B.&amp;rft.date=2018&amp;rft.volume=35&amp;rft.issue=1&amp;rft.pages=5%E2%80%9328&amp;rft_id=info:doi\/10.1007%2Fs00357-018-9247-0&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HandStat18-16\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-HandStat18_16-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-HandStat18_16-1\" rel=\"external_link\">16.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hand, D.J.&#32;(2018).&#32;\"Statistical challenges of administrative and transaction data\".&#32;<i>Statistics in Society Series A<\/i>&#32;<b>181<\/b>&#32;(3): 555\u2013605.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Frssa.12315\" target=\"_blank\">10.1111\/rssa.12315<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Statistical+challenges+of+administrative+and+transaction+data&amp;rft.jtitle=Statistics+in+Society+Series+A&amp;rft.aulast=Hand%2C+D.J.&amp;rft.au=Hand%2C+D.J.&amp;rft.date=2018&amp;rft.volume=181&amp;rft.issue=3&amp;rft.pages=555%E2%80%93605&amp;rft_id=info:doi\/10.1111%2Frssa.12315&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AndersonTheEnd08-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AndersonTheEnd08_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Anderson, C.&#32;(23 June 2008).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.wired.com\/2008\/06\/pb-theory\/\" target=\"_blank\">\"The End of Theory: The Data Deluge Makes The Scientific Method Obsolete\"<\/a>.&#32;<i>Wired<\/i>.&#32;Cond\u00e9 Nast<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.wired.com\/2008\/06\/pb-theory\/\" target=\"_blank\">https:\/\/www.wired.com\/2008\/06\/pb-theory\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=The+End+of+Theory%3A+The+Data+Deluge+Makes+The+Scientific+Method+Obsolete&amp;rft.atitle=Wired&amp;rft.aulast=Anderson%2C+C.&amp;rft.au=Anderson%2C+C.&amp;rft.date=23+June+2008&amp;rft.pub=Cond%C3%A9+Nast&amp;rft_id=https%3A%2F%2Fwww.wired.com%2F2008%2F06%2Fpb-theory%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MurtaghOrigins08-18\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MurtaghOrigins08_18-0\" rel=\"external_link\">18.0<\/a><\/sup> <sup><a href=\"#cite_ref-MurtaghOrigins08_18-1\" rel=\"external_link\">18.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Murtagh, F.&#32;(2008).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/arxiv.org\/abs\/0811.2519\" target=\"_blank\">\"Origins of Modern Data Analysis Linked to the Beginnings and Early Development of Computer Science and Information Engineering\"<\/a>.&#32;<i>Electronic Journal for History of Probability and Statistics<\/i>&#32;<b>4<\/b>&#32;(2): 1\u201326<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/arxiv.org\/abs\/0811.2519\" target=\"_blank\">https:\/\/arxiv.org\/abs\/0811.2519<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Origins+of+Modern+Data+Analysis+Linked+to+the+Beginnings+and+Early+Development+of+Computer+Science+and+Information+Engineering&amp;rft.jtitle=Electronic+Journal+for+History+of+Probability+and+Statistics&amp;rft.aulast=Murtagh%2C+F.&amp;rft.au=Murtagh%2C+F.&amp;rft.date=2008&amp;rft.volume=4&amp;rft.issue=2&amp;rft.pages=1%E2%80%9326&amp;rft_id=https%3A%2F%2Farxiv.org%2Fabs%2F0811.2519&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EnglmeierEdit17-19\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-EnglmeierEdit17_19-0\" rel=\"external_link\">19.0<\/a><\/sup> <sup><a href=\"#cite_ref-EnglmeierEdit17_19-1\" rel=\"external_link\">19.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Englmeier, K.; Murtagh, F.&#32;(2017).&#32;\"Editorial: What Can We Expect from Data Scientists?\".&#32;<i>Journal of Theoretical and Applied Electronic Commerce Research<\/i>&#32;<b>12<\/b>&#32;(1): 1\u20135.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4067%2FS0718-18762017000100001\" target=\"_blank\">10.4067\/S0718-18762017000100001<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Editorial%3A+What+Can+We+Expect+from+Data+Scientists%3F&amp;rft.jtitle=Journal+of+Theoretical+and+Applied+Electronic+Commerce+Research&amp;rft.aulast=Englmeier%2C+K.%3B+Murtagh%2C+F.&amp;rft.au=Englmeier%2C+K.%3B+Murtagh%2C+F.&amp;rft.date=2017&amp;rft.volume=12&amp;rft.issue=1&amp;rft.pages=1%E2%80%935&amp;rft_id=info:doi\/10.4067%2FS0718-18762017000100001&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MurtaghContext17-20\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MurtaghContext17_20-0\" rel=\"external_link\">20.0<\/a><\/sup> <sup><a href=\"#cite_ref-MurtaghContext17_20-1\" rel=\"external_link\">20.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Murtagh, F.; Farid, M.&#32;(2017).&#32;\"Contextualizing Geometric Data Analysis and Related Data Analytics: A Virtual Microscope for Big Data Analytics\".&#32;<i>Journal of Interdisciplinary Methodologies and Issues in Sciences<\/i>&#32;<b>3<\/b>&#32;(Digital Contextualization): 1\u201319.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.18713%2FJIMIS-010917-3-1\" target=\"_blank\">10.18713\/JIMIS-010917-3-1<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Contextualizing+Geometric+Data+Analysis+and+Related+Data+Analytics%3A+A+Virtual+Microscope+for+Big+Data+Analytics&amp;rft.jtitle=Journal+of+Interdisciplinary+Methodologies+and+Issues+in+Sciences&amp;rft.aulast=Murtagh%2C+F.%3B+Farid%2C+M.&amp;rft.au=Murtagh%2C+F.%3B+Farid%2C+M.&amp;rft.date=2017&amp;rft.volume=3&amp;rft.issue=Digital+Contextualization&amp;rft.pages=1%E2%80%9319&amp;rft_id=info:doi\/10.18713%2FJIMIS-010917-3-1&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AllinNew16-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AllinNew16_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Allin, P.; Hand, D.J.&#32;(2016).&#32;\"New statistics for old?\u2014Measuring the wellbeing of the UK\".&#32;<i>Statistics in Society Series A<\/i>&#32;<b>180<\/b>&#32;(1): 3\u201343.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Frssa.12188\" target=\"_blank\">10.1111\/rssa.12188<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=New+statistics+for+old%3F%E2%80%94Measuring+the+wellbeing+of+the+UK&amp;rft.jtitle=Statistics+in+Society+Series+A&amp;rft.aulast=Allin%2C+P.%3B+Hand%2C+D.J.&amp;rft.au=Allin%2C+P.%3B+Hand%2C+D.J.&amp;rft.date=2016&amp;rft.volume=180&amp;rft.issue=1&amp;rft.pages=3%E2%80%9343&amp;rft_id=info:doi\/10.1111%2Frssa.12188&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WesselYou16-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WesselYou16_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Wessel, M.&#32;(03 November 2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/hbr.org\/2016\/11\/you-dont-need-big-data-you-need-the-right-data\" target=\"_blank\">\"You Don't Need Big Data - You Need the Right Data\"<\/a>.&#32;<i>Harvard Business Review<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/hbr.org\/2016\/11\/you-dont-need-big-data-you-need-the-right-data\" target=\"_blank\">https:\/\/hbr.org\/2016\/11\/you-dont-need-big-data-you-need-the-right-data<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=You+Don%27t+Need+Big+Data+-+You+Need+the+Right+Data&amp;rft.atitle=Harvard+Business+Review&amp;rft.aulast=Wessel%2C+M.&amp;rft.au=Wessel%2C+M.&amp;rft.date=03+November+2016&amp;rft_id=https%3A%2F%2Fhbr.org%2F2016%2F11%2Fyou-dont-need-big-data-you-need-the-right-data&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CCJobs17-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CCJobs17_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.careercast.com\/jobs-rated\/2017-jobs-rated-report\" target=\"_blank\">\"Jobs Rated Report 2017: Ranking 200 Jobs\"<\/a>.&#32;<i>CareerCast.com<\/i>.&#32;2017<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.careercast.com\/jobs-rated\/2017-jobs-rated-report\" target=\"_blank\">https:\/\/www.careercast.com\/jobs-rated\/2017-jobs-rated-report<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Jobs+Rated+Report+2017%3A+Ranking+200+Jobs&amp;rft.atitle=CareerCast.com&amp;rft.date=2017&amp;rft_id=https%3A%2F%2Fwww.careercast.com%2Fjobs-rated%2F2017-jobs-rated-report&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DanielReimaging18-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DanielReimaging18_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Daniel, B.K.&#32;(2018).&#32;\"Reimaging Research Methodology as Data Science\".&#32;<i>Big Data and Cognitive Computing<\/i>&#32;<b>2<\/b>&#32;(1): 4.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3390%2Fbdcc2010004\" target=\"_blank\">10.3390\/bdcc2010004<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Reimaging+Research+Methodology+as+Data+Science&amp;rft.jtitle=Big+Data+and+Cognitive+Computing&amp;rft.aulast=Daniel%2C+B.K.&amp;rft.au=Daniel%2C+B.K.&amp;rft.date=2018&amp;rft.volume=2&amp;rft.issue=1&amp;rft.pages=4&amp;rft_id=info:doi\/10.3390%2Fbdcc2010004&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PRessGrad18-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PRessGrad18_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Press, G.&#32;(28 February 2018).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/whatsthebigdata.com\/2012\/08\/09\/graduate-programs-in-big-data-and-data-science\/\" target=\"_blank\">\"Graduate Programs in Data Science and Big Data Analytics\"<\/a>.&#32;<i>What's the Big Data?<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/whatsthebigdata.com\/2012\/08\/09\/graduate-programs-in-big-data-and-data-science\/\" target=\"_blank\">https:\/\/whatsthebigdata.com\/2012\/08\/09\/graduate-programs-in-big-data-and-data-science\/<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Graduate+Programs+in+Data+Science+and+Big+Data+Analytics&amp;rft.atitle=What%27s+the+Big+Data%3F&amp;rft.aulast=Press%2C+G.&amp;rft.au=Press%2C+G.&amp;rft.date=28+February+2018&amp;rft_id=https%3A%2F%2Fwhatsthebigdata.com%2F2012%2F08%2F09%2Fgraduate-programs-in-big-data-and-data-science%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NVPBigData17-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NVPBigData17_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/newvantage.com\/wp-content\/uploads\/2017\/01\/Big-Data-Executive-Survey-2017-Executive-Summary.pdf\" target=\"_blank\">\"Big Data Executive Survey 2017\"<\/a>&#32;(PDF).&#32;NewVantage Partners LLC.&#32;January 2017<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/newvantage.com\/wp-content\/uploads\/2017\/01\/Big-Data-Executive-Survey-2017-Executive-Summary.pdf\" target=\"_blank\">http:\/\/newvantage.com\/wp-content\/uploads\/2017\/01\/Big-Data-Executive-Survey-2017-Executive-Summary.pdf<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Big+Data+Executive+Survey+2017&amp;rft.atitle=&amp;rft.date=January+2017&amp;rft.pub=NewVantage+Partners+LLC&amp;rft_id=http%3A%2F%2Fnewvantage.com%2Fwp-content%2Fuploads%2F2017%2F01%2FBig-Data-Executive-Survey-2017-Executive-Summary.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HayesEmperically16-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HayesEmperically16_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Hayes, B.&#32;(18 January 2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/businessoverbroadway.com\/empirically-based-approach-to-understanding-the-structure-of-data-science\" target=\"_blank\">\"Empirically-Based Approach to Understanding the Structure of Data Science\"<\/a>.&#32;<i>Business Over Broadway<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/businessoverbroadway.com\/empirically-based-approach-to-understanding-the-structure-of-data-science\" target=\"_blank\">http:\/\/businessoverbroadway.com\/empirically-based-approach-to-understanding-the-structure-of-data-science<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Empirically-Based+Approach+to+Understanding+the+Structure+of+Data+Science&amp;rft.atitle=Business+Over+Broadway&amp;rft.aulast=Hayes%2C+B.&amp;rft.au=Hayes%2C+B.&amp;rft.date=18+January+2016&amp;rft_id=http%3A%2F%2Fbusinessoverbroadway.com%2Fempirically-based-approach-to-understanding-the-structure-of-data-science&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MurtaghSec-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MurtaghSec_28-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Murtagh, F.&#32;(2018).&#32;\"Security and ethics in Big Data: Analytical foundations for surveys\".&#32;<i>Archives of Data Science<\/i>&#32;<b>Submitted<\/b>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Security+and+ethics+in+Big+Data%3A+Analytical+foundations+for+surveys&amp;rft.jtitle=Archives+of+Data+Science&amp;rft.aulast=Murtagh%2C+F.&amp;rft.au=Murtagh%2C+F.&amp;rft.date=2018&amp;rft.volume=Submitted&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HandTheDangers17-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HandTheDangers17_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Hand, D.&#32;(06 April 2017).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.ucc.ie\/en\/matsci\/news\/irish-statistical-association-isa-gossett-lecture-2017.html\" target=\"_blank\">\"The dangers of not seeing what isn\u2019t there: Selection bias in statistical modelling\"<\/a>.&#32;<i>Irish Statistical Association (ISA) Gossett Lecture 2017<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.ucc.ie\/en\/matsci\/news\/irish-statistical-association-isa-gossett-lecture-2017.html\" target=\"_blank\">https:\/\/www.ucc.ie\/en\/matsci\/news\/irish-statistical-association-isa-gossett-lecture-2017.html<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=The+dangers+of+not+seeing+what+isn%E2%80%99t+there%3A+Selection+bias+in+statistical+modelling&amp;rft.atitle=Irish+Statistical+Association+%28ISA%29+Gossett+Lecture+2017&amp;rft.aulast=Hand%2C+D.&amp;rft.au=Hand%2C+D.&amp;rft.date=06+April+2017&amp;rft_id=https%3A%2F%2Fwww.ucc.ie%2Fen%2Fmatsci%2Fnews%2Firish-statistical-association-isa-gossett-lecture-2017.html&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KeidingPerils16-30\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-KeidingPerils16_30-0\" rel=\"external_link\">30.0<\/a><\/sup> <sup><a href=\"#cite_ref-KeidingPerils16_30-1\" rel=\"external_link\">30.1<\/a><\/sup> <sup><a href=\"#cite_ref-KeidingPerils16_30-2\" rel=\"external_link\">30.2<\/a><\/sup> <sup><a href=\"#cite_ref-KeidingPerils16_30-3\" rel=\"external_link\">30.3<\/a><\/sup> <sup><a href=\"#cite_ref-KeidingPerils16_30-4\" rel=\"external_link\">30.4<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Keiding, N.; Louis, T.A.&#32;(2016).&#32;\"Perils and potentials of self\u2010selected entry to epidemiological studies and surveys\".&#32;<i>Statistics in Society Series A<\/i>&#32;<b>179<\/b>&#32;(2): 319\u201376.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Frssa.12136\" target=\"_blank\">10.1111\/rssa.12136<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Perils+and+potentials+of+self%E2%80%90selected+entry+to+epidemiological+studies+and+surveys&amp;rft.jtitle=Statistics+in+Society+Series+A&amp;rft.aulast=Keiding%2C+N.%3B+Louis%2C+T.A.&amp;rft.au=Keiding%2C+N.%3B+Louis%2C+T.A.&amp;rft.date=2016&amp;rft.volume=179&amp;rft.issue=2&amp;rft.pages=319%E2%80%9376&amp;rft_id=info:doi\/10.1111%2Frssa.12136&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-O.27NeilWeapons16-31\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-O.27NeilWeapons16_31-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">O'Neil, C.&#32;(2016).&#32;<i>Weapons of Math Destruction<\/i>.&#32;Crown.&#32;pp.&#160;272.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9780553418811.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Weapons+of+Math+Destruction&amp;rft.aulast=O%27Neil%2C+C.&amp;rft.au=O%27Neil%2C+C.&amp;rft.date=2016&amp;rft.pages=pp.%26nbsp%3B272&amp;rft.pub=Crown&amp;rft.isbn=9780553418811&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LeRouxLeMeth15-32\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LeRouxLeMeth15_32-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">\"Chapitre 1. Id\u00e9es\u2013clefs de l\u2019analyse g\u00e9om\u00e9trique des donn\u00e9es\".&#32;<i>La M\u00e9thodologie de Pierre Bourdieu en Action: Espace Culturel, Espace Social et Analyse des Donn\u00e9es<\/i>.&#32;Dunod.&#32;2015.&#32;pp.&#160;3\u201320.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3917%2Fdunod.lebar.2015.01.0003\" target=\"_blank\">10.3917\/dunod.lebar.2015.01.0003<\/a>.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9782100703845.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Chapitre+1.+Id%C3%A9es%E2%80%93clefs+de+l%E2%80%99analyse+g%C3%A9om%C3%A9trique+des+donn%C3%A9es&amp;rft.atitle=La+M%C3%A9thodologie+de+Pierre+Bourdieu+en+Action%3A+Espace+Culturel%2C+Espace+Social+et+Analyse+des+Donn%C3%A9es&amp;rft.date=2015&amp;rft.pages=pp.%26nbsp%3B3%E2%80%9320&amp;rft.pub=Dunod&amp;rft_id=info:doi\/10.3917%2Fdunod.lebar.2015.01.0003&amp;rft.isbn=9782100703845&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MurtaghTracking16-33\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MurtaghTracking16_33-0\" rel=\"external_link\">33.0<\/a><\/sup> <sup><a href=\"#cite_ref-MurtaghTracking16_33-1\" rel=\"external_link\">33.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Murtagh, F.; Pianosi, M.; Bull, R.&#32;(2016).&#32;\"Semantic mapping of discourse and activity, using Habermas\u2019s theory of communicative action to analyze process\".&#32;<i>Quality &amp; Quantity<\/i>&#32;<b>50<\/b>&#32;(4): 1675\u20131694.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11135-015-0228-7\" target=\"_blank\">10.1007\/s11135-015-0228-7<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Semantic+mapping+of+discourse+and+activity%2C+using+Habermas%E2%80%99s+theory+of+communicative+action+to+analyze+process&amp;rft.jtitle=Quality+%26+Quantity&amp;rft.aulast=Murtagh%2C+F.%3B+Pianosi%2C+M.%3B+Bull%2C+R.&amp;rft.au=Murtagh%2C+F.%3B+Pianosi%2C+M.%3B+Bull%2C+R.&amp;rft.date=2016&amp;rft.volume=50&amp;rft.issue=4&amp;rft.pages=1675%E2%80%931694&amp;rft_id=info:doi\/10.1007%2Fs11135-015-0228-7&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ECCommission14-34\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ECCommission14_34-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/europa.eu\/rapid\/press-release_IP-14-769_en.htm\" target=\"_blank\">\"Commission urges governments to embrace potential of Big Data\"<\/a>.&#32;<i>Press Release Database<\/i>.&#32;European Commission.&#32;02 July 2014<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/europa.eu\/rapid\/press-release_IP-14-769_en.htm\" target=\"_blank\">http:\/\/europa.eu\/rapid\/press-release_IP-14-769_en.htm<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Commission+urges+governments+to+embrace+potential+of+Big+Data&amp;rft.atitle=Press+Release+Database&amp;rft.date=02+July+2014&amp;rft.pub=European+Commission&amp;rft_id=http%3A%2F%2Feuropa.eu%2Frapid%2Fpress-release_IP-14-769_en.htm&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CPUOUOpen16-35\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CPUOUOpen16_35-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Committee on the Peaceful Uses of Outer Space&#32;(14 June 2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.unoosa.org\/res\/oosadoc\/data\/documents\/2016\/aac_1052016crp\/aac_1052016crp_6_0_html\/AC105_2016_CRP06E.pdf\" target=\"_blank\">\"\u201cOpen Universe\u201d proposal, an initiative under the auspices of the Committee on the Peaceful Uses of Outer Space for expanding availability of and accessibility to open source space science data\"<\/a>&#32;(PDF).&#32;United Nations Office for Outer Space Affairs<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.unoosa.org\/res\/oosadoc\/data\/documents\/2016\/aac_1052016crp\/aac_1052016crp_6_0_html\/AC105_2016_CRP06E.pdf\" target=\"_blank\">http:\/\/www.unoosa.org\/res\/oosadoc\/data\/documents\/2016\/aac_1052016crp\/aac_1052016crp_6_0_html\/AC105_2016_CRP06E.pdf<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=%E2%80%9COpen+Universe%E2%80%9D+proposal%2C+an+initiative+under+the+auspices+of+the+Committee+on+the+Peaceful+Uses+of+Outer+Space+for+expanding+availability+of+and+accessibility+to+open+source+space+science+data&amp;rft.atitle=&amp;rft.aulast=Committee+on+the+Peaceful+Uses+of+Outer+Space&amp;rft.au=Committee+on+the+Peaceful+Uses+of+Outer+Space&amp;rft.date=14+June+2016&amp;rft.pub=United+Nations+Office+for+Outer+Space+Affairs&amp;rft_id=http%3A%2F%2Fwww.unoosa.org%2Fres%2Foosadoc%2Fdata%2Fdocuments%2F2016%2Faac_1052016crp%2Faac_1052016crp_6_0_html%2FAC105_2016_CRP06E.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WilkinsonTheFAIR16-36\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WilkinsonTheFAIR16_36-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J. et al.&#32;(2016).&#32;\"The FAIR Guiding Principles for scientific data management and stewardship\".&#32;<i>Scientific Data<\/i>&#32;<b>3<\/b>: 160018.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fsdata.2016.18\" target=\"_blank\">10.1038\/sdata.2016.18<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+FAIR+Guiding+Principles+for+scientific+data+management+and+stewardship&amp;rft.jtitle=Scientific+Data&amp;rft.aulast=Wilkinson%2C+M.D.%3B+Dumontier%2C+M.%3B+Aalbersberg%2C+I.J.+et+al.&amp;rft.au=Wilkinson%2C+M.D.%3B+Dumontier%2C+M.%3B+Aalbersberg%2C+I.J.+et+al.&amp;rft.date=2016&amp;rft.volume=3&amp;rft.pages=160018&amp;rft_id=info:doi\/10.1038%2Fsdata.2016.18&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DevlinLogic91-37\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-DevlinLogic91_37-0\" rel=\"external_link\">37.0<\/a><\/sup> <sup><a href=\"#cite_ref-DevlinLogic91_37-1\" rel=\"external_link\">37.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Devlin, K.&#32;(1991).&#32;<i>Logic and Information<\/i>.&#32;Cambridge University Press.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;0521499712.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Logic+and+Information&amp;rft.aulast=Devlin%2C+K.&amp;rft.au=Devlin%2C+K.&amp;rft.date=1991&amp;rft.pub=Cambridge+University+Press&amp;rft.isbn=0521499712&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DevlinAUniform11-38\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DevlinAUniform11_38-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Devlin, K.&#32;(July 2011).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/web.stanford.edu\/~kdevlin\/Papers\/Army_report_0711.pdf\" target=\"_blank\">\"A uniform framework for describing and analyzing the modern battlefield\"<\/a>&#32;(PDF).&#32;Standford University<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/web.stanford.edu\/~kdevlin\/Papers\/Army_report_0711.pdf\" target=\"_blank\">http:\/\/web.stanford.edu\/~kdevlin\/Papers\/Army_report_0711.pdf<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=A+uniform+framework+for+describing+and+analyzing+the+modern+battlefield&amp;rft.atitle=&amp;rft.aulast=Devlin%2C+K.&amp;rft.au=Devlin%2C+K.&amp;rft.date=July+2011&amp;rft.pub=Standford+University&amp;rft_id=http%3A%2F%2Fweb.stanford.edu%2F%7Ekdevlin%2FPapers%2FArmy_report_0711.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DevlinAnal96-39\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DevlinAnal96_39-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Devlin, K.; Rosenberg, D.&#32;(1996).&#32;<i>Language at Work: Analyzing Communication Breakdown in the Workplace to Inform Systems Design<\/i>.&#32;Center for the Study of Language and Information.&#32;pp.&#160;212.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781575860510.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Language+at+Work%3A+Analyzing+Communication+Breakdown+in+the+Workplace+to+Inform+Systems+Design&amp;rft.aulast=Devlin%2C+K.%3B+Rosenberg%2C+D.&amp;rft.au=Devlin%2C+K.%3B+Rosenberg%2C+D.&amp;rft.date=1996&amp;rft.pages=pp.%26nbsp%3B212&amp;rft.pub=Center+for+the+Study+of+Language+and+Information&amp;rft.isbn=9781575860510&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DevlinInfo08-40\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-DevlinInfo08_40-0\" rel=\"external_link\">40.0<\/a><\/sup> <sup><a href=\"#cite_ref-DevlinInfo08_40-1\" rel=\"external_link\">40.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Devlin, K.; Rosenberg, D.&#32;(2008).&#32;\"Information in the Study of Human Interaction\".&#32;In&#32;Adriaana, P.; van Benthem, J.; Gabbay, D. et al..&#32;<i>Philosophy of Information<\/i>.&#32;Elsevier.&#32;pp.&#160;685\u2013709.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FB978-0-444-51726-5.50021-2\" target=\"_blank\">10.1016\/B978-0-444-51726-5.50021-2<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Information+in+the+Study+of+Human+Interaction&amp;rft.atitle=Philosophy+of+Information&amp;rft.aulast=Devlin%2C+K.%3B+Rosenberg%2C+D.&amp;rft.au=Devlin%2C+K.%3B+Rosenberg%2C+D.&amp;rft.date=2008&amp;rft.pages=pp.%26nbsp%3B685%E2%80%93709&amp;rft.pub=Elsevier&amp;rft_id=info:doi\/10.1016%2FB978-0-444-51726-5.50021-2&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BarwiseSituations99-41\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BarwiseSituations99_41-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Barwise, J.; Perry, J.&#32;(1999).&#32;<i>Situations and Attitudes<\/i>.&#32;Center for the Study of Language and Information.&#32;pp.&#160;376.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781575861937.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Situations+and+Attitudes&amp;rft.aulast=Barwise%2C+J.%3B+Perry%2C+J.&amp;rft.au=Barwise%2C+J.%3B+Perry%2C+J.&amp;rft.date=1999&amp;rft.pages=pp.%26nbsp%3B376&amp;rft.pub=Center+for+the+Study+of+Language+and+Information&amp;rft.isbn=9781575861937&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CORDIS-SANE00-42\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CORDIS-SANE00_42-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/cordis.europa.eu\/project\/rcn\/58059_en.html\" target=\"_blank\">\"Sustainable Accommodation in the New Economy\"<\/a>.&#32;<i>CORDIS<\/i>.&#32;European Commission.&#32;15 May 2008<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/cordis.europa.eu\/project\/rcn\/58059_en.html\" target=\"_blank\">https:\/\/cordis.europa.eu\/project\/rcn\/58059_en.html<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 June 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Sustainable+Accommodation+in+the+New+Economy&amp;rft.atitle=CORDIS&amp;rft.date=15+May+2008&amp;rft.pub=European+Commission&amp;rft_id=https%3A%2F%2Fcordis.europa.eu%2Fproject%2Frcn%2F58059_en.html&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RosenbergInteraction05-43\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RosenbergInteraction05_43-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Rosenberg, D.; Foley, S.; Lievonen, M. et al.&#32;(2005).&#32;\"Interaction spaces in computer-mediated communication\".&#32;<i>AI &amp; Society<\/i>&#32;<b>19<\/b>&#32;(1): 22\u201333.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs00146-004-0299-9\" target=\"_blank\">10.1007\/s00146-004-0299-9<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Interaction+spaces+in+computer-mediated+communication&amp;rft.jtitle=AI+%26+Society&amp;rft.aulast=Rosenberg%2C+D.%3B+Foley%2C+S.%3B+Lievonen%2C+M.+et+al.&amp;rft.au=Rosenberg%2C+D.%3B+Foley%2C+S.%3B+Lievonen%2C+M.+et+al.&amp;rft.date=2005&amp;rft.volume=19&amp;rft.issue=1&amp;rft.pages=22%E2%80%9333&amp;rft_id=info:doi\/10.1007%2Fs00146-004-0299-9&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WalkowskiUsing11-44\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WalkowskiUsing11_44-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Walkoski, S.; D\u00f6rner, R.; Lievonen, M. et al.&#32;(2011).&#32;\"Using a game controller for relaying deictic gestures in computer-mediated communication\".&#32;<i>International Journal of Human-Computer Studies<\/i>&#32;<b>69<\/b>: 362\u201374.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijhcs.2011.01.002\" target=\"_blank\">10.1016\/j.ijhcs.2011.01.002<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Using+a+game+controller+for+relaying+deictic+gestures+in+computer-mediated+communication&amp;rft.jtitle=International+Journal+of+Human-Computer+Studies&amp;rft.aulast=Walkoski%2C+S.%3B+D%C3%B6rner%2C+R.%3B+Lievonen%2C+M.+et+al.&amp;rft.au=Walkoski%2C+S.%3B+D%C3%B6rner%2C+R.%3B+Lievonen%2C+M.+et+al.&amp;rft.date=2011&amp;rft.volume=69&amp;rft.pages=362%E2%80%9374&amp;rft_id=info:doi\/10.1016%2Fj.ijhcs.2011.01.002&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference. The original inline citation method was unorthodox; these inline citations have been made clearer with the addition of the author of the citation. This often required sentences containing inline citations to be reconstructed. Several URL mentions in the text were turned into full citations. Several vanity statements and irrelevant comments were removed for improved readability.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193145\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.961 seconds\nReal time usage: 1.001 seconds\nPreprocessor visited node count: 31657\/1000000\nPreprocessor generated node count: 40424\/1000000\nPost\u2010expand include size: 205747\/2097152 bytes\nTemplate argument size: 69060\/2097152 bytes\nHighest expansion depth: 15\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 954.074 1 - -total\n 85.79% 818.493 1 - Template:Reflist\n 75.34% 718.837 44 - Template:Citation\/core\n 32.05% 305.743 17 - Template:Cite_journal\n 25.69% 245.139 15 - Template:Cite_web\n 21.88% 208.757 12 - Template:Cite_book\n 6.79% 64.761 1 - Template:Infobox_journal_article\n 6.50% 62.046 1 - Template:Infobox\n 4.34% 41.372 59 - Template:Citation\/make_link\n 3.90% 37.210 80 - Template:Infobox\/row\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10687-0!*!0!!en!5!* and timestamp 20181214193144 and revision id 33596\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development\">https:\/\/www.limswiki.org\/index.php\/Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","795feead44bb9c43869be23a90bf9d75_images":["https:\/\/www.limswiki.org\/images\/3\/34\/Fig1_Murtagh_BigDataCogComp2018_2-2.jpg","https:\/\/www.limswiki.org\/images\/d\/df\/Fig2_Murtagh_BigDataCogComp2018_2-2.jpg","https:\/\/www.limswiki.org\/images\/7\/77\/Fig3_Murtagh_BigDataCogComp2018_2-2.jpg"],"795feead44bb9c43869be23a90bf9d75_timestamp":1544815904,"3d10ab796a58a8bc8aa92318f0b8bfdb_type":"article","3d10ab796a58a8bc8aa92318f0b8bfdb_title":"Data science as an innovation challenge: From big data to value proposition (Kayser et al. 2018)","3d10ab796a58a8bc8aa92318f0b8bfdb_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition","3d10ab796a58a8bc8aa92318f0b8bfdb_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Data science as an innovation challenge: From big data to value proposition\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nData science as an innovation challenge: From big data to value propositionJournal\n \nTechnology Innovation Management ReviewAuthor(s)\n \nKayser, Victoria; Nehrke, Bastian; Zubovic, DamirAuthor affiliation(s)\n \nErnst &amp; YoungYear published\n \n2018Volume and issue\n \n8(3)Page(s)\n \n16\u201325DOI\n \n10.22215\/timreview\/1143ISSN\n \n1927-0321Distribution license\n \nCreative Commons Attribution 3.0 UnportedWebsite\n \nhttps:\/\/timreview.ca\/article\/1143Download\n \nhttps:\/\/timreview.ca\/sites\/default\/files\/article_PDF\/Kayser_et_al_TIMReview_March2018.pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Big data and analytics \n\n3.1 Data \n3.2 Analytics \n3.3 IT infrastructure \n\n\n4 From data to value: Turning ideas into applications \n\n4.1 What is the starting point? \n4.2 The analytics process \n4.3 Phase 1: Idea generation \n4.4 Phase 2: Proof of concept \n4.5 Phase 3 and 4: Operationalization \n\n\n5 Discussion and conclusion \n6 Acknowledgements \n7 References \n8 Notes \n\n\n\nAbstract \nAnalyzing \u201cbig data\u201d holds huge potential for generating business value. The ongoing advancement of tools and technology over recent years has created a new ecosystem full of opportunities for data-driven innovation. However, as the amount of available data rises to new heights, so too does complexity. Organizations are challenged to create the right contexts, by shaping interfaces and processes, and by asking the right questions to guide the data analysis. Lifting the innovation potential requires teaming and focus to efficiently assign available resources to the most promising initiatives. With reference to the innovation process, this article will concentrate on establishing a process for analytics projects from first ideas to realization (in most cases, a running application). The question we tackle is: what can the practical discourse on big data and analytics learn from innovation management? The insights presented in this article are built on our practical experiences in working with various clients. We will classify analytics projects as well as discuss common innovation barriers along this process.\nKeywords: analytics, big data, digital innovation, idea generation, innovation process\n\nIntroduction \n\"Listening to the data is important\u2026 but so is experience and intuition. After all, what is intuition at its best but large amounts of data of all kinds filtered through a human brain rather than a math model?\" - Steve Lohr, Technology and economics journalist\nUnderstandably, much effort is being expended into analyzing \u201cbig data\u201d to unleash its potentially enormous business value.[1][2] New data sources evolve, and new techniques for storing and analyzing large data sets are enabling many new applications, but the exact business value of any one big data application is often unclear. From a practical viewpoint, organizations still struggle to use data meaningfully or they lack the right competencies. Different types of analytics problems arise in an organizational context, depending on whether the starting point is a precise request from a department that only lacks required skills or capabilities (e.g., machine learning) or rather it stems from a principal interest in working with big data (e.g., no own infrastructure, no methodical experience). So far, clear strategies and process for value generation from data are often missing.\nMuch literature addresses the technical and methodical implementation, the transformative strength of big data[3], the enhancement of firm performance by building analytics capability[4], or other managerial issues[5][1] Little work covers the transformation process from first ideas to ready analytics applications or in building analytics competence. This article seeks to address this gap.\nAnalytics initiatives have several unique features. First, they require an exploratory approach\u2014the analysis does not start with specific requirements as in other projects but rather with an idea or data set. To assess the contribution, ideation techniques and rapid prototyping are applied. This exploration plays a key role in developing a shared understanding and giving a big data initiative a strategic direction. Second, analytics projects in their early phase are bound to a complex interplay between different stakeholder interests, competencies, and viewpoints. Learning is an integral part of these projects to build experience and competence with analytics. Third, analytics projects run in parallel to the existing information technology (IT) infrastructure and deliver short scripts or strategic insights, which are then installed in larger IT projects. Due to a missing end-to-end target, data is not only to be extracted, transformed, and loaded, but also needs to be identified, classified, and partly structured. So, a general process for value generation needs to be established to guide analytics projects and address these issues.\nHere, we propose an exact configuration and series of steps to guide a big data analytics project. The lack of specified requirements and defined project goals in a big data analytics project (compared to a classic analytics project) make it challenging to structure the analytics process. Therefore, the linear innovation process serves as reference and orientation.[6] As Braganza and colleagues[7] describe, for big data to be successfully integrated and implemented in an organization, clear and repeatable processes are required. Nevertheless, each analytics initiative is different and the process needs to flexible. Unfortunately, the literature rarely combines challenges in the analytics process with concepts from innovation management. Nevertheless, an integration of the concepts from innovation management could guide the analytics work of formulating digital strategies, organizational anchoring of the analytics units and their functions, and designing the analytics portfolio, as well as the underlying working principles (e.g., rapid prototyping, ideation techniques).\nThus, in this article, we will concentrate on the question of what the practical discourse and work on analytics respectively implementing big data in organizations can learn from innovation management. A process for analytics innovation is introduced to guide the process from ideation to value generation. Emphasis is put on challenges during this process as well as different entry points. Thereby, we build on experience and insights from a number of analytics projects for different sectors and domains to derive recommendations for successfully implementing analytics solutions.\nWe begin with a definition of big data and analytics. Next, we propose a process for a structured approach to retrieving value from data. Finally, we discuss the results and outline directions for future research.\n\nBig data and analytics \nIn this section, we address the elementary angles from which the analytics value chain should be looked at (Figure 1): data, infrastructure, and analytics\u2013and the business need as the driver. According to our understanding, value is generated by analyzing data within a certain context, with a problem statement related to a business requirement driving the need for innovation. Besides expertise in conducting data and analytics projects, this process requires a working infrastructure, especially when volume, velocity, or variety of data to be analyzed exceeds certain limits. Below, we describe the three technical angles in more detail.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. Framework of data, infrastructure, analytics and business need\n\n\n\nData \nBig data is often defined with volume (how much data), velocity (speed of data generation), and variety as the diversity of data types.[8][9] Big data describes data collections of a size difficult to process with traditional data management techniques. While many definitions of big data concentrate on the aspect of volume referring to the scale of data available, big data brings in particular heterogeneous formats and a broad spectrum of possible data sources. Examples are structured numeric data or unstructured data such as text, images, or videos. This variety and broad landscape of data sources offers many opportunities for generating insights. Moreover, the speed of data creation enables rapid insights in ongoing developments.\nRecent technical improvements (e.g., cloud computing, big data architectures) enable data to be analyzed and stored on a large scale. For many (new) types of data, their exact business value is unclear so far and requires systematic exploration. Available data is often messy, and even when cleaned up can be overwhelming and too complex to be easily understood, even by professional data scientists. The contribution of data is, of course, context specific and varies among business cases and applications. One key challenge is to identify data that best meets the business requirement.\n\nAnalytics \nData science is concerned with knowledge generation from data. Analytics or data science addresses the exploration of data sets with different quantitative methods motivated from statistical modelling[10] or machine learning.[11] Methods from different disciplines such as statistics, economics, or computer science find application to identify patterns, influence factors, or dependencies. In contrast to business intelligence, analytics reaches further than descriptive analytics (based on SQL) and often has a predictive component. Which method to apply depends on the exact business case. Analyzing data is restricted, for example, by a company\u2019s internal policies as well as legal restrictions and guidelines that vary among countries. Data quality and reliability are further issues. Data understanding and domain knowledge are key prerequisites in the analysis process (e.g., Waller &amp; Fawcett[12]), especially when model assumptions are made.\nConcerning data analysis, there are primarily the following opportunities for organizations:\n\n Improved analysis of internal data: One example is forecasting methods that enhance expert-based planning approaches by additional figures. These methods build on existing databases such as business intelligence systems, and they contribute new or further insights to internal firm processes.\n Putting data together in new ways: New combinations of data sets offer new insights, for example, through the combination of sensor data and user profiles.\n Opening up to new or (so far) unused data sources (e.g., websites, open data) to identify potential for generating new insights: However, a context or application is necessary to use the data. One example is social media data used for market observation.\nHowever, the core problem of analytics is to work out the guiding question and achieve a match between business need, data source, and analysis as discussed later in the article.\n\nIT infrastructure \nRelevant for the successful implementation of analytics is the adaption of the IT infrastructure to embed analytics solutions and integrate different data sources. The core layers of an IT infrastructure are the:\n\n Data ingestion layer: This layer covers the data transfer from a source system to an analytics environment. Therefore, a toolset and a corresponding process need to be defined. Traditional extract, transform, load (ETL) tools and relational databases are combined with Hadoop\/big data setups covering, in particular, scenarios caused by less structured, high-volume, or streamed data. Analytics use cases build on data from data warehouses to fully unstructured data. This breadth challenges classic architectures and requires adaptable schemes. Which data sources to integrate depends on the specific application.\n Data value exploration layer: Based on the business need and corresponding use case, data is investigated, tested, and sampled in this layer. Depending on the complexity and business question, an appropriate analytics scheme is developed. Business and explorative analysis based on online analytical processing (OLAP) models in memory technologies are supplemented or expanded by using advanced analytics methods and integrating (e.g., R or Python plugins).\n Data consumption layer: Here, the results are used for visualization, for example. The end user can consume the data or service without deep technical understanding (e.g., for self-service business intelligence).\nModern approaches require structures that are adaptable and scalable to different requirements and data sources. Factors such as system performance, cost efficiency, and overall enterprise infrastructure strategy must be taken into consideration.\n\nFrom data to value: Turning ideas into applications \nOrganizations still struggle to use data meaningfully or lack the right competencies. One of the key challenges in analytics projects is identifying the business need and the guiding questions. Principally, different types of analytics problems arise in an organizational context ranging from precise requests that only lack specific capabilities to a principal interest in working with big data (e.g., no own infrastructure, expert-based approaches). This approach implies different starting points for the analytics process and different innovation pathways, both of which are described later in this article.\n\nWhat is the starting point? \nThe starting point for each analytics initiative varies. According to the four points mentioned above, the \u201cstate of the art\u201d for each one needs to be assessed individually to estimate the analytics maturity:\n\n Business need: From case to case, the precision of the problem description and scope varies. For some cases, the leading question and scope guiding the analysis phase are formulated very precisely, and for other cases it needs to be worked out and refined during the process. \n Data: The data to be used in the project can be defined or an appropriate source is not yet clear. The size and quality of the data essentially determine the progress of the further process. Parameters are, for example, structure (i.e., pre-processing effort) or the size of the data set (e.g., one CSV file or a large database).\n Analytics: Which methods to apply differs from case to case and must be tested and explored.\n Infrastructure: The current (technical) state of the business unit (e.g., own data warehouse, reporting system) or own (human) resources and competencies is a further important aspect in classifying requests.\nThese four angles can be rated differently with reference to the maturity level of the analytics request. Based on our experience, three scenarios, representing different maturity levels, can be distinguished from these four angles (illustrated in Figure 2).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2. Classifying analytics requests: Three maturity levels\n\n\n\nIn scenario 1, the data analysis is motivated by a defined requirement such as market observation during the rollout of a new product. The appropriate data source needs to be identified. The data missing so far implies that the precise analysis cannot be defined and also that there is no existing infrastructure. Ideas need to be developed as to which data sources could be relevant and which issues can be resolved on this basis. Then, different methods from data analysis are applied to generate new insights.\nIn scenario 2, the data source and infrastructure are clearly defined, and the specific questions need to be identified. One application is assessing the contribution of a specific data source that has not been professionally analyzed so far, for example, by means of machine learning. For instance, the business unit has an internal database, considers new methods, and wants to further develop a business intelligence system by adding a forecasting component. In this case, the scope is clearer than in the first scenario, and straight away an exploratory data analysis can be started.\nIn scenario 3, there is a precise analytical problem that needs to be professionalized. A first draft shows promising results, and the solution can, as a next step, be upscaled. Guidance in making architectural decisions is needed.\nThese three scenarios are exemplary starting points for analytics projects. The following section describes the implications for the innovation process and outlines different challenges and barriers.\n\nThe analytics process \nTo succeed with analytics, the process from data to value must be structured to be integrated in the existing organization. For example, Braganza and colleagues[7] examine the management of organizational resources in big data initiatives. They stress the importance of systematic approaches and processes to operationalize big data.\nRelated work on analytics processes has a focus on service design[13] or concentrates on the methodical part of analyzing data.[14] The process, as introduced by Braganza and colleagues[7], is too linear and does not address the systemic complexity of data analysis and necessary stakeholder discourse. To cover these issues, structuring the analytics process can be linked to the classic linear innovation process.[6][15]\nIn our work, to guide the analytics process from ideation, scoping, and identifying a data set to value generation, a process with four phases is introduced. Taking the classic innovation funnel as starting point, this concept is transferred to the context of analytics. The process is divided in four parts: i) the generation of ideas, ii) the development of proof of concepts (PoCs) to test these ideas, iii) the implementation and testing of successful PoCs, and, finally, iv) making them available as a product or service. Based on a first idea or requirement, the process is initialized, while the number of ideas or projects is reduced within each phase. Each phase has tasks, as well as barriers or filters, that need to be successfully addressed to continue in the process chain.\nThe three scenarios described above are assessed differently concerning their maturity, as illustrated in the process in Figure 3. Scenario 1 is in a very early stage of idea generation and many open questions need to be addressed. Scenario 2 is more concrete and many more issues are resolved than in scenario 1. However, initiating questions need to be developed before a PoC can be conducted. Scenario 3 builds on a running system, so it is located in the phase of testing and operationalization (phase three).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3. Phases of the analytics process\n\n\n\nFor each phase, different challenges arise. While related work emphasizes data-related challenges such as data acquisition, cleansing, or aggregation[16], this work focuses on process challenges.\n\nPhase 1: Idea generation \nOrientating analytics projects begins with an ideation phase. Here, the key challenge is to gather ideas and discuss relevant business problems (see also Provost &amp; Fawcett[17]). Idea generation plays a key role in developing a shared understanding, challenging existing assumptions, orientating big data initiatives, and identifying aspects that can be solved with analytics. For example, design thinking is applied as a systematic approach to problem solving[18] and supports a structured ideation process. Problems of the business unit are collected and matched with the scope of analytics (e.g., technical feasibility, input parameters, and methodical requirements). The ideation phase is iterative. Initially, the general project objectives guide the first ideation round, which aims at getting an overview of present challenges and needs of the business unit. This is in line with identifying appropriate data sets. Then, the feasibility of the ideas must be checked by experts and the ideas are then selected for prototyping.\nFrom an organizational perspective, involvement of decision makers from all hierarchy levels is a must. Top management is required to resolve conflicts of interest and to create a sense of urgency, middle management is required to free experts from daily work and onboard stakeholders into their particular roles, whereas the expert knowledge of operative specialists is key to detailing the guiding question and checking the feasibility.\nA portfolio is drawn to select the ideas that are considered in the PoC phase. Innovation portfolios provide a coherent basis for judging the possible impact of ideas (Tidd &amp; Bessant, 2013). They separate ideas into areas and indicate which ideas to prioritize. For the exemplary case as illustrated in Figure 4, the ideas are rated and assessed according to three categories: feasibility (x-axis), value creation (y-axis), and overall relevance (size of the node). Feasibility contains aspects such as data availability, time to access data, or the expected complexity of the task. Value creation addresses the expected business value and underlines ideas with a high expected contribution. The overall relevance is used to emphasize which ideas are expected to have greater impact on the problem at hand. So, for example, idea 3 has a high expected feasibility, but the created value is expected to be low. By contrast, idea 4 and idea 8 are bound to a higher expectation concerning value creation and should therefore be prioritized in the next phase.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 4. Portfolio for selecting ideas\n\n\n\nBesides the portfolio-based selection process, ideas are filtered during the first phase, for example, because there is no data available to address the problem, the data must be raised first (e.g., implementation of additional sensors), or access is denied (e.g., internal policies, legal restrictions). So, appropriate data sources need to be identified and access needs to be granted for a reliable yet efficient assessment of business needs and data applicability.\nAs an organizational barrier, the right experts need to be identified and freed of their daily work such that they are available for analytics projects. During the ideation process, the right balance between creativity and focus is important as well as bridging the gaps between diverse knowledge areas to ask the right questions.\nThe outcome of this first phase are represented by the ideas plus the data sources on which basis the problems can be examined; a mapping of problems or ideas and data sources is required. In the first phase, strong facilitators are needed to guide through the process. In addition, someone with methodical expertise to check the technical feasibility of the ideas considered as well as business understanding are important. The ideas and data are only discussed; no examination takes place. This is done in the next step.\nAnother issue that needs to be clarified in this early phase are data security and data protection. Each country has individual regulations that limit the analysis.\n\nPhase 2: Proof of concept \nTo test the ideas, prototypes are built and PoCs are conducted. PoCs are a first examination of the data set to see if a raised question can be answered based on the available data or not.\nThis phase is described in Figure 5. Based on the defined scope from the previous phase, access to the data must be granted, the data is explored and analyzed, and finally the results are communicated.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 5. The analytics process\n\n\n\nAs described previously, this phase begins with a project goal or problem description (business need). Whereas classic IT development starts with requirements, analytics often starts in an exploratory way with a dataset and a hypothesis. Specific requirements are generated during the analysis process. So, the PoC phase can only start with data or when data is available. Getting the data or retrieving it from existing systems is among the first steps in a PoC. Here, access barriers such as legal issues or organizational constraints need to be checked. For example, depending on the type of data (e.g., personal data, machine data, market figures), the analysis should be in line with these restrictions.\nNext, the data is explored for a deeper understanding. Here, the data is transformed to a suitable format for further analysis. This step contains data preparation and cleaning, and the first descriptive analysis is conducted.\nThe data is then analyzed for patterns and dependencies during the modelling phase to answer the questions raised. Different methods and algorithms are tested, and the results are validated in an iterative process of variable selection, model selection, model adaption, and validation.\nFinally, the results are communicated. A PoC gives a first orientation on the potential in the data with an emphasis on strengths and weaknesses. Possible results are that different modelling techniques do not deliver a valid result, the data quality does not allow modelling, or there is not enough data for a significant statement. This is finally the basis for planning and communicating next steps and coordinating further actions.\nConcerning the presentation of the results, different visualization techniques can be applied working with tools such as Tableau, QlikView, or different open-source platforms. Especially to develop an understanding of the data, descriptive data analysis is helpful. Nevertheless, many models and techniques from advanced analytics deliver figures that cannot be captured by intuitive visualizations.\nPoCs have a short duration of maybe six to eight weeks. Besides getting access to the necessary data or extracting data from relevant sources, among the key challenges in this phase are data quality, data ownership, and data understanding. Further barriers are cleansing and munging of the data to a format that can be processed and to apply the right models. Furthermore, business understanding is key to retrieving valuable insights from the data and achieving outcomes that are not only plausible but relevant for the business. Another issue is the lack of experience with analytics and the required agility in implementing the results.\n\nPhase 3 and 4: Operationalization \nThen, the PoC results are integrated into a professional IT infrastructure. Prototyped results need to be prepared for operationalization and transformed to an application. The main question to answer is if the model is scalable and whether or not results achieved so far can be applied to a larger data set. Adjustments have to be made so that a resulting application can be maintained by an IT service organization without continued support from data scientists. Event or time-based data flows have to be established and, together with the final application, need to be aligned with compliance, security, and data privacy requirements. Test management and service-level agreements for incident handling and application changes need to be agreed on as well as product and portfolio management functions in case the tool or application is meant to assume a strategic, long-term role.\nBarriers include, for example, the required budget, overly complex tests, standards, and compliance. Together, the integration in IT management and allocation of tasks to the IT department represent another issue. This relates to switching from an agile, iterative working model to stable operations and scaling the analytical model and transferring it to maintainable code.\nGenerally, great effort is required in transforming the PoC prototype into a professional infrastructure. Further barriers during operationalization are, for example, establishing support and service management functions; achieving acceptance among the user base; developing adequate training concepts; and transferring knowledge required to maintain, test, and develop the application.\n\nDiscussion and conclusion \nGenerally, the challenge for organizations lies in defining strategies for value generation from the large amount of available data sets. In this article, we discussed how to retrieve value from data and introduced a systematic process that analytics projects follow. First, we described the fundamental building blocks for value creation: business need, data, infrastructure, and analytics. Then, we described the process from ideation to market ready applications. According to the maturity state of the project, the process can be entered at different stages. The four phases of this process were described with emphasis on the specific barriers. This model is oriented towards a stage-gate model[6] for analytics processes and aims to structure and systematize exploratory analytics approaches.\nAnalytics and big data are not only a technical challenge; they impact the whole organization and its processes. For being successful with analytics, less effort should be spent on building complex and sophisticated models but instead on integrating the results into the existing (technical) infrastructure and processes. For the prototype being professionalized, the results must be accepted and understood, and the business unit should be continuously involved in the process. Moreover, the right set of people and skills is necessary. Not only are data scientists with competencies in machine learning and statistical modelling required[19], but also IT specialists and business understanding in general must be addressed. In addition, value is only generated from data if the analysis is integrated into an overall framework of skills and competencies and the analytics initiative is embedded in a business application.\nThe results of this article can be transferred to organizations of different sizes and levels of experience when building analytics capabilities. The process as described in this work guides personnel through analytics projects and illustrates the differences to known IT management approaches. By principally discussing the meaning of innovation for analytics, this work contributes to the evolving literature on digital innovation management.[20] In our work, we have outlined an approach for data-driven innovation.\nFuture work should examine the decisions in organizing analytics. This covers aspects such as roles and responsibilities, team structures, leading analytics teams, or the organizational embedding of analytics units in the organization. The results of this work should be linked to the extensive research on analytics capability, which are often classified along the dimensions of management, technology, and human capability.[4][19] Throughout the process, as introduced in this work, the understanding of analytics becomes clearer. As such, its contribution to organizational learning, skill development, developing a shared understanding, and building analytics capability should be examined. For example, according to Davenport and Harris[5], this analytics learning process needs around 18 to 36 months. From a technical point of view, in particular the integration of analytics solution into the overall IT landscape, the professionalization of prototypes and change of established processes remain challenging.\n\nAcknowledgements \nThis article was developed from a paper presented at the ISPIM Innovation Conference in Vienna, Austria, June 18\u201321, 2017. ISPIM\u2013the International Society for Professional Innovation Management\u2013is a network of researchers, industrialists, consultants, and public bodies who share an interest in innovation management.\n\nReferences \n\n\n\u2191 1.0 1.1 McAfee, A.; Byrnjolfsson, E.&#32;(2012).&#32;\"Big data: The management revolution\".&#32;Harvard Business Review&#32;90&#32;(10): 60\u20138.&#32;PMID&#160;23074865. &#160; \n\n\u2191 Wamba, S.F.; Gunasekaran, A.; Akter, S. et al.&#32;(2017).&#32;\"Big data analytics and firm performance: Effects of dynamic capabilities\".&#32;Journal of Business Research&#32;70: 356\u201365.&#32;doi:10.1016\/j.jbusres.2016.08.009. &#160; \n\n\u2191 Wamba, S.F.; Akter, S.; Edwards, A. et al.&#32;(2015).&#32;\"How \u2018big data\u2019 can make big impact: Findings from a systematic review and a longitudinal case study\".&#32;International Journal of Production Economics&#32;165: 234-46.&#32;doi:10.1016\/j.ijpe.2014.12.031. &#160; \n\n\u2191 4.0 4.1 Akter, S.; Wamba, S.F.; Gunasekaran, A. et al.&#32;(2016).&#32;\"How to improve firm performance using big data analytics capability and business strategy alignment?\".&#32;International Journal of Production Economics&#32;182: 113\u201331.&#32;doi:10.1016\/j.ijpe.2016.08.018. &#160; \n\n\u2191 5.0 5.1 Devenport, T.H.; Harris, J.G.&#32;(2007).&#32;Competing on Analytics: The New Science of Winning.&#32;Harvard Business School Press.&#32;pp.&#160;240.&#32;ISBN&#160;9781422103326. &#160; \n\n\u2191 6.0 6.1 6.2 Cooper, R.G.&#32;(1990).&#32;\"Stage-gate systems: A new tool for managing new products\".&#32;Business Horizons&#32;33&#32;(3): 44\u201354.&#32;doi:10.1016\/0007-6813(90)90040-I. &#160; \n\n\u2191 7.0 7.1 7.2 Braganza, A.; Brooks, L.; Nepelski, D. et al.&#32;(2017).&#32;\"Resource management in big data initiatives: Processes and dynamic capabilities\".&#32;Journal of Business Research&#32;70: 328\u201337.&#32;doi:10.1016\/j.jbusres.2016.08.006. &#160; \n\n\u2191 Philip Chen, C.L.; Zhang, C.Y.&#32;(2014).&#32;\"Data-intensive applications, challenges, techniques and technologies: A survey on Big Data\".&#32;Information Sciences&#32;275: 314-347.&#32;doi:10.1016\/j.ins.2014.01.015. &#160; \n\n\u2191 Gandomi, A.; Haider, M.&#32;(2015).&#32;\"Beyond the hype: Big data concepts, methods, and analytics\".&#32;International Journal of Information Management&#32;35&#32;(2): 137\u201344.&#32;doi:10.1016\/j.ijinfomgt.2014.10.007. &#160; \n\n\u2191 James, G.; Witten, D.; Hastie, T.; Tibshirani, R.&#32;(2015).&#32;An Introduction to Statistical Learning with Applications in R&#32;(6th ed.).&#32;Springer.&#32;doi:9781461471387. &#160; \n\n\u2191 Mitchell, T.M.&#32;(1997).&#32;Machine Learning&#32;(1st ed.).&#32;McGraw-Hill Education.&#32;doi:9780070428072. &#160; \n\n\u2191 Waller, M.A.; Fawcett, S.E.&#32;(2013).&#32;\"Data Science, Predictive Analytics, and Big Data: A Revolution That Will Transform Supply Chain Design and Management\".&#32;Journal of Business Logistics&#32;34&#32;(2): 77\u201384.&#32;doi:10.1111\/jbl.12010. &#160; \n\n\u2191 Meierhofer, J.; Meier, K.&#32;(2017).&#32;\"From Data Science to Value Creation\".&#32;Proceedings from IESS 2017: Exploring Services Science: 173\u201381.&#32;doi:10.1007\/978-3-319-56925-3_14. &#160; \n\n\u2191 Cielen, D.; Meysman, A.; Ali, M.&#32;(2016).&#32;Introducing Data Science: Big Data, Machine Learning, and more, using Python tools.&#32;Manning Publications.&#32;pp.&#160;320.&#32;ISBN&#160;9781633430037. &#160; \n\n\u2191 Salerno, M.S.; de Vasconcelos Gomes, L.A.; da Silva, D.E. et al.&#32;(2015).&#32;\"Innovation processes: Which process for which project?\".&#32;Technovation&#32;35: 59\u201370.&#32;doi:10.1016\/j.technovation.2014.07.012. &#160; \n\n\u2191 Sivarajah, U.; Kamal, M.M.; Irani, Z. et al.&#32;(2017).&#32;\"Critical analysis of Big Data challenges and analytical method\".&#32;Journal of Business Research&#32;70: 263\u201386.&#32;doi:10.1016\/j.jbusres.2016.08.001. &#160; \n\n\u2191 Provost, F.; Fawcett, T.&#32;(2013).&#32;Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking.&#32;O'Reilly Media.&#32;pp.&#160;414.&#32;ISBN&#160;9781449361327. &#160; \n\n\u2191 Liedtka, J.; Ogilvie, T.&#32;(2011).&#32;Designing for Growth: A Design Thinking Tool Kit for Managers.&#32;Columbia Business School Publishing.&#32;pp.&#160;248.&#32;ISBN&#160;9780231158381. &#160; \n\n\u2191 19.0 19.1 Mikalef, P.; Pappas, I.O.; Krogstie, J.; Giannakos, M. et al.&#32;(2017).&#32;\"Big data analytics capabilities: A systematic literature review and research agenda\".&#32;Information Systems and e-Business Management: 1\u201332.&#32;doi:10.1007\/s10257-017-0362-y. &#160; \n\n\u2191 Nambisan, S.; Lyytinen, K.; Majchrzak, A.; Song, M.&#32;(2017).&#32;\"Digital innovation management: Reinventing innovation management research in a digital world\".&#32;MIS Quarterly&#32;41&#32;(1): 223\u201338.&#32;doi:10.25300\/MISQ\/2017\/41:1.03. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation and grammar. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version \u2014 by design \u2014 lists them in order of appearance.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\">https:\/\/www.limswiki.org\/index.php\/Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on big data\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 24 July 2018, at 20:24.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 609 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","3d10ab796a58a8bc8aa92318f0b8bfdb_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Data_science_as_an_innovation_challenge_From_big_data_to_value_proposition skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Data science as an innovation challenge: From big data to value proposition<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>Analyzing \u201cbig data\u201d holds huge potential for generating business value. The ongoing advancement of tools and technology over recent years has created a new ecosystem full of opportunities for data-driven innovation. However, as the amount of available data rises to new heights, so too does complexity. Organizations are challenged to create the right contexts, by shaping interfaces and processes, and by asking the right questions to guide the <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">data analysis<\/a>. Lifting the innovation potential requires teaming and focus to efficiently assign available resources to the most promising initiatives. With reference to the innovation process, this article will concentrate on establishing a process for analytics projects from first ideas to realization (in most cases, a running application). The question we tackle is: what can the practical discourse on big data and analytics learn from innovation management? The insights presented in this article are built on our practical experiences in working with various clients. We will classify analytics projects as well as discuss common innovation barriers along this process.\n<\/p><p><b>Keywords<\/b>: analytics, big data, digital innovation, idea generation, innovation process\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<blockquote>\"Listening to the data is important\u2026 but so is experience and intuition. After all, what is intuition at its best but large amounts of data of all kinds filtered through a human brain rather than a math model?\" - Steve Lohr, Technology and economics journalist<\/blockquote>\n<p>Understandably, much effort is being expended into analyzing \u201cbig data\u201d to unleash its potentially enormous business value.<sup id=\"rdp-ebb-cite_ref-McAfeeBig12_1-0\" class=\"reference\"><a href=\"#cite_note-McAfeeBig12-1\" rel=\"external_link\">[1]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WambaBig17_2-0\" class=\"reference\"><a href=\"#cite_note-WambaBig17-2\" rel=\"external_link\">[2]<\/a><\/sup> New data sources evolve, and new techniques for storing and analyzing large data sets are enabling many new applications, but the exact business value of any one big data application is often unclear. From a practical viewpoint, organizations still struggle to use data meaningfully or they lack the right competencies. Different types of analytics problems arise in an organizational context, depending on whether the starting point is a precise request from a department that only lacks required skills or capabilities (e.g., machine learning) or rather it stems from a principal interest in working with big data (e.g., no own infrastructure, no methodical experience). So far, clear strategies and process for value generation from data are often missing.\n<\/p><p>Much literature addresses the technical and methodical implementation, the transformative strength of big data<sup id=\"rdp-ebb-cite_ref-WambaHow15_3-0\" class=\"reference\"><a href=\"#cite_note-WambaHow15-3\" rel=\"external_link\">[3]<\/a><\/sup>, the enhancement of firm performance by building analytics capability<sup id=\"rdp-ebb-cite_ref-AkterHow16_4-0\" class=\"reference\"><a href=\"#cite_note-AkterHow16-4\" rel=\"external_link\">[4]<\/a><\/sup>, or other managerial issues<sup id=\"rdp-ebb-cite_ref-DavenportCompeting07_5-0\" class=\"reference\"><a href=\"#cite_note-DavenportCompeting07-5\" rel=\"external_link\">[5]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-McAfeeBig12_1-1\" class=\"reference\"><a href=\"#cite_note-McAfeeBig12-1\" rel=\"external_link\">[1]<\/a><\/sup> Little work covers the transformation process from first ideas to ready analytics applications or in building analytics competence. This article seeks to address this gap.\n<\/p><p>Analytics initiatives have several unique features. First, they require an exploratory approach\u2014the analysis does not start with specific requirements as in other projects but rather with an idea or data set. To assess the contribution, ideation techniques and rapid prototyping are applied. This exploration plays a key role in developing a shared understanding and giving a big data initiative a strategic direction. Second, analytics projects in their early phase are bound to a complex interplay between different stakeholder interests, competencies, and viewpoints. Learning is an integral part of these projects to build experience and competence with analytics. Third, analytics projects run in parallel to the existing information technology (IT) infrastructure and deliver short scripts or strategic insights, which are then installed in larger IT projects. Due to a missing end-to-end target, data is not only to be extracted, transformed, and loaded, but also needs to be identified, classified, and partly structured. So, a general process for value generation needs to be established to guide analytics projects and address these issues.\n<\/p><p>Here, we propose an exact configuration and series of steps to guide a big data analytics project. The lack of specified requirements and defined project goals in a big data analytics project (compared to a classic analytics project) make it challenging to structure the analytics process. Therefore, the linear innovation process serves as reference and orientation.<sup id=\"rdp-ebb-cite_ref-CooperStage90_6-0\" class=\"reference\"><a href=\"#cite_note-CooperStage90-6\" rel=\"external_link\">[6]<\/a><\/sup> As Braganza and colleagues<sup id=\"rdp-ebb-cite_ref-BraganzaResource17_7-0\" class=\"reference\"><a href=\"#cite_note-BraganzaResource17-7\" rel=\"external_link\">[7]<\/a><\/sup> describe, for big data to be successfully integrated and implemented in an organization, clear and repeatable processes are required. Nevertheless, each analytics initiative is different and the process needs to flexible. Unfortunately, the literature rarely combines challenges in the analytics process with concepts from innovation management. Nevertheless, an integration of the concepts from innovation management could guide the analytics work of formulating digital strategies, organizational anchoring of the analytics units and their functions, and designing the analytics portfolio, as well as the underlying working principles (e.g., rapid prototyping, ideation techniques).\n<\/p><p>Thus, in this article, we will concentrate on the question of what the practical discourse and work on analytics respectively implementing big data in organizations can learn from innovation management. A process for analytics innovation is introduced to guide the process from ideation to value generation. Emphasis is put on challenges during this process as well as different entry points. Thereby, we build on experience and insights from a number of analytics projects for different sectors and domains to derive recommendations for successfully implementing analytics solutions.\n<\/p><p>We begin with a definition of big data and analytics. Next, we propose a process for a structured approach to retrieving value from data. Finally, we discuss the results and outline directions for future research.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Big_data_and_analytics\">Big data and analytics<\/span><\/h2>\n<p>In this section, we address the elementary angles from which the analytics value chain should be looked at (Figure 1): data, infrastructure, and analytics\u2013and the business need as the driver. According to our understanding, value is generated by analyzing data within a certain context, with a problem statement related to a business requirement driving the need for innovation. Besides expertise in conducting data and analytics projects, this process requires a working infrastructure, especially when volume, velocity, or variety of data to be analyzed exceeds certain limits. Below, we describe the three technical angles in more detail.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Kayser_TechInnoManRev2018_8-3.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"144c5216ae35243091e59d65b943318e\"><img alt=\"Fig1 Kayser TechInnoManRev2018 8-3.png\" src=\"https:\/\/www.limswiki.org\/images\/e\/e9\/Fig1_Kayser_TechInnoManRev2018_8-3.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> Framework of data, infrastructure, analytics and business need<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Data\">Data<\/span><\/h3>\n<p>Big data is often defined with volume (how much data), velocity (speed of data generation), and variety as the diversity of data types.<sup id=\"rdp-ebb-cite_ref-PhilipData14_8-0\" class=\"reference\"><a href=\"#cite_note-PhilipData14-8\" rel=\"external_link\">[8]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-GandomiBeyond15_9-0\" class=\"reference\"><a href=\"#cite_note-GandomiBeyond15-9\" rel=\"external_link\">[9]<\/a><\/sup> Big data describes data collections of a size difficult to process with traditional data management techniques. While many definitions of big data concentrate on the aspect of volume referring to the scale of data available, big data brings in particular heterogeneous formats and a broad spectrum of possible data sources. Examples are structured numeric data or unstructured data such as text, images, or videos. This variety and broad landscape of data sources offers many opportunities for generating insights. Moreover, the speed of data creation enables rapid insights in ongoing developments.\n<\/p><p>Recent technical improvements (e.g., <a href=\"https:\/\/www.limswiki.org\/index.php\/Cloud_computing\" title=\"Cloud computing\" target=\"_blank\" class=\"wiki-link\" data-key=\"fcfe5882eaa018d920cedb88398b604f\">cloud computing<\/a>, big data architectures) enable data to be analyzed and stored on a large scale. For many (new) types of data, their exact business value is unclear so far and requires systematic exploration. Available data is often messy, and even when cleaned up can be overwhelming and too complex to be easily understood, even by professional data scientists. The contribution of data is, of course, context specific and varies among business cases and applications. One key challenge is to identify data that best meets the business requirement.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Analytics\">Analytics<\/span><\/h3>\n<p>Data science is concerned with knowledge generation from data. Analytics or data science addresses the exploration of data sets with different quantitative methods motivated from statistical modelling<sup id=\"rdp-ebb-cite_ref-JamesAnInto15_10-0\" class=\"reference\"><a href=\"#cite_note-JamesAnInto15-10\" rel=\"external_link\">[10]<\/a><\/sup> or machine learning.<sup id=\"rdp-ebb-cite_ref-MitchellMachine97_11-0\" class=\"reference\"><a href=\"#cite_note-MitchellMachine97-11\" rel=\"external_link\">[11]<\/a><\/sup> Methods from different disciplines such as statistics, economics, or computer science find application to identify patterns, influence factors, or dependencies. In contrast to business intelligence, analytics reaches further than descriptive analytics (based on SQL) and often has a predictive component. Which method to apply depends on the exact business case. Analyzing data is restricted, for example, by a company\u2019s internal policies as well as legal restrictions and guidelines that vary among countries. Data quality and reliability are further issues. Data understanding and domain knowledge are key prerequisites in the analysis process (e.g., Waller &amp; Fawcett<sup id=\"rdp-ebb-cite_ref-WallerData13_12-0\" class=\"reference\"><a href=\"#cite_note-WallerData13-12\" rel=\"external_link\">[12]<\/a><\/sup>), especially when model assumptions are made.\n<\/p><p>Concerning data analysis, there are primarily the following opportunities for organizations:\n<\/p>\n<ul><li> Improved analysis of internal data: One example is forecasting methods that enhance expert-based planning approaches by additional figures. These methods build on existing databases such as business intelligence systems, and they contribute new or further insights to internal firm processes.<\/li>\n<li> Putting data together in new ways: New combinations of data sets offer new insights, for example, through the combination of sensor data and user profiles.<\/li>\n<li> Opening up to new or (so far) unused data sources (e.g., websites, open data) to identify potential for generating new insights: However, a context or application is necessary to use the data. One example is social media data used for market observation.<\/li><\/ul>\n<p>However, the core problem of analytics is to work out the guiding question and achieve a match between business need, data source, and analysis as discussed later in the article.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"IT_infrastructure\">IT infrastructure<\/span><\/h3>\n<p>Relevant for the successful implementation of analytics is the adaption of the IT infrastructure to embed analytics solutions and integrate different data sources. The core layers of an IT infrastructure are the:\n<\/p>\n<ol><li> <b>Data ingestion layer<\/b>: This layer covers the data transfer from a source system to an analytics environment. Therefore, a toolset and a corresponding process need to be defined. Traditional extract, transform, load (ETL) tools and relational databases are combined with Hadoop\/big data setups covering, in particular, scenarios caused by less structured, high-volume, or streamed data. Analytics use cases build on data from data warehouses to fully unstructured data. This breadth challenges classic architectures and requires adaptable schemes. Which data sources to integrate depends on the specific application.<\/li>\n<li> <b>Data value exploration layer<\/b>: Based on the business need and corresponding use case, data is investigated, tested, and sampled in this layer. Depending on the complexity and business question, an appropriate analytics scheme is developed. Business and explorative analysis based on online analytical processing (OLAP) models in memory technologies are supplemented or expanded by using advanced analytics methods and integrating (e.g., R or Python plugins).<\/li>\n<li> <b>Data consumption layer<\/b>: Here, the results are used for visualization, for example. The end user can consume the data or service without deep technical understanding (e.g., for self-service business intelligence).<\/li><\/ol>\n<p>Modern approaches require structures that are adaptable and scalable to different requirements and data sources. Factors such as system performance, cost efficiency, and overall enterprise infrastructure strategy must be taken into consideration.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"From_data_to_value:_Turning_ideas_into_applications\">From data to value: Turning ideas into applications<\/span><\/h2>\n<p>Organizations still struggle to use data meaningfully or lack the right competencies. One of the key challenges in analytics projects is identifying the business need and the guiding questions. Principally, different types of analytics problems arise in an organizational context ranging from precise requests that only lack specific capabilities to a principal interest in working with big data (e.g., no own infrastructure, expert-based approaches). This approach implies different starting points for the analytics process and different innovation pathways, both of which are described later in this article.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"What_is_the_starting_point.3F\">What is the starting point?<\/span><\/h3>\n<p>The starting point for each analytics initiative varies. According to the four points mentioned above, the \u201cstate of the art\u201d for each one needs to be assessed individually to estimate the analytics maturity:\n<\/p>\n<ol><li> <b>Business need<\/b>: From case to case, the precision of the problem description and scope varies. For some cases, the leading question and scope guiding the analysis phase are formulated very precisely, and for other cases it needs to be worked out and refined during the process. <\/li>\n<li> <b>Data<\/b>: The data to be used in the project can be defined or an appropriate source is not yet clear. The size and quality of the data essentially determine the progress of the further process. Parameters are, for example, structure (i.e., pre-processing effort) or the size of the data set (e.g., one CSV file or a large database).<\/li>\n<li> <b>Analytics<\/b>: Which methods to apply differs from case to case and must be tested and explored.<\/li>\n<li> <b>Infrastructure<\/b>: The current (technical) state of the business unit (e.g., own data warehouse, reporting system) or own (human) resources and competencies is a further important aspect in classifying requests.<\/li><\/ol>\n<p>These four angles can be rated differently with reference to the maturity level of the analytics request. Based on our experience, three scenarios, representing different maturity levels, can be distinguished from these four angles (illustrated in Figure 2).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Kayser_TechInnoManRev2018_8-3.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"b9bd5fc073c48f83d0b4f29a1cc0f55f\"><img alt=\"Fig2 Kayser TechInnoManRev2018 8-3.png\" src=\"https:\/\/www.limswiki.org\/images\/4\/49\/Fig2_Kayser_TechInnoManRev2018_8-3.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> Classifying analytics requests: Three maturity levels<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>In scenario 1, the data analysis is motivated by a defined requirement such as market observation during the rollout of a new product. The appropriate data source needs to be identified. The data missing so far implies that the precise analysis cannot be defined and also that there is no existing infrastructure. Ideas need to be developed as to which data sources could be relevant and which issues can be resolved on this basis. Then, different methods from data analysis are applied to generate new insights.\n<\/p><p>In scenario 2, the data source and infrastructure are clearly defined, and the specific questions need to be identified. One application is assessing the contribution of a specific data source that has not been professionally analyzed so far, for example, by means of machine learning. For instance, the business unit has an internal database, considers new methods, and wants to further develop a business intelligence system by adding a forecasting component. In this case, the scope is clearer than in the first scenario, and straight away an exploratory data analysis can be started.\n<\/p><p>In scenario 3, there is a precise analytical problem that needs to be professionalized. A first draft shows promising results, and the solution can, as a next step, be upscaled. Guidance in making architectural decisions is needed.\n<\/p><p>These three scenarios are exemplary starting points for analytics projects. The following section describes the implications for the innovation process and outlines different challenges and barriers.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"The_analytics_process\">The analytics process<\/span><\/h3>\n<p>To succeed with analytics, the process from data to value must be structured to be integrated in the existing organization. For example, Braganza and colleagues<sup id=\"rdp-ebb-cite_ref-BraganzaResource17_7-1\" class=\"reference\"><a href=\"#cite_note-BraganzaResource17-7\" rel=\"external_link\">[7]<\/a><\/sup> examine the management of organizational resources in big data initiatives. They stress the importance of systematic approaches and processes to operationalize big data.\n<\/p><p>Related work on analytics processes has a focus on service design<sup id=\"rdp-ebb-cite_ref-MeierhoferFromData17_13-0\" class=\"reference\"><a href=\"#cite_note-MeierhoferFromData17-13\" rel=\"external_link\">[13]<\/a><\/sup> or concentrates on the methodical part of analyzing data.<sup id=\"rdp-ebb-cite_ref-CielenInto16_14-0\" class=\"reference\"><a href=\"#cite_note-CielenInto16-14\" rel=\"external_link\">[14]<\/a><\/sup> The process, as introduced by Braganza and colleagues<sup id=\"rdp-ebb-cite_ref-BraganzaResource17_7-2\" class=\"reference\"><a href=\"#cite_note-BraganzaResource17-7\" rel=\"external_link\">[7]<\/a><\/sup>, is too linear and does not address the systemic complexity of data analysis and necessary stakeholder discourse. To cover these issues, structuring the analytics process can be linked to the classic linear innovation process.<sup id=\"rdp-ebb-cite_ref-CooperStage90_6-1\" class=\"reference\"><a href=\"#cite_note-CooperStage90-6\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-SalernoInnov15_15-0\" class=\"reference\"><a href=\"#cite_note-SalernoInnov15-15\" rel=\"external_link\">[15]<\/a><\/sup>\n<\/p><p>In our work, to guide the analytics process from ideation, scoping, and identifying a data set to value generation, a process with four phases is introduced. Taking the classic innovation funnel as starting point, this concept is transferred to the context of analytics. The process is divided in four parts: i) the generation of ideas, ii) the development of proof of concepts (PoCs) to test these ideas, iii) the implementation and testing of successful PoCs, and, finally, iv) making them available as a product or service. Based on a first idea or requirement, the process is initialized, while the number of ideas or projects is reduced within each phase. Each phase has tasks, as well as barriers or filters, that need to be successfully addressed to continue in the process chain.\n<\/p><p>The three scenarios described above are assessed differently concerning their maturity, as illustrated in the process in Figure 3. Scenario 1 is in a very early stage of idea generation and many open questions need to be addressed. Scenario 2 is more concrete and many more issues are resolved than in scenario 1. However, initiating questions need to be developed before a PoC can be conducted. Scenario 3 builds on a running system, so it is located in the phase of testing and operationalization (phase three).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Kayser_TechInnoManRev2018_8-3.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"a38076f7c2d7096176373b941c6bea91\"><img alt=\"Fig3 Kayser TechInnoManRev2018 8-3.png\" src=\"https:\/\/www.limswiki.org\/images\/a\/aa\/Fig3_Kayser_TechInnoManRev2018_8-3.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3.<\/b> Phases of the analytics process<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>For each phase, different challenges arise. While related work emphasizes data-related challenges such as data acquisition, cleansing, or aggregation<sup id=\"rdp-ebb-cite_ref-SivarajahCritical17_16-0\" class=\"reference\"><a href=\"#cite_note-SivarajahCritical17-16\" rel=\"external_link\">[16]<\/a><\/sup>, this work focuses on process challenges.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Phase_1:_Idea_generation\">Phase 1: Idea generation<\/span><\/h3>\n<p>Orientating analytics projects begins with an ideation phase. Here, the key challenge is to gather ideas and discuss relevant business problems (see also Provost &amp; Fawcett<sup id=\"rdp-ebb-cite_ref-ProvostData13_17-0\" class=\"reference\"><a href=\"#cite_note-ProvostData13-17\" rel=\"external_link\">[17]<\/a><\/sup>). Idea generation plays a key role in developing a shared understanding, challenging existing assumptions, orientating big data initiatives, and identifying aspects that can be solved with analytics. For example, design thinking is applied as a systematic approach to problem solving<sup id=\"rdp-ebb-cite_ref-LiedtkaDesign11_18-0\" class=\"reference\"><a href=\"#cite_note-LiedtkaDesign11-18\" rel=\"external_link\">[18]<\/a><\/sup> and supports a structured ideation process. Problems of the business unit are collected and matched with the scope of analytics (e.g., technical feasibility, input parameters, and methodical requirements). The ideation phase is iterative. Initially, the general project objectives guide the first ideation round, which aims at getting an overview of present challenges and needs of the business unit. This is in line with identifying appropriate data sets. Then, the feasibility of the ideas must be checked by experts and the ideas are then selected for prototyping.\n<\/p><p>From an organizational perspective, involvement of decision makers from all hierarchy levels is a must. Top management is required to resolve conflicts of interest and to create a sense of urgency, middle management is required to free experts from daily work and onboard stakeholders into their particular roles, whereas the expert knowledge of operative specialists is key to detailing the guiding question and checking the feasibility.\n<\/p><p>A portfolio is drawn to select the ideas that are considered in the PoC phase. Innovation portfolios provide a coherent basis for judging the possible impact of ideas (Tidd &amp; Bessant, 2013). They separate ideas into areas and indicate which ideas to prioritize. For the exemplary case as illustrated in Figure 4, the ideas are rated and assessed according to three categories: feasibility (x-axis), value creation (y-axis), and overall relevance (size of the node). Feasibility contains aspects such as data availability, time to access data, or the expected complexity of the task. Value creation addresses the expected business value and underlines ideas with a high expected contribution. The overall relevance is used to emphasize which ideas are expected to have greater impact on the problem at hand. So, for example, idea 3 has a high expected feasibility, but the created value is expected to be low. By contrast, idea 4 and idea 8 are bound to a higher expectation concerning value creation and should therefore be prioritized in the next phase.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_Kayser_TechInnoManRev2018_8-3.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"08816e2114964a387d985bb931d0ea9c\"><img alt=\"Fig4 Kayser TechInnoManRev2018 8-3.png\" src=\"https:\/\/www.limswiki.org\/images\/9\/91\/Fig4_Kayser_TechInnoManRev2018_8-3.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 4.<\/b> Portfolio for selecting ideas<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Besides the portfolio-based selection process, ideas are filtered during the first phase, for example, because there is no data available to address the problem, the data must be raised first (e.g., implementation of additional sensors), or access is denied (e.g., internal policies, legal restrictions). So, appropriate data sources need to be identified and access needs to be granted for a reliable yet efficient assessment of business needs and data applicability.\n<\/p><p>As an organizational barrier, the right experts need to be identified and freed of their daily work such that they are available for analytics projects. During the ideation process, the right balance between creativity and focus is important as well as bridging the gaps between diverse knowledge areas to ask the right questions.\n<\/p><p>The outcome of this first phase are represented by the ideas plus the data sources on which basis the problems can be examined; a mapping of problems or ideas and data sources is required. In the first phase, strong facilitators are needed to guide through the process. In addition, someone with methodical expertise to check the technical feasibility of the ideas considered as well as business understanding are important. The ideas and data are only discussed; no examination takes place. This is done in the next step.\n<\/p><p>Another issue that needs to be clarified in this early phase are <a href=\"https:\/\/www.limswiki.org\/index.php\/Information_security\" title=\"Information security\" target=\"_blank\" class=\"wiki-link\" data-key=\"9eff362d944224ff1d4ffe3a149d7cff\">data security<\/a> and data protection. Each country has individual regulations that limit the analysis.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Phase_2:_Proof_of_concept\">Phase 2: Proof of concept<\/span><\/h3>\n<p>To test the ideas, prototypes are built and PoCs are conducted. PoCs are a first examination of the data set to see if a raised question can be answered based on the available data or not.\n<\/p><p>This phase is described in Figure 5. Based on the defined scope from the previous phase, access to the data must be granted, the data is explored and analyzed, and finally the results are communicated.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_Kayser_TechInnoManRev2018_8-3.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"0619457461466a8a2b19a86400a04676\"><img alt=\"Fig5 Kayser TechInnoManRev2018 8-3.png\" src=\"https:\/\/www.limswiki.org\/images\/5\/5b\/Fig5_Kayser_TechInnoManRev2018_8-3.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 5.<\/b> The analytics process<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>As described previously, this phase begins with a project goal or problem description (business need). Whereas classic IT development starts with requirements, analytics often starts in an exploratory way with a dataset and a hypothesis. Specific requirements are generated during the analysis process. So, the PoC phase can only start with data or when data is available. Getting the data or retrieving it from existing systems is among the first steps in a PoC. Here, access barriers such as legal issues or organizational constraints need to be checked. For example, depending on the type of data (e.g., personal data, machine data, market figures), the analysis should be in line with these restrictions.\n<\/p><p>Next, the data is explored for a deeper understanding. Here, the data is transformed to a suitable format for further analysis. This step contains data preparation and cleaning, and the first descriptive analysis is conducted.\n<\/p><p>The data is then analyzed for patterns and dependencies during the modelling phase to answer the questions raised. Different methods and algorithms are tested, and the results are validated in an iterative process of variable selection, model selection, model adaption, and validation.\n<\/p><p>Finally, the results are communicated. A PoC gives a first orientation on the potential in the data with an emphasis on strengths and weaknesses. Possible results are that different modelling techniques do not deliver a valid result, the data quality does not allow modelling, or there is not enough data for a significant statement. This is finally the basis for planning and communicating next steps and coordinating further actions.\n<\/p><p>Concerning the presentation of the results, different visualization techniques can be applied working with tools such as Tableau, QlikView, or different open-source platforms. Especially to develop an understanding of the data, descriptive data analysis is helpful. Nevertheless, many models and techniques from advanced analytics deliver figures that cannot be captured by intuitive visualizations.\n<\/p><p>PoCs have a short duration of maybe six to eight weeks. Besides getting access to the necessary data or extracting data from relevant sources, among the key challenges in this phase are data quality, data ownership, and data understanding. Further barriers are cleansing and munging of the data to a format that can be processed and to apply the right models. Furthermore, business understanding is key to retrieving valuable insights from the data and achieving outcomes that are not only plausible but relevant for the business. Another issue is the lack of experience with analytics and the required agility in implementing the results.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Phase_3_and_4:_Operationalization\">Phase 3 and 4: Operationalization<\/span><\/h3>\n<p>Then, the PoC results are integrated into a professional IT infrastructure. Prototyped results need to be prepared for operationalization and transformed to an application. The main question to answer is if the model is scalable and whether or not results achieved so far can be applied to a larger data set. Adjustments have to be made so that a resulting application can be maintained by an IT service organization without continued support from data scientists. Event or time-based data flows have to be established and, together with the final application, need to be aligned with compliance, security, and data privacy requirements. Test management and service-level agreements for incident handling and application changes need to be agreed on as well as product and portfolio management functions in case the tool or application is meant to assume a strategic, long-term role.\n<\/p><p>Barriers include, for example, the required budget, overly complex tests, standards, and compliance. Together, the integration in IT management and allocation of tasks to the IT department represent another issue. This relates to switching from an agile, iterative working model to stable operations and scaling the analytical model and transferring it to maintainable code.\n<\/p><p>Generally, great effort is required in transforming the PoC prototype into a professional infrastructure. Further barriers during operationalization are, for example, establishing support and service management functions; achieving acceptance among the user base; developing adequate training concepts; and transferring knowledge required to maintain, test, and develop the application.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion_and_conclusion\">Discussion and conclusion<\/span><\/h2>\n<p>Generally, the challenge for organizations lies in defining strategies for value generation from the large amount of available data sets. In this article, we discussed how to retrieve value from data and introduced a systematic process that analytics projects follow. First, we described the fundamental building blocks for value creation: business need, data, infrastructure, and analytics. Then, we described the process from ideation to market ready applications. According to the maturity state of the project, the process can be entered at different stages. The four phases of this process were described with emphasis on the specific barriers. This model is oriented towards a stage-gate model<sup id=\"rdp-ebb-cite_ref-CooperStage90_6-2\" class=\"reference\"><a href=\"#cite_note-CooperStage90-6\" rel=\"external_link\">[6]<\/a><\/sup> for analytics processes and aims to structure and systematize exploratory analytics approaches.\n<\/p><p>Analytics and big data are not only a technical challenge; they impact the whole organization and its processes. For being successful with analytics, less effort should be spent on building complex and sophisticated models but instead on integrating the results into the existing (technical) infrastructure and processes. For the prototype being professionalized, the results must be accepted and understood, and the business unit should be continuously involved in the process. Moreover, the right set of people and skills is necessary. Not only are data scientists with competencies in machine learning and statistical modelling required<sup id=\"rdp-ebb-cite_ref-MikalefBigData17_19-0\" class=\"reference\"><a href=\"#cite_note-MikalefBigData17-19\" rel=\"external_link\">[19]<\/a><\/sup>, but also IT specialists and business understanding in general must be addressed. In addition, value is only generated from data if the analysis is integrated into an overall framework of skills and competencies and the analytics initiative is embedded in a business application.\n<\/p><p>The results of this article can be transferred to organizations of different sizes and levels of experience when building analytics capabilities. The process as described in this work guides personnel through analytics projects and illustrates the differences to known IT management approaches. By principally discussing the meaning of innovation for analytics, this work contributes to the evolving literature on digital innovation management.<sup id=\"rdp-ebb-cite_ref-NambisanDigital17_20-0\" class=\"reference\"><a href=\"#cite_note-NambisanDigital17-20\" rel=\"external_link\">[20]<\/a><\/sup> In our work, we have outlined an approach for data-driven innovation.\n<\/p><p>Future work should examine the decisions in organizing analytics. This covers aspects such as roles and responsibilities, team structures, leading analytics teams, or the organizational embedding of analytics units in the organization. The results of this work should be linked to the extensive research on analytics capability, which are often classified along the dimensions of management, technology, and human capability.<sup id=\"rdp-ebb-cite_ref-AkterHow16_4-1\" class=\"reference\"><a href=\"#cite_note-AkterHow16-4\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-MikalefBigData17_19-1\" class=\"reference\"><a href=\"#cite_note-MikalefBigData17-19\" rel=\"external_link\">[19]<\/a><\/sup> Throughout the process, as introduced in this work, the understanding of analytics becomes clearer. As such, its contribution to organizational learning, skill development, developing a shared understanding, and building analytics capability should be examined. For example, according to Davenport and Harris<sup id=\"rdp-ebb-cite_ref-DavenportCompeting07_5-1\" class=\"reference\"><a href=\"#cite_note-DavenportCompeting07-5\" rel=\"external_link\">[5]<\/a><\/sup>, this analytics learning process needs around 18 to 36 months. From a technical point of view, in particular the integration of analytics solution into the overall IT landscape, the professionalization of prototypes and change of established processes remain challenging.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>This article was developed from a paper presented at the ISPIM Innovation Conference in Vienna, Austria, June 18\u201321, 2017. ISPIM\u2013the International Society for Professional Innovation Management\u2013is a network of researchers, industrialists, consultants, and public bodies who share an interest in innovation management.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-McAfeeBig12-1\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-McAfeeBig12_1-0\" rel=\"external_link\">1.0<\/a><\/sup> <sup><a href=\"#cite_ref-McAfeeBig12_1-1\" rel=\"external_link\">1.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">McAfee, A.; Byrnjolfsson, E.&#32;(2012).&#32;\"Big data: The management revolution\".&#32;<i>Harvard Business Review<\/i>&#32;<b>90<\/b>&#32;(10): 60\u20138.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a>&#160;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23074865\" target=\"_blank\">23074865<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+data%3A+The+management+revolution&amp;rft.jtitle=Harvard+Business+Review&amp;rft.aulast=McAfee%2C+A.%3B+Byrnjolfsson%2C+E.&amp;rft.au=McAfee%2C+A.%3B+Byrnjolfsson%2C+E.&amp;rft.date=2012&amp;rft.volume=90&amp;rft.issue=10&amp;rft.pages=60%E2%80%938&amp;rft_id=info:pmid\/23074865&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WambaBig17-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WambaBig17_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wamba, S.F.; Gunasekaran, A.; Akter, S. et al.&#32;(2017).&#32;\"Big data analytics and firm performance: Effects of dynamic capabilities\".&#32;<i>Journal of Business Research<\/i>&#32;<b>70<\/b>: 356\u201365.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbusres.2016.08.009\" target=\"_blank\">10.1016\/j.jbusres.2016.08.009<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+data+analytics+and+firm+performance%3A+Effects+of+dynamic+capabilities&amp;rft.jtitle=Journal+of+Business+Research&amp;rft.aulast=Wamba%2C+S.F.%3B+Gunasekaran%2C+A.%3B+Akter%2C+S.+et+al.&amp;rft.au=Wamba%2C+S.F.%3B+Gunasekaran%2C+A.%3B+Akter%2C+S.+et+al.&amp;rft.date=2017&amp;rft.volume=70&amp;rft.pages=356%E2%80%9365&amp;rft_id=info:doi\/10.1016%2Fj.jbusres.2016.08.009&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WambaHow15-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WambaHow15_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wamba, S.F.; Akter, S.; Edwards, A. et al.&#32;(2015).&#32;\"How \u2018big data\u2019 can make big impact: Findings from a systematic review and a longitudinal case study\".&#32;<i>International Journal of Production Economics<\/i>&#32;<b>165<\/b>: 234-46.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijpe.2014.12.031\" target=\"_blank\">10.1016\/j.ijpe.2014.12.031<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=How+%E2%80%98big+data%E2%80%99+can+make+big+impact%3A+Findings+from+a+systematic+review+and+a+longitudinal+case+study&amp;rft.jtitle=International+Journal+of+Production+Economics&amp;rft.aulast=Wamba%2C+S.F.%3B+Akter%2C+S.%3B+Edwards%2C+A.+et+al.&amp;rft.au=Wamba%2C+S.F.%3B+Akter%2C+S.%3B+Edwards%2C+A.+et+al.&amp;rft.date=2015&amp;rft.volume=165&amp;rft.pages=234-46&amp;rft_id=info:doi\/10.1016%2Fj.ijpe.2014.12.031&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AkterHow16-4\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-AkterHow16_4-0\" rel=\"external_link\">4.0<\/a><\/sup> <sup><a href=\"#cite_ref-AkterHow16_4-1\" rel=\"external_link\">4.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Akter, S.; Wamba, S.F.; Gunasekaran, A. et al.&#32;(2016).&#32;\"How to improve firm performance using big data analytics capability and business strategy alignment?\".&#32;<i>International Journal of Production Economics<\/i>&#32;<b>182<\/b>: 113\u201331.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijpe.2016.08.018\" target=\"_blank\">10.1016\/j.ijpe.2016.08.018<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=How+to+improve+firm+performance+using+big+data+analytics+capability+and+business+strategy+alignment%3F&amp;rft.jtitle=International+Journal+of+Production+Economics&amp;rft.aulast=Akter%2C+S.%3B+Wamba%2C+S.F.%3B+Gunasekaran%2C+A.+et+al.&amp;rft.au=Akter%2C+S.%3B+Wamba%2C+S.F.%3B+Gunasekaran%2C+A.+et+al.&amp;rft.date=2016&amp;rft.volume=182&amp;rft.pages=113%E2%80%9331&amp;rft_id=info:doi\/10.1016%2Fj.ijpe.2016.08.018&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DavenportCompeting07-5\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-DavenportCompeting07_5-0\" rel=\"external_link\">5.0<\/a><\/sup> <sup><a href=\"#cite_ref-DavenportCompeting07_5-1\" rel=\"external_link\">5.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Devenport, T.H.; Harris, J.G.&#32;(2007).&#32;<i>Competing on Analytics: The New Science of Winning<\/i>.&#32;Harvard Business School Press.&#32;pp.&#160;240.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781422103326.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Competing+on+Analytics%3A+The+New+Science+of+Winning&amp;rft.aulast=Devenport%2C+T.H.%3B+Harris%2C+J.G.&amp;rft.au=Devenport%2C+T.H.%3B+Harris%2C+J.G.&amp;rft.date=2007&amp;rft.pages=pp.%26nbsp%3B240&amp;rft.pub=Harvard+Business+School+Press&amp;rft.isbn=9781422103326&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CooperStage90-6\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-CooperStage90_6-0\" rel=\"external_link\">6.0<\/a><\/sup> <sup><a href=\"#cite_ref-CooperStage90_6-1\" rel=\"external_link\">6.1<\/a><\/sup> <sup><a href=\"#cite_ref-CooperStage90_6-2\" rel=\"external_link\">6.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Cooper, R.G.&#32;(1990).&#32;\"Stage-gate systems: A new tool for managing new products\".&#32;<i>Business Horizons<\/i>&#32;<b>33<\/b>&#32;(3): 44\u201354.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2F0007-6813%2890%2990040-I\" target=\"_blank\">10.1016\/0007-6813(90)90040-I<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Stage-gate+systems%3A+A+new+tool+for+managing+new+products&amp;rft.jtitle=Business+Horizons&amp;rft.aulast=Cooper%2C+R.G.&amp;rft.au=Cooper%2C+R.G.&amp;rft.date=1990&amp;rft.volume=33&amp;rft.issue=3&amp;rft.pages=44%E2%80%9354&amp;rft_id=info:doi\/10.1016%2F0007-6813%2890%2990040-I&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BraganzaResource17-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BraganzaResource17_7-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-BraganzaResource17_7-1\" rel=\"external_link\">7.1<\/a><\/sup> <sup><a href=\"#cite_ref-BraganzaResource17_7-2\" rel=\"external_link\">7.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Braganza, A.; Brooks, L.; Nepelski, D. et al.&#32;(2017).&#32;\"Resource management in big data initiatives: Processes and dynamic capabilities\".&#32;<i>Journal of Business Research<\/i>&#32;<b>70<\/b>: 328\u201337.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbusres.2016.08.006\" target=\"_blank\">10.1016\/j.jbusres.2016.08.006<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Resource+management+in+big+data+initiatives%3A+Processes+and+dynamic+capabilities&amp;rft.jtitle=Journal+of+Business+Research&amp;rft.aulast=Braganza%2C+A.%3B+Brooks%2C+L.%3B+Nepelski%2C+D.+et+al.&amp;rft.au=Braganza%2C+A.%3B+Brooks%2C+L.%3B+Nepelski%2C+D.+et+al.&amp;rft.date=2017&amp;rft.volume=70&amp;rft.pages=328%E2%80%9337&amp;rft_id=info:doi\/10.1016%2Fj.jbusres.2016.08.006&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PhilipData14-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PhilipData14_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Philip Chen, C.L.; Zhang, C.Y.&#32;(2014).&#32;\"Data-intensive applications, challenges, techniques and technologies: A survey on Big Data\".&#32;<i>Information Sciences<\/i>&#32;<b>275<\/b>: 314-347.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ins.2014.01.015\" target=\"_blank\">10.1016\/j.ins.2014.01.015<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data-intensive+applications%2C+challenges%2C+techniques+and+technologies%3A+A+survey+on+Big+Data&amp;rft.jtitle=Information+Sciences&amp;rft.aulast=Philip+Chen%2C+C.L.%3B+Zhang%2C+C.Y.&amp;rft.au=Philip+Chen%2C+C.L.%3B+Zhang%2C+C.Y.&amp;rft.date=2014&amp;rft.volume=275&amp;rft.pages=314-347&amp;rft_id=info:doi\/10.1016%2Fj.ins.2014.01.015&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GandomiBeyond15-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GandomiBeyond15_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gandomi, A.; Haider, M.&#32;(2015).&#32;\"Beyond the hype: Big data concepts, methods, and analytics\".&#32;<i>International Journal of Information Management<\/i>&#32;<b>35<\/b>&#32;(2): 137\u201344.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijinfomgt.2014.10.007\" target=\"_blank\">10.1016\/j.ijinfomgt.2014.10.007<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Beyond+the+hype%3A+Big+data+concepts%2C+methods%2C+and+analytics&amp;rft.jtitle=International+Journal+of+Information+Management&amp;rft.aulast=Gandomi%2C+A.%3B+Haider%2C+M.&amp;rft.au=Gandomi%2C+A.%3B+Haider%2C+M.&amp;rft.date=2015&amp;rft.volume=35&amp;rft.issue=2&amp;rft.pages=137%E2%80%9344&amp;rft_id=info:doi\/10.1016%2Fj.ijinfomgt.2014.10.007&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JamesAnInto15-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JamesAnInto15_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">James, G.; Witten, D.; Hastie, T.; Tibshirani, R.&#32;(2015).&#32;<i>An Introduction to Statistical Learning with Applications in R<\/i>&#32;(6th ed.).&#32;Springer.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/9781461471387\" target=\"_blank\">9781461471387<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=An+Introduction+to+Statistical+Learning+with+Applications+in+R&amp;rft.aulast=James%2C+G.%3B+Witten%2C+D.%3B+Hastie%2C+T.%3B+Tibshirani%2C+R.&amp;rft.au=James%2C+G.%3B+Witten%2C+D.%3B+Hastie%2C+T.%3B+Tibshirani%2C+R.&amp;rft.date=2015&amp;rft.edition=6th&amp;rft.pub=Springer&amp;rft_id=info:doi\/9781461471387&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MitchellMachine97-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MitchellMachine97_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Mitchell, T.M.&#32;(1997).&#32;<i>Machine Learning<\/i>&#32;(1st ed.).&#32;McGraw-Hill Education.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/9780070428072\" target=\"_blank\">9780070428072<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Machine+Learning&amp;rft.aulast=Mitchell%2C+T.M.&amp;rft.au=Mitchell%2C+T.M.&amp;rft.date=1997&amp;rft.edition=1st&amp;rft.pub=McGraw-Hill+Education&amp;rft_id=info:doi\/9780070428072&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WallerData13-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WallerData13_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Waller, M.A.; Fawcett, S.E.&#32;(2013).&#32;\"Data Science, Predictive Analytics, and Big Data: A Revolution That Will Transform Supply Chain Design and Management\".&#32;<i>Journal of Business Logistics<\/i>&#32;<b>34<\/b>&#32;(2): 77\u201384.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Fjbl.12010\" target=\"_blank\">10.1111\/jbl.12010<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Data+Science%2C+Predictive+Analytics%2C+and+Big+Data%3A+A+Revolution+That+Will+Transform+Supply+Chain+Design+and+Management&amp;rft.jtitle=Journal+of+Business+Logistics&amp;rft.aulast=Waller%2C+M.A.%3B+Fawcett%2C+S.E.&amp;rft.au=Waller%2C+M.A.%3B+Fawcett%2C+S.E.&amp;rft.date=2013&amp;rft.volume=34&amp;rft.issue=2&amp;rft.pages=77%E2%80%9384&amp;rft_id=info:doi\/10.1111%2Fjbl.12010&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MeierhoferFromData17-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MeierhoferFromData17_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Meierhofer, J.; Meier, K.&#32;(2017).&#32;\"From Data Science to Value Creation\".&#32;<i>Proceedings from IESS 2017: Exploring Services Science<\/i>: 173\u201381.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2F978-3-319-56925-3_14\" target=\"_blank\">10.1007\/978-3-319-56925-3_14<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=From+Data+Science+to+Value+Creation&amp;rft.jtitle=Proceedings+from+IESS+2017%3A+Exploring+Services+Science&amp;rft.aulast=Meierhofer%2C+J.%3B+Meier%2C+K.&amp;rft.au=Meierhofer%2C+J.%3B+Meier%2C+K.&amp;rft.date=2017&amp;rft.pages=173%E2%80%9381&amp;rft_id=info:doi\/10.1007%2F978-3-319-56925-3_14&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CielenInto16-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CielenInto16_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Cielen, D.; Meysman, A.; Ali, M.&#32;(2016).&#32;<i>Introducing Data Science: Big Data, Machine Learning, and more, using Python tools<\/i>.&#32;Manning Publications.&#32;pp.&#160;320.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781633430037.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Introducing+Data+Science%3A+Big+Data%2C+Machine+Learning%2C+and+more%2C+using+Python+tools&amp;rft.aulast=Cielen%2C+D.%3B+Meysman%2C+A.%3B+Ali%2C+M.&amp;rft.au=Cielen%2C+D.%3B+Meysman%2C+A.%3B+Ali%2C+M.&amp;rft.date=2016&amp;rft.pages=pp.%26nbsp%3B320&amp;rft.pub=Manning+Publications&amp;rft.isbn=9781633430037&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SalernoInnov15-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SalernoInnov15_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Salerno, M.S.; de Vasconcelos Gomes, L.A.; da Silva, D.E. et al.&#32;(2015).&#32;\"Innovation processes: Which process for which project?\".&#32;<i>Technovation<\/i>&#32;<b>35<\/b>: 59\u201370.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.technovation.2014.07.012\" target=\"_blank\">10.1016\/j.technovation.2014.07.012<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Innovation+processes%3A+Which+process+for+which+project%3F&amp;rft.jtitle=Technovation&amp;rft.aulast=Salerno%2C+M.S.%3B+de+Vasconcelos+Gomes%2C+L.A.%3B+da+Silva%2C+D.E.+et+al.&amp;rft.au=Salerno%2C+M.S.%3B+de+Vasconcelos+Gomes%2C+L.A.%3B+da+Silva%2C+D.E.+et+al.&amp;rft.date=2015&amp;rft.volume=35&amp;rft.pages=59%E2%80%9370&amp;rft_id=info:doi\/10.1016%2Fj.technovation.2014.07.012&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SivarajahCritical17-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SivarajahCritical17_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Sivarajah, U.; Kamal, M.M.; Irani, Z. et al.&#32;(2017).&#32;\"Critical analysis of Big Data challenges and analytical method\".&#32;<i>Journal of Business Research<\/i>&#32;<b>70<\/b>: 263\u201386.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbusres.2016.08.001\" target=\"_blank\">10.1016\/j.jbusres.2016.08.001<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Critical+analysis+of+Big+Data+challenges+and+analytical+method&amp;rft.jtitle=Journal+of+Business+Research&amp;rft.aulast=Sivarajah%2C+U.%3B+Kamal%2C+M.M.%3B+Irani%2C+Z.+et+al.&amp;rft.au=Sivarajah%2C+U.%3B+Kamal%2C+M.M.%3B+Irani%2C+Z.+et+al.&amp;rft.date=2017&amp;rft.volume=70&amp;rft.pages=263%E2%80%9386&amp;rft_id=info:doi\/10.1016%2Fj.jbusres.2016.08.001&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ProvostData13-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ProvostData13_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Provost, F.; Fawcett, T.&#32;(2013).&#32;<i>Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking<\/i>.&#32;O'Reilly Media.&#32;pp.&#160;414.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781449361327.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Data+Science+for+Business%3A+What+You+Need+to+Know+about+Data+Mining+and+Data-Analytic+Thinking&amp;rft.aulast=Provost%2C+F.%3B+Fawcett%2C+T.&amp;rft.au=Provost%2C+F.%3B+Fawcett%2C+T.&amp;rft.date=2013&amp;rft.pages=pp.%26nbsp%3B414&amp;rft.pub=O%27Reilly+Media&amp;rft.isbn=9781449361327&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LiedtkaDesign11-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LiedtkaDesign11_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Liedtka, J.; Ogilvie, T.&#32;(2011).&#32;<i>Designing for Growth: A Design Thinking Tool Kit for Managers<\/i>.&#32;Columbia Business School Publishing.&#32;pp.&#160;248.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9780231158381.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Designing+for+Growth%3A+A+Design+Thinking+Tool+Kit+for+Managers&amp;rft.aulast=Liedtka%2C+J.%3B+Ogilvie%2C+T.&amp;rft.au=Liedtka%2C+J.%3B+Ogilvie%2C+T.&amp;rft.date=2011&amp;rft.pages=pp.%26nbsp%3B248&amp;rft.pub=Columbia+Business+School+Publishing&amp;rft.isbn=9780231158381&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MikalefBigData17-19\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MikalefBigData17_19-0\" rel=\"external_link\">19.0<\/a><\/sup> <sup><a href=\"#cite_ref-MikalefBigData17_19-1\" rel=\"external_link\">19.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Mikalef, P.; Pappas, I.O.; Krogstie, J.; Giannakos, M. et al.&#32;(2017).&#32;\"Big data analytics capabilities: A systematic literature review and research agenda\".&#32;<i>Information Systems and e-Business Management<\/i>: 1\u201332.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs10257-017-0362-y\" target=\"_blank\">10.1007\/s10257-017-0362-y<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Big+data+analytics+capabilities%3A+A+systematic+literature+review+and+research+agenda&amp;rft.jtitle=Information+Systems+and+e-Business+Management&amp;rft.aulast=Mikalef%2C+P.%3B+Pappas%2C+I.O.%3B+Krogstie%2C+J.%3B+Giannakos%2C+M.+et+al.&amp;rft.au=Mikalef%2C+P.%3B+Pappas%2C+I.O.%3B+Krogstie%2C+J.%3B+Giannakos%2C+M.+et+al.&amp;rft.date=2017&amp;rft.pages=1%E2%80%9332&amp;rft_id=info:doi\/10.1007%2Fs10257-017-0362-y&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NambisanDigital17-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NambisanDigital17_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Nambisan, S.; Lyytinen, K.; Majchrzak, A.; Song, M.&#32;(2017).&#32;\"Digital innovation management: Reinventing innovation management research in a digital world\".&#32;<i>MIS Quarterly<\/i>&#32;<b>41<\/b>&#32;(1): 223\u201338.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.25300%2FMISQ%2F2017%2F41%3A1.03\" target=\"_blank\">10.25300\/MISQ\/2017\/41:1.03<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Digital+innovation+management%3A+Reinventing+innovation+management+research+in+a+digital+world&amp;rft.jtitle=MIS+Quarterly&amp;rft.aulast=Nambisan%2C+S.%3B+Lyytinen%2C+K.%3B+Majchrzak%2C+A.%3B+Song%2C+M.&amp;rft.au=Nambisan%2C+S.%3B+Lyytinen%2C+K.%3B+Majchrzak%2C+A.%3B+Song%2C+M.&amp;rft.date=2017&amp;rft.volume=41&amp;rft.issue=1&amp;rft.pages=223%E2%80%9338&amp;rft_id=info:doi\/10.25300%2FMISQ%2F2017%2F41%3A1.03&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation and grammar. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version \u2014 by design \u2014 lists them in order of appearance.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193144\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.525 seconds\nReal time usage: 0.574 seconds\nPreprocessor visited node count: 15987\/1000000\nPreprocessor generated node count: 32621\/1000000\nPost\u2010expand include size: 99024\/2097152 bytes\nTemplate argument size: 30756\/2097152 bytes\nHighest expansion depth: 15\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 534.991 1 - -total\n 80.31% 429.675 1 - Template:Reflist\n 67.10% 358.982 20 - Template:Citation\/core\n 53.09% 284.006 14 - Template:Cite_journal\n 20.86% 111.609 6 - Template:Cite_book\n 14.02% 75.017 1 - Template:Infobox_journal_article\n 13.45% 71.940 1 - Template:Infobox\n 7.16% 38.309 80 - Template:Infobox\/row\n 6.20% 33.180 20 - Template:Citation\/identifier\n 3.14% 16.812 20 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10695-0!*!0!!en!5!* and timestamp 20181214193143 and revision id 33618\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition\">https:\/\/www.limswiki.org\/index.php\/Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","3d10ab796a58a8bc8aa92318f0b8bfdb_images":["https:\/\/www.limswiki.org\/images\/e\/e9\/Fig1_Kayser_TechInnoManRev2018_8-3.png","https:\/\/www.limswiki.org\/images\/4\/49\/Fig2_Kayser_TechInnoManRev2018_8-3.png","https:\/\/www.limswiki.org\/images\/a\/aa\/Fig3_Kayser_TechInnoManRev2018_8-3.png","https:\/\/www.limswiki.org\/images\/9\/91\/Fig4_Kayser_TechInnoManRev2018_8-3.png","https:\/\/www.limswiki.org\/images\/5\/5b\/Fig5_Kayser_TechInnoManRev2018_8-3.png"],"3d10ab796a58a8bc8aa92318f0b8bfdb_timestamp":1544815903,"b1e9f2666792cce972a4a66979d1d937_type":"article","b1e9f2666792cce972a4a66979d1d937_title":"A data quality strategy to enable FAIR, programmatic access across large, diverse data collections for high performance data analysis (Evans et al. 2017)","b1e9f2666792cce972a4a66979d1d937_url":"https:\/\/www.limswiki.org\/index.php\/Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis","b1e9f2666792cce972a4a66979d1d937_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:A data quality strategy to enable FAIR, programmatic access across large, diverse data collections for high performance data analysis\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nA data quality strategy to enable FAIR, programmatic access across large,\r\ndiverse data collections for high performance data analysisJournal\n \nInformaticsAuthor(s)\n \nEvans, Ben; Druken, Kelsey; Wang, Jingbo; Yang, Rui; Richards, Clare; Wyborn, LesleyAuthor affiliation(s)\n \nAustralian National UniversityPrimary contact\n \nEmail: Jingbo dot Wang at anu dot edu dot auEditors\n \nGe, Mouzhi; Dohnal, VlastislavYear published\n \n2017Volume and issue\n \n4(4)Page(s)\n \n45DOI\n \n10.3390\/informatics4040045ISSN\n \n2227-9709Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttp:\/\/www.mdpi.com\/2227-9709\/4\/4\/45\/htmDownload\n \nhttp:\/\/www.mdpi.com\/2227-9709\/4\/4\/45\/pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 NCI's data quality strategy (DQS) \n\n3.1 Data structure \n\n3.1.1 Organization of data within the data structure \n\n\n3.2 Data QC \n\n3.2.1 Climate and forecast (CF) convention \n3.2.2 Attribute Convention for Data Discovery (ACDD) \n\n\n3.3 Benchmarking methodology \n3.4 Data QA \n\n\n4 Examples of tests and reports undertaken on NCI datasets prior to publication \n\n4.1 Metadata QC checker reports \n4.2 Functionality test QA reports \n4.3 Benchmarking use cases \n4.4 Results sharing \n\n\n5 Discussion \n6 Conclusions \n7 Acknowledgements \n\n7.1 Author contributions \n7.2 Conflicts of interest \n\n\n8 Appendix \n\n8.1 Appendix A \n8.2 Appendix B \n\n\n9 References \n10 Notes \n\n\n\nAbstract \nTo ensure seamless, programmatic access to data for high-performance computing (HPC) and analysis across multiple research domains, it is vital to have a methodology for standardization of both data and services. At the Australian National Computational Infrastructure (NCI) we have developed a data quality strategy (DQS) that currently provides processes for: (1) consistency of data structures needed for a high-performance data (HPD) platform; (2) quality control (QC) through compliance with recognized community standards; (3) benchmarking cases of operational performance tests; and (4) quality assurance (QA) of data through demonstrated functionality and performance across common platforms, tools, and services. By implementing the NCI DQS, we have seen progressive improvement in the quality and usefulness of the datasets across different subject domains, and demonstrated the ease by which modern programmatic methods can be used to access the data, either in situ or via web services, and for uses ranging from traditional analysis methods through to emerging machine learning techniques. To help increase data re-usability by broader communities, particularly in high-performance environments, the DQS is also used to identify the need for any extensions to the relevant international standards for interoperability and\/or programmatic access.\nKeywords: data quality, quality control, quality assurance, benchmarks, performance, data management policy, netCDF, high-performance computing, HPC, fair data\n\nIntroduction \nThe National Computational Infrastructure (NCI) manages one of Australia\u2019s largest and more diverse repositories (10+ petabytes) of research data collections spanning datasets from climate, coasts, oceans, and geophysics through to astronomy, bioinformatics, and the social sciences.[1] Within these domains, data can be of different types such as gridded, ungridded (i.e., line surveys, point clouds), and raster image types, as well as having diverse coordinate reference projections and resolutions. NCI has been following the Force 11 FAIR data principles to make data findable, accessible, interoperable, and reusable.[2] These principles provide guidelines for a research data repository to enable data-intensive science, and enable researchers to answer problems such as how to trust the scientific quality of data and determine if the data is usable by their software platform and tools.\nTo ensure broader reuse of the data and enable transdisciplinary integration across multiple domains, as well as enabling programmatic access, a dataset must be usable and of value to a broad range of users from different communities.[3] Therefore, a set of standards and \"best practices\" for ensuring the quality of scientific data products is a critical component in the life cycle of data management. We undertake both QC through compliance with recognized community standards (e.g., checking the header of the files to make sure it is compliant with community convention standard) and QA of data through demonstrated functionality and performance across common platforms, tools, and services (e.g., verifying the data to be functioning with designated software and libraries).\nThe Earth Science Information Partners (ESIP) Information Quality Cluster (IQC) has been established for collecting such standards and best practices and then assisting data producers in their implementation, and users in their taking advantage of them.[4] ESIP considers four different aspects of information quality in close relation to different stages of data products in their four-stage life cycle[4]: (1) define, develop, and validate; (2) produce, access, and deliver; (3) maintain, preserve, and disseminate; and (4) enable use, provide support, and service.\nScience teams or data producers are responsible for managing data quality during the first two stages, while data publishers are responsible for the latter two stages. As NCI is both a digital repository, which manages the storage and distribution of reference data for a range of users, as well as the provider of high-end compute and data analysis platforms, the data quality processes are focused on the latter two stages. A check on the scientific correctness is considered to be part of the first two stages and is not included in the definition of \"data quality\" that is described in this paper.\n\nNCI's data quality strategy (DQS) \nNCI developed a DQS to establish a level of assurance, and hence confidence, for our user community and key stakeholders as an integral part of service provision.[5] It is also a step on the pathway to meet the technical requirements of a trusted digital repository, such as the CoreTrustSeal certification.[6] As meeting these requirements involves the systematic application of agreed policies and procedures, our DQS provides a suite of guidelines, recommendations, and processes for: (1) consistency of data structures suitable for the underlying high-performance data (HPD) platform; (2) QC through compliance with recognized community standards; (3) benchmarking performance using operational test cases; and (4) QA through demonstrated functionality and benchmarking across common platforms, tools, and services.\nNCI\u2019s DQS was developed iteratively through firstly a review of other approaches for management of data QC and data QA (e.g., Ramapriyan et al.[4] and Stall[7]) to establish the DQS methodology and secondly applying this to selected use cases at NCI which captured existing and emerging requirements, particularly the use cases that relate to HPC.\nOur approach is consistent with the American Geophysical Union (AGU) Data Management Maturity (DMM)SM model[7][8], which was developed in partnership the Capability Maturity Model Integration (CMMI) Institute and adapted for their DMMSM[9] model for applications in the Earth and space sciences. The AGU DMMSM model aims to provide guidance on how to improve data quality and consistency and facilitate reuse in the data life cycle. It enables both producers of data and repositories that store data to ensure that datasets are \"fit-for-purpose,\" repeatable, and trustworthy. The Data Quality Process Areas in the AGU DMMSM model define a collaborative approach for receiving, assessing, cleansing, and curating data to ensure \"fitness\" for intended use in the scientific community.\nAfter several iterations, the NCI DQS was established as part of the formal data publishing process and is applied throughout the cycle from submission of data to the NCI repository through to its final publication. The approach is also being adopted by the data producers who now engage with the process from the preparation stage, prior to ingestion onto the NCI data platform. Early consultation and feedback has greatly improved both the quality of the data as well as the timeliness for publication. To improve the efficiency further, one of our major data suppliers is including our DQS requirements in their data generation processes to ensure data quality is considered earlier in data production.\nThe technical requirements and implementation of our DQS will be described as four major but related data components: structure, QC, benchmarking, and QA.\n\nData structure \nNCI's research data collections are particularly focused on enabling programmatic access, required by: (1) NCI core services such as the NCI supercomputer and NCI cloud-based capabilities; (2) community virtual laboratories and virtual research environments; (3) those that require remote access through established scientific standards-based protocols that use data services; and, (4) increasingly, by international data federations. To enable these different types of programmatic access, datasets must be registered in the central NCI catalogue[10], which records their location for access both on the filesystems and via data services.\nThis requires the data to be well-organized and compliant with uniform, professionally managed standards and consistent community conventions wherever possible. For example, the climate community Coupled Model Intercomparison Project (CMIP) experiments use the Data Reference Syntax (DRS)[11], whilst the National Aeronautics and Space Administration (NASA) recommends a specific name convention for Landsat satellite image products.[12] The NCI data collection catalogue manages the details of each dataset through a uniform application of ISO 19115:2003[13], an international schema used for describing geographic information and services. Essentially, each catalogue entry points to the location of the data within the NCI data infrastructure. The catalogue entries also point to the services endpoints such as a standard data download point, data subsetting interface, as well as Open Geospatial Consortium (OGC) Web Mapping Service (WMS) and Web Coverage Services (WCS). NCI can publish data through several different servers, and as such the specific endpoint for each of these service capabilities is listed.\nNCI has developed a catalogue and directory policy, which provides guidelines for the organization of datasets within the concepts of data collections and data sub-collections and includes a comprehensive definition for each hierarchical layer. The definitions are:\n\n A data collection is the highest in the hierarchy of data groupings at NCI. It is comprised of either an exclusive grouping of data subcollections, or it is a tiered structure with an exclusive grouping of lower tiered data collections, where the lowest tier data collection will only contain data subcollections.\n A data subcollection is an exclusive grouping of datasets (i.e., belonging to only one subcollection) where the constituent datasets are tightly managed. It must have responsibilities within one organization with responsibility for the underlying management of its constituent datasets. A data subcollection constitutes a strong connection between the component datasets, and is organized coherently around a single scientific element (e.g., model, instrument). A subcollection must have compatible licenses such that constituent datasets do not need different access arrangements.\n A dataset is a compilation of data that constitutes a programmable data unit that has been collected and organized using a self-contained process. For this purpose it must have a named data owner, a single license, one set of semantics, ontologies, vocabularies, and has a single data format and internal data convention. A dataset must include its version.\n A dataset granule is used for some scientific domains that require a finer level of granularity (e.g., in satellite Earth Observation datasets). A granule refers to the smallest aggregation of data that can be independently described, inventoried, and retrieved as defined by NASA.[14] Dataset granules have their own metadata and support values associated with the additional attributes defined by parent datasets.\nIn addition we use the term \"data category\" to identify common contents\/themes across all levels of the hierarchy.\n\n A data category allows a broad spectrum of options to encode relationships between data. A data category can be anything that weakly relates datasets, with the primary way of discovering the groupings within the data by key terms (e.g., keywords, attributes, vocabularies, ontologies). Datasets are not exclusive to a single category.\nOrganization of data within the data structure \nNCI has organized data collections according to this hierarchical structure on both filesystem and within our catalogue system. Figure 1 shows how these datasets are organized. Figure 2 provides an example of how the CMIP 5 data collection demonstrates the hierarchical directory structure.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. Illustration of the different levels of metadata and community standards used for each\n\n\n\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2. Example schematic of the National Computational Infrastructure (NCI)\u2019s data organizational structure using the Coupled Model Intercomparison Project (CMIP)) 5 collection. The CMIP 5 collection housed at NCI includes three sub-collections from The Commonwealth Scientific and Industrial Research Organisation (CSIRO) and Australian Bureau of Meteorology (BOM): (1) the ACCESS-1.0 model, (2) ACCESS-1.3 model, and (3) Mk 3.6.0 model. Each sub-collection then contains a number of datasets, such as \u201cpiControl\u201d (pre-industrial control experiment), which then contains numerous granules (e.g., precipitation, \u201cpr\u201d). A complete description on the range of CMIP5 contents can be found at: https:\/\/pcmdi.llnl.gov\/mips\/cmip5\/experiment_design.html.\n\n\n\nData QC \nData QC measures are intended to ensure that all datasets hosted at NCI adhere, wherever possible, to existing community standards for metadata and data. For Network Common Data Form (netCDF) (and Hierarchical Data Format v5 (HDF5)-based) file formats, these include the Climate and Forecast (CF) Convention[15] and the Attribute Convention for Data Discovery[16] (see Table 1).\n\n\n\n\n\n\n\nTable 1. The NCI Quality Control (QC) mandatory requirements. A full list of the Attribute Convention for Data Discovery (ACDD) metadata requirements used by NCI is provided in Appendix A.\n\n\n\nConvention\/Standard\n\nNCI Requirements\n\nFurther Information\n\n\nCF\n\nMandatory CF criteria, e.g., no \u201cerrors\u201d result from any of the recommended compliance checkers\n\nhttp:\/\/cfconventions.org\n\n\nACDD (Modified version)\n\nRequired attributes are included within each file: 1. title, 2. summary, 3. source, 4. date_created\n\nhttp:\/\/wiki.esipfed.org\/index.php\/Attribute_Convention_for_Data_Discovery_1-3\n\n\n\nClimate and forecast (CF) convention \nNCI requires that all geospatial datasets meet the minimum mandatory CF convention metadata criteria at the time of publication, and, where scientifically applicable, we require they meet the relevant recommended CF criteria. These requirements are detailed in the latest CF convention document provided on their website.[15]\nThe CF convention is the primary community standard for netCDF data, which was originally developed by the climate community and is now being adapted for other domains, e.g., marine and geosciences. It defines metadata requirements for information on each variable contained within the file as well as spatial and temporal properties of the data, so that contents are fully \u201cself-described.\u201d For example, no additional companion files or external sources are required to describe any information about how to read or utilize the data contents within the file. The metadata requirements also provide important guidelines on how to structure spatial data. This includes recommendations on the order of dimensions, the handling of gridded and non-gridded (time series, point and trajectory) data, coordinate reference system descriptions, standardized units, and cell measures (i.e., information relating to the size, shape, or location of grid cells). CF requires that all metadata information be equally readable and understandable by humans and software, which has the benefit of allowing software tools to easily display and dynamically perform associated operations.\n\nAttribute Convention for Data Discovery (ACDD) \nThe ACDD is another common standard for netCDF data that complements the CF convention requirements.[16] The ACDD primarily governs metadata information written at the file-level (i.e., netCDF global attributes), while the CF convention pertains mainly to variable-level metadata and structure information. Therefore, when combined these two standards help to fully describe both the higher-level metadata relevant to the entire file (e.g., dataset title, custodian, data created, etc.) and the lower-level information about each individual variable or dimension (e.g., name, units, bounds, fill values, etc.). ACDD also provides the ability to link to even higher-levels such as the dataset parent and grandparent ISO 19115 metadata entries.\nNCI has applied this convention, along with CF, as summarized in Table 1 as part of our data QC. As the ACDD has no \u201crequired\u201d fields in its current specification, NCI has applied a modified version that requires all published datasets meet the minimum of four required ACDD catalogue metadata fields at the time of publication. These are \u201ctitle,\u201d \u201csummary,\u201d \u201csource,\u201d and \u201cdate_created\u201d and have been ranked as \u201crequired\u201d to aid with NCI\u2019s data services and data discovery. A complete list of ACDD metadata attributes and NCI requirements are available in Appendix A.\n\nBenchmarking methodology \nAny reference datasets made available on NCI must be well organized and accessible in a form suitable for the known class of users. Datasets also need to be more broadly available to other users from different domains, with the expectation that the collection will continue to have long-term and enduring value not just to the research community but also to others (e.g., government, general public, industry). To ensure that these expectations are clearly understood across the range of use-cases and environments, NCI has adopted a benchmarking methodology as part of their DQS process. Benchmarks register their functionality and performance, which helps to define expectations around data accessibility and provide an effective, defined measure of usability.\nTo substantiate this, NCI works with both the data producers and the users to establish benchmarks for specific areas, which are then included as part of the registry of data QA measures. These tests are then verified by both NCI and by wider community representatives to ensure that the benchmark is appropriate for the requested access. The benchmark methodology also provides a way to systematically consider how current users will be affected when considering any future developments or evolution in technology, standards, or reorganization of data. The benchmark cases then substantiate the original intention, and they can be reviewed against any subsequent changes. For example, benchmark cases that were previously specified to use data in a particular format may have been updated to use an alternative, more acceptable format that is better for use in high-performance environments or improves accessibility across multiple domains. The original benchmark cases can then be re-evaluated against both the functionality and performance required to assess how to make such a transformation. Further, if there are any upgrades or changes to the production services, the benchmark cases are used to perform prerelease tests on the data servers before implementing the changes into production.\nThe benchmarks consist of explicit current examples using tools, libraries, services, packages, software, and processes that are executed at NCI. These benchmarks explore the required access and identify supporting standards that are critical to the utility of the service, whether access be through the filesystem or by API protocols provided by NCI data services. Where benchmarks are shown to be beyond the capability of the current data service, the benchmark case will be recorded for future application.\nFurthermore, the results of the testing of each benchmark are reviewed with the data producer in light of any issues raised. This may require action by the user to revise the access pattern and\/or by the data producer to modify the data to ensure that the reliability of NCI\u2019s production service is not compromised. Alternatively, NCI may be able to provide a temporary separate service to accommodate some aspects of the usage pattern. For example, the data might be released via a modified server that can address shortcomings of a specific benchmark case but would not be applicable generally. This may be a short-term measure until a better server solution is found, or it may address current local issues on either the data or client application side.\n\nData QA \nTo ensure that the data is usable across a range of use-cases and environments, the QA approach uses benchmarks for testing data located on the local filesystem, as well as remotely via the data service endpoints. The QA process is designed to verify that software and libraries used are functioning properly with the most commonly used tools in the community.\nThe following are a list of data services that are available under NCI\u2019s Unidata Thematic Real-time Environmental Distributed Data Services (THREDDS):\n\n Open-source Project for a Network Data Access Protocol (OPeNDAP): a protocol enabling data access and subsetting through the web;\n NetCDF Subset Service (NCSS): web service for subsetting files that can be read by the netCDF java library;\n WMS: OGC web service for requesting raster images of data;\n WCS: OGC web service for requesting data in some output format;\n Godiva2 Data Viewer: tool for simple visualization of data; and\n HTTP File Download: for direct downloading of data.\nThe data is tested through each of the required services as part of the QA process, with the basic usability functionality tests applied to each service as shown in Table 2. Should an issue be discovered during these functionality tests, the issue is investigated further. This may lead to additional modifications of the data so as to pass the functionality or performance requirements, and in doing so requires further communication with the data producer to ensure that such changes are acceptable and can be corrected in any future data production process. More detailed functionality can also be recorded for scientific use around the data. Such tests tend to be specific for the data use-case but follow the same methodology as that described here.\n\r\n\n\n\n\n\n\n\n\nTable 2. Description of basic accessibility and functionality tests that are applied for commonly used tools as part of NCI\u2019s QA tests\n\n\n\nTest\n\nMeasures of Success\n\n\nnetCDF C-Library\n\nUsing the ncdump-h &lt;file&gt; function from command line, the file is readable and displays the file header information about the file dimensions, variables, and metadata.\n\n\nGDAL\n\nUsing the gdalinfo &lt;file&gt; function from command line, the file is readable and displays the file header information about the file dimensions, variables, and metadata.\r\nUsing the gdalinfo NETCDF:&lt;file&gt;:&lt;subdataset&gt; function from command line, the subdatasets are readable and corresponding metadata for each subdataset is displayed.\r\nThe Open and GetMetadata functions return non-empty values that correspond to the netCDF file contents.\r\nThe GetProjection function (of the appropriate file or subdataset) returns a non-empty result corresponding to the data coordinate reference system information.\n\n\nNCO (NetCDF Operators)\n\nUsing the ncks -m &lt;file&gt; function from command line, the file is readable and displays file metadata.\n\n\nCDO (Climate Data Operators)\n\nUsing the cdo sinfon &lt;file&gt; function from command line, the file is readable and displays information on the included variables, grids, and coordinates.\n\n\nFerret\n\nUsing SET DATA \u201c&lt;file&gt;\u201d followed by SHOW DATA displays information on file contents.\r\n Using SET DATA \u201c&lt;file&gt;\u201d followed by SHADE &lt;variable&gt; (or another plotting command) produces a plot of the requested data.\n\n\nThredds Data Server\n\nDataset index catalog page loads without timeout and within reasonable time expectations (&lt;10 s).\n\n\nThredds Data Service Endpoints\n\nHTTP Download: File download commences when selected the HTTPServer option from the THREDDS catalog page for the file.\r\nOPeNDAP: When selecting OPeNDAP from the THREDDS catalog page for the file, the OPeNDAP Dataset Access Form page loads without error. From the OPeNDAP Dataset Access Form page, a data subset is returned in ASCII format after selecting data and clicking the Get ASCII option at the top of the page.\r\nGodiva2: When selecting the Godiva2 viewer option from the THREDDS catalog page for the file, the viewer displays the file contents.\r\nWMS: When selecting the WMS option from the THREDDS catalog page for the file, the web browser displays the GetCapabilities information in xml format. After constructing a GetMap request, the web browser displays the corresponding map.\r\nWCS: When selecting the WCS option from the THREDDS catalog page for the file, the web browser displays the GetCapabilities information in XML format. After constructing a GetCoverage request, file download of coverage commences.\n\n\nPanoply\n\nFrom the File \u2192 Open menu, the file can be opened. File contents and metadata displayed.\r\nUsing Create Plot for a selected variable, data is displayed correctly in new plot window.\n\n\nQGIS\n\nUsing the Add WMS\/WMTS menu option, QGIS can request GetCapabilities and\/or GetMap operations, and the layer is visible.\r\nThe ncWMS GetCapabilities URL accepts and adds the NCI THREDDS Server, the request displays the available layers to select from, and a selected layer displays according to user expectations.\n\n\nNASA Web WorldWind\n\nThe ncWMS GetCapabilities URL accepts and adds the NCI THREDDS Server, the request displays the available layers to select from, and a selected layer displays according to user expectations.\n\n\nPYTHON cdms2\n\nThe file can be opened by the Open function.\r\nFile metadata is displayed using Attributes function.\r\nFile data contents are displayed when using Variables function.\n\n\nPYTHON netCDF4\n\nThe file can be opened by the Dataset function.\r\nFile metadata is displayed using ncattrs object.\r\nFile data contents are displayed using variables (and\/or groups) objects.\n\n\nPYTHON h5py\n\nThe netcdf file can be opened by the File function.\r\nThe metadata and variables are displayed by the keys and attrs objects.\n\n\nParaView\n\nFrom the File \u2192 Open menu, the file can be opened and displayed as a layer in the Pipeline Browser. Enabling layer visibility results in data displaying in the Layout window.\n\n\n\nExamples of tests and reports undertaken on NCI datasets prior to publication \nMetadata QC checker reports \nTo assess the CF and ACDD compliance, NCI runs a QC checker prior to data publication and works with the data producer to rectify problems. The NCI checker is based on the U.S. Integrated Ocean Observing System (IOOS) Compliance Checker[17] but has been modified to include additional checks relevant to NCI\u2019s data services as well as the modified ACDD convention. Appendix B shows an example QC checker report (Figure A1) with metadata that is 100% compliant with NCI\u2019s requirements. In practice, the process usually needs to be run several times as the datasets are checked, feedback is given, and then re-run against the timestamp for each version to keep a record of metadata update provenance. The reports are shared with the data producers with comments and additional feedback provided in the \u201chigh\/medium\/low-priority suggestions\u201d section at the end of the report, depending on the potential impact of non-compliance.\nDue to the large number of data files that can be involved, NCI\u2019s QC checker has been modified to enable parallelization so that multiple processes can be run simultaneously, thus increasing performance of the checking process. For instance, it takes less than a minute to check hundreds of files, and about 10 minutes for tens of thousands. For the largest datasets, the QC checker can typically run on more than one million files at a time.\nThe QC checker also helps to find corrupted or temporary files, which can be easily overlooked or not detected by the data producers, especially during a batch production process.\n\nFunctionality test QA reports \nAppendix B provides an example report (Figure A2) of the QA results from checking three data files when accessed directly on the filesystem and their service endpoints for access via THREDDS. The functionality test shows that the variable structure within the data of two files (2 GB and 4 GB) are too large to load the files into several commonly used data viewers, such as ncview (v2.1.1) and Panoply (v4.5.1), and they have similar issues on opening files through the service endpoints. In this case, our advice for mitigation is to reduce the requested size of the image by using a lower resolution or to work in situ with this particular data file, as recorded in the comments of Figure A2, sections b and c.\n\nBenchmarking use cases \nIn the benchmark tests several popular tools and APIs are run to evaluate their elapsed time on accessing data either residing on the local filesystem or being accessed via data services. The test files in the example NCI functionality QA test report (Figure A2) are used in the benchmark tests, and their data structures are listed in Table 3. We access the 2D variable in each file, which is recorded at (lat, lon), chunked at (128,128) and deflated at level 2.\n\n\n\n\n\n\n\nTable 3. Data structure of the sample files used in the benchmark tests\n\n\n\nAttributes\n\nFile 1\n\nFile 2\n\nFile 3\n\n\nlon (double)\n\nSize\n\n5717\n\n59501\n\n40954\n\n\nChunksize\n\n128\n\n128\n\n128\n\n\nlat (double)\n\nSize\n\n4182\n\n41882\n\n34761\n\n\nChunksize\n\n128\n\n128\n\n128\n\n\nVariable(float)\n\nName\n\ngrav_ir_anomaly\n\nmag_tmi_rtp_anomaly\n\nrad_air_dose_rate\n\n\nSize\n\n(4182,5717)\n\n(41882,59501)\n\n(34761,40954)\n\n\nChunksize\n\n(128,128)\n\n(128,128)\n\n(128,128)\n\n\nDeflate Level\n\n\n\n2\n\n2\n\n2\n\n\nFormat\n\n\n\nnetCDF-4 classic model\n\nnetCDF-4 classic model\n\nnetCDF-4 classic model\n\n\n\nThe elapsed time of the benchmark tests are listed in Table 4. The netCDF utilities such as ncdump or h5dump could dump the contents of netCDF files into an ASCII representation. They are frequently used in the functionality test of the QA report to fetch the metadata of the netCDF files. In the performance benchmarking tests, we measure the elapsed time to dump the whole variable as human-readable ASCII text. This performance relies on the internal data organization, such as contiguous or chunking, deflation shuffling, etc., and involves numerous type conventional operations. Such conventions may also incur a heavy overhead during the dump process, and it could take a very long time to complete the access of a large size file.\n\n\n\n\n\n\n\nTable 4. Benchmark results (in sec.)\n\n\n\nProgram\/Service\n\nTest\n\nFile 1\n\nFile 2\n\nFile 3\n\n\nNetCDF Utilities\n\nncdump\n\n8.630\n\n5584.414\n\n3246.879\n\n\nh5dump\n\n40.547\n\n3546.999\n\n2373.483\n\n\nPython (2.7.x) netCDF APIs\n\nnetCDF4-python (1.2.7)\n\n0.445\n\n48.603\n\n29.160\n\n\nGDAL-python (1.11.1)\n\n0.421\n\n42.654\n\n25.538\n\n\nh5py (v2.6.0)\n\n0.356\n\n40.105\n\n23.826\n\n\nTHREDDS Data Server (TDS)\n\nnetCDF4-python (1.2.7)\n\n3.087\n\n282.797\n\n185.358\n\n\nOPeNDAP (TDS v4.6.6)\n\n3.038\n\n277.21\n\n194.85\n\n\nnetCDF Subset Service (TDS v4.6.6)\n\n2.833\n\n248.194\n\n158.236\n\n\n\nIn Table 4 we show an extreme case where a file provided complies with standard QC checks and is well formatted. However, when we evaluate the file using the standard suite of tools we see that the elapsed time of using both ncdump and h5dump can take hours to dump a variable for a file size of 2 GB or 4 GB. To evaluate performance of programmatic methods on netCDF files, we use netCDF4-python, Geospatial Data Abstraction Library (GDAL)-python, and h5py to access the target files from the Lustre filesystems. In this case our tests show that all APIs could use much less time fetching the whole variable than netCDF dump tools due to the removal of overheads on data convention and transporting. Our tests also show that h5py presents the best performance. Since netCDF-4 is essentially a profile of the HDF5 format, both netCDF4-python and GDAL-python eventually invoke the HDF5 library to access the data. NetCDF4-python can also access data from the THREDDS server (which is tested for performance on our high speed internal network), but it takes nearly six times longer to access the data via the data service when compared with accessing the same volume of data on our Lustre filesystem. All three tools take a similar time to access data from our THREDDS server. By default, netCDF4-python and THREDDS have a request size limit of 500 MB, so it is necessary to divide the fetching process into several individual requests if the target dataset is larger than 500 MB. NCSS, on the other hand, has a much larger file limit per request so less requests are needed in NCSS than either netCDF4-python or THREDDS.\n\nResults sharing \nAll QC\/QA reports and benchmarks are shared with the data producers. In the future we plan to make these reports available to the wider community, as the information provides consumers with evidence on how the data is functioning and how it has performed with different software and libraries. It also provides guidance on how to best use the data and enables the consumer to determine if they are using data, or a tool to access the data, that has not been tested before. This information is also used in data training to demonstrate the application of data standards in both data organization and data preparation, and how to use the data with a range of software.\n\nDiscussion \nThe NCI DQS has been applied to climate and weather, earth observation, geoscience, and astronomy data, with the QC and QA tests adapted to meet the relevant community standards and protocols for each domain. The examples provided in this paper have shown how the knowledge and experience on data standards for netCDF files and conventions\u2014such as CF and ACDD, initially developed within the climate community\u2014are applicable to other scientific domains. For example, in the geophysics domain, there is a growing need to enable access to much larger data volumes, over larger spatial areas and\/or enable aggregation of data from multiple individual geophysical surveys. To do this, in consultation with the geophysics and HDF communities, the principles of the CF convention from the climate community and the ACDD from the Earth science community were translated into a proposed new geophysics convention that improves programmatic access and interoperability across different geophysical data types, such as seismic, gravity, magnetotelluric, and radiometric.[18] We also applied our benchmarking strategy to the geophysics domain, initially using the domain-popular ObsPy library[19] and SPECFEM3D code[20], to demonstrate how different organizations of the data (in terms of chunking size and compression) impact on the performance by comparing new data formats, such as PH5[21] and ASDF[22] to traditional formats such as the Society of Exploration Geophysicists-Y Data Exchange Format (SEG-Y), the Standard for the Exchange of Earthquake Data Format (SEED), Seismic Analysis Code (SAC), etc.\n\nConclusions \nWe have developed a DQS as a key component of our vision to provide a trustworthy, transdisciplinary, high-performance data platform which enables researchers to share, use, reuse, and repurpose the data collections in high-end computational and data-intensive environments. The implementation of DQS provides assurance to users that the data is properly quality checked and they are compliant within the community standard. The functionality check in the QA process lists suitable software and libraries so that users can check whether the data is usable within their platform. Applying the DQS provides a standard way to (1) assess completeness and consistency of data across multiple datasets and collections; (2) evaluate the suitability of the data for transdisciplinary use; (3) enable standardized programmatic access; and (4) avoid the negative impacts of poor data and dissatisfied user experience.\nThe NCI DQS identifies issues with the data and metadata at the time of data ingestion onto the NCI data platform, thus allowing corrections to be undertaken prior to publication. Applying the DQS means that scientists spend less time reformatting and wrangling the data to make it suitable for use by their applications and workflows\u2014especially if their applications can read standardized interfaces. Future work will focus on broader adoption of data from additional domains and data types, as well improving use of controlled vocabularies for individual data attributes as a means of more efficiently indexing the data.\n\nAcknowledgements \nThe authors wish to acknowledge funding from the Australian Government Department of Education, through the National Collaborative Research Infrastructure Strategy (NCRIS), and the Education Investment Fund (EIF) Super Science Initiatives through the NCI and Research Data Services (RDS) projects. We also wish to acknowledge the organizational partners and data managers involved in data management at NCI, particularly Geoscience Australia, the Bureau of Meteorology, CSIRO, and the Australian National University.\n\nAuthor contributions \nB.E. and K.D. conceived and designed the NCI DQS. K.D. developed the codes of QC\/QA checker. K.D. and J.W. run the QC and QA test and generate reports. R.Y. ran the benchmark tests. J.W., K.D., R.Y. and L.W. wrote the initial paper. B.E., C.R. and L.W. reviewed and improved key sections of the paper, particularly for the broader activities of QA and its application.\n\nConflicts of interest \nThe authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; and in the decision to publish the results.\n\nAppendix \nAppendix A \nNCI NetCDF Metadata Guide based on the Attribute Convention for Dataset Discovery (ACDD v1.3)\n\n\n\n\n\n\n\nTable A1. The following table contains a subgroup of attributes from the ACDD metadata specification[16] where the priority-level for the attributes are categorized as \u201cRequired,\u201d \u201cRecommended,\u201d or \u201cSuggested,\u201d and which shows attributes where the priority-level has been modified to better align with NCI\u2019s data hosting services (e.g., NCI classifies \u201csource\u201d as \u201cRequired\u201d while it is only \u201cRecommended\u201d by the ACDD guidelines).\n\n\nREQUIRED\n\n\nGlobal Attribute\n\nDescription\n\n\ntitle\n\nA short phrase or sentence describing the dataset. In many discovery systems, the title will be displayed in the results list from a search, and therefore it should be human-readable and reasonable to display in a list of such names. This attribute is also recommended by the NetCDF Users Guide and the CF conventions.\n\n\nsummary\n\nA paragraph describing the dataset, analogous to an abstract for a paper.\n\n\nsource\n\nThe method of production of the original data. If it was model-generated, source should name the model and its version. If it is observational, source should characterize it. This attribute is defined in the CF Conventions. Examples: \"temperature from CTD #1234\"; \"world model v.0.1\".\n\n\ndata_created\n\nThe date on which this version of the data was created. (Modification of values implies a new version, hence this would be assigned the date of the most recent values modification.) Metadata changes are not considered when assigning the date_created. The ISO 8601:2004 extended date format is recommended, as described in the Attribute Content Guidance section.\n\n\nRECOMMENDED\n\n\nGlobal Attribute\n\nDescription\n\n\nConventions\n\nA comma-separated list of the conventions that are followed by the dataset. For files that follow this version of ACDD, include the string \u2018ACDD-1.3\u2019. (This attribute is described in the netCDF Users Guide.)\n\n\nmetadata_link\n\nA URL that gives the location of more complete metadata. A persistent URL is recommended for this attribute.\n\n\nhistory\n\nProvides an audit trail for modifications to the original data. This attribute is also in the netCDF Users Guide: \"This is a character array with a line for each invocation of a program that has modified the dataset. Well-behaved generic netCDF applications should append a line containing: date, time of day, user name, program name and command arguments.\" To include a more complete description you can append a reference to an ISO Lineage entity; see NOAA EDM ISO Lineage guidance.\n\n\nlicense\n\nProvide the URL to a standard or specific license, enter \u201cFreely Distributed\u201d or \u201cNone\u201d, or describe any restrictions to data access and distribution in free text.\n\n\ndoi\n\nTo be used if a DOI exists.\n\n\nproduct_version\n\nVersion identifier of the data file or product as assigned by the data creator. For example, a new algorithm or methodology could result in a new product_version.\n\n\nprocessing_level\n\nA textual description of the processing (or QC) level of the data.\n\n\ninstitution\n\nThe name of the institution principally responsible for originating this data. This attribute is recommended by the CF convention.\n\n\nproject\n\nThe name of the project(s) principally responsible for originating this data. Multiple projects can be separated by commas, as described under Attribute Content Guidelines. Examples: \"PATMOS-X\" and \"Extended Continental Shelf Project\".\n\n\ninstrument\n\nName of the contributing instrument(s) or sensor(s) used to create this data set or product. Indicate controlled vocabulary used in instrument_vocabulary.\n\n\nplatform\n\nName of the platform(s) that supported the sensor data used to create this data set or product. Platforms can be of any type, including satellite, ship, station, aircraft or other. Indicate controlled vocabulary used in platform_vocabulary.\n\n\nSUGGESTED\n\n\nGlobal Attribute\n\nDescription\n\n\nid\n\nAn identifier for the data set, provided by and unique within its naming authority. The combination of the \u201cnaming authority\u201d and the \u201cid\u201d should be globally unique, but the id can be globally unique by itself also. IDs can be URLs, URNs, DOIs, meaningful text strings, a local key, or any other unique string of characters. The id should not include white space characters.\n\n\ndate_modified\n\nThe date on which the data was last modified. Note that this applies just to the data, not the metadata. The ISO 8601:2004 extended date format is recommended, as described in the Attributes Content Guidance section.\n\n\ndate_created\n\nThe date on which this version of the data was created. (Modification of values implies a new version, hence this would be assigned the date of the most recent values modification.) Metadata changes are not considered when assigning the date_created. The ISO 8601:2004 extended date format is recommended, as described in the Attribute Content Guidance section.\n\n\ndate_issued\n\nThe date on which this data (including all modifications) was formally issued (i.e., made available to a wider audience). Note that these apply just to the data, not the metadata. The ISO 8601:2004 extended date format is recommended, as described in the Attributes Content Guidance section.\n\n\nreferences\n\nPublished or web-based references that describe the data or methods used to produce it. Recommend URIs (such as a URL or DOI) for papers or other references. This attribute is defined in the CF conventions.\n\n\nkeywords\n\nA comma-separated list of key words and\/or phrases. Keywords may be common words or phrases, terms from a controlled vocabulary (GCMD is often used), or URIs for terms from a controlled vocabulary (see also \u201ckeywords_vocabulary\u201d attribute).\n\n\nstandard_name_vocabulary\n\nThe name and version of the controlled vocabulary from which variable standard names are taken. (Values for any standard_name attribute must come from the CF Standard Names vocabulary for the data file or product to comply with CF.) Example: \"CF Standard Name Table v27\".\n\n\ngeospatial_lat_min\n\nDescribes a simple lower latitude limit; may be part of a 2- or 3-dimensional bounding region. Geospatial_lat_min specifies the southernmost latitude covered by the dataset.\n\n\ngeospatial_lat_max\n\nDescribes a simple upper latitude limit; may be part of a 2- or 3-dimensional bounding region. Geospatial_lat_max specifies the northernmost latitude covered by the dataset.\n\n\ngeospatial_lon_min\n\nDescribes a simple longitude limit; may be part of a 2- or 3-dimensional bounding region. geospatial_lon_min specifies the westernmost longitude covered by the dataset. See also geospatial_lon_max.\n\n\ngeospatial_lon_max\n\nDescribes a simple longitude limit; may be part of a 2- or 3-dimensional bounding region. geospatial_lon_max specifies the easternmost longitude covered by the dataset. Cases where geospatial_lon_min is greater than geospatial_lon_max indicate the bounding box extends from geospatial_lon_max, through the longitude range discontinuity meridian (either the antimeridian for \u2212180:180 values, or Prime Meridian for 0:360 values), to geospatial_lon_min; for example, geospatial_lon_min = 170 and geospatial_lon_max = \u2212175 incorporates 15 degrees of longitude (ranges 170 to 180 and \u2212180 to \u2212175).\n\n\ngeospatial_vertical_min\n\nDescribes the numerically smaller vertical limit; may be part of a 2- or 3-dimensional bounding region. See geospatial_vertical_positive and geospatial_vertical_units.\n\n\ngeospatial_vertical_max\n\nDescribes the numerically larger vertical limit; may be part of a 2- or 3-dimensional bounding region. See geospatial_vertical_positive and geospatial_vertical_units.\n\n\ngeospatial_vertical_positive\n\nOne of \"up\" or \"down.\" If up, vertical values are interpreted as \"altitude,\" with negative values corresponding to below the reference datum (e.g., under water). If down, vertical values are interpreted as \"depth,\" positive values correspond to below the reference datum. Note that if geospatial_vertical_positive is down (\"depth\" orientation), the geospatial_vertical_min attribute specifies the data\u2019s vertical location furthest from the earth\u2019s center, and the geospatial_vertical_max attribute specifies the location closest to the earth\u2019s center.\n\n\ngeospatial_bounds\n\nDescribes the data\u2019s 2D or 3D geospatial extent in OGC\u2019s Well-Known Text (WKT) Geometry format (reference the OGC Simple Feature Access (SFA) specification). The meaning and order of values for each point\u2019s coordinates depends on the coordinate reference system (CRS). The ACDD default is 2D geometry in the EPSG:4326 coordinate reference system. The default may be overridden with geospatial_bounds_crs and geospatial_bounds_vertical_crs (see those attributes). EPSG:4326 coordinate values are latitude (decimal degrees_north) and longitude (decimal degrees_east), in that order. Longitude values in the default case are limited to the [\u2212180, 180) range. Example: \"POLYGON ((40.26 -111.29, 41.26 -111.29, 41.26 -110.29, 40.26 -110.29, 40.26 -111.29))\".\n\n\ntime_coverage_start\n\nDescribes the time of the first data point in the data set. Use the ISO 8601:2004 date format, preferably the extended format as recommended in the Attribute Content Guidance section.\n\n\ntime_coverage_end\n\nDescribes the time of the last data point in the data set. Use ISO 8601:2004 date format, preferably the extended format as recommended in the Attribute Content Guidance section.\n\n\ntime_coverage_duration\n\nDescribes the duration of the data set. Use ISO 8601:2004 duration format, preferably the extended format as recommended in the Attribute Content Guidance section.\n\n\ntime_coverage_resolution\n\nDescribes the targeted time period between each value in the data set. Use ISO 8601:2004 duration format, preferably the extended format as recommended in the Attribute Content Guidance section.\n\n\n\nAppendix B \nExamples of NCI\u2019s Quality Control (QC) and Quality Assurance (QA) reporting\n\n\n\n\n\n\n\n\n\n Figure A1. An example of NCI\u2019s QC compliance report, which is shared with data producers and used to ensure that the dataset metadata meets the minimum requirements for a netCDF collection. In this particular example collection, 30 files were successfully scanned (zero skipped) and all elements of the QC process passed. In cases were elements are not fully compliant, the high\/medium\/low priority suggestions section at the end of the report is used to explain the nature of the errors found and list possible means for modification.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n Figure A2. An example of NCI functionality QA test report. (a) The first section of the report provides a short summary of results and whether the data is considered functional with all the tested tools, and lists the details of the files that were used for the assessment, including the properties of the files, such as size, variable shape, chunk size, and compression (deflate) level. (b) The second section provides the results for the functionality tests performed on the data, directly on the filesystem. (c) The third section provides the results of the functionality tests using the data served through NCI\u2019s THREDDS services.\n\n\n\nReferences \n\n\n\u2191 Wang, J.; Evans, B.J.K.; Bastrakova, I. et al.&#32;(2014).&#32;\"Large-Scale Data Collection Metadata Management at the National Computation Infrastructure\".&#32;Proceedings from the American Geophysical Union, Fall Meeting 2014: IN14B-07. &#160; \n\n\u2191 \"The FAIR Data Principles\".&#32;Force11.&#32;https:\/\/www.force11.org\/group\/fairgroup\/fairprinciples .&#32;Retrieved 23 August 2017 . &#160; \n\n\u2191 Evans, B.J.K.; Wyborn, L.A.; Druken, K.A. et al.&#32;(2016).&#32;\"Extending the Common Framework for Earth Observation Data to other Disciplinary Data and Programmatic Access\".&#32;Proceedings from the American Geophysical Union, Fall General Assembly 2016: IN22A-05. &#160; \n\n\u2191 4.0 4.1 4.2 Ramapriyan, H.; Peng, G.; Moroni, D.; Shie, C.-L.&#32;(2017).&#32;\"Ensuring and Improving Information Quality for Earth Science Data and Products\".&#32;D-Lib Magazine&#32;23&#32;(7\/8).&#32;doi:10.1045\/july2017-ramapriyan. &#160; \n\n\u2191 Atkin, B.; Brooks, A..&#32;\"Chapter 8: Service Specifications, Service Level Agreements and Performance\".&#32;Total Facilities Management.&#32;Wiley.&#32;ISBN&#160;9781405127905. &#160; \n\n\u2191 \"Data Repositories Requirements\".&#32;CoreTrustSeal.&#32;https:\/\/www.coretrustseal.org\/why-certification\/requirements\/ .&#32;Retrieved 24 October 2017 . &#160; \n\n\u2191 7.0 7.1 Stall, S.; Downs, R.R.; Kempler, S.J.&#32;(2016).&#32;\"AGU's Data Management Maturity Model\".&#32;Auditing of Trustworthy Data Repositories.&#32;SciDataCon 2016.&#32;https:\/\/www.scidatacon.org\/2016\/sessions\/100\/ . &#160; \n\n\u2191 Stall, S.; Hanson, B.; Wyborn, L.&#32;(2016).&#32;\"The American Geophysical Union Data Management Maturity Program\".&#32;Proceedings from the eResearch Australasia Conference 2016: 72.&#32;https:\/\/eresearchau.files.wordpress.com\/2016\/03\/eresau2016_paper_72.pdf . &#160; \n\n\u2191 \"Data Management Maturity (DMM)\".&#32;CMMI Institute LLC.&#32;https:\/\/cmmiinstitute.com\/store\/data-management-maturity-(dmm) . &#160; \n\n\u2191 \"NCI Data Portal\".&#32;National Computational Infrastructure.&#32;https:\/\/geonetwork.nci.org.au\/geonetwork\/srv\/eng\/catalog.search#\/home . &#160; \n\n\u2191 Taylor, K.E.; Balaji, V.; Hankin, S. et al.&#32;(13 June 2012).&#32;\"CMIP5 Data Reference Syntax (DRS) and Controlled Vocabularies\"&#32;(PDF).&#32;Program for Climate Model Diagnosis &amp; Intercomparison.&#32;https:\/\/pcmdi.llnl.gov\/mips\/cmip5\/docs\/cmip5_data_reference_syntax.pdf . &#160; \n\n\u2191 \"What are the naming conventions for Landsat scene identifiers?\".&#32;U.S. Geological Survey.&#32;https:\/\/landsat.usgs.gov\/what-are-naming-conventions-landsat-scene-identifiers .&#32;Retrieved 23 August 2017 . &#160; \n\n\u2191 \"ISO 19115-1:2014 Geographic information -- Metadata -- Part 1: Fundamentals\".&#32;International Organization for Standardization.&#32;April 2014.&#32;https:\/\/www.iso.org\/standard\/53798.html .&#32;Retrieved 25 May 2016 . &#160; \n\n\u2191 \"Granule\".&#32;EarthData Glossary.&#32;https:\/\/earthdata.nasa.gov\/user-resources\/glossary#ed-glossary-g .&#32;Retrieved 23 August 2017 . &#160; \n\n\u2191 15.0 15.1 \"CF Conventions and Metadata\".&#32;Lawrence Livermore National Laboratory.&#32;http:\/\/cfconventions.org\/ .&#32;Retrieved 23 August 2017 . &#160; \n\n\u2191 16.0 16.1 16.2 \"Attribute Convention for Data Discovery 1-3\".&#32;Federation of Earth Science Information Partners.&#32;http:\/\/wiki.esipfed.org\/index.php\/Attribute_Convention_for_Data_Discovery_(ACDD) .&#32;Retrieved 23 August 2017 . &#160; \n\n\u2191 \"ioos\/compliance-checker\".&#32;GitHub.&#32;https:\/\/github.com\/ioos\/compliance-checker .&#32;Retrieved 22 November 2017 . &#160; \n\n\u2191 Wang, J.; Yang, R.; Evans, B.J.K.&#32;(2017).&#32;\"Improving Seismic Data Accessibility and Performance Using HDF Containers\".&#32;Proceedings from the AGU 2017 Fall Meeting: IN42B-04. &#160; \n\n\u2191 Megies, T..&#32;\"obspy\/obspy\".&#32;GitHub.&#32;https:\/\/github.com\/obspy\/obspy\/wiki .&#32;Retrieved 06 November 2017 . &#160; \n\n\u2191 Computational Infrastructure for Geodynamics.&#32;\"SPECFEM3D Cartesian\".&#32;University of California Davis.&#32;https:\/\/geodynamics.org\/cig\/software\/specfem3d\/ .&#32;Retrieved 06 November 2017 . &#160; \n\n\u2191 \"PH5: What is it?\".&#32;IRIS PASSCAL Instrument Center.&#32;https:\/\/www.passcal.nmt.edu\/content\/ph5-what-it .&#32;Retrieved 18 October 2017 . &#160; \n\n\u2191 Krischer, L.; Smith, J.; Lei, W. et al.&#32;(2016).&#32;\"An Adaptable Seismic Data Format\".&#32;Geophysical Journal International&#32;207&#32;(2): 1003\u201311.&#32;doi:10.1093\/gji\/ggw319. &#160; \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Several URL from the original were dead, and more current URLs were substituted.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\">https:\/\/www.limswiki.org\/index.php\/Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on data qualityLIMSwiki journal articles on informatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t&#160;\n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 20 August 2018, at 18:57.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 456 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","b1e9f2666792cce972a4a66979d1d937_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_A_data_quality_strategy_to_enable_FAIR_programmatic_access_across_large_diverse_data_collections_for_high_performance_data_analysis skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:A data quality strategy to enable FAIR, programmatic access across large, diverse data collections for high performance data analysis<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>To ensure seamless, programmatic access to data for high-performance computing (HPC) and <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">analysis<\/a> across multiple research domains, it is vital to have a methodology for standardization of both data and services. At the Australian National Computational Infrastructure (NCI) we have developed a data quality strategy (DQS) that currently provides processes for: (1) consistency of data structures needed for a high-performance data (HPD) platform; (2) <a href=\"https:\/\/www.limswiki.org\/index.php\/Quality_control\" title=\"Quality control\" target=\"_blank\" class=\"wiki-link\" data-key=\"1e0e0c2eb3e45aff02f5d61799821f0f\">quality control<\/a> (QC) through compliance with recognized community standards; (3) benchmarking cases of operational performance tests; and (4) <a href=\"https:\/\/www.limswiki.org\/index.php\/Quality_assurance\" title=\"Quality assurance\" target=\"_blank\" class=\"wiki-link\" data-key=\"2ede4490f0ea707b14456f44439c0984\">quality assurance<\/a> (QA) of data through demonstrated functionality and performance across common platforms, tools, and services. By implementing the NCI DQS, we have seen progressive improvement in the quality and usefulness of the datasets across different subject domains, and demonstrated the ease by which modern programmatic methods can be used to access the data, either <i>in situ<\/i> or via web services, and for uses ranging from traditional analysis methods through to emerging machine learning techniques. To help increase data re-usability by broader communities, particularly in high-performance environments, the DQS is also used to identify the need for any extensions to the relevant international standards for interoperability and\/or programmatic access.\n<\/p><p><b>Keywords<\/b>: data quality, quality control, quality assurance, benchmarks, performance, data management policy, netCDF, high-performance computing, HPC, fair data\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>The National Computational Infrastructure (NCI) manages one of Australia\u2019s largest and more diverse repositories (10+ petabytes) of research data collections spanning datasets from climate, coasts, oceans, and geophysics through to astronomy, <a href=\"https:\/\/www.limswiki.org\/index.php\/Bioinformatics\" title=\"Bioinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"8f506695fdbb26e3f314da308f8c053b\">bioinformatics<\/a>, and the social sciences.<sup id=\"rdp-ebb-cite_ref-WangLarge14_1-0\" class=\"reference\"><a href=\"#cite_note-WangLarge14-1\" rel=\"external_link\">[1]<\/a><\/sup> Within these domains, data can be of different types such as gridded, ungridded (i.e., line surveys, point clouds), and raster image types, as well as having diverse coordinate reference projections and resolutions. NCI has been following the Force 11 FAIR data principles to make data findable, accessible, interoperable, and reusable.<sup id=\"rdp-ebb-cite_ref-F11FAIR_2-0\" class=\"reference\"><a href=\"#cite_note-F11FAIR-2\" rel=\"external_link\">[2]<\/a><\/sup> These principles provide guidelines for a research data repository to enable data-intensive science, and enable researchers to answer problems such as how to trust the scientific quality of data and determine if the data is usable by their software platform and tools.\n<\/p><p>To ensure broader reuse of the data and enable transdisciplinary integration across multiple domains, as well as enabling programmatic access, a dataset must be usable and of value to a broad range of users from different communities.<sup id=\"rdp-ebb-cite_ref-EvansExtend16_3-0\" class=\"reference\"><a href=\"#cite_note-EvansExtend16-3\" rel=\"external_link\">[3]<\/a><\/sup> Therefore, a set of standards and \"best practices\" for ensuring the quality of scientific data products is a critical component in the life cycle of data management. We undertake both QC through compliance with recognized community standards (e.g., checking the header of the files to make sure it is compliant with community convention standard) and QA of data through demonstrated functionality and performance across common platforms, tools, and services (e.g., verifying the data to be functioning with designated software and libraries).\n<\/p><p>The Earth Science Information Partners (ESIP) Information Quality Cluster (IQC) has been established for collecting such standards and best practices and then assisting data producers in their implementation, and users in their taking advantage of them.<sup id=\"rdp-ebb-cite_ref-RamapriyanEnsuring17_4-0\" class=\"reference\"><a href=\"#cite_note-RamapriyanEnsuring17-4\" rel=\"external_link\">[4]<\/a><\/sup> ESIP considers four different aspects of <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> quality in close relation to different stages of data products in their four-stage life cycle<sup id=\"rdp-ebb-cite_ref-RamapriyanEnsuring17_4-1\" class=\"reference\"><a href=\"#cite_note-RamapriyanEnsuring17-4\" rel=\"external_link\">[4]<\/a><\/sup>: (1) define, develop, and validate; (2) produce, access, and deliver; (3) maintain, preserve, and disseminate; and (4) enable use, provide support, and service.\n<\/p><p>Science teams or data producers are responsible for managing data quality during the first two stages, while data publishers are responsible for the latter two stages. As NCI is both a digital repository, which manages the storage and distribution of reference data for a range of users, as well as the provider of high-end compute and data analysis platforms, the data quality processes are focused on the latter two stages. A check on the scientific correctness is considered to be part of the first two stages and is not included in the definition of \"data quality\" that is described in this paper.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"NCI.27s_data_quality_strategy_.28DQS.29\">NCI's data quality strategy (DQS)<\/span><\/h2>\n<p>NCI developed a DQS to establish a level of assurance, and hence confidence, for our user community and key stakeholders as an integral part of service provision.<sup id=\"rdp-ebb-cite_ref-AtkinTotal05_5-0\" class=\"reference\"><a href=\"#cite_note-AtkinTotal05-5\" rel=\"external_link\">[5]<\/a><\/sup> It is also a step on the pathway to meet the technical requirements of a trusted digital repository, such as the CoreTrustSeal certification.<sup id=\"rdp-ebb-cite_ref-CTSData_6-0\" class=\"reference\"><a href=\"#cite_note-CTSData-6\" rel=\"external_link\">[6]<\/a><\/sup> As meeting these requirements involves the systematic application of agreed policies and procedures, our DQS provides a suite of guidelines, recommendations, and processes for: (1) consistency of data structures suitable for the underlying high-performance data (HPD) platform; (2) QC through compliance with recognized community standards; (3) benchmarking performance using operational test cases; and (4) QA through demonstrated functionality and benchmarking across common platforms, tools, and services.\n<\/p><p>NCI\u2019s DQS was developed iteratively through firstly a review of other approaches for management of data QC and data QA (e.g., Ramapriyan <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-RamapriyanEnsuring17_4-2\" class=\"reference\"><a href=\"#cite_note-RamapriyanEnsuring17-4\" rel=\"external_link\">[4]<\/a><\/sup> and Stall<sup id=\"rdp-ebb-cite_ref-StallAGU16_7-0\" class=\"reference\"><a href=\"#cite_note-StallAGU16-7\" rel=\"external_link\">[7]<\/a><\/sup>) to establish the DQS methodology and secondly applying this to selected use cases at NCI which captured existing and emerging requirements, particularly the use cases that relate to HPC.\n<\/p><p>Our approach is consistent with the American Geophysical Union (AGU) Data Management Maturity (DMM)SM model<sup id=\"rdp-ebb-cite_ref-StallAGU16_7-1\" class=\"reference\"><a href=\"#cite_note-StallAGU16-7\" rel=\"external_link\">[7]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-StallTheAmerican16_8-0\" class=\"reference\"><a href=\"#cite_note-StallTheAmerican16-8\" rel=\"external_link\">[8]<\/a><\/sup>, which was developed in partnership the Capability Maturity Model Integration (CMMI) Institute and adapted for their DMMSM<sup id=\"rdp-ebb-cite_ref-CMMIDataMan_9-0\" class=\"reference\"><a href=\"#cite_note-CMMIDataMan-9\" rel=\"external_link\">[9]<\/a><\/sup> model for applications in the Earth and space sciences. The AGU DMMSM model aims to provide guidance on how to improve data quality and consistency and facilitate reuse in the data life cycle. It enables both producers of data and repositories that store data to ensure that datasets are \"fit-for-purpose,\" repeatable, and trustworthy. The Data Quality Process Areas in the AGU DMMSM model define a collaborative approach for receiving, assessing, cleansing, and curating data to ensure \"fitness\" for intended use in the scientific community.\n<\/p><p>After several iterations, the NCI DQS was established as part of the formal data publishing process and is applied throughout the cycle from submission of data to the NCI repository through to its final publication. The approach is also being adopted by the data producers who now engage with the process from the preparation stage, prior to ingestion onto the NCI data platform. Early consultation and feedback has greatly improved both the quality of the data as well as the timeliness for publication. To improve the efficiency further, one of our major data suppliers is including our DQS requirements in their data generation processes to ensure data quality is considered earlier in data production.\n<\/p><p>The technical requirements and implementation of our DQS will be described as four major but related data components: structure, QC, benchmarking, and QA.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Data_structure\">Data structure<\/span><\/h3>\n<p>NCI's research data collections are particularly focused on enabling programmatic access, required by: (1) NCI core services such as the NCI supercomputer and NCI cloud-based capabilities; (2) community virtual <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratories<\/a> and virtual research environments; (3) those that require remote access through established scientific standards-based protocols that use data services; and, (4) increasingly, by international data federations. To enable these different types of programmatic access, datasets must be registered in the central NCI catalogue<sup id=\"rdp-ebb-cite_ref-NCIDataPortal_10-0\" class=\"reference\"><a href=\"#cite_note-NCIDataPortal-10\" rel=\"external_link\">[10]<\/a><\/sup>, which records their location for access both on the filesystems and via data services.\n<\/p><p>This requires the data to be well-organized and compliant with uniform, professionally managed standards and consistent community conventions wherever possible. For example, the climate community Coupled Model Intercomparison Project (CMIP) experiments use the Data Reference Syntax (DRS)<sup id=\"rdp-ebb-cite_ref-TaylorCMIP12_11-0\" class=\"reference\"><a href=\"#cite_note-TaylorCMIP12-11\" rel=\"external_link\">[11]<\/a><\/sup>, whilst the National Aeronautics and Space Administration (NASA) recommends a specific name convention for Landsat satellite image products.<sup id=\"rdp-ebb-cite_ref-USGSLandsat_12-0\" class=\"reference\"><a href=\"#cite_note-USGSLandsat-12\" rel=\"external_link\">[12]<\/a><\/sup> The NCI data collection catalogue manages the details of each dataset through a uniform application of ISO 19115:2003<sup id=\"rdp-ebb-cite_ref-ISO19115_13-0\" class=\"reference\"><a href=\"#cite_note-ISO19115-13\" rel=\"external_link\">[13]<\/a><\/sup>, an international schema used for describing geographic information and services. Essentially, each catalogue entry points to the location of the data within the NCI data infrastructure. The catalogue entries also point to the services endpoints such as a standard data download point, data subsetting interface, as well as Open Geospatial Consortium (OGC) Web Mapping Service (WMS) and Web Coverage Services (WCS). NCI can publish data through several different servers, and as such the specific endpoint for each of these service capabilities is listed.\n<\/p><p>NCI has developed a catalogue and directory policy, which provides guidelines for the organization of datasets within the concepts of data collections and data sub-collections and includes a comprehensive definition for each hierarchical layer. The definitions are:\n<\/p>\n<ul><li> A <i>data collection<\/i> is the highest in the hierarchy of data groupings at NCI. It is comprised of either an exclusive grouping of data subcollections, or it is a tiered structure with an exclusive grouping of lower tiered data collections, where the lowest tier data collection will only contain data subcollections.<\/li><\/ul>\n<ul><li> A <i>data subcollection<\/i> is an exclusive grouping of datasets (i.e., belonging to only one subcollection) where the constituent datasets are tightly managed. It must have responsibilities within one organization with responsibility for the underlying management of its constituent datasets. A data subcollection constitutes a strong connection between the component datasets, and is organized coherently around a single scientific element (e.g., model, instrument). A subcollection must have compatible licenses such that constituent datasets do not need different access arrangements.<\/li><\/ul>\n<ul><li> A <i>dataset<\/i> is a compilation of data that constitutes a programmable data unit that has been collected and organized using a self-contained process. For this purpose it must have a named data owner, a single license, one set of semantics, ontologies, vocabularies, and has a single data format and internal data convention. A dataset must include its version.<\/li><\/ul>\n<ul><li> A <i>dataset granule<\/i> is used for some scientific domains that require a finer level of granularity (e.g., in satellite Earth Observation datasets). A granule refers to the smallest aggregation of data that can be independently described, inventoried, and retrieved as defined by NASA.<sup id=\"rdp-ebb-cite_ref-NASAGlossary_14-0\" class=\"reference\"><a href=\"#cite_note-NASAGlossary-14\" rel=\"external_link\">[14]<\/a><\/sup> Dataset granules have their own metadata and support values associated with the additional attributes defined by parent datasets.<\/li><\/ul>\n<p>In addition we use the term \"data category\" to identify common contents\/themes across all levels of the hierarchy.\n<\/p>\n<ul><li> A <i>data category<\/i> allows a broad spectrum of options to encode relationships between data. A data category can be anything that weakly relates datasets, with the primary way of discovering the groupings within the data by key terms (e.g., keywords, attributes, vocabularies, ontologies). Datasets are not exclusive to a single category.<\/li><\/ul>\n<h4><span class=\"mw-headline\" id=\"Organization_of_data_within_the_data_structure\">Organization of data within the data structure<\/span><\/h4>\n<p>NCI has organized data collections according to this hierarchical structure on both filesystem and within our catalogue system. Figure 1 shows how these datasets are organized. Figure 2 provides an example of how the CMIP 5 data collection demonstrates the hierarchical directory structure.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Evans_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"ecf96dbbc035d38f216843ff76a97945\"><img alt=\"Fig1 Evans Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/f\/f3\/Fig1_Evans_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> Illustration of the different levels of metadata and community standards used for each<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Evans_Informatics2017_4-4.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"bd9aacddb55de987335483b6c0dec0ad\"><img alt=\"Fig2 Evans Informatics2017 4-4.jpg\" src=\"https:\/\/www.limswiki.org\/images\/7\/7d\/Fig2_Evans_Informatics2017_4-4.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> Example schematic of the National Computational Infrastructure (NCI)\u2019s data organizational structure using the Coupled Model Intercomparison Project (CMIP)) 5 collection. The CMIP 5 collection housed at NCI includes three sub-collections from The Commonwealth Scientific and Industrial Research Organisation (CSIRO) and Australian Bureau of Meteorology (BOM): (1) the ACCESS-1.0 model, (2) ACCESS-1.3 model, and (3) Mk 3.6.0 model. Each sub-collection then contains a number of datasets, such as \u201cpiControl\u201d (pre-industrial control experiment), which then contains numerous granules (e.g., precipitation, \u201cpr\u201d). A complete description on the range of CMIP5 contents can be found at: <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/pcmdi.llnl.gov\/mips\/cmip5\/experiment_design.html\" target=\"_blank\">https:\/\/pcmdi.llnl.gov\/mips\/cmip5\/experiment_design.html<\/a>.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Data_QC\">Data QC<\/span><\/h3>\n<p>Data QC measures are intended to ensure that all datasets hosted at NCI adhere, wherever possible, to existing community standards for metadata and data. For Network Common Data Form (netCDF) (and Hierarchical Data Format v5 (HDF5)-based) file formats, these include the Climate and Forecast (CF) Convention<sup id=\"rdp-ebb-cite_ref-LLNLCFConv_15-0\" class=\"reference\"><a href=\"#cite_note-LLNLCFConv-15\" rel=\"external_link\">[15]<\/a><\/sup> and the Attribute Convention for Data Discovery<sup id=\"rdp-ebb-cite_ref-ESIPAttri_16-0\" class=\"reference\"><a href=\"#cite_note-ESIPAttri-16\" rel=\"external_link\">[16]<\/a><\/sup> (see Table 1).\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"3\"><b>Table 1.<\/b> The NCI Quality Control (QC) mandatory requirements. A full list of the Attribute Convention for Data Discovery (ACDD) metadata requirements used by NCI is provided in Appendix A.\n<\/td><\/tr>\n\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Convention\/Standard\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">NCI Requirements\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Further Information\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">CF\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Mandatory CF criteria, e.g., no \u201cerrors\u201d result from any of the recommended compliance checkers\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><a rel=\"external_link\" class=\"external free\" href=\"http:\/\/cfconventions.org\" target=\"_blank\">http:\/\/cfconventions.org<\/a>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ACDD (Modified version)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Required attributes are included within each file: 1. title, 2. summary, 3. source, 4. date_created\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><a rel=\"external_link\" class=\"external free\" href=\"http:\/\/wiki.esipfed.org\/index.php\/Attribute_Convention_for_Data_Discovery_1-3\" target=\"_blank\">http:\/\/wiki.esipfed.org\/index.php\/Attribute_Convention_for_Data_Discovery_1-3<\/a>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h4><span class=\"mw-headline\" id=\"Climate_and_forecast_.28CF.29_convention\">Climate and forecast (CF) convention<\/span><\/h4>\n<p>NCI requires that all geospatial datasets meet the minimum mandatory CF convention metadata criteria at the time of publication, and, where scientifically applicable, we require they meet the relevant recommended CF criteria. These requirements are detailed in the latest CF convention document provided on their website.<sup id=\"rdp-ebb-cite_ref-LLNLCFConv_15-1\" class=\"reference\"><a href=\"#cite_note-LLNLCFConv-15\" rel=\"external_link\">[15]<\/a><\/sup>\n<\/p><p>The CF convention is the primary community standard for netCDF data, which was originally developed by the climate community and is now being adapted for other domains, e.g., marine and geosciences. It defines metadata requirements for information on each variable contained within the file as well as spatial and temporal properties of the data, so that contents are fully \u201cself-described.\u201d For example, no additional companion files or external sources are required to describe any information about how to read or utilize the data contents within the file. The metadata requirements also provide important guidelines on how to structure spatial data. This includes recommendations on the order of dimensions, the handling of gridded and non-gridded (time series, point and trajectory) data, coordinate reference system descriptions, standardized units, and cell measures (i.e., information relating to the size, shape, or location of grid cells). CF requires that all metadata information be equally readable and understandable by humans and software, which has the benefit of allowing software tools to easily display and dynamically perform associated operations.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Attribute_Convention_for_Data_Discovery_.28ACDD.29\">Attribute Convention for Data Discovery (ACDD)<\/span><\/h4>\n<p>The ACDD is another common standard for netCDF data that complements the CF convention requirements.<sup id=\"rdp-ebb-cite_ref-ESIPAttri_16-1\" class=\"reference\"><a href=\"#cite_note-ESIPAttri-16\" rel=\"external_link\">[16]<\/a><\/sup> The ACDD primarily governs metadata information written at the file-level (i.e., netCDF global attributes), while the CF convention pertains mainly to variable-level metadata and structure information. Therefore, when combined these two standards help to fully describe both the higher-level metadata relevant to the entire file (e.g., dataset title, custodian, data created, etc.) and the lower-level information about each individual variable or dimension (e.g., name, units, bounds, fill values, etc.). ACDD also provides the ability to link to even higher-levels such as the dataset parent and grandparent ISO 19115 metadata entries.\n<\/p><p>NCI has applied this convention, along with CF, as summarized in Table 1 as part of our data QC. As the ACDD has no \u201crequired\u201d fields in its current specification, NCI has applied a modified version that requires all published datasets meet the minimum of four required ACDD catalogue metadata fields at the time of publication. These are \u201ctitle,\u201d \u201csummary,\u201d \u201csource,\u201d and \u201cdate_created\u201d and have been ranked as \u201crequired\u201d to aid with NCI\u2019s data services and data discovery. A complete list of ACDD metadata attributes and NCI requirements are available in Appendix A.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Benchmarking_methodology\">Benchmarking methodology<\/span><\/h3>\n<p>Any reference datasets made available on NCI must be well organized and accessible in a form suitable for the known class of users. Datasets also need to be more broadly available to other users from different domains, with the expectation that the collection will continue to have long-term and enduring value not just to the research community but also to others (e.g., government, general public, industry). To ensure that these expectations are clearly understood across the range of use-cases and environments, NCI has adopted a benchmarking methodology as part of their DQS process. Benchmarks register their functionality and performance, which helps to define expectations around data accessibility and provide an effective, defined measure of usability.\n<\/p><p>To substantiate this, NCI works with both the data producers and the users to establish benchmarks for specific areas, which are then included as part of the registry of data QA measures. These tests are then verified by both NCI and by wider community representatives to ensure that the benchmark is appropriate for the requested access. The benchmark methodology also provides a way to systematically consider how current users will be affected when considering any future developments or evolution in technology, standards, or reorganization of data. The benchmark cases then substantiate the original intention, and they can be reviewed against any subsequent changes. For example, benchmark cases that were previously specified to use data in a particular format may have been updated to use an alternative, more acceptable format that is better for use in high-performance environments or improves accessibility across multiple domains. The original benchmark cases can then be re-evaluated against both the functionality and performance required to assess how to make such a transformation. Further, if there are any upgrades or changes to the production services, the benchmark cases are used to perform prerelease tests on the data servers before implementing the changes into production.\n<\/p><p>The benchmarks consist of explicit current examples using tools, libraries, services, packages, software, and processes that are executed at NCI. These benchmarks explore the required access and identify supporting standards that are critical to the utility of the service, whether access be through the filesystem or by API protocols provided by NCI data services. Where benchmarks are shown to be beyond the capability of the current data service, the benchmark case will be recorded for future application.\n<\/p><p>Furthermore, the results of the testing of each benchmark are reviewed with the data producer in light of any issues raised. This may require action by the user to revise the access pattern and\/or by the data producer to modify the data to ensure that the reliability of NCI\u2019s production service is not compromised. Alternatively, NCI may be able to provide a temporary separate service to accommodate some aspects of the usage pattern. For example, the data might be released via a modified server that can address shortcomings of a specific benchmark case but would not be applicable generally. This may be a short-term measure until a better server solution is found, or it may address current local issues on either the data or client application side.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Data_QA\">Data QA<\/span><\/h3>\n<p>To ensure that the data is usable across a range of use-cases and environments, the QA approach uses benchmarks for testing data located on the local filesystem, as well as remotely via the data service endpoints. The QA process is designed to verify that software and libraries used are functioning properly with the most commonly used tools in the community.\n<\/p><p>The following are a list of data services that are available under NCI\u2019s Unidata Thematic Real-time Environmental Distributed Data Services (THREDDS):\n<\/p>\n<ul><li> Open-source Project for a Network Data Access Protocol (OPeNDAP): a protocol enabling data access and subsetting through the web;<\/li>\n<li> NetCDF Subset Service (NCSS): web service for subsetting files that can be read by the netCDF java library;<\/li>\n<li> WMS: OGC web service for requesting raster images of data;<\/li>\n<li> WCS: OGC web service for requesting data in some output format;<\/li>\n<li> Godiva2 Data Viewer: tool for simple visualization of data; and<\/li>\n<li> HTTP File Download: for direct downloading of data.<\/li><\/ul>\n<p>The data is tested through each of the required services as part of the QA process, with the basic usability functionality tests applied to each service as shown in Table 2. Should an issue be discovered during these functionality tests, the issue is investigated further. This may lead to additional modifications of the data so as to pass the functionality or performance requirements, and in doing so requires further communication with the data producer to ensure that such changes are acceptable and can be corrected in any future data production process. More detailed functionality can also be recorded for scientific use around the data. Such tests tend to be specific for the data use-case but follow the same methodology as that described here.\n<\/p><p><br \/>\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"2\"><b>Table 2.<\/b> Description of basic accessibility and functionality tests that are applied for commonly used tools as part of NCI\u2019s QA tests\n<\/td><\/tr>\n\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Test\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Measures of Success\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">netCDF C-Library\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Using the <tt>ncdump-h &lt;file&gt;<\/tt> function from command line, the file is readable and displays the file header information about the file dimensions, variables, and metadata.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GDAL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Using the <tt>gdalinfo &lt;file&gt;<\/tt> function from command line, the file is readable and displays the file header information about the file dimensions, variables, and metadata.<br \/>Using the <tt>gdalinfo NETCDF:&lt;file&gt;:&lt;subdataset&gt;<\/tt> function from command line, the subdatasets are readable and corresponding metadata for each subdataset is displayed.<br \/>The <tt>Open<\/tt> and <tt>GetMetadata<\/tt> functions return non-empty values that correspond to the netCDF file contents.<br \/>The <tt>GetProjection<\/tt> function (of the appropriate file or subdataset) returns a non-empty result corresponding to the data coordinate reference system information.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NCO (NetCDF Operators)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Using the <tt>ncks -m &lt;file&gt;<\/tt> function from command line, the file is readable and displays file metadata.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">CDO (Climate Data Operators)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Using the <tt>cdo sinfon &lt;file&gt;<\/tt> function from command line, the file is readable and displays information on the included variables, grids, and coordinates.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Ferret\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Using <tt>SET DATA \u201c&lt;file&gt;\u201d<\/tt> followed by <tt>SHOW DATA<\/tt> displays information on file contents.<br \/> Using <tt>SET DATA \u201c&lt;file&gt;\u201d<\/tt> followed by <tt>SHADE &lt;variable&gt;<\/tt> (or another plotting command) produces a plot of the requested data.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Thredds Data Server\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Dataset index catalog page loads without timeout and within reasonable time expectations (&lt;10 s).\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Thredds Data Service Endpoints\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>HTTP Download<\/b>: File download commences when selected the HTTPServer option from the THREDDS catalog page for the file.<br \/><b>OPeNDAP<\/b>: When selecting OPeNDAP from the THREDDS catalog page for the file, the OPeNDAP Dataset Access Form page loads without error. From the OPeNDAP Dataset Access Form page, a data subset is returned in ASCII format after selecting data and clicking the Get ASCII option at the top of the page.<br \/><b>Godiva2<\/b>: When selecting the Godiva2 viewer option from the THREDDS catalog page for the file, the viewer displays the file contents.<br \/><b>WMS<\/b>: When selecting the WMS option from the THREDDS catalog page for the file, the web browser displays the GetCapabilities information in xml format. After constructing a GetMap request, the web browser displays the corresponding map.<br \/><b>WCS<\/b>: When selecting the WCS option from the THREDDS catalog page for the file, the web browser displays the GetCapabilities information in XML format. After constructing a GetCoverage request, file download of coverage commences.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Panoply\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">From the File \u2192 Open menu, the file can be opened. File contents and metadata displayed.<br \/>Using Create Plot for a selected variable, data is displayed correctly in new plot window.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">QGIS\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Using the Add WMS\/WMTS menu option, QGIS can request GetCapabilities and\/or GetMap operations, and the layer is visible.<br \/>The ncWMS GetCapabilities URL accepts and adds the NCI THREDDS Server, the request displays the available layers to select from, and a selected layer displays according to user expectations.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NASA Web WorldWind\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The ncWMS GetCapabilities URL accepts and adds the NCI THREDDS Server, the request displays the available layers to select from, and a selected layer displays according to user expectations.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">PYTHON cdms2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The file can be opened by the <tt>Open<\/tt> function.<br \/>File metadata is displayed using <tt>Attributes<\/tt> function.<br \/>File data contents are displayed when using <tt>Variables<\/tt> function.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">PYTHON netCDF4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The file can be opened by the <tt>Dataset<\/tt> function.<br \/>File metadata is displayed using <tt>ncattrs<\/tt> object.<br \/>File data contents are displayed using <tt>variables<\/tt> (and\/or <tt>groups<\/tt>) objects.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">PYTHON h5py\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The netcdf file can be opened by the <tt>File<\/tt> function.<br \/>The metadata and variables are displayed by the <tt>keys<\/tt> and <tt>attrs<\/tt> objects.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ParaView\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">From the File \u2192 Open menu, the file can be opened and displayed as a layer in the Pipeline Browser. Enabling layer visibility results in data displaying in the Layout window.\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Examples_of_tests_and_reports_undertaken_on_NCI_datasets_prior_to_publication\">Examples of tests and reports undertaken on NCI datasets prior to publication<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Metadata_QC_checker_reports\">Metadata QC checker reports<\/span><\/h3>\n<p>To assess the CF and ACDD compliance, NCI runs a QC checker prior to data publication and works with the data producer to rectify problems. The NCI checker is based on the U.S. Integrated Ocean Observing System (IOOS) Compliance Checker<sup id=\"rdp-ebb-cite_ref-IOOSCompliance_17-0\" class=\"reference\"><a href=\"#cite_note-IOOSCompliance-17\" rel=\"external_link\">[17]<\/a><\/sup> but has been modified to include additional checks relevant to NCI\u2019s data services as well as the modified ACDD convention. Appendix B shows an example QC checker report (Figure A1) with metadata that is 100% compliant with NCI\u2019s requirements. In practice, the process usually needs to be run several times as the datasets are checked, feedback is given, and then re-run against the timestamp for each version to keep a record of metadata update provenance. The reports are shared with the data producers with comments and additional feedback provided in the \u201chigh\/medium\/low-priority suggestions\u201d section at the end of the report, depending on the potential impact of non-compliance.\n<\/p><p>Due to the large number of data files that can be involved, NCI\u2019s QC checker has been modified to enable parallelization so that multiple processes can be run simultaneously, thus increasing performance of the checking process. For instance, it takes less than a minute to check hundreds of files, and about 10 minutes for tens of thousands. For the largest datasets, the QC checker can typically run on more than one million files at a time.\n<\/p><p>The QC checker also helps to find corrupted or temporary files, which can be easily overlooked or not detected by the data producers, especially during a batch production process.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Functionality_test_QA_reports\">Functionality test QA reports<\/span><\/h3>\n<p>Appendix B provides an example report (Figure A2) of the QA results from checking three data files when accessed directly on the filesystem and their service endpoints for access via THREDDS. The functionality test shows that the variable structure within the data of two files (2 GB and 4 GB) are too large to load the files into several commonly used data viewers, such as ncview (v2.1.1) and Panoply (v4.5.1), and they have similar issues on opening files through the service endpoints. In this case, our advice for mitigation is to reduce the requested size of the image by using a lower resolution or to work <i>in situ<\/i> with this particular data file, as recorded in the comments of Figure A2, sections b and c.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Benchmarking_use_cases\">Benchmarking use cases<\/span><\/h3>\n<p>In the benchmark tests several popular tools and APIs are run to evaluate their elapsed time on accessing data either residing on the local filesystem or being accessed via data services. The test files in the example NCI functionality QA test report (Figure A2) are used in the benchmark tests, and their data structures are listed in Table 3. We access the 2D variable in each file, which is recorded at (lat, lon), chunked at (128,128) and deflated at level 2.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\"><b>Table 3.<\/b> Data structure of the sample files used in the benchmark tests\n<\/td><\/tr>\n\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">Attributes\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">File 1\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">File 2\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">File 3\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" rowspan=\"2\">lon (double)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Size\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5717\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">59501\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">40954\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Chunksize\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">128\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">128\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">128\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" rowspan=\"2\">lat (double)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Size\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">4182\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">41882\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">34761\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Chunksize\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">128\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">128\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">128\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" rowspan=\"3\">Variable(float)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Name\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">grav_ir_anomaly\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">mag_tmi_rtp_anomaly\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">rad_air_dose_rate\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Size\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">(4182,5717)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">(41882,59501)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">(34761,40954)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Chunksize\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">(128,128)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">(128,128)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">(128,128)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Deflate Level\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Format\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">netCDF-4 classic model\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">netCDF-4 classic model\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">netCDF-4 classic model\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The elapsed time of the benchmark tests are listed in Table 4. The netCDF utilities such as ncdump or h5dump could dump the contents of netCDF files into an ASCII representation. They are frequently used in the functionality test of the QA report to fetch the metadata of the netCDF files. In the performance benchmarking tests, we measure the elapsed time to dump the whole variable as human-readable ASCII text. This performance relies on the internal data organization, such as contiguous or chunking, deflation shuffling, etc., and involves numerous type conventional operations. Such conventions may also incur a heavy overhead during the dump process, and it could take a very long time to complete the access of a large size file.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"5\"><b>Table 4.<\/b> Benchmark results (in sec.)\n<\/td><\/tr>\n\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Program\/Service\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Test\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">File 1\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">File 2\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">File 3\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" rowspan=\"2\">NetCDF Utilities\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ncdump\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8.630\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5584.414\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3246.879\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">h5dump\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">40.547\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3546.999\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2373.483\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" rowspan=\"3\">Python (2.7.x) netCDF APIs\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">netCDF4-python (1.2.7)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.445\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">48.603\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">29.160\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GDAL-python (1.11.1)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.421\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">42.654\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">25.538\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">h5py (v2.6.0)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0.356\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">40.105\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">23.826\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" rowspan=\"3\">THREDDS Data Server (TDS)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">netCDF4-python (1.2.7)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3.087\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">282.797\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">185.358\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">OPeNDAP (TDS v4.6.6)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3.038\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">277.21\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">194.85\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">netCDF Subset Service (TDS v4.6.6)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.833\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">248.194\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">158.236\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>In Table 4 we show an extreme case where a file provided complies with standard QC checks and is well formatted. However, when we evaluate the file using the standard suite of tools we see that the elapsed time of using both ncdump and h5dump can take hours to dump a variable for a file size of 2 GB or 4 GB. To evaluate performance of programmatic methods on netCDF files, we use netCDF4-python, Geospatial Data Abstraction Library (GDAL)-python, and h5py to access the target files from the Lustre filesystems. In this case our tests show that all APIs could use much less time fetching the whole variable than netCDF dump tools due to the removal of overheads on data convention and transporting. Our tests also show that h5py presents the best performance. Since netCDF-4 is essentially a profile of the HDF5 format, both netCDF4-python and GDAL-python eventually invoke the HDF5 library to access the data. NetCDF4-python can also access data from the THREDDS server (which is tested for performance on our high speed internal network), but it takes nearly six times longer to access the data via the data service when compared with accessing the same volume of data on our Lustre filesystem. All three tools take a similar time to access data from our THREDDS server. By default, netCDF4-python and THREDDS have a request size limit of 500 MB, so it is necessary to divide the fetching process into several individual requests if the target dataset is larger than 500 MB. NCSS, on the other hand, has a much larger file limit per request so less requests are needed in NCSS than either netCDF4-python or THREDDS.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Results_sharing\">Results sharing<\/span><\/h3>\n<p>All QC\/QA reports and benchmarks are shared with the data producers. In the future we plan to make these reports available to the wider community, as the information provides consumers with evidence on how the data is functioning and how it has performed with different software and libraries. It also provides guidance on how to best use the data and enables the consumer to determine if they are using data, or a tool to access the data, that has not been tested before. This information is also used in data training to demonstrate the application of data standards in both data organization and data preparation, and how to use the data with a range of software.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>The NCI DQS has been applied to climate and weather, earth observation, geoscience, and astronomy data, with the QC and QA tests adapted to meet the relevant community standards and protocols for each domain. The examples provided in this paper have shown how the knowledge and experience on data standards for netCDF files and conventions\u2014such as CF and ACDD, initially developed within the climate community\u2014are applicable to other scientific domains. For example, in the geophysics domain, there is a growing need to enable access to much larger data volumes, over larger spatial areas and\/or enable aggregation of data from multiple individual geophysical surveys. To do this, in consultation with the geophysics and HDF communities, the principles of the CF convention from the climate community and the ACDD from the Earth science community were translated into a proposed new geophysics convention that improves programmatic access and interoperability across different geophysical data types, such as seismic, gravity, magnetotelluric, and radiometric.<sup id=\"rdp-ebb-cite_ref-WangImprov17_18-0\" class=\"reference\"><a href=\"#cite_note-WangImprov17-18\" rel=\"external_link\">[18]<\/a><\/sup> We also applied our benchmarking strategy to the geophysics domain, initially using the domain-popular ObsPy library<sup id=\"rdp-ebb-cite_ref-19\" class=\"reference\"><a href=\"#cite_note-19\" rel=\"external_link\">[19]<\/a><\/sup> and SPECFEM3D code<sup id=\"rdp-ebb-cite_ref-CIG_SPEC_20-0\" class=\"reference\"><a href=\"#cite_note-CIG_SPEC-20\" rel=\"external_link\">[20]<\/a><\/sup>, to demonstrate how different organizations of the data (in terms of chunking size and compression) impact on the performance by comparing new data formats, such as PH5<sup id=\"rdp-ebb-cite_ref-IRIS_PH5_21-0\" class=\"reference\"><a href=\"#cite_note-IRIS_PH5-21\" rel=\"external_link\">[21]<\/a><\/sup> and ASDF<sup id=\"rdp-ebb-cite_ref-KrischerAnAdapt16_22-0\" class=\"reference\"><a href=\"#cite_note-KrischerAnAdapt16-22\" rel=\"external_link\">[22]<\/a><\/sup> to traditional formats such as the Society of Exploration Geophysicists-Y Data Exchange Format (SEG-Y), the Standard for the Exchange of Earthquake Data Format (SEED), Seismic Analysis Code (SAC), etc.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>We have developed a DQS as a key component of our vision to provide a trustworthy, transdisciplinary, high-performance data platform which enables researchers to share, use, reuse, and repurpose the data collections in high-end computational and data-intensive environments. The implementation of DQS provides assurance to users that the data is properly quality checked and they are compliant within the community standard. The functionality check in the QA process lists suitable software and libraries so that users can check whether the data is usable within their platform. Applying the DQS provides a standard way to (1) assess completeness and consistency of data across multiple datasets and collections; (2) evaluate the suitability of the data for transdisciplinary use; (3) enable standardized programmatic access; and (4) avoid the negative impacts of poor data and dissatisfied user experience.\n<\/p><p>The NCI DQS identifies issues with the data and metadata at the time of data ingestion onto the NCI data platform, thus allowing corrections to be undertaken prior to publication. Applying the DQS means that scientists spend less time reformatting and wrangling the data to make it suitable for use by their applications and workflows\u2014especially if their applications can read standardized interfaces. Future work will focus on broader adoption of data from additional domains and data types, as well improving use of controlled vocabularies for individual data attributes as a means of more efficiently indexing the data.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>The authors wish to acknowledge funding from the Australian Government Department of Education, through the National Collaborative Research Infrastructure Strategy (NCRIS), and the Education Investment Fund (EIF) Super Science Initiatives through the NCI and Research Data Services (RDS) projects. We also wish to acknowledge the organizational partners and data managers involved in data management at NCI, particularly Geoscience Australia, the Bureau of Meteorology, CSIRO, and the Australian National University.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>B.E. and K.D. conceived and designed the NCI DQS. K.D. developed the codes of QC\/QA checker. K.D. and J.W. run the QC and QA test and generate reports. R.Y. ran the benchmark tests. J.W., K.D., R.Y. and L.W. wrote the initial paper. B.E., C.R. and L.W. reviewed and improved key sections of the paper, particularly for the broader activities of QA and its application.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h3>\n<p>The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; and in the decision to publish the results.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Appendix\">Appendix<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Appendix_A\">Appendix A<\/span><\/h3>\n<p>NCI NetCDF Metadata Guide based on the Attribute Convention for Dataset Discovery (ACDD v1.3)\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"2\"><b>Table A1.<\/b> The following table contains a subgroup of attributes from the ACDD metadata specification<sup id=\"rdp-ebb-cite_ref-ESIPAttri_16-2\" class=\"reference\"><a href=\"#cite_note-ESIPAttri-16\" rel=\"external_link\">[16]<\/a><\/sup> where the priority-level for the attributes are categorized as \u201cRequired,\u201d \u201cRecommended,\u201d or \u201cSuggested,\u201d and which shows attributes where the priority-level has been modified to better align with NCI\u2019s data hosting services (e.g., NCI classifies \u201csource\u201d as \u201cRequired\u201d while it is only \u201cRecommended\u201d by the ACDD guidelines).\n<\/td><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">REQUIRED\n<\/th><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Global Attribute\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Description\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">title\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">A short phrase or sentence describing the dataset. In many discovery systems, the title will be displayed in the results list from a search, and therefore it should be human-readable and reasonable to display in a list of such names. This attribute is also recommended by the NetCDF Users Guide and the CF conventions.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">summary\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">A paragraph describing the dataset, analogous to an abstract for a paper.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">source\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The method of production of the original data. If it was model-generated, source should name the model and its version. If it is observational, source should characterize it. This attribute is defined in the CF Conventions. Examples: \"temperature from CTD #1234\"; \"world model v.0.1\".\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">data_created\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The date on which this version of the data was created. (Modification of values implies a new version, hence this would be assigned the date of the most recent values modification.) Metadata changes are not considered when assigning the date_created. The ISO 8601:2004 extended date format is recommended, as described in the Attribute Content Guidance section.\n<\/td><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">RECOMMENDED\n<\/th><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Global Attribute\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Description\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Conventions\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">A comma-separated list of the conventions that are followed by the dataset. For files that follow this version of ACDD, include the string \u2018ACDD-1.3\u2019. (This attribute is described in the netCDF Users Guide.)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">metadata_link\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">A URL that gives the location of more complete metadata. A persistent URL is recommended for this attribute.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">history\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Provides an <a href=\"https:\/\/www.limswiki.org\/index.php\/Audit_trail\" title=\"Audit trail\" target=\"_blank\" class=\"wiki-link\" data-key=\"96a617b543c5b2f26617288ba923c0f0\">audit trail<\/a> for modifications to the original data. This attribute is also in the netCDF Users Guide: \"This is a character array with a line for each invocation of a program that has modified the dataset. Well-behaved generic netCDF applications should append a line containing: date, time of day, user name, program name and command arguments.\" To include a more complete description you can append a reference to an ISO Lineage entity; see NOAA EDM ISO Lineage guidance.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">license\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Provide the URL to a standard or specific license, enter \u201cFreely Distributed\u201d or \u201cNone\u201d, or describe any restrictions to data access and distribution in free text.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">doi\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">To be used if a DOI exists.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">product_version\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Version identifier of the data file or product as assigned by the data creator. For example, a new algorithm or methodology could result in a new product_version.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">processing_level\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">A textual description of the processing (or QC) level of the data.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">institution\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The name of the institution principally responsible for originating this data. This attribute is recommended by the CF convention.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">project\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The name of the project(s) principally responsible for originating this data. Multiple projects can be separated by commas, as described under Attribute Content Guidelines. Examples: \"PATMOS-X\" and \"Extended Continental Shelf Project\".\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">instrument\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Name of the contributing instrument(s) or sensor(s) used to create this data set or product. Indicate controlled vocabulary used in instrument_vocabulary.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">platform\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Name of the platform(s) that supported the sensor data used to create this data set or product. Platforms can be of any type, including satellite, ship, station, aircraft or other. Indicate controlled vocabulary used in platform_vocabulary.\n<\/td><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\" colspan=\"2\">SUGGESTED\n<\/th><\/tr>\n<tr>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Global Attribute\n<\/th>\n<th style=\"background-color:#e2e2e2; padding-left:10px; padding-right:10px;\">Description\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">id\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">An identifier for the data set, provided by and unique within its naming authority. The combination of the \u201cnaming authority\u201d and the \u201cid\u201d should be globally unique, but the id can be globally unique by itself also. IDs can be URLs, URNs, DOIs, meaningful text strings, a local key, or any other unique string of characters. The id should not include white space characters.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">date_modified\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The date on which the data was last modified. Note that this applies just to the data, not the metadata. The ISO 8601:2004 extended date format is recommended, as described in the Attributes Content Guidance section.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">date_created\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The date on which this version of the data was created. (Modification of values implies a new version, hence this would be assigned the date of the most recent values modification.) Metadata changes are not considered when assigning the date_created. The ISO 8601:2004 extended date format is recommended, as described in the Attribute Content Guidance section.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">date_issued\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The date on which this data (including all modifications) was formally issued (i.e., made available to a wider audience). Note that these apply just to the data, not the metadata. The ISO 8601:2004 extended date format is recommended, as described in the Attributes Content Guidance section.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">references\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Published or web-based references that describe the data or methods used to produce it. Recommend URIs (such as a URL or DOI) for papers or other references. This attribute is defined in the CF conventions.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">keywords\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">A comma-separated list of key words and\/or phrases. Keywords may be common words or phrases, terms from a controlled vocabulary (GCMD is often used), or URIs for terms from a controlled vocabulary (see also \u201ckeywords_vocabulary\u201d attribute).\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">standard_name_vocabulary\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">The name and version of the controlled vocabulary from which variable standard names are taken. (Values for any standard_name attribute must come from the CF Standard Names vocabulary for the data file or product to comply with CF.) Example: \"CF Standard Name Table v27\".\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">geospatial_lat_min\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes a simple lower latitude limit; may be part of a 2- or 3-dimensional bounding region. Geospatial_lat_min specifies the southernmost latitude covered by the dataset.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">geospatial_lat_max\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes a simple upper latitude limit; may be part of a 2- or 3-dimensional bounding region. Geospatial_lat_max specifies the northernmost latitude covered by the dataset.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">geospatial_lon_min\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes a simple longitude limit; may be part of a 2- or 3-dimensional bounding region. geospatial_lon_min specifies the westernmost longitude covered by the dataset. See also geospatial_lon_max.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">geospatial_lon_max\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes a simple longitude limit; may be part of a 2- or 3-dimensional bounding region. geospatial_lon_max specifies the easternmost longitude covered by the dataset. Cases where geospatial_lon_min is greater than geospatial_lon_max indicate the bounding box extends from geospatial_lon_max, through the longitude range discontinuity meridian (either the antimeridian for \u2212180:180 values, or Prime Meridian for 0:360 values), to geospatial_lon_min; for example, geospatial_lon_min = 170 and geospatial_lon_max = \u2212175 incorporates 15 degrees of longitude (ranges 170 to 180 and \u2212180 to \u2212175).\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">geospatial_vertical_min\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes the numerically smaller vertical limit; may be part of a 2- or 3-dimensional bounding region. See geospatial_vertical_positive and geospatial_vertical_units.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">geospatial_vertical_max\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes the numerically larger vertical limit; may be part of a 2- or 3-dimensional bounding region. See geospatial_vertical_positive and geospatial_vertical_units.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">geospatial_vertical_positive\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">One of \"up\" or \"down.\" If up, vertical values are interpreted as \"altitude,\" with negative values corresponding to below the reference datum (e.g., under water). If down, vertical values are interpreted as \"depth,\" positive values correspond to below the reference datum. Note that if geospatial_vertical_positive is down (\"depth\" orientation), the geospatial_vertical_min attribute specifies the data\u2019s vertical location furthest from the earth\u2019s center, and the geospatial_vertical_max attribute specifies the location closest to the earth\u2019s center.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">geospatial_bounds\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes the data\u2019s 2D or 3D geospatial extent in OGC\u2019s Well-Known Text (WKT) Geometry format (reference the OGC Simple Feature Access (SFA) specification). The meaning and order of values for each point\u2019s coordinates depends on the coordinate reference system (CRS). The ACDD default is 2D geometry in the EPSG:4326 coordinate reference system. The default may be overridden with geospatial_bounds_crs and geospatial_bounds_vertical_crs (see those attributes). EPSG:4326 coordinate values are latitude (decimal degrees_north) and longitude (decimal degrees_east), in that order. Longitude values in the default case are limited to the [\u2212180, 180) range. Example: \"POLYGON ((40.26 -111.29, 41.26 -111.29, 41.26 -110.29, 40.26 -110.29, 40.26 -111.29))\".\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">time_coverage_start\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes the time of the first data point in the data set. Use the ISO 8601:2004 date format, preferably the extended format as recommended in the Attribute Content Guidance section.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">time_coverage_end\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes the time of the last data point in the data set. Use ISO 8601:2004 date format, preferably the extended format as recommended in the Attribute Content Guidance section.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">time_coverage_duration\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes the duration of the data set. Use ISO 8601:2004 duration format, preferably the extended format as recommended in the Attribute Content Guidance section.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">time_coverage_resolution\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Describes the targeted time period between each value in the data set. Use ISO 8601:2004 duration format, preferably the extended format as recommended in the Attribute Content Guidance section.\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Appendix_B\">Appendix B<\/span><\/h3>\n<p>Examples of NCI\u2019s Quality Control (QC) and Quality Assurance (QA) reporting\n<\/p><p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:FigA1_Evans_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"87a79c0417d8cccbd9b471cd901ba4eb\"><img alt=\"FigA1 Evans Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/0\/09\/FigA1_Evans_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure A1.<\/b> An example of NCI\u2019s QC compliance report, which is shared with data producers and used to ensure that the dataset metadata meets the minimum requirements for a netCDF collection. In this particular example collection, 30 files were successfully scanned (zero skipped) and all elements of the QC process passed. In cases were elements are not fully compliant, the high\/medium\/low priority suggestions section at the end of the report is used to explain the nature of the errors found and list possible means for modification.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:FigA2_Evans_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"42f2c84fe0a0b96602b7c94ea19723b5\"><img alt=\"FigA2 Evans Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/1\/17\/FigA2_Evans_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:FigA2b_Evans_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"b4860fa6c86b40064f9a477cb05c822b\"><img alt=\"FigA2b Evans Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/2\/2d\/FigA2b_Evans_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:FigA2c_Evans_Informatics2017_4-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"82c4045c5e6640e41c23969d11797c02\"><img alt=\"FigA2c Evans Informatics2017 4-4.png\" src=\"https:\/\/www.limswiki.org\/images\/2\/21\/FigA2c_Evans_Informatics2017_4-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure A2.<\/b> An example of NCI functionality QA test report. (a) The first section of the report provides a short summary of results and whether the data is considered functional with all the tested tools, and lists the details of the files that were used for the assessment, including the properties of the files, such as size, variable shape, chunk size, and compression (deflate) level. (b) The second section provides the results for the functionality tests performed on the data, directly on the filesystem. (c) The third section provides the results of the functionality tests using the data served through NCI\u2019s THREDDS services.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-WangLarge14-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WangLarge14_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wang, J.; Evans, B.J.K.; Bastrakova, I. et al.&#32;(2014).&#32;\"Large-Scale Data Collection Metadata Management at the National Computation Infrastructure\".&#32;<i>Proceedings from the American Geophysical Union, Fall Meeting 2014<\/i>: IN14B-07.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Large-Scale+Data+Collection+Metadata+Management+at+the+National+Computation+Infrastructure&amp;rft.jtitle=Proceedings+from+the+American+Geophysical+Union%2C+Fall+Meeting+2014&amp;rft.aulast=Wang%2C+J.%3B+Evans%2C+B.J.K.%3B+Bastrakova%2C+I.+et+al.&amp;rft.au=Wang%2C+J.%3B+Evans%2C+B.J.K.%3B+Bastrakova%2C+I.+et+al.&amp;rft.date=2014&amp;rft.pages=IN14B-07&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-F11FAIR-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-F11FAIR_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.force11.org\/group\/fairgroup\/fairprinciples\" target=\"_blank\">\"The FAIR Data Principles\"<\/a>.&#32;Force11<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.force11.org\/group\/fairgroup\/fairprinciples\" target=\"_blank\">https:\/\/www.force11.org\/group\/fairgroup\/fairprinciples<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 23 August 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=The+FAIR+Data+Principles&amp;rft.atitle=&amp;rft.pub=Force11&amp;rft_id=https%3A%2F%2Fwww.force11.org%2Fgroup%2Ffairgroup%2Ffairprinciples&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EvansExtend16-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-EvansExtend16_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Evans, B.J.K.; Wyborn, L.A.; Druken, K.A. et al.&#32;(2016).&#32;\"Extending the Common Framework for Earth Observation Data to other Disciplinary Data and Programmatic Access\".&#32;<i>Proceedings from the American Geophysical Union, Fall General Assembly 2016<\/i>: IN22A-05.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Extending+the+Common+Framework+for+Earth+Observation+Data+to+other+Disciplinary+Data+and+Programmatic+Access&amp;rft.jtitle=Proceedings+from+the+American+Geophysical+Union%2C+Fall+General+Assembly+2016&amp;rft.aulast=Evans%2C+B.J.K.%3B+Wyborn%2C+L.A.%3B+Druken%2C+K.A.+et+al.&amp;rft.au=Evans%2C+B.J.K.%3B+Wyborn%2C+L.A.%3B+Druken%2C+K.A.+et+al.&amp;rft.date=2016&amp;rft.pages=IN22A-05&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RamapriyanEnsuring17-4\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-RamapriyanEnsuring17_4-0\" rel=\"external_link\">4.0<\/a><\/sup> <sup><a href=\"#cite_ref-RamapriyanEnsuring17_4-1\" rel=\"external_link\">4.1<\/a><\/sup> <sup><a href=\"#cite_ref-RamapriyanEnsuring17_4-2\" rel=\"external_link\">4.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ramapriyan, H.; Peng, G.; Moroni, D.; Shie, C.-L.&#32;(2017).&#32;\"Ensuring and Improving Information Quality for Earth Science Data and Products\".&#32;<i>D-Lib Magazine<\/i>&#32;<b>23<\/b>&#32;(7\/8).&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1045%2Fjuly2017-ramapriyan\" target=\"_blank\">10.1045\/july2017-ramapriyan<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Ensuring+and+Improving+Information+Quality+for+Earth+Science+Data+and+Products&amp;rft.jtitle=D-Lib+Magazine&amp;rft.aulast=Ramapriyan%2C+H.%3B+Peng%2C+G.%3B+Moroni%2C+D.%3B+Shie%2C+C.-L.&amp;rft.au=Ramapriyan%2C+H.%3B+Peng%2C+G.%3B+Moroni%2C+D.%3B+Shie%2C+C.-L.&amp;rft.date=2017&amp;rft.volume=23&amp;rft.issue=7%2F8&amp;rft_id=info:doi\/10.1045%2Fjuly2017-ramapriyan&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AtkinTotal05-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AtkinTotal05_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Atkin, B.; Brooks, A..&#32;\"Chapter 8: Service Specifications, Service Level Agreements and Performance\".&#32;<i>Total Facilities Management<\/i>.&#32;Wiley.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a>&#160;9781405127905.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Chapter+8%3A+Service+Specifications%2C+Service+Level+Agreements+and+Performance&amp;rft.atitle=Total+Facilities+Management&amp;rft.aulast=Atkin%2C+B.%3B+Brooks%2C+A.&amp;rft.au=Atkin%2C+B.%3B+Brooks%2C+A.&amp;rft.pub=Wiley&amp;rft.isbn=9781405127905&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CTSData-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CTSData_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.coretrustseal.org\/why-certification\/requirements\/\" target=\"_blank\">\"Data Repositories Requirements\"<\/a>.&#32;CoreTrustSeal<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.coretrustseal.org\/why-certification\/requirements\/\" target=\"_blank\">https:\/\/www.coretrustseal.org\/why-certification\/requirements\/<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 24 October 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Data+Repositories+Requirements&amp;rft.atitle=&amp;rft.pub=CoreTrustSeal&amp;rft_id=https%3A%2F%2Fwww.coretrustseal.org%2Fwhy-certification%2Frequirements%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-StallAGU16-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-StallAGU16_7-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-StallAGU16_7-1\" rel=\"external_link\">7.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">Stall, S.; Downs, R.R.; Kempler, S.J.&#32;(2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.scidatacon.org\/2016\/sessions\/100\/\" target=\"_blank\">\"AGU's Data Management Maturity Model\"<\/a>.&#32;<i>Auditing of Trustworthy Data Repositories<\/i>.&#32;SciDataCon 2016<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.scidatacon.org\/2016\/sessions\/100\/\" target=\"_blank\">https:\/\/www.scidatacon.org\/2016\/sessions\/100\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=AGU%27s+Data+Management+Maturity+Model&amp;rft.atitle=Auditing+of+Trustworthy+Data+Repositories&amp;rft.aulast=Stall%2C+S.%3B+Downs%2C+R.R.%3B+Kempler%2C+S.J.&amp;rft.au=Stall%2C+S.%3B+Downs%2C+R.R.%3B+Kempler%2C+S.J.&amp;rft.date=2016&amp;rft.pub=SciDataCon+2016&amp;rft_id=https%3A%2F%2Fwww.scidatacon.org%2F2016%2Fsessions%2F100%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-StallTheAmerican16-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-StallTheAmerican16_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Stall, S.; Hanson, B.; Wyborn, L.&#32;(2016).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/eresearchau.files.wordpress.com\/2016\/03\/eresau2016_paper_72.pdf\" target=\"_blank\">\"The American Geophysical Union Data Management Maturity Program\"<\/a>.&#32;<i>Proceedings from the eResearch Australasia Conference 2016<\/i>: 72<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/eresearchau.files.wordpress.com\/2016\/03\/eresau2016_paper_72.pdf\" target=\"_blank\">https:\/\/eresearchau.files.wordpress.com\/2016\/03\/eresau2016_paper_72.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=The+American+Geophysical+Union+Data+Management+Maturity+Program&amp;rft.jtitle=Proceedings+from+the+eResearch+Australasia+Conference+2016&amp;rft.aulast=Stall%2C+S.%3B+Hanson%2C+B.%3B+Wyborn%2C+L.&amp;rft.au=Stall%2C+S.%3B+Hanson%2C+B.%3B+Wyborn%2C+L.&amp;rft.date=2016&amp;rft.pages=72&amp;rft_id=https%3A%2F%2Feresearchau.files.wordpress.com%2F2016%2F03%2Feresau2016_paper_72.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CMMIDataMan-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CMMIDataMan_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/cmmiinstitute.com\/store\/data-management-maturity-(dmm)\" target=\"_blank\">\"Data Management Maturity (DMM)\"<\/a>.&#32;CMMI Institute LLC<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/cmmiinstitute.com\/store\/data-management-maturity-(dmm)\" target=\"_blank\">https:\/\/cmmiinstitute.com\/store\/data-management-maturity-(dmm)<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Data+Management+Maturity+%28DMM%29&amp;rft.atitle=&amp;rft.pub=CMMI+Institute+LLC&amp;rft_id=https%3A%2F%2Fcmmiinstitute.com%2Fstore%2Fdata-management-maturity-%28dmm%29&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NCIDataPortal-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NCIDataPortal_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"#\/home\">\"NCI Data Portal\"<\/a>.&#32;National Computational Infrastructure<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"#\/home\">https:\/\/geonetwork.nci.org.au\/geonetwork\/srv\/eng\/catalog.search#\/home<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=NCI+Data+Portal&amp;rft.atitle=&amp;rft.pub=National+Computational+Infrastructure&amp;rft_id=https%3A%2F%2Fgeonetwork.nci.org.au%2Fgeonetwork%2Fsrv%2Feng%2Fcatalog.search%23%2Fhome&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TaylorCMIP12-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TaylorCMIP12_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Taylor, K.E.; Balaji, V.; Hankin, S. et al.&#32;(13 June 2012).&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/pcmdi.llnl.gov\/mips\/cmip5\/docs\/cmip5_data_reference_syntax.pdf\" target=\"_blank\">\"CMIP5 Data Reference Syntax (DRS) and Controlled Vocabularies\"<\/a>&#32;(PDF).&#32;Program for Climate Model Diagnosis &amp; Intercomparison<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/pcmdi.llnl.gov\/mips\/cmip5\/docs\/cmip5_data_reference_syntax.pdf\" target=\"_blank\">https:\/\/pcmdi.llnl.gov\/mips\/cmip5\/docs\/cmip5_data_reference_syntax.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=CMIP5+Data+Reference+Syntax+%28DRS%29+and+Controlled+Vocabularies&amp;rft.atitle=&amp;rft.aulast=Taylor%2C+K.E.%3B+Balaji%2C+V.%3B+Hankin%2C+S.+et+al.&amp;rft.au=Taylor%2C+K.E.%3B+Balaji%2C+V.%3B+Hankin%2C+S.+et+al.&amp;rft.date=13+June+2012&amp;rft.pub=Program+for+Climate+Model+Diagnosis+%26+Intercomparison&amp;rft_id=https%3A%2F%2Fpcmdi.llnl.gov%2Fmips%2Fcmip5%2Fdocs%2Fcmip5_data_reference_syntax.pdf&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-USGSLandsat-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-USGSLandsat_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/landsat.usgs.gov\/what-are-naming-conventions-landsat-scene-identifiers\" target=\"_blank\">\"What are the naming conventions for Landsat scene identifiers?\"<\/a>.&#32;U.S. Geological Survey<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/landsat.usgs.gov\/what-are-naming-conventions-landsat-scene-identifiers\" target=\"_blank\">https:\/\/landsat.usgs.gov\/what-are-naming-conventions-landsat-scene-identifiers<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 23 August 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=What+are+the+naming+conventions+for+Landsat+scene+identifiers%3F&amp;rft.atitle=&amp;rft.pub=U.S.+Geological+Survey&amp;rft_id=https%3A%2F%2Flandsat.usgs.gov%2Fwhat-are-naming-conventions-landsat-scene-identifiers&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ISO19115-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ISO19115_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.iso.org\/standard\/53798.html\" target=\"_blank\">\"ISO 19115-1:2014 Geographic information -- Metadata -- Part 1: Fundamentals\"<\/a>.&#32;International Organization for Standardization.&#32;April 2014<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.iso.org\/standard\/53798.html\" target=\"_blank\">https:\/\/www.iso.org\/standard\/53798.html<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 25 May 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=ISO+19115-1%3A2014+Geographic+information+--+Metadata+--+Part+1%3A+Fundamentals&amp;rft.atitle=&amp;rft.date=April+2014&amp;rft.pub=International+Organization+for+Standardization&amp;rft_id=https%3A%2F%2Fwww.iso.org%2Fstandard%2F53798.html&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NASAGlossary-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NASAGlossary_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"#ed-glossary-g\">\"Granule\"<\/a>.&#32;<i>EarthData Glossary<\/i><span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"#ed-glossary-g\">https:\/\/earthdata.nasa.gov\/user-resources\/glossary#ed-glossary-g<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 23 August 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Granule&amp;rft.atitle=EarthData+Glossary&amp;rft_id=https%3A%2F%2Fearthdata.nasa.gov%2Fuser-resources%2Fglossary%23ed-glossary-g&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LLNLCFConv-15\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LLNLCFConv_15-0\" rel=\"external_link\">15.0<\/a><\/sup> <sup><a href=\"#cite_ref-LLNLCFConv_15-1\" rel=\"external_link\">15.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/cfconventions.org\/\" target=\"_blank\">\"CF Conventions and Metadata\"<\/a>.&#32;Lawrence Livermore National Laboratory<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/cfconventions.org\/\" target=\"_blank\">http:\/\/cfconventions.org\/<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 23 August 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=CF+Conventions+and+Metadata&amp;rft.atitle=&amp;rft.pub=Lawrence+Livermore+National+Laboratory&amp;rft_id=http%3A%2F%2Fcfconventions.org%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ESIPAttri-16\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ESIPAttri_16-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-ESIPAttri_16-1\" rel=\"external_link\">16.1<\/a><\/sup> <sup><a href=\"#cite_ref-ESIPAttri_16-2\" rel=\"external_link\">16.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/wiki.esipfed.org\/index.php\/Attribute_Convention_for_Data_Discovery_(ACDD)\" target=\"_blank\">\"Attribute Convention for Data Discovery 1-3\"<\/a>.&#32;Federation of Earth Science Information Partners<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/wiki.esipfed.org\/index.php\/Attribute_Convention_for_Data_Discovery_(ACDD)\" target=\"_blank\">http:\/\/wiki.esipfed.org\/index.php\/Attribute_Convention_for_Data_Discovery_(ACDD)<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 23 August 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=Attribute+Convention+for+Data+Discovery+1-3&amp;rft.atitle=&amp;rft.pub=Federation+of+Earth+Science+Information+Partners&amp;rft_id=http%3A%2F%2Fwiki.esipfed.org%2Findex.php%2FAttribute_Convention_for_Data_Discovery_%28ACDD%29&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IOOSCompliance-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IOOSCompliance_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/github.com\/ioos\/compliance-checker\" target=\"_blank\">\"ioos\/compliance-checker\"<\/a>.&#32;GitHub<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/github.com\/ioos\/compliance-checker\" target=\"_blank\">https:\/\/github.com\/ioos\/compliance-checker<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 22 November 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=ioos%2Fcompliance-checker&amp;rft.atitle=&amp;rft.pub=GitHub&amp;rft_id=https%3A%2F%2Fgithub.com%2Fioos%2Fcompliance-checker&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WangImprov17-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WangImprov17_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wang, J.; Yang, R.; Evans, B.J.K.&#32;(2017).&#32;\"Improving Seismic Data Accessibility and Performance Using HDF Containers\".&#32;<i>Proceedings from the AGU 2017 Fall Meeting<\/i>: IN42B-04.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Improving+Seismic+Data+Accessibility+and+Performance+Using+HDF+Containers&amp;rft.jtitle=Proceedings+from+the+AGU+2017+Fall+Meeting&amp;rft.aulast=Wang%2C+J.%3B+Yang%2C+R.%3B+Evans%2C+B.J.K.&amp;rft.au=Wang%2C+J.%3B+Yang%2C+R.%3B+Evans%2C+B.J.K.&amp;rft.date=2017&amp;rft.pages=IN42B-04&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-19\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Megies, T..&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/github.com\/obspy\/obspy\/wiki\" target=\"_blank\">\"obspy\/obspy\"<\/a>.&#32;GitHub<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/github.com\/obspy\/obspy\/wiki\" target=\"_blank\">https:\/\/github.com\/obspy\/obspy\/wiki<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 06 November 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=obspy%2Fobspy&amp;rft.atitle=&amp;rft.aulast=Megies%2C+T.&amp;rft.au=Megies%2C+T.&amp;rft.pub=GitHub&amp;rft_id=https%3A%2F%2Fgithub.com%2Fobspy%2Fobspy%2Fwiki&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CIG_SPEC-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CIG_SPEC_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Computational Infrastructure for Geodynamics.&#32;<a rel=\"external_link\" class=\"external text\" href=\"https:\/\/geodynamics.org\/cig\/software\/specfem3d\/\" target=\"_blank\">\"SPECFEM3D Cartesian\"<\/a>.&#32;University of California Davis<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/geodynamics.org\/cig\/software\/specfem3d\/\" target=\"_blank\">https:\/\/geodynamics.org\/cig\/software\/specfem3d\/<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 06 November 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=SPECFEM3D+Cartesian&amp;rft.atitle=&amp;rft.aulast=Computational+Infrastructure+for+Geodynamics&amp;rft.au=Computational+Infrastructure+for+Geodynamics&amp;rft.pub=University+of+California+Davis&amp;rft_id=https%3A%2F%2Fgeodynamics.org%2Fcig%2Fsoftware%2Fspecfem3d%2F&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IRIS_PH5-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IRIS_PH5_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.passcal.nmt.edu\/content\/ph5-what-it\" target=\"_blank\">\"PH5: What is it?\"<\/a>.&#32;IRIS PASSCAL Instrument Center<span class=\"printonly\">.&#32;<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.passcal.nmt.edu\/content\/ph5-what-it\" target=\"_blank\">https:\/\/www.passcal.nmt.edu\/content\/ph5-what-it<\/a><\/span><span class=\"reference-accessdate\">.&#32;Retrieved 18 October 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=bookitem&amp;rft.btitle=PH5%3A+What+is+it%3F&amp;rft.atitle=&amp;rft.pub=IRIS+PASSCAL+Instrument+Center&amp;rft_id=https%3A%2F%2Fwww.passcal.nmt.edu%2Fcontent%2Fph5-what-it&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KrischerAnAdapt16-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KrischerAnAdapt16_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Krischer, L.; Smith, J.; Lei, W. et al.&#32;(2016).&#32;\"An Adaptable Seismic Data Format\".&#32;<i>Geophysical Journal International<\/i>&#32;<b>207<\/b>&#32;(2): 1003\u201311.&#32;<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fgji%2Fggw319\" target=\"_blank\">10.1093\/gji\/ggw319<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=An+Adaptable+Seismic+Data+Format&amp;rft.jtitle=Geophysical+Journal+International&amp;rft.aulast=Krischer%2C+L.%3B+Smith%2C+J.%3B+Lei%2C+W.+et+al.&amp;rft.au=Krischer%2C+L.%3B+Smith%2C+J.%3B+Lei%2C+W.+et+al.&amp;rft.date=2016&amp;rft.volume=207&amp;rft.issue=2&amp;rft.pages=1003%E2%80%9311&amp;rft_id=info:doi\/10.1093%2Fgji%2Fggw319&amp;rfr_id=info:sid\/en.wikipedia.org:Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\"><span style=\"display: none;\">&#160;<\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Several URL from the original were dead, and more current URLs were substituted.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214193143\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.520 seconds\nReal time usage: 0.558 seconds\nPreprocessor visited node count: 15524\/1000000\nPreprocessor generated node count: 34944\/1000000\nPost\u2010expand include size: 92451\/2097152 bytes\nTemplate argument size: 31262\/2097152 bytes\nHighest expansion depth: 15\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 500.219 1 - -total\n 80.84% 404.380 1 - Template:Reflist\n 68.46% 342.460 22 - Template:Citation\/core\n 43.39% 217.065 15 - Template:Cite_web\n 25.39% 126.990 6 - Template:Cite_journal\n 12.88% 64.434 1 - Template:Infobox_journal_article\n 12.32% 61.644 1 - Template:Infobox\n 7.23% 36.176 80 - Template:Infobox\/row\n 5.26% 26.323 1 - Template:Cite_book\n 3.77% 18.871 25 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10770-0!*!0!!en!5!* and timestamp 20181214193143 and revision id 33834\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis\">https:\/\/www.limswiki.org\/index.php\/Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","b1e9f2666792cce972a4a66979d1d937_images":["https:\/\/www.limswiki.org\/images\/f\/f3\/Fig1_Evans_Informatics2017_4-4.png","https:\/\/www.limswiki.org\/images\/7\/7d\/Fig2_Evans_Informatics2017_4-4.jpg","https:\/\/www.limswiki.org\/images\/0\/09\/FigA1_Evans_Informatics2017_4-4.png","https:\/\/www.limswiki.org\/images\/1\/17\/FigA2_Evans_Informatics2017_4-4.png","https:\/\/www.limswiki.org\/images\/2\/2d\/FigA2b_Evans_Informatics2017_4-4.png","https:\/\/www.limswiki.org\/images\/2\/21\/FigA2c_Evans_Informatics2017_4-4.png"],"b1e9f2666792cce972a4a66979d1d937_timestamp":1544815903,"543e9e1292050a1bf977a4aa6ec97efe":{"type":"chapter","title":"1. Data science and big data","key":"543e9e1292050a1bf977a4aa6ec97efe"}},"link":"https:\/\/www.limswiki.org\/index.php\/Book:LIMSjournal_-_Fall_2018","price_currency":"","price_amount":"","book_size":"","download_url":"https:\/\/www.limsforum.com?ebb_action=book_download&book_id=78063","language":"","cta_button_content":"","toc":[{"type":"chapter","name":"1. Data science and big data","id":"543e9e1292050a1bf977a4aa6ec97efe","children":[{"type":"article","name":"A data quality strategy to enable FAIR, programmatic access across large, diverse data collections for high performance data analysis (Evans et al. 2017)","id":"b1e9f2666792cce972a4a66979d1d937","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:A_data_quality_strategy_to_enable_FAIR,_programmatic_access_across_large,_diverse_data_collections_for_high_performance_data_analysis"},{"type":"article","name":"Data science as an innovation challenge: From big data to value proposition (Kayser et al. 2018)","id":"3d10ab796a58a8bc8aa92318f0b8bfdb","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Data_science_as_an_innovation_challenge:_From_big_data_to_value_proposition"},{"type":"article","name":"The development of data science: Implications for education, employment, research, and the data revolution for sustainable development (Murtagh and Devlin 2018)","id":"795feead44bb9c43869be23a90bf9d75","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:The_development_of_data_science:_Implications_for_education,_employment,_research,_and_the_data_revolution_for_sustainable_development"},{"type":"article","name":"How big data, comparative effectiveness research, and rapid-learning health care systems can transform patient care in radiation oncology (Sanders and Showalter 2018)","id":"2c1bea416fe89e4530ea8d302ad49dbc","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:How_big_data,_comparative_effectiveness_research,_and_rapid-learning_health_care_systems_can_transform_patient_care_in_radiation_oncology"}]},{"type":"chapter","name":"2. Data sharing and open data","id":"498510331e1a6b3d31612804abb0edb3","children":[{"type":"article","name":"Technology transfer and true transformation: Implications for open data (Bezuidenhout 2017)","id":"8468ac745333952ccc234d2243224725","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Technology_transfer_and_true_transformation:_Implications_for_open_data"},{"type":"article","name":"Support Your Data: A research data management guide for researchers (Borghi et al. 2018)","id":"5084f989065d7c37f4ccf170c3f09ee7","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Support_Your_Data:_A_research_data_management_guide_for_researchers"},{"type":"article","name":"Promoting data sharing among Indonesian scientists: A proposal of a generic university-level research data management plan (RDMP) (Irawan and Rachmi 2018)","id":"bbf9b02ac710d05d03c94083fa4e01e0","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Promoting_data_sharing_among_Indonesian_scientists:_A_proposal_of_a_generic_university-level_research_data_management_plan_(RDMP)"}]},{"type":"chapter","name":"3. Information management tools and techniques","id":"ffae7f14a95b369e6cf6fadf97b0e00e","children":[{"type":"article","name":"systemPipeR: NGS workflow and report generation environment (Backman and Girke 2016)","id":"d6135e8d32b77d11c05c7b261fe72044","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:SystemPipeR:_NGS_workflow_and_report_generation_environment"},{"type":"article","name":"Wireless positioning in IoT: A look at current and future trends (e Silva et al. 2018)","id":"69cd9560f847d37e95c0bf5ffc36d532","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Wireless_positioning_in_IoT:_A_look_at_current_and_future_trends"},{"type":"article","name":"GeoFIS: An open-source decision support tool for precision agriculture data (Leroux et al. 2018)","id":"c443b688b80703848e965b29dc3cba01","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:GeoFIS:_An_open-source_decision_support_tool_for_precision_agriculture_data"}]},{"type":"chapter","name":"4. Information security and privacy","id":"305ec16624a81aa815b76c78f6cbba1a","children":[{"type":"article","name":"Big data in the era of health information exchanges: Challenges and opportunities for public health (Baseman et al. 2017)","id":"e29139b9d43cc4915ffca40cbc15f91c","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_in_the_era_of_health_information_exchanges:_Challenges_and_opportunities_for_public_health"},{"type":"article","name":"How could the ethical management of health data in the medical field inform police use of DNA? (Krikorian and Vailly 2018)","id":"9872ac73fcb8d8cb5b8b6de9ced82c60","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:How_could_the_ethical_management_of_health_data_in_the_medical_field_inform_police_use_of_DNA%3F"},{"type":"article","name":"Password compliance for PACS work stations: Implications for emergency-driven medical environments (Mahlaola and van Dyk 2017)","id":"51ebf6ac1bdd905b9c8c8d23fe8b8a29","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Password_compliance_for_PACS_work_stations:_Implications_for_emergency-driven_medical_environments"}]}],"settings":{"show_cover":"1","show_title":"1","show_subtitle":"0","show_full_title":"1","show_editor":"1","show_editor_pic":"1","show_publisher":"1","show_language":"1","show_size":"1","show_toc":"1","show_content_beneath_cover":"1","cta_button":"1","content_location":"1","toc_links":"disabled","log_in_msg":"<span><\/span> Please log in to read online.","cover_size":"medium"},"title_image":"https:\/\/s3.limsforum.com\/www.limsforum.com\/wp-content\/uploads\/Fig2_Evans_Informatics2017_4-4.jpg"}}
    LIMSjournal - Fall 2018
    Volume 4, Issue 3
    Editor: Shawn Douglas
    Publisher: LabLynx Press
    Copyright LabLynx Inc. All rights reserved.