The GnosisLIMS Project: Creating an Open Source Next-Generation LIMS

The GnosisLIMS Project: Creating an Open Source Next-Generation LIMS

 

By: The GnosisLIMS Project Group

Website: http://www.gnosislims.org

 

The GnosisLIMS was started be a small cadre of experienced LIMS designers and implementers after years of frustrating temporary successes and near misses.   Their vision for GnosisLIMS was an Open Source LIMS framework using web-enabled software tools that are relatively easy to program and administer and that will provide the flexibility for LIMS to evolve gracefully and to interact with other LIMS implementations.  Under the Open Source philosophy improvements to the framework would be shared throughout the community.

 

Historically, LIMS designs and implementations are complicated, each laboratory has its own unique work-flow and data management requirements.  Modeling of the data, especially as it pertains to quality control, quality assurance and privacy, increases the design complexity..  Finally, LIMS user and machine interfaces require tremendous amounts of attention. 

 

When the GnosisLIMS project was launched in August of 2000, its goals were ambitious.  The project aimed at creating a cross-platform, extensible and customizable LIMS system that could be modified to work with most if not all laboratories regardless of industry.  In addition, laboratories would be able to link themselves together into secure laboratory networks which would allow labs to share samples, data and results. 

 

Originally, the project was based on the J2EE platform.  Specific database engines would be usable via database plug-ins, and the client and server would communicate via XML.  The first database chosen was a java-based object oriented database system called Ozone.

 

After approximately a year and half of development, it became clear that either our requirements for GnosisLIMS would have to change, or the project would need to find alternatives to the J2EE platform. 

 

Specifically, as prototype data entry forms were created, it became clear that the idea of networking laboratories (or departments within a single laboratory) posed specific problems.  For example, if a laboratory within a network had a specific customization which was incompatible with other implementations, how could a system handle this incompatibility?  In other words,, if Laboratory A is sharing data with Laboratory B, and there is a field that A has customized to be a string, but B has kept as the default integer,  how would the system resolve that data collision?  Worse, if B had an additional required field , how would a user at laboratory A be able to account for the required field.

 

During this time, one member of the GnosisLIMS Project Group was working on a project called “cathbad”[1].  Cathbad is an XML language that allowed for the description of “Wizards.”  The idea is that an application could easily extend its user interface by allowing users to create simple wizards to complete complex tasks.  We realized that with some minor modifications, Cathbad could easily be extended to a generalized user interface description language, and that cathbad files could be accessed through a web service.  Our development of this concept showed that it would resolve the “required field” problem.  Specifically, if a user at Laboratory A can simply download the user interface from Laboratory B and use that interface to enter data in Laboratory B directly.

 

Cathbad is based on the Python programming language, and several of its features would be difficult to implement in a statically typed and compiled language such as Java.  Therefore, we decided that the client libraries would be written in Python, and the GnosisLIMS Server would be written in java.  All that remained was resolving the data collision problem.

 

Around this time, the Open Source Application Foundation (OSAF), started work on a calendar sharing functionality.  OSAF adopted a protocol called Web Distributed Authoring and Versioning, or WebDAV.  This protocol, used currently by many web masters, provides capabilities similar to a database, as well as having extensions for searching, security, and version control.  WebDAV protocols capabilities are particularly useful to scientific organizations in general and LIMS laboratories in particular[2].

 

WebDAV provides the ability to associate properties with objects at anytime.,  and it deals with the collision problem by assigning each property an XML namespace.  For example, if Laboratory A’ has customized their LIMS, the customized property will exist in Laboratory A’s name space, and Laboratory B can continue to use the standard version of the same property which will exist in the standard namespace.

 

WebDAV not only solves the data-collision problem, but provides a sound architecture for distributing loads across multiple databases, co-location capabilities and other aspects of enterprise level architecture that can be major issues for large installations.  Because WebDAV is an extension of the HTTP protocol, portable and mobile devices can easily read and display data from Gnosis repositories.  WebDAV’s version control extensions provide audit trail and “undo” capabilities that many laboratories desire.

 

Currently available implementations of WebDAV lack some features that are required by larger labs.  Specifically, currently implementations of WebDAV fail the “ACID” test for relational databases (Atomicity-Consistency-Isolation-Durability) and do not support SQL expressions.  Pending the availability of these features in WebDAV, development of the basic logic and user interfaces for GnosisLIMS can proceed using available implementations such as Apache’s mod_dav module.

 

To address the remaining requirements in the protocol, the GnosisLIMS Project will be creating its own implementation of a WebDAV server.  This “Logos Repository Server” will add support for ACID, and read-only SQL access for reporting tools.  It will provide a bridge between the WebDAV clients and a relational database, much like ODBC or JDBC bridges a database client and server.

 

Having selected the main technologies, we now addressed  the data-access layer on the client.  Since WebDAV is essentially a light-weight object-oriented database, we decided to use this approach on the client.  The Gnosis Framework provides an easy to use and powerful access library to WebDAV repositories.  When a resource is retrieved, its WebDAV properties become Python properties, if the object contains a list of other objects, this is converted into a simple Python list.  It also adds typing support to WebDAV, allowing the LIMS developer to develop classes and apply those to resource in the WebDAV repository.  In order to ensure that data is transmitted quickly, extensive use is made of compression and lazy-loading techniques which prevent a client for having to receive more information than it needs.

 

We are planning to submit our usage of WebDAV and Cathbad to the IETF[3] as an RFC (Request For Comments), the basic building block of Internet standards. This will enable closed and open source LIMS implementations to take advantage of the technology developed by the GnosisLIMS project.

 

The Logos Repository Server   currently is being designed, programming is expected to start during the summer months.   Our current road map calls for the release of a functional LIMS system within the next year.  However, this road map was developed with assuming that the three GnosisLIMS Project Group developers would be able to work on the project full time.  So far this has not been possible, but we are seeking to acquire sponsorships for all three.

 

This change in the basic approach to LIMS will enable the creation of a LIMS system that can easily modify its work-flow, user interface, data model and any other aspect of the LIMS.  Additionally, the specific technologies selected will allow most laboratories or vendors to customize their LIMS by simply modifying a few XML files and delivering them to a WebDAV repository.  Scalability and stability issues are reduced to simply following well known techniques for stable and scalable web server farms.

 

 

[1] The original cathbad presentation “Wizards! And Druids! And Users, Oh My!” can be found at http://www.myddrin.com/writing/formal/Cathbad_present.html

 

[2] For example, see http://aspen.ucs.indiana.edu/gce/C535EccePNL/c535eccepnl.pdf

 

[3] IETF is the organization that oversees the core protocols of the Internet.  All of your Internet based tools such as web browser, ftp client and email client follow the standards put forth by the IETF.  Their website is http://www.ietf.org/ .