Importing Spreadsheets and CSV Files into LabVantage LIMS

Importing Spreadsheets and CSV Files into LabVantage LIMSTwo of the most popular file formats used in science settings are Excel spreadsheets and CSV (Comma Separated Value) files. So, it comes as no surprise to hear that they are heavily relied on in the laboratory informatics market as well. Teams using a LIMS frequently collaborate on analyzing test result data, which often involves columns and tables. Formulas to analyze the data can be added via most spreadsheet editing programs, and these calculations may be crucial to seeing patterns in the data.

Laboratory and clinical results, whatever their source, are often converted to spreadsheet format. Excel spreadsheets or tabular CSV files provide a way for a scientist or technician to interact with and share data in a well-established format.

If you’ve decided to migrate your data into your LabVantage LIMS solution, data import systems can be created for Excel and CSV file types, even when they are complicated in structure. However, it’s important to have trained LabVantage administrators create the data import systems to be sure they are well written. When done properly, they can reduce time and errors exponentially. Properly structured data opens up powerful viewing, querying, and reporting possibilities, which can greatly benefit users.

In this blog, we’ll discuss some of the options for importing Excel and CSV files into LabVantage, with special emphasis on Data File Definitions, one of the most powerful and flexible tools included with the application.

Spreadsheet vs. Database—and Why You Should Care

Many scientists have been taught that a spreadsheet is a database, while many data experts would disagree, they’re both right at some level. The main difference, in practice, between a database and a spreadsheet is that a database tends to follow rules about data organization that are not required by a spreadsheet.

The keys to success when creating an Excel or CSV file import system are data consistency, organization, and analysis. In a properly designed database, a given piece of data should only appear in one place. For instance, the price of a widget should only exist in one table, no matter how many views it will be displayed in.

This is important because when importing data into LabVantage, organization matters a lot. If input data has duplications, then a decision has to be made about removing it, or establishing which location is definitive.

Data integrity and organization considerations include the following:

  • Coordination between the creators of the input data and LabVantage administrators should happen early.
  • The most important thing is that the data to import has the same format every time. Only the data in spreadsheets should change, never the layout.
  • Data needs to be analyzed carefully to understand the best way to map into LabVantage.
  • Data appears once in the LabVantage database, no matter how many Views are created.
  • If done properly, no one will ever have to ask “which data source is correct?”
White Paper: "An Introduction to LabVantage LIMS Master Data"

LabVantage Tools for Data Migration

If you make the decision to move or import your data from spreadsheets and delimited files into LabVantage LIMS, there are two primary tools you can use for data migration: Data File Definitions (DFDs) and using the Application Programming Interface (API). As using the API is an advanced option requiring programming of multiple systems, this blog will focus on DFDs. DFDs can be understood and used without the programming skills required for using the API. DFDs are exportable and can be copied from one system to another.

Data File Definitions

Highlights of working with Data File Definitions in LabVantage include the following:

  • LabVantage provides a Graphical User Interface, which allows administrators to preview a spreadsheet and select where the data will be (see Figure 1).

Figure 1. Upon importing a test version of a spreadsheet, the LabVantage user interface allows administrators to map the “shape” of the data correctly

  • Validation rules can be defined for each field in the imported data. In the example shown in Figure 1, the Type must be one of the allowable values defined in a Reference Type (akin to a drop-down list). Other types of validation might ensure that a measurement is in a sensible range, or a needed value is present. This can often prevent mangled data from appearing in the system; validation rules ensure that bad data doesn’t make it past the import stage.
  • LabVantage also provides a visual editor (see Figure 2) which allows administrators a way to define the steps required during an import. Various tasks such as emailing or sending a bulletin to a user can be added. Data imported with Data File Definitions can be audited for record-keeping and troubleshooting. LabVantage out-of-the-box (OOTB) provides dozens of predefined steps covering many of the possible actions needed for a data import system. For more complicated business logic, custom code can be created and made available as an import step.

Figure 2. The visual editor allows administrators to design the logic of fine-grained tasks needed during data import. In this example, business logic would determine whether to import a row of data, or send a bulletin that a condition was not met.

  • When there is a requirement to connect LabVantage to another system, the question of what that actually means should be asked. Perhaps connect means to establish a pipeline that will deliver CSV files from another system to a LabVantage system. DFD imports can be scheduled or operated by other triggers. If scheduled import of data was all that was needed, then costly and complicated API programming could be avoided. Administrators without a programming background could operate and maintain a system which did not require any or much custom code.
  • The creation of an import system usually comes with the side benefit of understanding the structure of the data better. The Data File Definition must work for the data to be imported in the first place, and that should mean that there will be a normalized architecture for the loaded data. For example, when price appears on two different sheets of a spreadsheet, the DFD creation process accounts for this. When the data is stored inside LabVantage, there will only be one copy of a given piece of information in the database.
For a deeper dive into considerations beyond data migration when upgrading LabVantage LIMS, request our webinar
  • This normalization of data means that views and queries should be much easier to create. If the user needs to see a widget price in two different reports, reports and other views can find and join data to give users views that make sense to them. Of course, if users are accustomed to looking at their data in the same format as the import files, that can be replicated within LabVantage.
  • Scheduled or on-demand data import is available through any of the well-known LabVantage functionalities such as Event Plans, Workflows, or Scheduled Tasks. Data can be automatically processed and placed in the correct stage of its lifecycle, ready for users to begin working.
  • For security and auditing purposes, a record of data changes can be made via import logs, and any data item in LabVantage can be audited.

Importing complicated data will always be challenging. It can be very difficult to communicate to users that things that make intuitive sense to them are often hard for computers to understand. Incoming data must be consistent, but LabVantage DFDs and other techniques can be made sophisticated enough to allow for variations. If a new version of an import data comes along, it should be relatively easy for trained administrators to copy existing well-made DFDs to account for new types of import data. For initial set up of importing, it is wise to consult experts who understand both LabVantage and data science. A small allocation of time setting up your LabVantage import systems could mean exponentially faster processes and a much lower error and issue rate.

➞ Additional Reading: Migration of Biobank Data from Multiple Systems to LabVantage 8 LIMS

What was the most complicated issue you have encountered during data import? How did you resolve it?