Identifying Files by Digital Signature

 

Identifying Files by Digital Signature
By Jeff Vannest, Senior Consultant

Eventually in the lifecycle of every project someone will look at a piece of custom software and think to themselves, “Is this the right version or did I put the new version on my other computer?” While it seems silly for a company that likely spent thousands of dollars analyzing, developing and validating that particular piece of software, this happens in even the best environments. Let’s look at some ways a company can establish the identity and integrity of custom code using digital signatures.

A file’s digital signature, which is sometimes called a checksum, is actually a cryptographic hash function. To break it down, a hash function (or simply “hash”) is a computer subprogram that converts a large amount of input into a smaller output, and the steps used to convert the data are called an algorithm. Adding cryptography means converting the output into a format that is incomprehensible and unreadable without special knowledge, for example, a password or electronic key. Adding encryption to the steps used by the hash makes it a special kind of algorithm called a cipher. This process is called encryption and is the basis for all electronic security on the internet, cell phones, pagers, cordless phones and ATMs.

When applied to a file, the computer program creates a hash using every byte of the file to define the character of the output. For example, take two files, one containing the text “Return 1” and the other containing “Return 2”. Using the MD5 algorithm the first file returns “6233dc16a56d0b4b02d94aa9e9f3fab6” and the second file returns “4df60fece3bbf3590d0bf1943d041f8e”. As you can see, the digital signatures of these files are very different even though they differ by only one character.

At J&R Consulting, Inc, we use digital signatures to characterize files for review and delivery to customers. Because our customers are usually pharmaceutical companies, great emphasis is placed on the veracity and reliability of developed, tested and delivered software. (By emphasis, I mean a certain department in the U.S government called the FDA, of course!) Let me briefly discuss how we track a file from initial development to final, customer delivery, and the part played by the digital signature.

When a file is first created, a J&R designer checks that file into a source code control tool. Currently we use two tools, CVS and Subversion, both having been discussed in further detail by Rob, my fellow employee and friend. Having been checked into the control tool, the file now has longevity and reliability. Since the tool is organized by customer project, we can maintain the tool on behalf of the customer until final delivery, and even deliver the entire project repository to the customer on CD-ROM (or DVD-ROM depending on size), which contains not only the final version of all software, but all revisions of all software going back to the first draft. As the file progresses through the software lifecycle – initial development, peer code review, unit testing, integration testing, system acceptance testing, installation qualification, operational qualification, performance qualification and business unit acceptance testing (dizzying, I know) – the file can be tracked by version number, which is automatically maintained by the control tool.

When the customer is ready to accept the final software distribution (typically just before the system acceptance testing phase), digital signatures are calculated for all files and delivered with the distribution. These digital signatures can be calculated per file, per folder and per distribution. Once delivered, these signatures can be used in several ways.

A distribution digital signature – which is a single hash of the entire distribution – can be used to indicate whether the distribution is complete and error-free. For example, let’s say a software distribution is delivered on CD-ROM, without using a digital signature for the entire CD, how can the integrity of the CD itself be confirmed? What if the CD has physical defects that are either present when burned, or are created during transfer to the customer? You’re in luck! A distribution digital signature can verify the integrity of the entire CD.

A folder digital signature – which is a single hash of each folder in the distribution – can be used to indicate whether a single folder is complete and error-free. For example, let’s say a folder on the CD named “Reports” is transferred to a Terminal Services server during the system acceptance phase. When it comes time to execute the operational qualification, what guarantee is there that one of the files have not become corrupt during a routine nightly backup, or was written to a block of the disk drive that has gone bad? You’re in luck! A digital signature calculated on the server folder can be compared against the signature delivered to the customer confirming that the entire Reports folder is complete, unaltered, and free of defects.

A file digital signature – which is a hash of a single file in the distribution – can be used to indicate that a file has not changed. For example, let’s say that a lengthy configuration file is altered during installation qualification. When it comes time to execute acceptance testing by the business unit, how can testers be certain that the system configuration settings have been returned to the delivered state? You’re in luck! A digital signature re-calculated on that file can indicate whether every character in the file was returned to the default settings.

But that’s not all. Long after the system is delivered any person with access to the tool can calculate a digital signature for a file on their system and be assured that it is the same version as the most recent file in the source code control tool. This is essential when executing a post review of a system change control that seems to have gone wrong. Sure, you could try to rely on timestamp or file size, but neither of those methods is foolproof whereas a digital signature is completely reliable.

On a final note, you might be asking yourself, “Where can I get a tool to calculate digital signatures on my files?” Contact us. We can help you choose the right tool or build one specific to your needs. We can even provide the detailed design and test scripts necessary to pass the testing and validation rigors of a Part 11, FDA-controlled environment. However you choose to address the problem, don’t fail to overlook the great degree of control digital signatures can provide your environment.

About the author: Jeff Vannest is a Senior Consultant at J&R Consulting, Inc, a consulting firm specializing in LIMS systems, and excels in technical design, development and implementation. Weekly articles on LIMS software and contact information can be found at http://www.jandrconsult.com