Paving the way of the future in R&D

Electronic lab notebooks: Paving the way of the future in R&D

Lysakowski, Rich

Managing intellectual property records electronically can be a daunting task. For organizations that are regulated by the Federal Drug Administration (FDA), Environmental Protection Agency (EPA) and others, moving into the nebulous world of electronic records feels risky. However, electronic management of research and development (R&D) records is the way of the future.

Researchers at TeamScience, a science-oriented software research company, initiated a study that detailed the vision, the business requirements, and the software functionality requirements for companies to be effective using the new class of systems called R&D Team Computing Systems, that

include the functions of electronic notebook systems, including reliable recordkeeping, sophisticated project data management and distributed team collaboration functions.

This article addresses the following issues: the unique computing requirements of R&D teams; current problems in the R&D software marketplace; why the web is not yet a serious place for R&D recordkeeping; why paper notebooks are inadequate for the collection and management of intellectual property records; and a summary of the R&D team computing study results. The article also addresses the cost/benefit analysis of collaborative electronic notebooks from the perspective of notebook administrators, records managers, scientists, micrographics professionals, proprietary information professionals, legal and patent departments and others.

Managing intellectual property records electronically can be a daunting task. For organizations that are regulated by the Food and Drug Administration (FDA), Environmental Protection Agency (EPA) and others, moving into the nebulous world of electronic records feels risky. However, electronic management of research and development (R&D) records is the way of the future. Most U.S. federal agencies have issued regulations or guidelines regarding the capture and usage of fully electronic records. For example, on March 20, 1997, the FDA issued a bonafide regulation that allows electronic records and signatures to be submitted in lieu of paper records and handwritten signatures. But representatives of the FDA, Patent and Trademark Office (PTO), and EPA have said they will not mandate particular technologies. In the wake of these announcements, pharmaceutical, chemical, biotech, and other regulated industries have been left clamoring for information on how to move forward confidently and benefit from the use of electronic laboratory notebooks and related recordkeeping systems.

Recently, researchers at TeamScience, a science-oriented software research company located in Woburn, Massachusetts, initiated a research study called the “R&D Team Computing Study.” The study was commissioned by nine multinational corporations in the pharmaceutical, chemical, biotechnology, and related industries. The study detailed the vision, the business requirements, and the software functional requirements for companies to be effective using the new class of systems called R&D Team Computing Systems, that include the functions of electronic notebook systems but go beyond them to include sophisticated project data management and distributed team collaboration functions.

This article addresses the following issues: the unique computing requirements of R&D teams; current problems in the R&D software marketplace; why the web is not yet a serious place for R&D recordkeeping; why paper laboratory notebooks are obsolete; benefits that scientists will gain from collaborative electronic notebooks; the cost/benefit analysis of collaborative electronic notebooks; and a summary of the R&D team computing study results.


A primary goal of the R&D Team Computing Study was to compare and evaluate commercial candidates for integrated electronic laboratory notebooks, records management, document management, groupware, and collaborative computing systems that support both individuals and teams of scientists in R&D and testing laboratories. In addition to researching the systems themselves, we also compared and evaluated their suppliers’ ability to address the business needs of the companies in the targeted industries. Only commercially available systems were evaluated because people wanted technologies that they could use to implement systems immediately. Although this study evaluated individual products, we wanted to use this opportunity to construct a framework for the evaluation of all R&D teamcomputing systems. A related goal was to construct a detailed blueprint for systems based on the requirements identified.

A primary study conclusion is that no existing vendors or products on the market meet the full range of needs required for electronic lab notebooks or R&D team-computing systems. A few generic systems are moving in the right direction, but still require an inordinate amount of work to sufficiently focus them on the needs of R&D and testing organizations. Commercial “out of the box” systems will likely require 3 to 5 more years to implement the majority of specialized functions required by R&D professionals working on global projects.


Teams ranging from 3 to 25 people working in large R&D and testing organizations of 5,000-10,000 people were the targets for the study. The teams worked in many different parts of basic and applied research, applied research, product development, quality assurance/quality control laboratories, and manufacturing labs. Eighty-six scientists (end-users), information technology (IT) managers, and R&D managers completed extensive questionnaires. The topics included: 1) problems with information systems and laboratory automation systems, 2) priorities for new software functions and features for electronic lab notebooks, records management, and R&D team project data management and collaboration systems, 3) current and near future computing platforms and standards, 4) social aspects of collaborative computing systems, 5) current methods for using paper notebooks, 6) social aspects of team collaboration, and 7) training requirements and investments by their organizations.

R&D Teams Have Unique Computing Requirements

Extensive research was done on electronic recordkeeping and records management as part of the R&D Team Computing Study. The research results consistently show that R&D records possess complex and unique characteristics. First, R&D activities are always “projectoriented”; they comprise a closed set of activities within a fixed duration that result in specific project deliverables. Often, those deliverables are materials, equipment, or reports, although they may be simply conclusions or presentations of results. While the project orientation of R&D may not be particularly unusual (many businesses are run using projects), R&D project structures are extremely variable and ad-hoc. They can change significantly and quickly.

Second, R&D projects are frequently interrelated. Project data may be similar in one set of dimensions yet highly distinct in another. This implicitly links projects across several dimensions. In computer terminology such project data are “hyperlinked.”

Third, excellent data management systems are essential. Because projects are rapidly spawned from one another, it is easy for a project portfolio to become unmanageable if the data are not carefully tracked and recorded. For example, in a medium-sized chemical company, there may be hundreds to thousands of projects open at once. Sometimes thousands of people scattered across the globe must participate in the generation, organization, review, summary, and release of this project information. Companies must have sophisticated ways of simplifying the inherent complexity of their project records management tasks.

Fourth, R&D laboratory record types are among the most varied anywhere. Typical data types encountered include text, images, graphics, structures (molecular, protein, DNA), spectral data, protocols, audio, video, and others-each with dozens of variations of data formats which change with successive releases of software. This diversity is driven by the reality that scientific instrument vendors still operate largely like guilds of craftsmen, each creating unique hardware devices, each with some unique software interfaces. Hundreds to thousands of devices are left to the automation specialist to interface. Automation of lab data management is challenging because there is so little consistency or standardization.

Finally, R&D records are held as intellectual property for decades. Research records as found in paper notebooks are the primary source of substantive records to backup patent litigation. They are used extensively in patent interference proceedings. Patents are granted and companies maintain their competitive advantages in the marketplace based on the reliability, trustworthiness and accuracy of these records. Hence, these records form a key part of the basis of free market economics around the world.

The characteristics of R&D records mandate special design requirements for electronic recordkeeping systems:

Audit trailing and journaling requirements.

Cross-linking of project data. Non-repudiation technologies (electronic signatures, witnessing, etc.).

Controls to ensure privacy and security.

Formats for import, export, and archiving records.

Systems must satisfy these requirements before the patent community and regulatory agencies (e.g., FDA, EPA and PTO) will feel comfortable conducting their transactions totally electronically, and before users will feel comfortable routinely using computing systems for R&D projects. The unique design requirements for electronic recordkeeping are best met with an integrated set of functions already listed, plus others too numerous to list here.

Initially, industry seemed to be waiting for government agencies to mandate the acceptability of electronic records and technologies. Now agencies and corporations are realizing that it is their records management procedures, and not simply technologies, that determine the veracity of records, whether in electronic or paper form. The procedures for electronic records management differ little from those for paper records. Representatives of the FDA and PTO have said they will not mandate particular technologies. Once industry provides solutions in each of these areas, these agencies will perform their assigned duties to inspect and regulate implementations of them. This should come as no big surprise to records managers. However, the adage, “the devil is in the details,” is nowhere else truer than in trying to integrate legal, business and technical systems.

Current Problems in the R&D Software Marketplace

Our research led us to conclude that the R&D software marketplace is served by a collection of niche vendors. Collaborative teams in R&D require selected subsets of functions from niche product classes, including Library Information Management Systems (LIMS), Data Acquisition and Instrument Control Systems, Groupware, Shared WhiteBoards and Databases, Workflow and Document Management, Chemical Information Databases, Electronic Notebook Systems, Electronic Records Management Systems and Electronic Archiving Systems.

Vendors often try to sell their products as “full solutions” when they are really just “point solutions.” This is even true of a seemingly broad class of software such as groupware and document management systems. The range of required functionality is too broad and the set of available technologies too narrowly and poorly integrated across these classes of automation functions. R&D teams need the ability to capture all work products as records, schedules, project design processes and the actual project flow. Document management systems are good at capturing final work products (documents, drawings, and similar data) and in-process work products that generate them. However, these systems were not designed, and thus do not serve well, for calendaring and scheduling applications. Capturing collaborative decision-making sessions continues to be difficult, often requiring a few hours to prepare transcripts to put them into the electronic capture and retrieval system. More importantly to R&D records managers, these systems were not designed to do recordkeeping or records management, and thus are missing key functions outlined in the previous section.

Finally, none of these classes of systems captures projects, data, and their interrelationships well, if at all. Some projects and data relationships are implicit and should be captured automatically by the infrastructure. Other relationships are explicitly established by scientists as they work and are best captured manually. Capturing data and project relationships makes it possible for scientists to drill back to earlier projects and data and make contextual inquiries-not just content searches. To capture such relationships effectively, tools must be easy to use, and preferably transparent to the user.

The World Wide Web is Not Yet a Serious Place for R&D Recordkeeping

Many people have asked, “Why not use the World Wide Web as the basis for a collaborative electronic lab notebooks or R&D team computing systems?” The technology and Internet infrastructure still possess serious deficiencies in at least seven functional areas:

Audit trailing features that log who, what, when, where, and with what computer applications and versions work was done.

Authentication of users and time certification of documents. Data and hyperlink integrity, security, and reliability. Records export and archiving standards for digital information.

Highly interactive authoring tools that permit average computer users to author documents easily.

Interoperability with commercial object standards such as Object Management Group’s Common Object Request Broker Architecture (CORBA), Object Linking & Embedding (OLE), Distributed Common Object Model (DCOM), Java Beans, and others. Formal document management and workflow subsystems for handling complex compound documents.

These deficiencies must be corrected before they can be very widely deployed for capturing and managing R&D records.

Functions of Lab Notebooks

Laboratory notebooks are one of the most familiar and common applications of R&D team computing systems. Currently, most laboratory records exist in paper recordbooks. It is the place where scientists record primary lab data and experimental observations; it is also used to record references to external data and literature sources.

From a legal and corporate perspective, laboratory recordbooks are the primary source repository for R&D intellectual property. They are used to substantiate claims within patent applications and to defend patent licenses in the event of interferences. They are also a key part of records management programs in R&D organizations. Paper notebooks have the important advantages of convenience, ease of use, portability, and acceptance by legal and regulatory authorities. In order to replace paper notebooks, electronic notebooks must not only offer similar ease of use but significant additional advantages for records and information management and retrieval.

Why Paper Notebooks Are Obsolete

Paper recordbooks are obsolete for the following reasons:

1. Information put on paper is lost. Unless data stored on paper are carefully indexed into electronic databases, it is lost to everyone but the original author(s).

2. Information stored on paper records cannot be shared or distributed easily. Photocopying is the only practical way. For notebooks this is not a legally accepted practice because it has the potential to create multiple operational copies of records.

3. Data capture is archaic. Manual cutting and pasting of data and instrument printouts into notebooks using scissors and cellophane tape are still a very common practice in modern laboratories. This is incredibly time-consuming.

4. Paper notebooks have limited capacity. Modern experiments such as High-Throughput Screening (HTS), GC/MS, 2D or 3D NMR, and real-time video imaging are commonplace in modern science; R&D experiments can easily generate terabytes of data per year if used routinely. Notebooks simply cannot handle the volume of raw data generated by such techniques.

5. Many activities involving notebooks are repetitive and/or tedious and can be done more easily electronically. Authenticating (signing), date time stamping, and witnessing are done many times during a notebook’s life. Experimental protocols are often captured in notebooks and copied into instrument systems or databases many times. Keyword lists in databases must be created for each experiment to facilitate and ensure proper indexing and retrieval operations later. Building the Table of Contents is an important step in closing out a recordbook, but it is often ignored or poorly done because it is inordinately time-consuming.

6. Bound notebooks are still used in many organizations because they are thought to be the required format for managing laboratory records. In fact, the legal requirements for record traceability, reliability, and trustworthiness have historically dictated the common research notebook format that uses bound, numbered pages. Having bound, numbered pages makes it easier to prove if a record was entered out-ofsequence, altered, or torn out.

Scientists Spend Significant Amounts of Time on Data Transcription

The R&D Team Computing Study included a comprehensive Business Needs Assessment that produced 85 variously detailed responses. Sixty scientists responded to the question, “How many minutes per week do you spend transcribing data from all sources into paper notebooks?” The findings were that, on average, 14.3% of a highly trained scientist’s time is taken up by transcription. Improved automation tools could enable scientists to focus on their primary missions and goals, increasing their productivity by eliminating countless hours of recordkeeping tasks.

Scientists Frequently Refer to Old Notebooks

Scientists frequently refer to notebooks after they have been filled with data. A total of 59 scientists responded to the question, “How often do you need to locate data in an old notebook?” Scientists must locate data often enough to merit having notebooks online and searchable for the individual authors as well as their colleagues. Benefits Expected by Teams of Scientists

When we asked scientists working within a team context to explain what benefits they would expect, certain items were consistently voiced. Scientists commented that they would expect better search functionality, a more efficient witnessing process, and the elimination of having to repeat satisfactory experiments (i.e., ability to find experiments already done to satisfaction).

Scientists working on teams also have a number of more individual expectations regarding automated systems. The more computer literate scientists are, the more often they ask for sophisticated group computing capabilities like integrated videoconferencing, shared windows, and project scheduling and resource management tools. In fact, for more advanced users, the electronic notebook software becomes much more than just a recordbook-it becomes a collaborative, intelligent working environment for project support.

Cost/Benefit Analysis of Collaborative Electronic Notebooks

A follow-on study of the costs and benefits of paper versus electronic notebooks was recently undertaken. Some preliminary results of time and money improvements are given later in this article. Scientists spend an average of 5% to 7% of their time on manual paper notebook processes, including formatting, cutting and pasting data, transcribing data, indexing, and otherwise assembling data into notebooks. This is viewed as a conservative average across many types of scientists and laboratories. For regulated laboratories, the numbers can be significantly higher (up to 4050% of the total time). With integrated electronic data capture tools such as clipboard cut and paste, object drag-and-drop, or direct pipes for data into the notebook systems, this 5% to 25% of a scientist’s time can be significantly reduced to under 2%. When multiplied by the total number of people using notebooks, this adds up to a substantial amount of time redirected toward greater efficiency and better utilization for higher level work.

Other benefits that accrue to individuals using electronic notebooks include the ability to use collaborative tools such as annotations and hyperlinks to annotate significant parts of experiments requiring additional work. Annotated parts of an experiment can be recognized and picked up immediately, thus saving communication time.

Notebook Administrators spend significant amounts of time administering, tracking possession, and managing the microfilming of active notebooks. One notebook administrator estimated that she spends up to three months per year (most of one day per week) trying to find active notebooks. This same administrator estimated that this three months per year could be cut down to one week per year if the notebooks were electronically traceable.

Proprietary Information Professionals spend several weeks per year auditing notebooks for the presence of proper author signatures, witness signatures and witnessing statements, and proper dates. If an electronic notebook system is configured properly, it can remind (or even force) authors to authenticate, corroborate, and datetime stamp notebooks within an appropriate time window. The practical time window for securing a patent in today’s fast moving world is one to six months for most companies, so it is common practice to require researchers to authenticate records less than a month from their date of origination.

Micrographics professionals are those people who produce microfilm and microfiche archives from paper records. These archive media are more convenient, easier to use, and less expensive than paper. Courts and regulatory agencies legally recognize and accept these media as exact facsimiles of paper originals. However, microfilming notebooks is expensive when compared with newer technology alternatives. Optical imaging of records is being used increasingly.

Quality Assurance Departments in pharmaceutical and other companies have responsibility for ensuring that data in recordbooks are accurate and complete (purpose, methods and materials, experimental data and observations, calculations, conclusions, etc.). These departments also check that all required originator signatures, witness signatures, and datetime stamps are present and done properly for regulatory reasons. While the recordbooks are being quality assured, they are not available for use by scientists. Due to the frequent backlog in many QA Departments, it can take as long as a month for recordbooks to be returned. During this time, scientists are not supposed to work from copies of the recordbooks, because the copies may not be the latest and most final. Working from copies also implies two versions of the recordbook, an official and unofficial version.

The problem of data in recordbooks not being readily accessible during a QA audit can be solved by shared electronic notebooks. Researchers can share their electronic recordbooks with QA for audit purposes while continuing to fill in later parts. Auditors can open a read-only copy of the recordbook, make annotations on an annotation layer, and forward the feedback and final audit results to the researcher for incorporation. Auditors and authors in different locations can also open a recordbook simultaneously using a shared window program to resolve issues that may exist in a particular recordbook.

Records Management and Archives Management Departments spend the bulk of their time managing paper records today. Over 70% of all records are stored and managed on paper, even though lab data originate in electronic format more than 90% of the time. Electronic records are easier to manage, take substantially less space, are faster to retrieve, and cost less to manage and maintain than paper and microfilm records. However, because records managers and legal departments are only slowly making the paradigm shift, electronic notebooks are providing a focal point for many businesses to make the shift to electronic storage of intellectual property.

Legal and Patent Departments are relying increasingly on electronic records for litigation support. Electronic records present faster and cheaper support for litigation in all types of legal proceedings, including discovery, regulatory, and torts. Because electronic records are relied upon in the courtroom and are being pursued aggressively by attorneys in many industries, electronic lab recordbooks that are well managed can expedite defense claims because information can be archived, backed up, and recalled quickly and easily.

The Quantitative Return on Investments for R&D Team Computing

The companies that have successfully deployed R&D team computing and collaborative electronic notebook systems have demonstrated significant benefits. Additional support for such systems is increasing as a number of other companies are doing collaborative electronic notebook pilot projects.

Time Savings from R&D Team Computing Systems

ResearchStation showed significant timesaving. In a polymer formulations laboratory at Air Products and Chemicals, an average of approximately 1.5 days per week per person were saved in a group of 7-10 people over a period of 18 months. In this pilot project, the timesaving was largely due to reduced manual cut-and-paste processes. Other timesaving came from having data “all in one place,” multiuser project support, and support for collaborative processes.

Lotus Notes was used in a large multinational drug company to help regulatory affairs groups cut the total time required for a computeraided new drug application (CANDA) assembly process. The assembly process was cut from six to eight months to thirty days. At over one million dollars cost per day of drug development, this project was estimated to save over 250 million dollars over a two-year period (the system has been used for at least 15 CANDAs). Of all the documented projects, this one demonstrated the shortest overall payback period-only 10-16 weeks for all costs,- including technology, training and labor.

Although it was claimed that after re-engineering, Digital’s LinkWorks reduced R&D project cycle time by about 30% at VW Gedas in Germany, no evidence was provided to back up the claim. Since the customer was not contacted directly, the numbers may be suspect. It is not clear how much of this cycle time reduction is due to process reengineering, and how much can be attributed to direct benefits from the software technology itself. In this case, no actual timesavings can be calculated. However, such project cycle time reduction is remarkable since it applied to large projects.

Issues with Existing Software Systems

Based on the systems evaluated thus far, a number of general conclusions can be drawn. System integration is still very expensive. They all require a large amount of programming to integrate existing applications beyond any but the most superficial level. These systems all require commitments of resources for installation, application development, rollout, and training. Human resource costs are usually several times that of the software.

None of the systems on the market today supports the legal or regulatory requirements for recordkeeping and records management well. They do not include trusted thirdparty electronic signatures, witnessing, or record notarization functions. Electronic signatures are created in an overly simplistic manner using a user’s log-on password. The systems all have poor to nonexistent support for export and import of objects and electronic records. They do not preserve, export, or import all of the information necessary to describe compound documents and folders in an industry standard format. In general, these systems were not designed with a focus on the records management lifecycle; they neither support rigorous version control nor possess a strong audit trailing subsystem required for regulatory or patent interference auditing. The systems were designed to track electronic documents only, and thus they do not come with features to track non-electronic records. However, with some systems, a knowledgeable system administrator or developer could configure these into them rather easily. None of these systems comes with subsystems for permanent archiving and retrieval, as is needed for managing R&D records. Of the systems included in the first batch of the product/vendor profiles in the R&D Team Computing Study, the systems do not have strong concurrent access control or configuration management features (with the notable exception of Documentum’s Electronic Document Management System). Documentum’s system is the only formal document management system of those studied which was designed for managing complex product data such as CAD/CAM drawing sets and Computer-Aided New Drug Applications (CANDAs).

Proprietary, All-in-One Approaches

The evaluated systems all take the approach of proprietary, all-inone “solutions” coming from a single vendor rather than a modular, component-based approach. Each system is designed to be the center of the data universe. Out of this group of five vendors, two vendors, Lotus Development and Digital Equipment, appear to be driving toward a component architecture the fastest, whereby software components can be developed on top of or underneath their systems easily. Adding software components on top is important for developing “add-ons” or desktop client applications of the systems. Adding software components underneath is important for integrating the system with existing databases and database applications, document management systems, workflow, or groupware systems.

Summary of Results

The available systems studied thus far are readily usable, and effective applications can be developed with them. Companies are showing significant business and technical benefits from using them, despite current limitations, which frequently can be worked around. Personnel requirements to pilot and roll out solutions based on the available systems are large because these new client/server technologies take considerable time to learn, as do their applications to develop and deploy. Once the applications are in place, the amount of time the organization needs in order to learn to work collaboratively is also significant. In every case studied, the social lessons have been as valuable as the technological ones.

Additionally, vendors have yet to address sufficiently several key points. Recordkeeping and records management requirements are still poorly understood by groupware and document management system developers. True distributed object management systems based on open standards are not yet available. Support is poor for open (non-proprietary) distributed object standards (CORBA, JavaBeans, etc.), and system integration tools are still weak and require extensive custom programming to use. Systems integration is progressing slowly due to the poor support of open standards by scientific software vendors.

Greater focus must be put on the integration of existing systems and support of component architectures in order to fix these problems. This includes the support and integration of:

application integration frameworks,

compound object standards, such as Standard Generalized Markup Language (SGML),

portable client/server components and objects (CORBA and Java),

middleware and application integration tools, and

modularized and “componentized” scientific applications and databases.

There are still no true electronic notebook systems available that meet the legal, regulatory, technical, and social requirements. The systems examined thus far may serve as starting points; however, they all require significant investment to add key missing features, which may take many months of development.


The R&D Team Computing Study clarified many aspects of information automation systems needed for R&D team project data management and collaboration and the requirements for electronic recordkeeping systems. A few conclusions can be drawn from the study:

The legal and regulatory requirements are clear and available in documented form.

Base platforms are functional but still lack missing key pieces for electronic recordkeeping, records management, systems integration, distributed object management, compound documents, and portable components.

Systems must be enhanced significantly to support scientific applications easily and inexpensively.

Large learning curves exist with all products and vendors, and buyers must plan to invest heavily in training end users, managers, and IT resources.

Several roads can lead to success now, not just one or two. The remaining issues are which system(s) are the best fit for a particular organization’s stage of infrastructure and cultural readiness and maturity, its budgets, and installed systems that will require integration with the R&D team computing systems.

AUTHOR: Rich Lysakowski, Ph.D., is the Executive Director for the Collaborative Electronic Notebook Systems Association, which hosts the Collaborative Electronic Notebook Systems Consortium, an influential group of pharmaceutical, chemical, biotech and software companies that are collaboratively specifying and funding the development of collaborative electronic notebooks and related systems for scientific teams. Dr. Lysakowski has over 17 years’ experience in lab and R&D automation. He has worked extensively in the area of standards development, in the following roles: ADISS Project Leader, ASTM E49.52 Committee Chairman, and Lab Automation Standards Foundation Founder. He has a Ph.D. in Physical and Analytical Chemistry.

AUTHOR: Leslie Doyle is the Marketing & Communications Manager for the Collaborative Electronic Notebook Systems Association. She has three years of experience in records and information management and was formerly the Newsletter Editor for Documents, a quarterly publication providing educational RIM resources to clients of First American Records Management, a San Jose based records and information management firm. In 1996, she won the “Rookie of the Year Award” from the Silicon Valley Chapter of ARMA International. Ms. Doyle has a B.A. in English.

Copyright Association of Records Managers and Administrators Inc. Apr 1998

Provided by ProQuest Information and Learning Company. All rights Reserved