Food for Thought ... Alternative Approaches for Medical Countermeasures to Biological and Chemical Terrorism and Warfare

Countermeasures evaluation is the new kid on the block of alternative approaches. And it is a big kid that appeared suddenly: Over the last six months three funding opportunities in the US totaled more than $ 200 million, all with the goal of advancing the Human on a Chip concept. An alliance of US agencies is tackling the problem of evaluating drugs for which there are no patients and, hopefully, never will be patients. three different calls from NIH, FDA, and the Department of Defense (DoD) agencies, DtRA (Defense threat Reduction Agency) and DARPA (Defense Advanced Research Agency), all aim for producing a number of three-dimensional human organ equivalents based on stem cell technology and combining them with microfluidics on a chip. The design criteria for the two consortia sponsored by DARPA/FDA1, with about $ 35 million each, are very demanding: 10 organs to be maintained for 4 weeks. the projects supported by NIH/FDA2 will develop individual 3D organ equivalents to opt into the platforms established in the former call. It is the first time that DARPA and NIH have teamed up, with each agency committing $ 70 million. the NIH contribution will come from its director’s discretionary “Common Fund” but will be administered through the new National Center for Advancing translational Sciences (NCAtS)3. the FDA will advise the agencies on how to meet its requirements for safety and effectiveness as part of their medical countermeasure initiative4. DtRA seeks independently 5 “a platform comprised of in vitro human organ constructs in communication with each other [(i.e., liver, lung, heart, kidney, vasculature, and Blood Brain Barrier (BBB) with neuronal component] that will accurately assess efficacy, toxicity, and pharmacokinetics of drugs in a way that is relevant to humans.” the activities are prompted by the perceived need to have medical countermeasures (MCM) at hand in case something happens. (Note, however, that the eU does not follow the same path.) the obvious problem is the lack of patients for clinical development, which make a traditional product registration with FDA impossible. the original response was the Animal Rule, i.e., the suggestion to use appropriate animal models instead. In May 2002, FDA issued the final rule New Drug and Summary The desire to develop and evaluate drugs as potential countermeasures for biological and chemical threats requires test systems that can also substitute for the clinical trials normally crucial for drug development. Animal models have limited predictivity for drug efficacy, as is well known from many disappointments in clinical trials. Traditional in vitro and in silico approaches are not really game changers here, but the substantial investment into novel tools now underway might bring about a second generation of alternative approaches. The avenue pursued focuses primarily on the development of a Human on a Chip, i.e., the combination of different three-dimensional (stem) cell-based organ equivalents combined with microfluidics. The prospects of such approaches, their impact on the field of alternative approaches, and necessary complementary activities are discussed. The need to adapt quality assurance measures and experiences from validation is stressed.


Introduction
Countermeasures evaluation is the new kid on the block of alternative approaches. And it is a big kid that appeared suddenly: Over the last six months three funding opportunities in the US totaled more than $ 200 million, all with the goal of advancing the Human on a Chip concept. An alliance of US agencies is tackling the problem of evaluating drugs for which there are no patients and, hopefully, never will be patients. three different calls from NIH, FDA, and the Department of Defense (DoD) agencies, DtRA (Defense threat Reduction Agency) and DARPA (Defense Advanced Research Agency), all aim for producing a number of three-dimensional human organ equivalents based on stem cell technology and combining them with microfluidics on a chip. The design criteria for the two consortia sponsored by DARPA/FDA 1 , with about $ 35 million each, are very demanding: 10 organs to be maintained for 4 weeks. the projects supported by NIH/FDA 2 will develop individual 3D organ equivalents to opt into the platforms established in the former call. It is the first time that DARPA and NIH have teamed up, with each agency committing $ 70 million. the NIH contribution will come from its director's discretionary "Common Fund" but will be administered through the new National Center for Advancing translational Sciences (NCAtS) 3 . the FDA will advise the agencies on how to meet its requirements for safety and effectiveness as part of their medical countermeasure initiative 4 . DtRA seeks independently 5 "a platform comprised of in vitro human organ constructs in communication with each other [(i.e., liver, lung, heart, kidney, vasculature, and Blood Brain Barrier (BBB) with neuronal component] that will accurately assess efficacy, toxicity, and pharmacokinetics of drugs in a way that is relevant to humans." the activities are prompted by the perceived need to have medical countermeasures (MCM) at hand in case something happens. (Note, however, that the eU does not follow the same path.) the obvious problem is the lack of patients for clinical development, which make a traditional product registration with FDA impossible. the original response was the Animal Rule, i.e., the suggestion to use appropriate animal models instead. In May 2002, FDA issued the final rule New Drug and

Summary
The desire to develop and evaluate drugs as potential countermeasures for biological and chemical threats requires test systems that can also substitute for the clinical trials normally crucial for drug development. Animal models have limited predictivity for drug efficacy, as is well known from many disappointments in clinical trials. Traditional in vitro and in silico approaches are not really game changers here, but the substantial investment into novel tools now underway might bring about a second generation of alternative approaches. The avenue pursued focuses primarily on the development of a Human on a Chip, i.e., the combination of different three-dimensional (stem) cell-based organ equivalents combined with microfluidics. The prospects of such approaches, their impact on the field of alternative approaches, and necessary complementary activities are discussed. The need to adapt quality assurance measures and experiences from validation is stressed.
Keywords: human on a chip, biological warfare, chemical warfare, countermeasures, alternative methods

Consideration 1:
There is no such thing as a sufficiently predictive animal model for countermeasures the US Department of Defense sponsored a National Academy of Sciences report, Animal Models for Assessing Countermeasures to Bioterrorism Agents, published in December 2011 8 . One author (tH) had the privilege of being part of the committee.
The key findings of the report are reproduced in Box 2. In a nutshell, neither animal nor alternative methods are available for this purpose, but the committee discouraged the development of further animal models while proposing the exploitation of new alternative approaches.

Biological Drug Products; Evidence Needed to Demonstrate Effectiveness of New Drugs When Human Efficacy Studies Are
Not Ethical or Feasible 6 (Kwik et al. 2007). FDA's own summary of the animal rule is reproduced in Box 1. In a nutshell, FDA allows substituting for evidence of efficacy (not safety!) in humans with animal studies if a "reasonably well understood pathophysiological mechanism for the toxicity … and its amelioration or prevention by the product" is given, "effect is demonstrated in more than one animal species" or "a single animal species … predicting the response in humans," "study endpoint is … generally the enhancement of survival or prevention of major morbidity," and "pharmacokinetics and pharmacodynamics … in animals and humans is sufficiently well understood." It appears that the obvious non-fit for purpose of animal models paired with the need to regulate these new products opens doors for new approaches. this is very reminiscent of the introduction of the limulus assay (lAl) in 1986 by FDA, when the duration of the rabbit assay did simply not allow testing shortlived radiopharmaceuticals. Once introduced, the lAl became broadly applied, making it probably the most successful alternative method to date. Similarly, FDA's new interest in predictive in vitro tools for MCM might be a door-opener for the evaluation of drugs in general.

FDA Animal Rule Summary 7
In assessing the sufficiency of animal data, the agency may take into account other data, including human data, available to the agency. Under this rule, FDA can rely on the evidence from animal studies to provide substantial evidence of the effectiveness of these products when: 1. there is a reasonably well understood pathophysiological mechanism for the toxicity of the chemical, biological, radiological, or nuclear substance and its amelioration or prevention by the product; 2. the effect is demonstrated in more than one animal species expected to react with a response predictive for humans, unless the effect is demonstrated in a single animal species that represents a sufficiently well characterized animal model (meaning the model has been adequately evaluated for its responsiveness) for predicting the response in humans; 3. the animal study endpoint is clearly related to the desired benefit in humans, which is generally the enhancement of survival or prevention of major morbidity; and 4. the data or information on the pharmacokinetics and pharmacodynamics of the product or other relevant data or information in animals and humans is sufficiently well understood to allow selection of an effective dose in humans, and it is therefore reasonable to expect the effectiveness of the product in animals to be a reliable indicator of its effectiveness in humans. All studies subject to this rule must be conducted in accordance with preexisting requirements under the good laboratory practices (21 CFR part 58) regulations and the Animal Welfare Act (7 U.S.C. 2131 et seq.). Safety evaluation of products is not addressed in this rule. Products evaluated for effectiveness under subpart I of part 314 and subpart H of part 601 will be evaluated for safety under preexisting requirements for establishing the safety of new drug and biological products.
the agency believes that the safety of most of these products can be studied in human volunteers similar to the people who would be exposed to the product. FDA recognizes that some safety data, such as data on possible adverse interactions between the toxic substance itself and the new product, may not be available. this is not expected to keep the agency from making an adequate safety evaluation. FDA's procedures and standards for evaluating the safety of new drug and biological products are sufficiently flexible to provide for the safety evaluation of products evaluated for efficacy under 21 CFR subpart I of part 314 and subpart H of part 601. this rule will not apply if product approval can be based on standards described elsewhere in our regulations (e.g., accelerated approval based on human surrogate markers or clinical endpoints other than survival or irreversible morbidity).
this judgment is remarkable, coming unanimously from such an esteemed institution as the National Academies. It extends the earlier report, Toxicity Testing for the 21 st Century: A Vision and a Strategy 9 , to drug development, in the sense that, after taking stock of the state of the art of animal-based evaluations, a call is made for novel approaches based on today's biotechnology. though the committee had no mandate to discuss the animal rule, it effectively called for the development of a new approach. Similarly, we might apply this thinking to challenge current animal-based drug development more generally. this is quite in line with the ongoing devaluation of preclinical drug evaluation, giving rise to explorative human trials (Robinson, 2007;Coleman, 2011), orphan drug development (Bashaw and Fang, 2012), microdosing (Garner, 2005;Boyd and lalonde, 2007;lappin and Garner, 2008), etc.
A critical point of comparison for the evaluation of MCM is the regular drug development process to understand the probabilities of arriving, finally, at a safe and efficient drug. The complete process -from drug discovery to FDA approval -takes an average of 10 to 15 years and costs more than $ 1 billion (Mundae and Ostör, 2010;tamimi and ellis, 2009;Gilbert et al., 2003). In some estimates, when the costs of failed prospective drugs are factored in, the cost of a single drug development • Currently available animal models are imperfect representations of the human-pathogen interaction with several important limitations, such as methodological differences and a lack of sufficient human data and knowledge of the natural history of diseases of interest. However, at this time animal models remain central to the development of countermeasures against biothreats when testing the efficacy of therapeutics or vaccines would otherwise involve exposing human volunteers or warfighters to a potentially lethal or permanently disabling toxic substance or microorganism.
• Because animal models may be imperfect for a specific need and are expensive to employ (they require large numbers of animals and must be used in secure biocontainment facilities), current models should be reevaluated for their limitations as well as their presumed advantages. For example, methodological differences among similar animal models may result in differences in how well those models accurately mirror the human response to infection or treatment. Consequently, expanded collection and analysis of human clinical data from natural infections could help verify and augment the strengths of available models.
• Developing new animal models for biodefense research cannot adequately resolve in a reasonable time frame the limitations of the currently available ones. It would be more useful for transformational Medical technologies to support a more thorough qualification of currently available animal models to advance the predictive capacity of animal-derived data than to create new models.
• In vitro and in silico methods are not yet advanced enough (in part due to the absence of human data) to reliably replace animals in biodefense research on a large scale.
• the Committee suggests that transformational Medical technologies undertake an analysis of the discovery, development, and approval process for medical countermeasures to identify: -Scientific gaps in terms of utilizing alternative methods to animal models and how to address them -Specific areas in which use of in vitro and in silico methods could be sufficient, or an adjunct, to the use of animals -Criteria for choosing and utilizing the most suitable technologies to replace animal use in biodefense research in the near future • Changing the standard practice of animal experimentation where feasible to approximate the clinical course of treatment for humans could provide a more reasonable prediction of the usefulness of countermeasures during the development process.
• Potential advances in knowledge regarding biothreats and medical countermeasures should be weighed against the duration and severity of animal pain and distress.
• A comprehensive strategy to improve the gathering and sharing of data from animal models (and their alternatives) would significantly increase the efficiency and productivity of research into bioterrorism countermeasures as well as improve laboratory animal welfare, if it includes: -Compartmentalization -experiments designed to yield information from components of the animals (organs, cells, and systems) rather than data derived from the whole organism; -the use of systems biology and in vitro or in silico methods; -Systemic collection of, and access to, experimental data; -Publication of negative results; -enhanced collection and analysis of human data; -Added clinical veterinary care.
• Where possible, transformational Medical technologies should encourage efforts to replace nonhuman primates as the animal of choice in biodefense research. Such efforts, coupled with unhindered access to data and publishing of all results -even negative ones -help ensure that these data are beneficial, animals are used judiciously, and unnecessary duplication of work is avoided.
predictivity is not that of a single animal model but rather of the combination of all efforts of preclinical development.
Sepsis, as an uncontrolled systemic infection with high mortality, is a clinical condition most closely reflecting the clinical features of a bioterroristic agent's clinical picture. Buras et al. summarized their value as follows: "Animal models have been developed in an effort to create reproducible systems for studying sepsis pathogenesis and preliminary testing of potential therapeutic agents. However, demonstrated benefit from a therapeutic agent in animal models has rarely been translated into success in human clinical trials." Buras et al. (2005)  Similarly, Opal and Cross (1999) summarize: "It has become painfully evident that animal models provide misleading and overly optimistic estimates of the survival benefit of specific antisepsis drugs when compared to clinical efficacy in actual human sepsis." the clinical studies in sepsis following the successful completion of preclinical animal models was summarized by Opal and Cross (1999) (DiMasi and Grabowski, 2007).
typically, MCM can be developed until phase I clinical trials, so the point of comparison needs to be the clinical development phase and its success/attrition rate. Approximately 8% of drugs that now enter phase I studies eventually become FDA approved products, compared to 14% in the eighties 10 . the success rates from the first study in humans to launch are now <10% (Peck, 2007), according to the C.M.R. International 2006/7 Pharmaceutical R&D Factbook cited. the attrition rate in phase II is now more than 70% and rising, and even in phase III one-third to half the molecules fail (Kola and landis, 2004;DiMasi, 2001a) 11 . Obviously, recent biomedical research breakthroughs have not improved our ability to identify successful candidates.
the main causes of failure in the clinic include safety problems (about 20%) and lack of effectiveness (about 40%), both predicted by a series of animal models before entering the most costly part of drug development. the inability to predict these failures before human testing or early in clinical trials dramatically escalates costs. In the infectious disease area, data from the ten biggest drug companies during 1991-2000 showed a success rate of about 15%, while the average of all indications was 11% (DiMasi and Grabowski, 2007). Similarly, DiMasi et al. (2010) showed a success rate for systemic infectious disease of 15.6% during 1994 and 2003. Noteworthy, from 1981 to 1992 the success rate of anti-infective drugs was 28.1% (DiMasi, 2001b). Overall, biopharmaceuticals appear to have the higher success rate (all indications) of 30.2% (Gilbert et al., 2003).
A key question, then, is whether countermeasures to bioterrorism have a higher likelihood of success in a (theoretical) human trial? A number of aspects actually argue against this, as summarized by one author (tH) for the NAS report: -the type of diseases are peracute systemic infections, most closely related to sepsis patients, a disease entity most notorious in failing clinical trials (see below). -the pathogenesis of these rare or even unknown diseases is little known to guide the development process. -the pathogens are likely to be optimized to stand interventions, e.g., by introduced antibiotic resistance. -the clinical setting is likely one of mass infection, possibly combined with other threats, hardly comparable to the randomized clinical trials of hospitalized patients. -the biosafety levels and the strong reliance on non-human primates limit the number of animal studies. -Most development took place with less than average development expenditure by entities not experienced with full clinical development of drugs. It must be concluded that a success rate of normal drug development can only be used as a best-case scenario. Note that this spectacular failures and unexpected toxicities." Opal updated these data for 2009 in his presentation to the committee reporting that 42 clinical studies led to 39 cases of lack of effect, one small effect, and two cases where the situation of the patients worsened (Christaki et al., 2011).
In conclusion, it must be assumed that the likelihood of success in humans of countermeasures to bioterrorism would be considerably lower than the average 11% success rate of drugs at a similar stage of development. there is no evidence that any additional animal model would improve such a success rate. this is especially apt if this animal model represents a repetition of a model used during the development process: In fact, it would be rather unlikely that a most promising animal model would not have been used during the development process and left for the final phase of the clinical trial equivalent.

Consideration 2: Complexity versus simplicity in modeling complex biology
the Human on a Chip concept aims for the combination of human stem cell-based 3D organ systems to be combined in a microfluidic platform. This combines a number of desirable features: 1) Organotypic coculture of cells: the concept of mirroring the complexity of the organism in the test models is appealing. the coculture of cells allows for cellular interactions and mutual influences for cell development and differentiation. However, achieving natural proportions of cell types and tissue architecture is a major challenge. 2) 3D systems: the third dimension adds another physiologic component. While traditional cell cultures have only 1% of the cell density of tissue and less than 10% of normal cellcell-contacts (Hartung, 2007a), 3D models reflect the tissue situation but pose a problem of nutrient and oxygen supply in the absence of a circulatory system and hemoglobin. 3) Perfusion: While traditional cell culture does not achieve homeostasis but is characterized by repeated exchange of culture media with an interim decrease in nutrients and accumulation of waste products, cell perfusion culture can maintain stable conditions. However, stable conditions are not yet physiological conditions and, typically, large culture media volumes are needed. If recirculating the medium, volumes can be reduced but stability of media composition is reduced if no regulated nutrient supply and excretion systems are built in. One author (tH) had early experiences in two eU-funded consortia for perfused cultures of kidney and liver Jennings et al., 2004). these taught us the advantages of such systems but also the difficulties in standardization, long-term maintenance, throughput, etc. 4) Human stem-cell based systems: It is a common assumption that human cells reflect human reactivity better than animal cells. though experimental evidence is rather limited, it is certainly wise to work with cells from one species, preferably humans. Studies in human versus mouse bone marrow (Pessina et al., 2003) suggest that human reactivity is indeed reflected on an in vitro level to an extent that species differences in vivo can be estimated. In the same vein, the MeIC study, which tested 50 reference compounds in 61 in vitro assays, showed the best predictivity of human lethal plasma concentrations with human cells (ekwall, 1999). Stem cells promise to be a resource of human quasi-primary cells, but we should keep in mind that we do not currently have protocols for final differentiation of most cell types (with the exception of cardiomyocytes), and the generation of pure cell types is still beyond reach. 5) Chip-based systems: they combine both miniaturization (few cells and little test agent) and opportunities for continuous measurements (Khetani and Bhaita, 2006;Ni et al., 2009;Zhang et al., 2009;Gupta et al., 2010;Shintu et al., 2012). Promising examples exist (van Vliet et al., 2007;Huh et al., 2010;Robinette et al., 2011), some of which use functional endpoints to predict hazards. At this juncture, however, it is mainly electrophysiology that allows continuous measurement but limits us to neurons and cardiomyocytes. Unspecific measurements such as impedance are on the way, but continuous functional markers are rare. As appealing as a combination of so many good things is ("Too much of a good thing is wonderful," Mae West), the question emerges whether the enormous efforts to create such systems are necessary to predict human reactions. Challenges include: -Final differentiation of stem cells to the different cell types of the organ, including drug metabolism capacity. -Finding compromise cell culture conditions to maintain all the different cells and organ equivalents. -Balancing organ equivalent sizes and the perfusion liquid compartment to allow close to physiological kinetics. -Continuous supply with nutrients and oxygen as well as extraction of waste products. -Biocompatible materials for all cell types supporting but not interfering with differentiation. -Sufficiently noninvasive measurements for the small amounts of cells on a chip and the kinetics of the test substance. this is not just an engineering challenge but also a standardization and reproducibility challenge. A key lesson from the validation of alternative methods (Hartung, 2010a) is the pivotal role of this challenge, and usually this requires simplicity. the more variables in a system, the more standardization is required and the more opportunities for introducing variability.

Consideration 3: Three things count for the new tools for MCM: quality, quality, and quality
Cell cultures are prone to artifacts (Hartung, 2007a): Far too many artificially chosen and difficult to control conditions influence our experiments. Quality assurance is the gift from alternative methods to the life sciences. this blunt statement might be challenged by those involved with Good laboratory Practice (GlP) or ISO quality assurance. However, while GlP (at least ences: GlP still gives only limited guidance for in vitro. GlP cannot normally be implemented in academia on the grounds of costs and lack of flexibility. GCCP, on the other hand, also aims to give guidance to journals and funding bodies. Note that guidance also has been developed for the publication of in vitro journal articles (leist et al., 2010). A CAAt workshop was held in March 2012 in San Francisco, and a taskforce was formed to further this work.
All quality assurance of an in vitro system starts with its definition and standardization, which include: -A definition of the scientific purpose of the method -A description of its mechanistic basis -the case for its relevance -the availability of an optimized protocol, including: -standard operation procedures -specification of endpoints and endpoint measurements -derivation, expression, and interpretation of results (preliminary prediction model) -the inclusion of adequate controls -An indication of limitations (preliminary applicability domain) -Quality assurance measures the novel types of Human on a Chip test will represent additional challenges as to standardization of design and generation of cultures and devices. the systems are considerably more complex than traditional in vitro approaches, involving various cell types and engineering.
this standardization forms the basis for formal validation, as developed by eCVAM, adapted and expanded by ICCVAM and other validation bodies, and, finally, internationally harmonized by OeCD (OeCD, 2005). Validation is the independent assessment of the scientific basis, the reproducibility, and the predictive capacity of a test. It was redefined in 2004 in the Modular Approach (Hartung et al., 2004) but needs to be seen as a continuous adaptation of the process to practical needs and a case-by-case assessment of what is feasible (Hartung, 2007b;leist et al., 2012). the most important changes to the Modular Approach were: the introduction of an applicability domain (borrowing the concept from QSAR), the use of existing data (retrospective validation), the independence of reproducibility and relevance assessment, allowing leaner study designs and performance standards for similar tests to be considered equivalent to a validated one.
Applicability domain describes the range of test materials to which the test can be applied and reliable predictions obtained (e.g., which chemical class(es), types of products). A later change in, or extension of, the applicability domain might require additional validation work and a new peer-review. While earlier validation studies had not taken into consideration any existing data on a test, the introduction of retrospective validation allowed for their use as sole source or in combination with prospective data generation. traditional validation studies need a certain number of test items in at least three labs to assess both reproducibility and relevance. Often, however, fewer test items are required to establish reproducibility, allowing the testing of further items in one laboratory only once reproducibility has been confirmed, which can eliminate half the testing with-originally) addressed only regulatory in vivo studies, and ISO guidance is not really specific for life science tools, neither addresses the key issue, i.e., the relevance of a test. this is the truly unique contribution of validation, which is far too rarely applied in other settings.
the limited applicability of GlP to in vitro studies was first addressed in an eCVAM workshop in 1998 (Cooper-Hannan et al., 1999). Parallel initiatives involving one author (tH) (1996 in Germany and1999 in Bologna at the third World Congress on Alternatives and Animal Use in the life Sciences) led to a declaration toward Good Cell Culture Practice -GCCP (Gstraunthaler and Hartung, 1999): "The participants … call on the scientific community to develop guidelines defining minimum standards in cell and tissue culture, to be called Good Cell Culture Practice … should facilitate the interlaboratory comparability of in vitro results … encourage journals in the life sciences to adopt these guidelines..." A GCCP task force was then established, which produced two reports (Hartung et al., 2002;Coecke et al., 2005). the maintenance of high standards is fundamental to all good scientific practice, and it is essential for ensuring the reproducibility, reliability, credibility, acceptance, and proper application of any results produced. the aim of GCCP is to reduce uncertainty in the development and application of in vitro procedures by encouraging the establishment of principles for the greater international harmonization, standardization, and rational implementation of laboratory practices, nomenclature, quality control systems, safety procedures, and reporting, linked, where appropriate, to the application of the principles of Good laboratory Practice (GlP). GCCP addresses issues related to: -Characterization & maintenance of essential characteristics -Quality assurance -Recording -Reporting -Safety -education and training -ethics the GCCP documents formed a major basis for a GlP advisory document by OeCD for in vitro studies (OeCD, 2004), which addresses: -test Facility Organization and Personnel -Quality Assurance Program -Facilities -Apparatus, Materials, and Reagents -test Systems -test and Reference Items -Standard Operating Procedures -Performance of the Study -Reporting of Study Results -Storage and Retention of Records and Materials therefore, both guidance documents have a lot in common: Inherent variation of in vitro test systems calls for standardization, and both the GlP advisory document and the GCCP guidance are intended to support best practice in all aspects of the use of in vitro systems, including the use of cells and tissues. When comparing GlP and GCCP, there also are some major differ-al test" or "gold standard." For MCM we have neither human data, other drugs for the same purpose, nor established animal tests fit for purpose. A 2008 workshop (Hoffmann et al., 2008) discussed similar issues for toxicology, identifying three types of reference points: reference method/results where available; expert consensus to establish a putative point of reference where data are ambiguous and incomplete; and cases of no point of reference methods, such as latent class analysis. As laid out earlier (Hartung, 2010b), in the absence of reference data, the scientific validation needs to be stressed. Figure 1 shows the classical validation scheme and its adaptation to such situations. The process of defining the point of reference for MCM evaluation will be very challenging. It should be started early enough, as it will guide test development, but it out major decrease in statistical power (Hoffmann and Hartung, 2006). last, the introduction of performance standards, i.e., minimum criteria to be fulfilled for any later test development to prove equivalence to the validated test, represents a key tool to open the market for competing developments while also accommodating changes to the validated protocol without embarking on a full validation study.
It became evident, however, that it is difficult to adapt these schemes to such complex technologies as toxicogenomics (Corvi et al., 2006). therefore, the concept of applying some tools of evidence-based medicine for validation purposes was put forward (Hartung, 2010b). A key problem for the new technologies, and also for new products like MCM and endpoints, is the absence of a point of reference, i.e., a "tradition- Fig. 1: The traditional validation scheme and its adaptation to situations without reference test grated testing strategies (ItS) or pathway-based approaches, as suggested in the roadmap for systemic toxicity testing (Hartung and McBride, 2011;Basketter et al., 2012) to produce more predictive systems. the last article in this Food for thought … series (Hartung et al., 2012) laid out the vision of a systems toxicology. Similarly, a systems pharmacology (van der Greef and McBurney, 2005;Berger and Iyengar, 2011;Hansen et al., 2012) can be envisioned and is currently emerging to model drug interventions in the organism. In both cases it will be necessary to convince the regulatory community to base decisions on this novel type of data, which will be best achieved by scientific rigor and the continuous exposure to new evidence from opinion leaders and market forces.
A very interesting opportunity lies in making use of the new organotypic models for pathway identification. If the improved culture conditions boost relevance of the models, the pathway identification in these models should be even more relevant.

Conclusions
the new interest in predictive in vitro systems in the US will revamp the field of alternative approaches. The substantial funding opportunities will bring researchers and engineers into the field. It will be important to acquaint them with the lessons learned over the last two decades of developing predictive models and their quality assurance. toxicology has served as a pilot, but all areas of life sciences have similar needs, and drug development can benefit in similar ways.
While the Human on a Chip approach is not the only way to construct novel predictive test strategies, it complements approaches based on integrated testing strategies or pathwaybased approaches as they are mainly pursued in toxicology (tox-21c).
The validation process as defined originally by ECVAM has been proven to work. the eCVAM principles on validation were taken up by ICCVAM (USA) and internationally by the OeCD in GD 34 on the validation and international acceptance of new or updated test methods for hazard assessment. they can be reasonably translated to drug development purposes. the validation process is in constant evolution, and MCM will need such adaptations. Validation of tests is an essential quality assurance process, similar to eBM. the evidence-based toxicology Collaboration (eBtC) promises to be a tool for validation of 21 st century methods.
is important not to use up all reference information during the design phase of an assay. the framework of evidence-based medicine is increasingly being translated to toxicology (Hoffmann and Hartung, 2006), and it recently led to the creation of the evidence-based toxicology Collaboration (eBtC) 12 (Zurlo, 2011).

Consideration 4:
The investment in superior in vitro models will promote toxicology and pharmacology for the 21 st century, even if it does not result in a routine predictive tool the convergence of new technologies, the needs of the pharmaceutical industry, and the goals of regulatory agencies further inform the dialogue that began with the publication of Toxicity Testing in the 21 st Century: A Vision and a Strategy (NRC, 2007). the need to understand pathways of toxicity in order to better predict the toxicity of environmental chemicals has now spilled over into drug development and safety assessment of pharmaceuticals and biologicals, as well as the assessment of the efficacy of countermeasures to biological and chemical terrorism agents. the need to develop approved pharmaceutical and biological interventions without clinical trials poses a unique challenge. The added benefit in pursing this challenge is that the in vitro, human-based, mechanistically-oriented organ-on-a-chip or combined to Human on a Chip and other test systems developed for that purpose will also serve to meet the needs of the other disciplines that currently rely on animal tests to predict human responses.
Our discussions for the last few years have centered very much on toxicology for the 21 st century (tox-21c). However, neither is toxicology the key problem of MCM (safety assessments including phase I trials can actually be done in a very conventional way, with the possible caveat that biologicals such as vaccines and immune response modifiers are not really testable for side effects or excess-pharmacology in animals), nor is pharmacology short of novel tools. Whatever gives a cutting edge to drug development is typically applied. However, so far the new tools have been used to convince the management to make certain business decisions, rather than to make the regulators admit a product to the market. For this reason, the very same tools require some independent endorsement so that regulators can feel comfortable in basing decisions on them. this will hold true whether we talk about animal models or alternative approaches. For this reason, MCM open up opportunities for Pharma-21c, i.e., embracing novel technologies for regulatory decisions on the likely efficacy of drug candidates.
Creating organotypic cell models is not necessarily the most convincing approach to produce credible data given the artificial nature of the constructs and the degree of innovation required to design them. We could as well imagine approaches of inte-