International Regulatory Needs for Development of an IATA for Non-Genotoxic Carcinogenic Chemical Substances

359 Received January 20, 2016 Accepted April 14, 2016; Epub April 27, 2016; http://dx.doi.org/10.14573/altex.1601201 tiation) and non-genotoxic (including both initiation and propagation phases). Both categories are comprised of several more specific modes of action (MoA). Over the last two decades there has been much debate about the “Hallmarks of Cancer”, highlighting the need to better understand the biological mechanisms underlying cancer onset and


Introduction
The current thinking is that there are three main stages of carcinogenicity: initiation, promotion and progression, and that the mechanisms by which chemicals cause carcinogenicity can fall into two broad categories: genotoxic (a mechanism causing ini-Relevant in vitro assays are briefly described such that, with respect to the 3Rs test methods and assessment strategies, relevant in vitro Organisation for Economic Cooperation and Development (OECD) Test Guideline (TG) development for chemical hazard assessment will be stimulated.

What are non-genotoxic carcinogens?
Carcinogens can be categorized as either predominantly genotoxic or non-genotoxic according to their specific pathogenic mechanism.Genotoxic carcinogens, also known as DNA-reactive carcinogens, interact directly with DNA through the formation of covalent bonds, resulting in DNA-carcinogen complexes.
NGTxC are substances that act through secondary mechanisms that do not involve direct interaction with DNA; they do not induce mutation in (short term) eukaryotic and prokaryotic mutation assays nor do they induce direct DNA damage in the target organ.They also include "epigenetic" carcinogens, where the term "epigenetic" is sometimes used to encompass the full spectrum of transcriptional regulatory processes that appear to mediate environmental influences and change a cellular state to reflect past and current (chemical) exposures (Greally and Jacobs, 2013).Here we consider that epigenetic changes comprise changes in chromatin (DNA methylation and/or histone modifications) and non-coding RNAs, including microRNAs.progression (reviewed in Hanahan andWeinberg, 2000, 2011;Smith et al., 2015).Initially six hallmarks were described in 2000: sustained proliferative signaling; evasion of growth suppression; resisting cell death; replicative immortality; induced angiogenesis; and activated invasive metastasis (Hanahan and Weinberg, 2000).To this set, further hallmarks have been added, including dysregulation of cell metabolism and avoidance of immune destruction.Genetic instability and chronic inflammation underlie these hallmarks, as do epigenetic perturbation mechanisms, particularly changes in DNA methylation (Moggs et al., 2004;Thomson et al., 2012Thomson et al., , 2014;;Miousse et al., 2015), see Figure 1.Whilst this evolving somatic mutation theory of cancer is widely but not universally accepted (Sonnenschein and Soto, 2013), it provides a useful grouping format to assist with the clustering of relevant assays for the biological processes associated with cancer initiation, promotion/progression, and tumor formation for regulatory test method development and integrated approaches to testing and assessment (IATA) purposes.It also accommodates the growing body of information about the biological processes underlying each of the hallmarks; clearly there are a number of different pathways underlying each.
Here we examine what non-genotoxic carcinogens (NGTxC) are, the current European and also international chemical regulatory requirements and difficulties with respect to NGTxC, and how an IATA could be explored and a framework ultimately be developed to assist regulators in their assessments of NGTxC.
Fig. 1: Hallmarks of cancer with associated modes of action (MoA) and existing assays that address particular aspects of certain hallmarks (adapted and modified from Goodson et al., 2015) Both non-genotoxic and genotoxic substances can trigger oxidative stress (Ellinger-Ziegelbauer et al., 2005), see also Table 1, which can also lead to adverse outcomes other than cancer.One of the main pathways related to oxidative stress is orchestrated by p53.The modulation of p53-downstream genes controls cell cycle and apoptosis.These two endpoints may or may not be related to cancer onset and progression.However, a string of p53signaling pathway-dependent genes are directly related to angiogenesis and metastasis, including the master gene TSP1, which plays an important role in bladder cancer (Mitra et al., 2006).The modulation of different genes in the p53-signaling pathway can also lead to different fates of cells affected by oxidative stress damage, which may proceed through apoptosis or to steps leading to DNA damage (Filomeni et al., 2015).Oxidative DNA damage may be critical in carcinogenesis if it results in extensive enough changes to induce gene mutations.However, the induction of oxidative stress may also regulate gene expression either directly through activation of gene transcriptional pathways or indirectly through hypomethylation (Nishida and Kudo, 2013;Casey et al., 2015;Vaccari et al., 2015;Wu and Ni, 2015).Epigenetic mechanisms in relation to regulatory needs are reviewed by Greally and Jacobs (2013) and Marczylo et al. (2016), and International Agency for Research on Cancer (IARC) cancer classification by Herceg et al. (2013).Taken together, these reviews demonstrate the major roles that such epigenetic mechanisms can play in tumor formation, often in combination with other mechanisms, and how this understanding can start to be applied for OECD and IARC purposes.A mechanism-based approach provides valuable insights with respect to the test method combinations that could be developed, also employing current OECD TGs (as Tab. 1 suggests), in order to obtain an IATA for NGTxC.Table 2 provides examples of molecular regulators of the vertebrate epigenome and Table 3 provides molecular epigenomic choices (Greally and Jacobs, 2013).
Of these, the three principle mechanisms DNA methylation, histone modification and miRNA are particularly promising with respect to assay development and augmentation for detecting NGTxC, as they are often key elements of many of the cancer hallmarks (as indicated on the right side of Fig. 2).Relevant information about DNA methylation, RNA and miRNA expression studies and chromatin structure and modification data can already start to be derived from the literature, together with analyses to identify markers for detection of NGTxC with epigenetic activity.This information can be used in combination with the key event assay blocks identified in Table 1, specific examples being endocrine receptor (Zhang andHo, 2011, reviewed in Greally andJacobs, 2013) and xenobiotic receptor activation (Lempiainen et al., 2011).One of the major challenges when studying epigenetic changes inferred by NGTxC is to link these epigenetic changes to adverse outcomes.Preliminary review suggests that association of NGTxC exposure with miRNA's may contribute to the WoE for downstream adverse outcome pathway (AOP) linkages, but specific epigenetic machinery may differ between adverse outcomes.Taking the male germline as an example, we can relate an MIE such as androgen receptor (AR) activation with a negative correlation to NGTxC can act via epigenetic mechanisms, as reviewed by Thomson et al (2014).Additionally, NGTxC can also act via other well-known mechanisms, many of which underlie the hallmarks shown in Figure 1, such as: peroxisome proliferation; immune suppression; receptor-mediated endocrine modulation; inhibition of intercellular communication; induction of tissue-specific toxicity; inflammatory responses; disruption of cellular signaling or structures by changing the rate of either cell proliferation or of processes that increase the risk of genetic error; disruption of certain negative cell feedback signals that can enhance proliferative signaling (e.g., the oncogenic effect of mutations in the Ras gene stem from disruption of its negative feedback system) (Hanahan and Weinberg, 2011); and mutations in tumor suppressor genes that allow cells to evade growth suppression and contact inhibition.
Thus, NGTxC act through a large and diverse variety of different and in some cases specific mechanisms.An overview of the major mechanisms is provided in Table 1 and Figure 1.This is a living list of NGTxC mechanism examples.With the recent rapid elucidation of epigenetic mechanisms in the NGTxC process, specific epigenetic mechanisms of relevance are increasingly being identified.While genetic modifications in tumor cells may initiate and drive malignancy, exposure to NGTxC may affect one or more of the biological traits underlying the hallmarks of cancer.NGTxC in this paper therefore refer to carcinogens that are negative (or produce ambiguous results) in genotoxicity assays performed to measure biological endpoints, including gene mutations and chromosomal damage (chromosomal aberrations, micronuclei formation).
There are a large number of pathways and intervention points for NGTxC to contribute to cancer initiation and cancer progression.In addition, there is evidence to support the theory that NGTxC can cause alterations that increase the likelihood of genetic instability, which would promote the acquisition of a fully malignant and invasive phenotype (Ellinger-Ziegelbauer et al., 2009).NGTxC exhibit thresholds and may be reversible at some early stages; they are strain, species and tissue specific and function often at the tumor promotion stage, while they are also able to be tumorigenic alone (such as asbestos).
From the overview of NGTxC mechanisms presented in Table 1, it should be noted that assays with endpoints capturing early key event mechanisms may provide an individual contribution to the Weight of Evidence (WoE) -where multiple sources of information are assessed in relation to each other -that is relatively poor.For example, measurement of proliferation alone does not necessary mean that there will be further downstream pathway perturbation leading to angiogenesis.
In other cases, the mechanism may not be purely non-genotoxic.A relevant example of a non-genotoxic mechanism is oxidative stress as it is considered to be one of the best documented mechanisms for carcinogenesis together with impacts upon cell cycle regression.Oxidative stress is one of the mechanisms through which environmental pollutants may induce cell damage, triggering an inflammatory response which may evolve into chronic inflammation as a consequence of enduring exposure.Gap junctional intercellular communication (Cruciani et al.,1997(Cruciani et al., , 1999) ) c-myc (Maire et al., 2007) Ornithine decarboxylase induction (Dhalluin et al., 1997(Dhalluin et al., , 1998
Library preparation becoming cheaper and easier.
RNA RNA-seq (Nagalakshmi et al., 2008) Quantitative, can also generate qualitative data about transcription such as alternative splicing.Data analysis approaches still being optimized.

Secondary (alternative)
Chromatin post-translational ChIP-seq (Mikkelsen et al., 2007) Tests entire sequenced genome.modifications, Resolution limited, not shown to be quantitative.chromatin constituents

Chromatin structure
DNase-seq (Song and Crawford, 2010) Identifies important regulatory regions not located at annotated promoters.Not shown to be quantitative.
pathways with other toxicity or AOPs.It will therefore be important to consider the paradigm of pathway-based responses in the discrimination between genotoxic and non-genotoxic chemicals, together with understanding the causality of exposures in inducing cancer versus other possible outcomes.

Current regulatory requirements and difficulties with respect to non-genotoxic carcinogens
Regulatory requirements for carcinogenicity testing of chemicals vary from legislation to legislation and region to region; however, a standard approach starts with a battery of genotoxicity tests.
Positive in vitro genotoxicity assays trigger in vivo genotoxicity assays, and if these are also positive, they can trigger further more involved mammalian testing.Positive in vitro genotoxicity assays miRNA-34c (Ostling et al., 2011), altered regulation is noted in infertile men (Wang et al., 2011), and it is known that miRNA plays a critical role in cleavage in mammalian spermatogenesis (Liu et al., 2012), but neither have been reported so far to be associated with adverse clinical outcomes in prostate cancer.However, there is an adverse association with the downregulation of miRNA-4723 and miRNA 338-3p in prostate cancer (Arora et al., 2013;Bakkar et al., 2016).
A recent review of IARC monograph Volume 100 found that this volume identified specific epigenetic substances with nongenotoxic carcinogenic potential and a recommendation is made to identify a priority set of potential epigenetic, non-genotoxic carcinogens (Herceg et al., 2013; see also Tab. 4).These examples illustrate the complexity of the processes leading to carcinogenicity, such that chemicals acting through one mechanism can trigger another, as well as the complex interaction of cancer The right hand side of the figure depicts the suggested levels of organization for the NGTxC IATA, with increasing levels of complexity that the hallmarks of NGTxC, shown in Figure 1, sit within, underpinned in many cases by epigenetic machinery, as indicated in the epicenter of the NGTxC hallmark wheel.The underlying blue lines with connecting nodes are symbolic of the NGTxC AOP causality network, where key events (KE) and key event relationships (KER) can be identified on the basis of the mechanisms and MoA identified in Table 1.As work on the IATA progresses, these nodes will be completed as KE, KER and relevant in vitro assays for both the IATA and as potential OECD TGs are identified and assessed for relevance and readiness.
studies were actually performed, as from the 2010 deadline.As they are rarely performed, few NGTxC might be identified via the assessment of these in vivo studies.
Published final decisions of ECHA can be found at http:// echa.europa.eu/information-on-chemicals/dossier-evaluationdecisions.The REACH regulation permits registrants to adapt the standard testing regime to fill data gaps using existing data, weight of evidence, (Quantitative) Structure Activity Relationships ((Q)SAR), in vitro methods, and grouping of substances and read-across (ECHA, 2008(ECHA, , 2014)).The issues regarding existing models and approaches with respect to the acceptance of in silico data for NGTxC chemical regulatory purposes are discussed in more detail in Section 5.4.
As there are no (OECD approved) in vitro screening methods for NGTxC, it appears likely that many NGTxC remain unidentified.Consequently the risks they may pose to human health will not be managed.
The situation appears to be similar in other regulatory frameworks in OECD member countries, i.e., that there is no specific chemicals legislation in place to address the NGTxC mode of action directly.As part of the regulatory evaluation process, further carcinogenicity studies can be requested if deemed necessary.In the United States, guidance is given on how to use MoA data (US EPA, 2005;Corton et al., 2014), as available, to determine if the substance is genotoxic or not, and inform the dose response relationship characterization.Similarly in the US, Canada and Japan, for cancer risk assessment, examination of the various indicators that can be seen within an OECD TG study that may alert to a potential NGTxC (e.g., estrogen agonists, overt cytotoxicity or cellular proliferation, etc.) is conducted, but there is no specific structured guidance for NGTxC can also trigger Globally Harmonized System (GHS) 2 classification for mutagenicity.If any of the in vivo genotoxicity tests are positive, then under several regulatory schemes a lifetime rodent cancer bioassay (RCB) may (or may not) follow.The RCB is the standard comprehensive in vivo test for the detection of carcinogens, including detection of NGTxC, but the bioassay will seldom be performed for chemicals.
Whilst it has been estimated that 10-20% of recognized human carcinogens classified as Class 1 by IARC act by NGTx mechanisms (Hernandez et al., 2009) there are no specific requests to obtain information on NGTx mechanisms of carcinogenicity in several chemical regulatory frameworks in OECD member countries.As already noted, the current approach to classification of carcinogens is based on the results of the in vivo RCB.For example, with respect to chemicals entering the European Registration, Evaluation and Authorization and Restriction of Chemicals (REACH)1 registration process, there is a general requirement that all available information on the registered substance needs to be submitted in the REACH registration dossier.Moreover, the legislation does indicate that when hyperplasia and/or pre-neoplastic lesions are observed in repeated dose studies and the substance has a widespread use or there is frequent or long-term human exposure, further carcinogenicity studies could be considered.However, the requirement to perform a carcinogenicity study is conditional (see Annex X of REACH Regulation for details).In practice, new carcinogenicity bioassays are seldom proposed by the registrants, and ECHA rarely requests such data.ECHA reports that only two proposals to conduct the RCB have been submitted to ECHA (ECHA, 2014).At the REACH registration deadlines of 2010 and 2013 substances that are manufactured at more than 100 tons per annum (tpa), have been registered, and only two carcinogenicity

Examples of the role of NGTxC events in the tumor development process
In rodents the liver is the target organ that is most commonly affected by carcinogens, thus one of the first experimental cancer models to highlight the role of non-genotoxic events in the tumor multistep process was hepatocarcinogenesis. Hepatocarcinogenesis is a progressive process, starting with focal lesions of inflamed liver.The neoplastic transformation of hepatocytes is a consequence of the accumulation of genetic damage in hyperproliferative cells such that inflammation and cell proliferation are then considered among the initial steps in the liver tumor process.The sequence of these events is considered to be mimicked in experimental carcinogenesis models.However, tumor formation in treated animals depends on dose, which should be sufficiently high, and time span, which should be sufficiently long.Moreover, the induction of cell proliferation is often required.This can be achieved by performing hepatectomy, or by repeatedly administering a "tumor-promoting" agent, such as phenobarbital, which serves as an inducer of both chemical bioactivation, by stimulating Phase-1 enzymes, and clonal expansion of pre-neoplastic cells.A single administration of a genotoxic chemical, such as N-nitrosodiethylamine (DEN), often requires a very long time to induce the tumor.Tumor formation is also strain, sex and age-dependent and it is poorly reproducible.For this reason, several experimental models have been developed to improve the sensitivity and reproducibility of the in vivo assay to study hepatocarcinogenesis in order to highlight the key steps leading to the adverse outcome.The experimental models, ranging from two-stage studies to xenograft models, however, show several limitations.Even if the sequence of events in experimental models can be considered similar to the tumor process in humans, the molecular changes are often of no or low significance in humans.As an example, H-Ras mutations induced by the treatment of DEN as a key event leading to tumor formation, is a late event in humans, related to the formation of metastasis and poor prognosis (Heindryckx et al., 2009).Based on the accumulated evidence from data reported in the literature over the last 40 years, NGTxC may trigger hepatocarcinogenesis through different mechanisms, including the activation of Phase-1 enzymes via binding the Constitutive Androstane Receptor (CAR) (e.g., phenobarbital and some polychlorinated biphenyl congeners), binding the Aryl-hydrocarbon Receptor (AhR) (dioxin and dioxin-like compounds), or by inducing peroxisome proliferation via the Peroxisome Proliferator-Activated Receptor (PPARα) (e.g., hexachlorobenzene) (see Fig. 3).Peroxisome proliferation, however, is another example of a species-specific event, whose role in human carcinogenesis is debated and generally considered not relevant to humans in regulatory evaluations.
On the basis of these mechanisms together with results from in vitro assays and two-stage in vivo studies, IARC provided a first list of non-genotoxic hepatocarcinogens in 1992, which was expanded in the IARC evaluations reported in 2013.The lists are included in Tables 5.1-5.4.These substances are known identification.Consequently, NGTxC is not well covered/addressed under the different OECD regulatory jurisdictions.
For GHS purposes, information regarding the type of lesion and species affected is an important element to forming the conclusions regarding classification for sufficient or limited evidence for carcinogenicity.The GHS classification is mainly based on the evaluation of complex in vivo long term and carcinogenicity data, and the GHS criteria indicate that information in two species is often needed to conclude as to whether a substance is a category 2 or category 1B carcinogen.In this regard, it is important to note that several retrospective analyses of carcinogenicity data generated in rats and mice have raised doubts about the need for two-species testing and revealed that mouse studies have not been of substantial added value in regulatory decision-making (Billington et al., 2010;Annys et al., 2014).Therefore systematic analysis of the GHS category 1 criteria for multi-species carcinogenicity will be needed to adequately identify NGTxC (Gray et al., 1995), whilst also some uncertainties of this test method in terms of reliability and relevance are recognized (Gottmann et al., 2001;Alden et al., 2011).Additionally, critical MoA information providing insights into the human relevance may be missing (Meek et al., 2014;WHO, 2007), and this is discussed further below.
As with REACH, there are no formal NGTxC regulation procedures to follow in several other chemical regulatory frameworks, whilst information on the effect level and/or dose response curve and information on the Classification and Labelling form an essential step in the risk management of chemicals.With a strong international drive to reduce animals testing and costs, it is essential that proper and robust methods for addressing NGTxC modes of action are developed and put in place.This is particularly stimulated by recent discussions between OECD member countries on difficulties as to how to meaningfully apply individual in vitro tests, such as the Cell Transformation Assay(s) (CTA), for NGTxC assessment.
At a meeting of the Working Group of the National Coordinators of the OECD Test Guidelines Programme (WNT) in April 2014, the CTA using Syrian Hamster Embryo (SHE) cells was not accepted as a Test Guideline (TG) but was proposed and then accepted as a Guidance Document (GD) (OECD, 2015a).Therefore, it does not fall under the Mutual Acceptance of Data (MAD) Decision, under which data generated in the testing of chemicals in an OECD member country, or a partner country having adhered to the Decision, in accordance with TGs and Principles of Good Laboratory Practice (GLP), can be accepted in other OECD countries (and partner counties having adhered to the Decision), for the purposes of assessment and the protection of human health and the environment.This greatly reduces the testing that needs to be done for different regulatory jurisdictions.A GD only provides guidance as to how to use a test.
At the WNT April 2014 meeting, it was also again recognized that no single test can currently demonstrate and predict NGTxC (OECD, 2012b).So, appropriate integrated approaches to detect and manage these NGTxC are not yet in place.The construction of an IATA could be a primary way to address this crucial gap.
to inform regulatory decision regarding potential hazard and/ or risk and/or the need for further targeted testing and therefore optimising and potentially reducing the number of tests that need to be conducted" (OECD, 2015b).When applying IATA, the hazard information together with the exposure information would be used to determine which data gaps exist, and what testing if any would be most appropriate to undertake in order to elucidate the hazard profile of that substance for a given use context.Thus the extent to which testing approaches are needed depends on the problem formulation, which in turn is defined by the end purpose under consideration and the scientific confidence needed (OECD, 2015b).The IATA should be as simple as possible but as complex as necessary, and it is necessary to provide regulators with the understanding of the assumptions on which the IATA is based (OECD, 2015b).
to modulate cell growth and cell death and exhibit dose response relationships between the initial stages of exposure and the later stages of tumor formation.While the exact MoAs of these substances on the process of neoplastic cell formation have not been established, changes in gene expression and cell growth parameters appear to be paramount.

IATA for NGTxC
The OECD working definition of an IATA is as follows: "a structured approach used for hazard identification (potential), hazard characterization (potency) and/or safety assessment (potential/potency and exposure) of a chemical or group of chemicals, which strategically integrates and weights all relevant data

Tab. 5: Examples of groups of chemicals that may trigger mechanisms involved in non-genotoxic carcinogenesis
The current approach to classification of carcinogens is based on the results of the rodent bioassay.As already noted, most of the chemicals target the liver in these models.An IATA therefore represents an approach to integrate and weigh all available data for hazard and risk purposes.This includes a wide variety of regulatory needs that range from simple hazard identification for priority setting or support for category formation and read-across to complex quantitative-based risk/safety assessments.An IATA is usually comprised of several elements, and the selection of these can be based on an AOP, where an AOP describes existing knowledge on the toxicity mechanisms at different levels of biological organization that lead to an adverse human and/or environmental health effect (Ankley et al., 2010;OECD, 2013;Tollefesen et al., 2014;Villeneuve et al., 2014).It depicts biological changes, defined as key events (KEs), at cellular, tissue, Much of the available information about AOPs, MIEs, KEs, AOs and chemical initiators is being collected and will be linked in the AOP Knowledge Base (AOP-KB, https://aopkb.org), an ongoing OECD initiative.The number of AOPs being developed by the scientific community internationally, both for human health and ecological effects, is steadily increasing.As more AOPs are developed and successfully peer reviewed, networks of AOPs will arise that are interlinked by sharing one or more KEs and/ or KERs.In this way the AOP concept provides a framework to organize and communicate scientific knowledge on toxicological mechanisms that may be highly informative for regulatory decision-making and IATA construction.An IATA can also be developed in an empirical manner, based upon predictivity and reproducibility, as expected from a TG, and can so also contain elements that are not informed by an AOP, particularly intended use and exposure, toxicokinetics and -dynamics, etc. Depending on the purpose of the risk assessment, the exposure assessment may need to emphasize certain areas in addition to quantification of exposure and dose, for example, the number of people exposed and the duration and frequency of exposure(s).However, the methodology and guidance for exposure assessment considerations are not being included at this cause and effect linkages between KEs or a KE and the AO.These pathways are often based on a few stressors tested in a limited number of assays with a low level of confidence in the AOP.
-Qualitative: where AOPs / MoAs have qualitative understanding of critical components of the AOP / MoA.Pathways are often based on one or a few well-studied stressors where there is experimental evidence for the most critical KEs and the AO.
The level of confidence in the AOP is moderate.-Semi-Quantitative: where AOPs / MoAs have, in addition to qualitative understanding of the entire AOP / MoA, semi-quantitative understanding of some of the KEs.Pathways are based on multiple compounds and/or stressors evaluated at several KEs and the AO.The level of confidence in the AOP is moderate to high.-Quantitative: where AOPs / MoAs have, in addition to quantitative understanding of critical components of the AOP, empirical data across the spectrum of KEs and AO.These pathways are based on many compounds evaluated for all KEs and the AO, so in vitro effects can be scaled to in vivo effects for risk assessment.The level of confidence in the AOP is high (OECD, 2015b).the use of safety assessment factors varies according to different regulatory frameworks.Assessment factors may be better formulated scientifically as probability distributions (mean ± SD) (e.g., Jaworska et al., 2011;WHO, 2014) rather than deterministic (e.g., factor 100), in which case on a human population level no deterministic threshold for whatever type of effect can be defined, be it GTx or NGTx (see e.g., Vermeire et al., 1999).The current common regulatory practice is to use large assessment factors or probabilistic thresholds for GTx carcinogens (e.g., residual tumor probability of 10 -6 ), but standard assessment factors and deterministic thresholds for NGTx effects.This perhaps reflects the differing confidence in the reliability and relevance of the GTx and NGTx testing methods rather than a biological concept, and as such could be a useful consideration in the WoE analysis within an NGTxC IATA.

Requirement of consistent evidence of adversity
When we consider the possible mechanisms through which NGTxC affect the carcinogenesis process, it is important to appreciate that not all of these mechanisms are always related to adversity.For example, oxidative stress, cell death and immune system evasion are three strictly interconnected cancer hallmarks (they share several molecular targets and pathways).However, cell death is also a mechanism of cell defense, and the immune system has a counterbalancing role to play that recently has been described as a yin/yang response (Khatami, 2008;Biswas and Mantovani, 2010), depending upon the affected pathway, as the body attempts to recover from a toxic insult and tries to restore balance.Among these three mechanisms, oxidative stress is directly linked to adversity, but for the other mechanisms, disturbance of the immune system could also lead to serious adverse effects, depending upon whether the balance is disturbed and equilibrium is not quickly restored.Since it is possible to measure only the endpoint (molecular target, biomarker) that is related to the mechanisms, but not necessarily to adversity, a substance cannot be considered to be a positive NGTxC simply on the basis of a single mechanistic in vitro test, or only one mechanism, particularly if this mechanism cannot be unequivocally related to adversity.

One IATA NGTxC cancer model or many?
When it comes to building an IATA for NGTxC, it is very likely that while there may be mechanistic blocks or common KE elements and KERs that occur in many different types of cancers, there can also often be specific influences and signaling pathways as first indicated in Table 1 that are more relevant to a particular tissue and organ.It is therefore unlikely that a global "one size fits all" model will be sufficient.Here we suggest three initial examples of the natural history of tumor progression scheme: liver (Fig. 3), colon (Fig. 4) and lung (Fig. 5), utilizing the AOP format that might provide the basis for IATA approaches for NGTxC to prioritize testing regimes.These first examples are developed to highlight the role of inflammation and related oxidative stress as steps that trigger NGTxC in the multistep carcinogenesis process.Inflammation is a major early stage, since these are outside the scope of the intrinsic hazard profile.The reader is referred to more authoritative sources for the conduct of an exposure assessment, e.g., Embry et al. (2014).The governance of IATA activities is in development under the auspices of the OECD.The intention is that at least core elements of IATAs can be developed that will also fall under the Mutual Acceptance of Data (MAD) principle that underpins the OECD Test Guideline Programme (Section 3), such that they are mutually accepted in member countries.
Here, we explore the basis for the development of an IATA for NGTxC, aiming first to address hazard identification and how this can start to be approached internationally in a harmonized manner (and under the auspices of the OECD).There might be a need for several "purpose" levels within an IATA for NGTxC, or there might be a need to develop different and separate purpose driven IATAs for NGTxC.These ideas and concepts are in very early development, and here we are not able to make concrete and definitive suggestions as to a final IATA for NGTxC, but we do examine how to start synthesizing the approaches that would be acceptable at the OECD level.Figure 2 provides a conceptual bird's eye view and provisional structure in which to synthesize the elements that can inform an IATA for NGTxC, as described within this paper.

Thresholds
NGTxC chemical substances exhibit temporal and threshold characteristics frequently requiring repeated treatment to produce carcinogenicity.While NGTxC are carcinogenic alone, they are generally considered to impact upon the promotion stage of the cancer process.As such, they synergize with genotoxic agents and/or DNA damaging events, triggering the multistep carcinogenic process.Co-exposure and interactions occur constantly in real life, and promoting effects might be worth considering in the regulation of chemicals for cancer prevention.The AOP construct can be very useful to discriminate between the adaptive response and the adverse responses, so identifying the reference dose that is the point of departure (i.e., benchmark dose (BMD)) for the calculation of the acceptable exposure levels (Chepelev et al., 2015).
Within current regulatory frameworks up to now the existence of thresholds for NGTxC has been generally accepted.However, due to the possible interactions with GTxC, and in the context of probabilistic, population-based limit value derivation (WHO, 2014), the concept may need refinement for the purpose of planning, design and interpretation of an NGTxC IATA, and therefore a brief consideration on the threshold issues is presented here.
The variability of data points within each experimental data set could be used to derive a no adverse effect level including its confidence interval, i.e., a BMD approach.In other words, there is a probability that at concentrations much below the BMD the effect size should still be considered as adverse.In addition to "within experimental uncertainty", there is extrapolation uncertainty -from animals to humans and between humans.Differences between animals and humans, and furthermore differences between humans vary for each individual chemical, and colorectal cancer (Liu et al., 2016).It can also occur as a sporadic form where APC is hypermethylated as a consequence of environmental exposures (i.e., diet, alcohol consumption) and also as a consequence of inflammation in chronic diseases or due to pathogenic bacteria (Viljoen et al., 2015).Inflammation also plays the main role also in the early steps of sporadic colon tumors.
The third example is lung cancer.For lung cancer, inflammation is the first adverse effect following the exposure to environmental oxidative stressors (such as smoking and airborne environmental pollutants).Inflammation is supported by the immediate immune response, through the production of chemokines and activation of alveolar macrophages.Figure 5 maps the key events in the natural history of tumor development and progression leading to the adverse outcome of lung cancer.

Acceptance of in silico data for regulatory purposes
Generally, in silico approaches have been utilized for many years by regulatory organizations such as the US EPA, but the routine use of such tools has not been consistent (Lo Piparo et al., 2011), often due of a lack of in-house expertise but also, with respect to carcinogenicity, due to a lack of appropriate models.However, the implementation of REACH, which aims to fill information gaps for a large number of chemicals and strongly encourages the minimization of animal testing, has provided an pathological condition that provides suitable conditions for further evolution of the multistep process.
Whilst there are for NGTxC many mechanisms and modes of action leading to an adverse outcome, one can start to map the AOP utilizing current understanding of tumor natural histories, and therefore elucidate the key assay blocks that would be required for an organ specific IATA for example.
These may therefore form a basis for the IATA specific compartments from which an IATA framework for NGTxC will be able to draw out mechanistic KE commonalties for assay selection and development, and the KE differences that would need specific attention.
The first example is that for hepatocarcinogenesis.A second example is that for colon cancer.Figure 4 shows such a scheme for colon cancer.It is known that colon cancer can be initiated by inherited damage, e.g., an encoding large multidomain protein that antagonizes the Wnt signaling pathway, called mutated adenomatous polyposis coli (APC), is known to be the main causative gene responsible for familial adenomatous polyposis (FAP).This is an autosomal dominant disorder characterized by the development of many hundreds to thousands of colonic adenomas, and thus an increased risk of properties of substances.SAR models only use (sub)structure information and can therefore be seen as a more formal way of performing read-across with a given reference set of data, as the property of one or more substances is directly used to predict the property of the substance of interest.It is noted that conceptually (Q)SARs should be developed utilizing a large, good quality database using robust scientific and statistical concepts and as such should represent the most formalized non-testing approach.In contrast, read-across has the disadvantage of representing a much less formalized and therefore more subjective non-testing approach, but it may provide more specific information.In the future a combination of (Q)SARs with read-across, i.e., a local validity analysis of the models, may be of greater reliability for decision making.VEGA 2 , for example, explicitly supports this.Such an approach may improve possibilities to yield robust information on the type of effect (tumor) and the dose levels at which effects occur.This information is crucial impetus to employ in silico models for the safety assessment of chemicals (EC, 2008).Non-testing methods to assess genotoxic or carcinogenic hazard to humans encompass (Q)SARs as well as chemical grouping for read-across approaches.The principle of the latter is that endpoint or test information for one or more chemicals is used to predict the same endpoint or test for another chemical, which is considered to be similar by robust scientific justification (Wu et al., 2010;OECD, 2014).Crucial for this approach are the quality of the existing data and definitions of the similarity, and with a drive to improve the transparency and consistency in the utilization and evaluation of read-across, ECHA have recently developed a Read-Across Assessment Framework (ECHA, 2015).Read-across approaches are more frequently applied for cancer hazard assessment than (Q)SAR methods (ECHA, 2014).(Q)SAR models predict the biological activity of a given chemical using quantitative parameters describing structure but also physico-chemical and/or reactivity are considered more important than structural alerts (Silva Lima and Van der Laan, 2000), but these can be combined.A few models describing a number of structural alerts and/or characteristics of several types of NGTxC, such as PPARα activators and inducers of oxidative stress, have been developed (Woo and Lai, 2003;Benigni et al., 2013).
Thus, for structural alerts for receptor mediated interactions (e.g., ER, AR, PXR, CAR, PPAR, GR and AhR, where the receptors are considered MIEs that may induce cancer indirectly via hormonal imbalance), (Jacobs et al., 2003;Jacobs, 2004;EFSA, 2013EFSA, , 2014, and references therein; OECD 2015c and references therein), oxidative stress and DNA methylation (Benigni, 2012;OECD 2015c), application of in silico methods in sequential/ step-wise approaches (combining relevant and reliable expert systems or (Q)SAR models), can contribute to the WoE.Overall, for MIE endpoints, such as receptor binding and activation, the quality and reliability of the tools are relatively high, but reliability for more complex endpoints has been noted to be far less certain (Benigni, 2014;EFSA, 2013EFSA, , 2014;;OECD, 2015c).With respect to models built upon the CTA for example, as already discussed, regulatory confidence needs to be improved, but this is possible when used in a tiered testing strategy with the inclusion of relevant new structural alerts for NGTxC.
The OECD has recently published new guidance principles for (Q)SAR analysis of chemical carcinogens with mechanistic considerations (OECD, 2015c), which include some NGTxC mechanisms, though these will need further assessment, whilst the US EPA have developed and published a new pathwaybased approach with performance based metrics, using the ER signaling pathway as the first example.The approach consists in an integrated model of chemical perturbations of a biological pathway using 18 in vitro high throughput screening assays for the ER, with data generated from the Toxcast/Tox21 program (Browne et al., 2015).Kleinstreuer et al. (2013) developed a model trained on 232 pesticides from the phase 1 study of the Toxcast project for the prediction of rodent carcinogenicity including some of the cancer hallmark processes, identified in Figure 1.However, there were several false negatives of concern, and the data used to build this model is noted to be limited, but the phase 2 study, may overcome some of these data quality problems.
For both predictions (positive and negative), the similarity of the substance of interest to the substances used in the training datasets of the models (the so-called applicability domain of the models) is crucial.Although a (Q)SAR model might be able to give a prediction for any (organic) substance, if this prediction is far outside of the applicability domain, the reliability of the prediction will be low.Amongst the model builders there are also different interpretations of the definition of the term in risk assessment and will not be delivered by the use of, e.g., alert models such as DEREK-Nexus3 , or ToxTree4 (Patlewicz et al., 2008).Application of such alert models will be more relevant in screening large numbers of substances, and subsequent priority setting.A good example of such an approach is the ICH M7 guideline for assessment of potentially genotoxic impurities in pharmaceuticals (ICH, 2014).According to this guideline, a negative result obtained from in vitro bacterial mutagenicity (e.g., Ames test) is sufficient to assume lack of genotoxic potential of the impurities under study, and no further testing is conducted.
Indeed the examples for in silico genotoxicity tools are far stronger than those for NGTxC mechanisms and MoA.Multiple computational models have been established for identifying a chemical's genotoxic potential, triggered over 25 years ago by the publication of the Ashby-Tennant alerts in 1991 (Ashby and Tennant, 1991), followed by software such as VEGA (Fjodorova et al., 2010), ToxTree (Patlewicz et al., 2008), the OECD QSAR Toolbox (OECD, 2015c) and LAZAR (Maunz et al., 2013;Lo Piparo et al., 2014), which are all available free of charge.Other, (semi)commercial, models include MultiCASE5 , TOPKAT6 , HazardExpert7 , LeadScope8 and DEREK-Nexus 3 .In general, it can be stated that these models, in contrast to those developed for predicting NGTxC, are more developed, and cover broader chemical space.The knowledge embedded in the various freeof-charge models, as well as in the commercial models, often largely overlaps, as most models derive their knowledge from the same set of experimental data, although there are differences in the details.Caution is warranted when using multiple theoretical models in a WoE approach, as the same outcome from different models does not necessarily increase the confidence in that prediction.A further distinction should be made between models that are "only" alert models, such as the DEREK-Nexus expert system, ToxTree, or the profilers in the OECD QSAR Toolbox, versus models that also try to correlate the absence of genotoxicity or carcinogenicity to specific chemical and/or physicochemical properties.The first are considered to give only a valid positive prediction, i.e., the absence of an alert is not a prediction of the absence of genotoxicity and/or carcinogenicity.In contrast, models such as VEGA, TOPKAT and MultiCASE give valid predictions of the presence and absence of genotoxicity or carcinogenicity, the latter being based on the similarity of a substance with the negative (non-carcinogenic) substances in their training datasets.
With the large diversity of chemicals that might interfere with the NGTxC mechanisms and the many potential molecular targets, the establishment of a single (Q)SAR for NGTxC is an unrealistic goal and the diversity makes development of (Q)SAR models for identifying NGTxC a challenging process.Furthermore, NGTxC typical characteristics, such as receptor binding, practice.Will they indicate their substance as being an NGTxC on the basis of an IATA primarily based on in silico and in vitro data?Or will it still be necessary for regulators to ask for more (in vivo) information to come to definite conclusions that are essential for risk management measures?In the spirit of Toxicity Testing in the 21 st Century (NAS, 2007), an additional ultimate goal will be for the IATA to eventually move away from animal testing altogether, as the KE, KERs and test methods that could address those KEs are increasingly identified, confirmed and validated.
It will be essential that the IATA is developed in such a way that in most cases regulatory decisions can be taken without the need of additional animal testing.Otherwise regulatory acceptance will be limited due to concern for false negatives on one hand or concern for false positives on the other hand, such that gains in the 3Rs as well as gains in regulatory testing and assessment throughput will be minimal.Expectations for the time frame necessary to develop such an IATA vary between experts and in the end depend upon resources that are made available.The time frame will also be influenced by how far the IATA can accommodate defined testing approaches for specified chemical spaces (the "chemical applicability domains" referred to in the previous section), for more robustly characterized mechanisms of action compared to less well characterized mechanisms, and for categorization and read-across based on a larger database including other animal data or for specific regulatory areas, e.g., those without legal animal data requirements.In any case, scrutinizing the reliability and the relevance and potential added value of the in vivo testing standards (Gottmann et al., 2001;Alden et al., 2011;Basketter et al., 2012;Marone et al., 2014) should contribute to the development of robust testing strategies (Paparella et al., 2013(Paparella et al., , 2016)).

Levels of test information as a preliminary step towards the IATA
A structured approach to building an IATA for NGTxC could be usefully organized into different levels of information, and the AOP concept provides a suitable starting point for creating such an information level framework.Table 6 provides a summary overview of such a structure, as described below, and Figure 2 illustrates how such levels could fit conceptually into an IATA for NGTxC.
Level 0 would cover pre-screening of existing information, category formation, read-across and (Q)SARs, also indicated in Figure 2 as part of the AOP information.Subsequent levels would continue to incorporate AOP level concepts, such that: Level 1 would be at a subcellular level of very early KE and include for example in vitro assays such as receptor binding assays that indicate the MIE; Level 2 would be at the cellular level, also in vitro, and include both MIEs such as receptor binding and initial KE such as DNA activation, enzyme activation and other further downstream KE (such as disturbance of metabolism and key event relationships); Level 3 would be at the organ level, and so include ex vivo assays and in vivo screening assays, developing KER further, whilst Level 4 would include the whole organism level, but be kept to a minimum, in keeping "applicability domain" (VEGA, TOPKAT and MultiCASE all have their own definitions), which is confusing for the inexperienced user.However, in principle, deciding on the applicability domain for a (Q)SAR model represents the same challenge as deciding on sufficient similarity of substances within readacross approaches.As outlined below, further improvements are expected in future with the integration of such non-testing approaches with in vitro approaches.
A thorough analysis of the use of in silico information in a regulatory setting shows that the number of REACH dossiers for which read-across and/or (Q)SAR was used to replace experimental evidence hovers around 30% (ECHA, 2014).Furthermore, read across/(Q)SAR information was more frequently used for substances with lower information requirements in comparison to substances with higher information requirements (higher production volume) (ECHA, 2014).This could, in part, be explained by the fact that more (Q)SARs are available for the mechanistically more simple endpoints (irritation, sensitization, mutagenicity), which are required to be assessed at lower tonnage levels.A similar trend in the use of Q(SAR) predictions can be noted for pharmaceuticals, as the ICH has adopted the M7 guideline for assessment of potentially DNA-reactive/ mutagenic impurities in pharmaceuticals (ICH, 2014).The use of WoE approaches should further enhance both the application and acceptance of in silico information in chemical safety assessment.

Acceptance of in vitro data for regulatory purposes
Current GHS criteria will not consider in vitro data on their own as adequate to classify a substance as a carcinogen, as the definition indicates it needs to be established in an intact organism, as is also the case for many regulatory frameworks with respect to endocrine disruptors, but there are signs of improvements in flexibility with respect to the term "intact organism."This is also valid for the conclusion as to whether a substance is not an NGTxC.Thus NGTxC determination for a chemical can as yet not be concluded solely on basis of a single mechanistic in vitro test.
In the development of an IATA for NGTxC, it therefore will be important to integrate more than one "adversity" endpoint, so that there is more consistent evidence of adversity.For example, if a chemical triggers senescence by modulating telomerase, it should be tested for genetic instability.In this context, the CTA also could be considered to provide the endpoint related to morphological transformation.And another example, if a chemical blocks the gap junctions or induces oxidative stress, is this mechanism related to morphological transformation?Or does it occur independently?If both morphological transformation and oxidative stress are observed, the level of concentrations that are effective on each criterion may inform as to a possible link, or not.
The development of test case study examples may help in assisting with such evaluations for practical purposes and application to real life risk assessment scenarios.Particular attention will need to be paid to how this will be used by stakeholders in results already obtained.This involves an integrative assessment before each level is considered completed.Probabilistic approaches are increasingly considered to reflect more realistic data interpretation (Jaworska and Hoffmann, 2010;Jaworska et al., 2011;Paparella et al., 2013;Rovida et al., 2015).
To further ensure that the key MoA(s) are identified, and not missed, all MoAs would need to be tested when one moves from one block of MoA tests to the next, if negative in the first block tested.That is, testing one MoA will not exclude all other MoAs.

Chemical examples
From the list of chemicals that are considered as NGTx hepatocarcinogens by IARC (Tab.5), with the Table 1 overview of key NGTxC mechanisms and the liver tumor model shown in Figure 3, it may be helpful to consider actual chemical examples that demonstrate how the results from in vitro tests at different levels may contribute to overcome the uncertainties related to in vivo results.
These examples all have their limitations and are given here not to draw any conclusions but to emphasize the limitations of the current approaches in the evaluation of the carcinogenic potential of NGTxC (including the role of peroxisome proliferation), and to provide a starting point for discussion and evolution of the concept.A good first chemical example, DEHP, can be given to start to suggest minimum information requirements for each level.This example is also interesting in that it highlights the fact that carcinogenic information generated in vivo can in some cases be considered equivocal, species specific and may be better detected with key event based testing.This is because the earlier levels, before Level 4, provide the key MOA information, whilst the key MOA information often is obscured in Level 4 in vivo data.DEHP, is negative in GTx assays, but is an initiator in initiation/promotion studies and is considered to be a hepatocarcinogen in rodents (IARC, 2013).However, the carcinogenic effect in rodents is recognized to be species specific and strictly with Toxicity Testing in the 21 st Century (NAS, 2007) and only be performed if the earlier levels did not provide sufficiently strong weight of evidence as required by a regulator/regulatory jurisdiction.
In some cases an assay or diagnostic tool might straddle two levels, for example docking studies with chemicals and receptors/P450 can provide very specific molecular mechanisms for an MIE and belong in Level 0 ((Q)SAR information) and Level 1.The CTA's also might belong to two levels: Level 2 and Level 3.This is because while the cytoskeleton changes are strictly related to the acquisition of the malignant phenotype in the CTA, cloning of embryonic Syrian hamster cells leads to colonies that may acquire a transformed phenotype when exposed to chemical carcinogens.The morphologically transformed colonies are characterized by disorganized growth patterns which mimic early stages in the carcinogenic process (OECD, 2015a).Thus, in combination with toxicogenomics profiling highlighting the CTA mechanisms, the assay could in the future be considered a Level 3 assay, as the change in cytoskeleton may be used as a hallmark of the cancer microenvironment.This would be in keeping with the OECD recommendation (OECD, 2015a) that "When SHE CTA results are used as part of a testing strategy (not as results from a stand-alone assay) and/or in a weight of evidence approach, they may contribute to the assessment of carcinogenic potential of test chemicals (Creton et al., 2012)." The resolution of the different levels (molecular /subcellular -cellular -tissue/organ -organism) may need to be higher, depending upon the purpose of the IATA, and this will need more discussion.
Such a structured approach for the scoping of the IATA is necessary, whether it is being developed for prioritization purposes for further testing or the purpose is for hazard identification/ characterization to indicate whether a substance is an NGTxC or not, for subsequent quantitative risk assessment purposes, or both.
The structure can then also assist in the grouping of different types of tests and assays that can be used in identification of NGTxC.At each level the decision on the next required test to be done would depend on the available information and test Whole organism level only to be performed if the earlier levels did not provide a sufficiently strong weight of evidence required by a regulator/ regulatory jurisdiction however, this mechanism was considered not relevant to humans, and the carcinogenic activity of atrazine was questioned on the basis of biological plausibility.More recent reports highlight the transgenerational effect of atrazine exposure (Hovey et al., 2011), resembling the next generational behavior of endocrine disruptors, such as diethylstilbestrol (DES), via a mechanism that could be of relevance to humans, but for which there is no evidence in the standard cancer bioassay.Atrazine and atrazine metabolites have MIEs acting via early KEs that include steroidogenesis/aromatase, CYP 1A2 induction and binding to G protein-coupled estrogen receptor 1 (GPER).Level 2 KEs include ERK (extra signal-regulated kinases) activation, at Level 3 the effect observed is delayed mammary gland development (initiated in utero), and at Level 4 in rats, mammary gland tumors are reported (Tab.7).The MoA for mammary gland tumors in the rat is species and strain specific (Simpkins et al., 2011) due to differences in the modulation of gonadotrophin releasing hormone (GnRH) pulse and impact upon the release of luteinizing hormone with alteration of ovulatory cycles -which are markedly different and not of human relevance, as recently concluded by the Risk Assessment Committee at ECHA (ECHA, 2015).
The third example, captafol, a pesticide, is not considered genotoxic, but displays both hepatic and renal carcinogenicity (Rakitsky et al., 2000;IARC, 1991).However, most NGTxC that are renal carcinogens are negative in a standard cancer bioassay, and thus captafol is an example of a NGTx carcinogen related to the peroxisome proliferation in the liver of treated animals (Melnick, 2001), and this is not considered relevant to humans.Recently, it has been reported to induce liver tumors in PPARα-null mice and a different mechanism has been hypothesized, which includes the activation of the human CAR, a constitutively expressed xenobiotic receptor that plays a role in liver cancer induced by phenobarbital, in structure activity relationships (Zhang et al., 2015) and in vivo (Lv et al., 2015).The potential for ER dependent and ER independent modulations also has been shown recently in two cell lines (Tanay Das et al., 2014).
DEHP can be shown to have the following succession of key events: Level 1, MIE/KEs: Receptor binding with PPARα and CAR (see Tab. 1), thus two MIEs, followed by KEs also at sub cellular level: induction of CYP 4A and CYP 2B, respectively (see also Fig. 3); Level 2, KEs: inhibition of apoptosis, peroxisome proliferation, inhibition of gap junctional intracellular communication; Level 3 KEs: enlarged liver due to peroxisome proliferation; and Level 4, AO adverse outcome: hepatocarcinoma (see Tab. 7).
Another example, atrazine is the prototype of a class of chemicals (triazines) sharing the same MoA supporting NGTxC.Triazines represent the first example for which combinations of structure-activity considerations and relevant biological and molecular events were used to support a MoA regulatory approach to assess adverse effects (US EPA, 2002).In the following years, *Note that with the building of the IATA it may be possible that Level 4 in vivo assay information will no longer need to be generated.

Level 2/3
Inhibition of apoptosis in cultured human hepatocytes (Goll et al., 1999) and SHE cells (Maire et al., 2005;Pant et al., 2010) ERK activation as the consequence of atrazine binding to GPER (Albanito et al., 2015), aromatase induction (Albanito et al., 2015;Quignot et al., 2012;Sanderson et al., 2000) In vitro cell transformation assays (Perocco et al., 1995) Level 1 Induction of CYP 2B10 associated with CAR activation (Ren et al., 2010) In vitro liver microsomal CYP 1A2 induction (Lang et al., 1996(Lang et al., , 1997) ) Assessment of effects on CYP 1A1 and CYP 2B (Rahden-Staron et al., 2001) Level 0 (Q)SAR and structure activity information (Benigni, 2012;Serafimova et al., 2007;Zhang et al., 2015) SAR and grouping based upon mammary gland tumor induction by various striazine compounds in rats (US EPA, 2002) Existing literature data (Rakitsky et al., 2000;NTP, 2008NTP, , 2011;;IARC, 1991)  In this context it will be essential to characterize qualitatively and, as far as possible, quantitatively the reliability and relevance of the established animal testing methods as well as the in vitro/in silico test methods.We need to take stock of the performance of the actual methods in order to define a benchmark that new approaches should overcome.Furthermore the animal test data often serve as reference for the validation of new methods.The weight these animal test data should have within a validation compared to human data and mechanistic information shall depend on the reliability and relevance of all these sources of information for the target of evaluation, i.e., human health.It will be important to conduct thorough and step by step transparent uncertainty analyses as part of the WoE approach within the IATA, particularly for potentially sensitive conclusions on positively and negatively identified substances.In this way it is possible to reduce uncertainty and remove controversy, so that a scientifically robust decision that is acceptable to regulatory authorities can be made.

Chemical
In any case, Level 1, 2 and 3 assays (see below) will require validation to the extent that definitive decisions including the derivation of acceptable exposure levels can be made.

Level 0: Existing information and in silico approaches
Literature and in silico MoA review information (using for example chemical structure information and prior published information) would guide the selection of the most relevant test block in which to first initiate testing the substance.Whilst it is too early to devise any precise decision rules at this stage, it is recognized that in a final IATA, decision rules or workflows would be necessary, together with the considerations described in the earlier Section 5.4.In summary, when selecting and interrogating such computational tools, a high level of attention needs to be paid to chemistry and biological endpoint data quality and data cleaning considerations, appropriate selection and use of descriptors and statistics.The use of (Q)SAR models, expert systems, category formation tools, as well as the interpretation of the results, require expert knowledge, because each of these tools have their own level of reliability and chemical applicability domain limitations.

Level 1: Subcellular level and Level 2: Cellular level
A number of mechanisms have been linked to NGTxC.For some of these mechanisms there is an assay that can in principle be taken to the OECD Test Guideline Programme and be developed into an internationally accepted test method.So far this has only taken place for endocrine MIEs, such as estrogen receptor binding and transactivation, and steroidogenesis.In addition, the ToxCast program offers a plethora of high-throughput tests that may be very useful as Level 1/2 tests (Judson et al., 2014), although the readiness of these assays to be developed into an internationally accepted test method needs assessment.
As already noted, many of these mechanisms may be initial steps in the NGTxC processes, but their initiation does not mean that can potentially be classified as a false negative under current regulatory testing paradigms.Furthermore, it is clearly carcinogenic only to mouse, inducing liver hepatocarcinomas in B6C3F1 mice, hemangiosarcomas in male CD-1 mice, lymphosarcoma in both sexes, and harderian gland adenoma in males (NTP, 2011(NTP, , 2014)).In rats, captafol also induces renal adenomas and carcinomas, but the incidence is statistically significant in males only when the adenomas and carcinomas are combined and when considering the positive trend at the highest assayed dose after applying the Cochran Armitage test (or equivalent) for the trends to the combination.In this example, the carcinogenic endpoint effect therefore exists, but the standard RCB method is not sensitive enough, on its own, to detect it, probably due to the NGTxC mechanism.Captafol also induces cell transformation in vitro.Most recent reports consider captafol as both GTx and NGTxC (NTP, 2008(NTP, , 2011)).The NGTxC activity was revealed in initiation/promotion studies, but the possible NGTxC mechanisms have not been fully explained.
7 How could an IATA for NGTxC start to take shape?
Both quantitative and qualitative AOP/MoA IATA elements will be required and can be based upon the essential conceptual steps of the carcinogenic process, which include initiation (such as the early KE in Tab.1), promotion (differential stimulation, inhibition or toxicity (as for example with cell proliferation, gap junction intercellular communication), transformation (from benign to malignancy), neoangiogenesis and progression with pathogenic angiogenesis and neoangiogenesis, shown in the steps in Figures 3, 4 and 5 and presented conceptually in Figure 2.
As already briefly indicated in Section 5.6, to further ensure that the key MoA is identified and not missed, all MoAs would need to be tested for, when one moves from one block of MoA tests to the next, if negative in the first block tested.When we consider the possible mechanisms through which NGTxC affect the carcinogenesis process, it is important to appreciate that not all of these mechanisms are always related to adversity.For example, receptor binding and transactivation may not lead to consequential adverse effects at all.As these early MIE/KE often do not result in downstream adverse outcomes, an IATA will need to examine the specific initiating events for each receptor, and then look at the downstream KE that is/are related to all receptors with a pivotal role in the adverse tissue/target organ outcome.
When we can see that several mechanisms are interconnected and are being affected adversely, then we will start to be in a position to make IATA based decisions.For example, oxidative stress, cell death and immune system evasion are three cancer hallmarks that are strictly interconnected (they share several molecular targets and pathways).Three examples of chemical case studies specific to tumor models are given in Table 7.
In this way, the intention would be to overcome current issues of insufficient applicability, scope and downstream relevance.It will be important to have a conservative approach to such tiered test information, and while keeping false positives to a minimum it may still need to be tested in others before that substance can move to Level 3 and then 4, i.e., definitive in vivo testing.This is because NGTxC may have multiple MoAs, as indicated in Table 1 and references therein.With multiple MoAs, testing only one MoA group is not sufficient, not least because it may not necessarily be the most sensitive one, and so a battery of in silico/vitro assays or in vitro assays covering the entire spectrum would be preferable.A high-throughput (HTP) setting could potentially expedite this.Moreover, testing a wide range of MoAs will be preferable to making the decision as to whether or not additional testing is required.An additional advantage is that it will yield knowledge on adverse outcome and toxicological pathways in general, not only on NGTx carcinogenesis.It is important to note that quantitative information, such as dose response relationships and points of departure (POD), will be required in order to know whether a particular KE will trigger the next KE, and not single dose data points, which cannot contribute at the qualitative level of a more comprehensive IATA format.

Level 3: Multicellular tissue and organ level
This is the pivotal level at which the cellular changes are sufficient to trigger cytoskeletal, tissue and organ changes, and angiogenesis.This level includes in vitro tests such as the CTAs, which may also be at Level 2: The SHE assay can be utilized for the first steps of the multistep carcinogenic process, whilst the Balb 3T3 and C3H10T1/2 CTA assays are designed to address the later steps of a carcinogenic process.At this level, also 3D liver cell models (Hengstler et al., 2015;Prestigiacomo and Suter-Dick, 2015;Ramachandran et al., 2015;van Grunsven, 2015) and isolated organ studies, such as tissue perfusion and histopathology from in vivo repeated dose studies would be relevant.
Pathogenic angiogenesis and neoangiogenesis are the later multistep cancer processes following on from endothelial cell activation in response to angiogenic factors.This leads to the re-organization of endothelial cells to form tubules, which interconnect to form a network by way of: 1. Degradation of the capillary wall by extracellular proteinases; 2. Migratory signals; 3. Interconnection of the new tubules to form a network (anastomosis).Some assays at level 2 will be highly supportive in addressing these three network aspects and HTP assays for primary angiogenic pathway molecular targets are available (Tab. 1).
Currently organ-on-chip technologies are also being explored for applicability in this respect (Marx et al., 2012) but are currently being developed more for drug discovery applications (Bhatia and Ingber, 2014), and the translation from pharmaceutical research and safety assessment to applied chemical hazard assessment has often taken longer and been more problematic than initially envisaged.

Potential regulatory use of Level 2/3 assays
Positive histopathology findings such as hyperplasia or neoplasia, and neoangiogenesis studies do not have a clear mechanistic basis.While they provide useful descriptive data, they are not that there are automatically relevant downstream carcinogenic events, unless this has been clearly demonstrated in the literature.Moreover, not all the mechanisms may be related unequivocally (or to the same extent) to adversity (as described above in Section 5.2).
Furthermore, with exception of "inhibition of gap junction intercellular communication" and "inhibition of senescence through activation of telomerase," a number of the mechanisms/ endpoints listed are not specific to NGTxC: oxidative stress, increased mitogenesis, interference with tubulin polymerization and so on, are also mechanisms for genotoxic carcinogens, and where negative results are recorded in the mutagenicity/genotoxicity assays, these are acceptable for mutagenesis/genotoxicity regulatory purposes.For NGTxC mechanisms, the IATA will need to both address and be negative for each principal mechanism, should the IATA be required to indicate that a substance is not a NGTxC.However if the IATA is being used solely for prioritization for in vivo testing, it will not need to be so rigorous in this regard.
A more holistic way to approaching the creation of relevant testing sequences or batteries would mean reducing the focus on the MIEs in Levels 1 and 2, and concentrating instead on the cellular and tissue events that are pivotal for NGTx cancer outcomes.There are several examples of such test developments in medical research that could be considered for adaptation.Key cellular properties that indicate that tumor cells are successively accumulating to be included in such test developments are as follows: -Disturbed regulation of growth (balance between cell replication and cell death) -Genetic instability (disturbed DNA repair, quick establishment of mutations = "mutator phenotype") -Control of micro-environment (establishment of tumor-tissue, angiogenesis) -Cellular senescence (immortality of tumor cells, expression of telomerase) -Metastases (migration, intra-and extravasation, survival outside of original tissue) This approach could be complemented by toxicogenomic approaches using in vitro test systems to recognize and group chemicals according to specific MoA, as demonstrated by Schaap et al. (2012Schaap et al. ( , 2015)).

Potential regulatory use of Level 1 and 2 assays
From today's perspective, once relevant methods are developed for NGTxC, the mechanistically based in vitro assays may be used for detecting positives.At Level 1 (subcellular), it has been proposed to use sequential testing on the basis of mechanistic and toxicokinetic understanding for the chemical structure for initial hazard identification (OECD, 2015a) (note this is not referring to formal REACH Classification and Labelling), and sequential testing is clearly preferred to battery testing by industry for pragmatic reasons such as time and cost.
Assays can be grouped together as key stages of the particular NGTx MoA.If a substance is positive for one MoA group, then type of information, if it is not already in a regulatory decision.On the basis of this new information a member state could propose classification and labelling for the substance and/or could start other processes but, as already indicated, such requests are rare.It is therefore anticipated that use of this type of IATA will not increase the number of in vivo studies.In any case, e.g., in the REACH Substance Evaluation, any additional standard and non-standard methods may be required if they appear to be critical for a decision.
Human epidemiological evidence should be utilized where available for existing chemicals/substances, but of course this is unlikely to be the case for new chemicals, except when it may be possible to extrapolate by read-across and/or grouping, and is also supported by Level 0/1, 2 and 3 data.
The challenge will be how to integrate the differing pieces of information and how to ascertain the minimum information requirements.For negatives, to ensure that they truly are negative for agreed key event NGTxC mechanisms, one will need to conclude after screening conducted in all the Level 2 tests in the IATA, and after each test result use the test result information to make an informed decision as to which next test to conduct, such that it will contribute the most pertinent information.Thus the list of Level 2 tests will need to be 1) sufficiently comprehensive in coverage of the hallmarks of cancer, 2) predictive of the test endpoints, and 3) the information will need to be integrated to guarantee true across the board negative results for NGTxC.The conclusion might also be determined in conjunction with a Level 3 test, which includes initiation, promotion, such that the protocol takes into consideration effects at the promotion stage and KERs leading to adversity.When integrating a probabilistic Bayesian approach into the decision tree of the IATA, decisions as to where the confidence is sufficient to make a regulatory decision will require a consensus approach.

Future long-term goals
Following through with the vision expressed in Toxicity Testing in the 21 st Century (NAS, 2007), in the future information derived from test methods and assays performed in silico and in vitro (i.e., Level 0/1/2/3) could be sufficiently robust for decision making.One long term vision may be to define adversity on a cellular level and translate the respective in vitro BMD to a corresponding in vivo dose by kinetic modelling and classify according to potency.Classification essentially could be based on adverse cellular effects and respective in vivo potency estimates; indeed this may lead to proposals for major rearrangements of GHS classification.An important starting point for developing such an approach would be an assessment of the validation basis for current standard animal tests and the uncertainty inherent in defining adverse effects on standard "apical" animal testing endpoints compared with the uncertainties that could be considered acceptable when defining adversity on a cellular level.On such a basis, resources could then be targeted towards the TG development of the selected in vitro tests.sufficient in the evaluation of the carcinogenicity of a substance.However, taken together with other, more mechanistically based data and in a WoE approach, this information may be useful and increase the relevance in the evaluation of carcinogenicity.For data rich pharmaceuticals, the absence of histopathological data combined with information on mutagenicity and hormonal activity has been shown to correlate well with non-carcinogens (Sistare et al., 2011), and this could also be the case for chemicals.
Under the REACH Regulation, the sub-chronic toxicity study is a standard information requirement above 100 tonnes per annum (tpa) and therefore relevant histopathological findings can be observed, when present, and used in the evaluation of carcinogenicity.In the development of the individual test methods and IATA, the aim is to cover the hazard characterization with Level 1, 2 and in vitro Level 3 tests as often as possible.Note that the concept of this testing does not imply that only in vivo Level 3/4 tests can give definitive or conclusive test results for regulatory purposes.At present, Level 2 and in vitro Level 3 assays, as shown in Table 1, may be able to meet regulatory information requirements from the information available from the MIE together with additional WoE information, e.g., also in the context of category formation and read-across according to REACH Annex XI, whereas in vivo Level 3 studies would be considered sufficient for Classification and Labelling purposes.Thus, in vivo Level 3 information may not be necessarily needed to complete the decision making process.

Level 4: Organism level
Examples of Level 4 assays from which information relevant for an IATA for NGTxC may be derived include: -assays with transgenic rodents -the rodent carcinogenicity bioassay -chronic toxicity studies (provided that the histopathology findings give sufficient evidence of carcinogenicity of human relevance)

Actual regulatory use of Level 4 assays
Currently, information needs to be obtained from whole organisms if insight is required regarding the adverse effects level, dose response curve and the type of tumors in the tested species (which is essential for the GHS/Classification, Labelling and Packaging (CLP) regulatory requirements for chemicals).
Positive evidence from a definitive in vivo study, with no controversy about human relevance, can normally be used for risk assessment and for CLP, and is dependent upon the regulatory context and process.For instance for REACH it is essential that a substance is classified before certain risk management procedures come in to effect.However, Level 4 data will often not be available for the industrial chemicals.Therefore, the IATA should advise the registrants and authorities on making an informed decision as to when Level 4 studies should be conducted.Some non-conclusive evidence ranging from mechanistic/MoA in vitro tests (Tab. 1) to repeated dose studies might lead to testing the substance at Level 4. The REACH Substance Evaluation, for example, enables the request for this place, and it is the construction of an IATA, under the auspices of the OECD that will be a primary way to address this crucial gap internationally.

Conclusions and future perspectives
NGTxC contribute to cancer risk by a variety of mechanisms, which are not yet included in international chemical regulatory approaches.Whilst there have been valuable efforts at the international level to validate in vitro assays proposed to start to address this important discrepancy, these have not been wholly successful for a number of reasons, including a lack of mechanistic understanding and difficulties in defining how to meaningfully apply individual in vitro tests, such as the Cell Transformation Assays, for NGTxC assessment.
To address this need, an integrated approach to the testing and assessment of NGTxC is beginning to be developed internationally at the OECD, and here we have explored considerations as to what could be the basis for the development of an IATA for NGTxC, with a preliminary organization of the major mechanisms of NGTxC into blocks, identifying related mechanisms, and identifying relevant in vitro assays and markers.The follow-up work will present conceptual ways forward, looking in much more detail at how this approach could apply to a new chemical, such that (long-term) in vivo toxicity studies are no longer needed.It is acknowledged that concluding without animal studies may not be possible quite yet.However, if all earlier level assays give a red flag, then does there need to be a "societal debate" on finally conducting an animal study?Or is it enough to make the decision based on the red flags already provided?This is a major gap that needs exploring, and it is likely that further examples will also be drawn from the experience gathered from pharmaceutical safety evaluations.
In March 2016, an OECD nominated government and stakeholder expert group was established to address the development of an IATA for NGTxC, for which this paper provides the thought starter to initiate, frame and build the IATA(s).With expertise in the appropriate assay-specific knowledge, this group will fully scope, review and prioritize the assays that could be part of an IATA for NGTxC testing in a consensus fashion, together with designing a framework in which these assays would sit.Where appropriate, systematic review approaches, expert knowledge, WoE, identification of data gaps and uncertainty analyses will be utilized to facilitate consensus, to provide a solid basis for the prioritization of assays for test method development within each KE block in the agreed IATA(s) structure.Broader OECD consultation is then foreseen, with subsequent synthesis and the production of a comprehensive OECD guidance document.Information regarding the effect level and/or dose response curve is fundamental in risk assessment.Therefore, the alternative methods ought to provide this type of information in order to be used in current human hazard and risk chemical safety assessment for NGTxC.In several regulatory frameworks, such as REACH, information on the effect level and/or dose response curve and information on the Classification and Labelling form an essential step in the risk management of chemicals.With a strong international drive to reduce animal testing and costs, it is essential that proper and robust methods for addressing NGTxC MoA are developed and put in

Fig. 2 :
Fig. 2: Preliminary conceptual overview of an IATA for NGTxC The upper left side of this figure shows the chemical independent AOP and existing information source factors (in blue) that can feed the NGTxC IATA, whilst the lower left side shows the chemical dependent factors, which include (Q)SARS, read-across and chemical categorization, exposure considerations, Absorption, Distribution, Metabolism and Excretion (ADME) factors, TG generated data and other possible factors (in purple).The right hand side of the figure depicts the suggested levels of organization for the NGTxC IATA, with increasing levels of complexity that the hallmarks of NGTxC, shown in Figure1, sit within, underpinned in many cases by epigenetic machinery, as indicated in the epicenter of the NGTxC hallmark wheel.The underlying blue lines with connecting nodes are symbolic of the NGTxC AOP causality network, where key events (KE) and key event relationships (KER) can be identified on the basis of the mechanisms and MoA identified in Table1.As work on the IATA progresses, these nodes will be completed as KE, KER and relevant in vitro assays for both the IATA and as potential OECD TGs are identified and assessed for relevance and readiness.

Fig. 3 :
Fig. 3: Liver multistep carcinogenesis with liver nuclear receptor -P450 transcription factors (with modification from Jacobs et al., 2003) Hepatocellular carcinoma is strongly associated with chronic liver diseases, including chronic hepatitis and cirrhosis, which are characterized by a prolonged inflammatory condition.The P450 transcription factors can be MIEs that may ultimately contribute to liver tumor development in vivo shown in Figure 3.
organ and individual organism levels that occur in response to a molecular initiating event (MIE; a direct interaction of a chemical with its molecular target) leading to an adverse outcome (AO).Linkages between adjacent MIEs, KEs and AOs are described in key event relationships (KERs).For further information on OECD activities in relation to AOPs please see http://www.oecd.org/env/ehs/testing/adverse-outcome-pathways-molecular-screening-andtoxicogenomics.htm(accessed 5 April 2016).With respect to AOP information quality, an IATA can be graded as follows:-Correlative: a simple format, where the AOPs / MoAs have only qualitative or limited quantitative understanding of one or two Figure 3 shows major nuclear receptor -P450 transcription factors involved in xenobiotic metabolism that can be MIE, which may ultimately lead to liver tumor development in vivo.This figure depicts the natural history of tumor development and progression leading to the adverse outcome of liver cancer.

Fig. 4 :
Fig. 4: Colon multistep carcinogenesisColon cancer can be initiated by inherited damage or mutated APC in FAP families, as a sporadic form, where APC is hypermethylated as a consequence of environmental exposures (diet, alcohol consumption) and as a consequence of inflammation in chronic diseases or inflammation stimulated by pathogenic bacteria.Inflammation plays the main role also in the early steps of sporadic tumors.

Fig. 5 :
Fig. 5: Lung multistep carcinogenesisInflammation is the first adverse effect following exposure to environmental oxidative stressors (smoking, environmental pollutants).Inflammation is supported by the immediate immune response through the production of chemokines and activation of alveolar macrophages.
non-test information pre-screening of existing information, category formation, read across and (Q)SARs 1 Subcellular level: in vitro very early KEs including for example in vitro assays such as receptor binding assays that indicate the MIE 2 Cellular level: in vitro includes both MIEs such as receptor binding and initial KEs such as DNA activation; enzyme activation and other further downstream key events (e.g., disturbance of metabolism, and KERs) 3 Multicellular tissue and organ level includes in vitro, ex vivo assays and in vivo screening assays, developing KERs further 4 information currently available per level from existing data: Three chemical examples is important, also low probability of false negatives is critical.

Tab. 1: Major mechanisms of non-genotoxic carcinogenicity, and suggested organization for IATA development
Level 1: basic molecular information: Molecular initiating event (MIE) and of lower priority Level 2: MIE + additional level of complexity: higher priority (See the chapter on "Levels of information" below)