SpheraCosmolife: A New Tool for the Risk Assessment of Cosmetic Products

A new, freely available software for cosmetic products has been designed that considers the regulatory framework for cosmetics. The software allows an overall toxicological evaluation of cosmetic ingredients without the need for additional testing and, depending on the product type, it applies defined exposure scenarios to derive risk for consumers. It takes regulatory thresholds into account and uses either experimental values, if available, or predictions. Based on the experimental or predicted no observed adverse effect level (NOAEL), the software can define a point of departure (POD), which is used to calculate the margin of safety (MoS) of the query chemicals. The software also provides other toxicological properties, such as mutagenicity, skin sensitization, and the threshold of toxicological concern (TTC) to provide an overall evaluation of the potential chemical hazard. Predictions are calculated using in silico models implemented within the VEGA software. The full list of ingredients of a cosmetic product can be processed at the same time, at the effective concentrations in the product as given by the user. SpheraCosmolife is designed as a support tool for safety assessors of cosmetic products and can be used to prioritize the cosmetic ingredients or formulations according to their potential risk to consumers. The major novelty of the tool is that it wraps a series of models (some of them new) into a single, user-friendly software system.


Introduction
The European regulation on cosmetics represented a paradigm shift in Europe for the safety assessment of cosmetics, which transitioned from the classical toxicological approach based on animal testing towards a completely novel strategy, where the use of animals for toxicity testing is banned (EC, 2009).The European strategy has been followed by an increasing number of countries in the world 1 .However, the regulation does not provide details on which alternative methods are to be used.Several ambitious projects have addressed sophisticated alternative testing strategies, such as the European initiatives SEURAT-1 2 (Berggren et al., 2017) and EU-ToxRisk 3 .endpoints that must be evaluated.As a result, we built the software system SpheraCosmolife presented here.

Methods
The novel software system combines an expert system approach, which refers to the sequence of steps followed by the assessors, with machine-learning and statistical models to provide predictions.Thus, the novel system has a sound theoretical basis derived from the procedure defined by the regulators, which is currently performed manually in most cases.To demonstrate the safety of a cosmetic product prior to placing it on the market, the responsible person must ensure that it has undergone a safety assessment.Annex I of Regulation (EC) 1223/2009 defines the aspects to be evaluated within the cosmetic product safety report (EC, 2009).More-However, apart from hazard identification, risk assessment of cosmetic ingredients requires both hazard characterization and estimation of exposure.In fact, the assessment of the risk posed by one or more components of the cosmetic product depends on the concentration of the ingredients, and the exposure must be examined, starting from skin permeation by the substance of interest.Furthermore, the assessor needs to use different software systems to obtain the various values for the different toxicological endpoints and exposure values.
Within the EC-funded LIFE project VERMEER4 , a strategy was drawn up to integrate several tools into a single computer software system to support safety assessors of cosmetic products.To comply with the cosmetics regulation, we planned to reproduce the procedure employed for cosmetics as closely as possible, with specific reference to the lists of already regulated ingredients, the thresholds that need to be respected, and the specific Application of the official equations and implementation of predefined exposure scenarios within the software.Introduction of a refined approach based on new models for skin permeation to estimate the internal exposure.
Automatic calculation of the POD (NOAEL) using a QSAR model.Experimental values are also in the internal database.
MoS is automatically calculated using the previous results.A decision tree is implemented to provide a TTC assessment.

Comparison consideration
Novelty of computational approaches to refine the assessment.Other toxicological endpoints will be added in future versions.
Novelty of computational approaches to refine the assessment.Automatic creation of exposure scenarios.
Novelty of computational approaches to refine the assessment.New models for NOAEL and LOAEL will be implemented in future versions.Organ-specific toxicity will be indicated.The procedure for the safety evaluation of a cosmetic product is defined in detail by "The SCCS Notes of Guidance for the testing of cosmetic ingredients and their safety evaluation 10 th revision" (SCCS, 2018).Table 1 shows the workflow that a risk assessor follows and explains how these steps have been translated into SpheraCosmolife.

Automatic calculation of the
The risk assessment procedure followed by the regulators, including the four main steps (hazard identification, exposure assessment, dose-response assessment, and risk characterization) is largely replicated by the software.The in silico structure of the tool represents a forward-looking and future-oriented idea and supports the assessor with remarkable time-saving as well as with a considerable amount of data and information that facilitate the decision-making procedure.The assessor must still evaluate the provided values and their uncertainty (also considering the remarks indicated by the software), so the software does not substitute the work of the assessor.
The software will continue to be improved by including more data and information in the future.New technical and scientific issues, for example new toxicological endpoints and evaluation approaches (e.g., read-across) will be implemented in future versions in order to better cover the evaluation procedure and offer a more complete and innovative assessment.The following steps describe the SpheraCosmolife software system structure.

Identification of the ingredients, products and exposure scenarios
The ingredients of a product are identified through the International Nomenclature of Cosmetic Ingredients (INCI), Simplified Molecular Input Line Entry System (SMILES) format, or the Chemical Abstract Service (CAS) number.The identification of the product types derives from Tables 2A, 2B and 3 of the 10 th revision of the SCCS Notes of Guidance (NoG) (SCCS, 2018).The exposure scenarios are those defined in this document, for each product type.

The database
A key component of the software system is its database.It was populated with data retrieved from the COSMOS Cosmetic Inventory5 (Worth et al., 2012) and the information in the Annexes to Regulation (EC) 1223/2009(EC, 2009)), in particular: -Annex II: List of substances prohibited in cosmetic products -Annex III: List of substances which cosmetic products must not contain except subject to the restrictions laid down -Annex IV: List of colorants allowed in cosmetic products -Annex V: List of preservatives allowed in cosmetic products -Annex VI: List of UV filters allowed in cosmetic products Further sources of data refer to these repositories: -CLP harmonized classification6 -Safer Chemical Ingredients List (SCIL)7 These data sources refer to official repositories from the EU and the US Environmental Protection Agency (EPA).Overall, the database of SpheraCosmolife currently contains data related to about 5000 substances, but it is regularly updated with new and upgraded information and data.Most of these substances are cosmetic ingredients, and the following categories are particularly well represented: skin conditioning, skin protecting, surfactants, emulsifying, and perfuming.From a chemical point of view, alcohols, amines, ketones, and substances with an aliphatic chain of at least 8 carbons are well represented.
The information in the Annexes includes the maximum concentration, the product type in which the ingredient is allowed, and the wording of conditions of use and warnings.With regard to inorganic and organometallic compounds, polymers, and data related to mixtures of chemicals, SpheraCosmolife recognizes them if they are present in the database; otherwise, the software is not able to process these kinds of substances.The original structure of salts is stored in the database, so the user can use a salt as an input.However, the final assessment is done using the neutralized structure, without the cation or anion.Semi-automated data curation and quality checking workflow using the KNIME platform (Gadaleta et al., 2018) and expert-based knowledge were used to retrieve neutralized structures in the database.However, for substances not in the database, the user should insert the structure of the neutral substance, without ions.

Risk prediction models
The software evaluates a number of properties as described below.For these properties, the source of the data is the VEGA platform 8 , and information on a specific property can be found in the description of the in silico models for the individual property, available from VEGA.New in silico models have also been developed (see below).
The SpheraCosmolife system uses several equations and in silico models.Some models estimate exposure, others predict hazard.By combining the results of these models, the Sphera-Cosmolife system can assess the risk associated with the ingredients of a product.If available in the system database, experimental values are used, otherwise the system provides predictions.The software can deal with multiple ingredients of a given cosmetic product, for different categories of products.Although the system treats multiple ingredients, interactions between the ingredients are not considered, since each cosmetic product is considered an individual combination of cosmetic substances, as described in the 9 th revision of SCCS NoG (SCCS, 2015).

Exposure evaluation
The list of product types The user is asked to indicate the product type, as defined in Tables 2A, 2B and 3 of the 10 th revision of SCCS NoG (SCCS, 2018), to identify an exposure scenario.Once K p is obtained, the worst-case scenario, i.e., the most conservative value, is employed to proceed with the workflow, and the software calculates the maximum flux of the substance, J max , according to Equation 5: J max (mg/cm²/h) = K p * C water,sat (Eq.5) where C water,sat is the saturated water solubility, in mg/cm³, obtained using the model implemented in the VEGA platform.
Once J max is obtained, the software applies the Kroes approach (Kroes et al., 2007) according to Equation 6, which provides the percentages of absorption (%A): The SpheraCosmolife software provides the percentages of absorption for these three cases, which represent low, medium and high dermal absorption, as defined by Kroes (Kroes et al., 2007;Shen et al., 2014).

Hazard evaluation
The hazard assessment of chemicals is performed with several in silico models and profilers available on the VEGA platform.A brief description of each model is presented below.

Mutagenicity, bacterial reverse mutation test -Ames test
The consensus model available in VEGA is used for Ames mutagenicity prediction.It combines the results of four models, taking into account the reliability of each prediction for the target substance (Manganelli et al., 2018).It integrates models using both expert-and statistical-based tools, following the recommendation of the International Council for Harmonization of Technical Requirements for Pharmaceuticals for Human Use (ICH) (ICH, 2017).

Chromosomal aberration
The integrated model available in VEGA is used for chromosomal aberration.It was built with the CORAL software using SMILES-based attributes.The classification model is based on a dataset of 477 organic compounds (223 active and 254 inactive in chromosomal aberration tests).The data were collected from the Genotoxicity OASIS Database and from the Toxicity Japan MHLW, which include experimental data for chromosomal aberrations determined by in vitro testing using Chinese hamster lung (CHL) and ovary (CHO) cells, with and without S9 metabolic activation (Toropov et al., 2019).

In vitro micronucleus genotoxicity test
The model developed with the SARpy software, based on structural alerts, is used for the in vitro micronucleus genotoxici- where E product is the calculated relative daily exposure (mg/kg bw/day), which is tabulated in the tables of the SCCS NoG (SCCS, 2018).C is the concentration of the ingredient (%).

Systemic exposure dose (SED)
SED is obtained as indicated by the SCCS NoG (SCCS, 2018), from Equation 2: This equation takes into account the amount of the finished cosmetic product applied per day (E product ), the concentration (C) of the substance under study in the product category expressed as a percentage, and the dermal absorption (DA) expressed as a percentage.
A tiered approach is followed to calculate the SED.The SED value is provided by SpheraCosmolife considering three scenarios, where absorption is taken (i) as 100% (for oral and inhalation exposure), or (ii) 50% (for dermal exposure, a default value defined by the SCCS NoG (SCCS, 2018)), or (iii) 10, 40 or 80% (for dermal exposure, in a more accurate way, calculated according to the Kroes approach (Kroes et al., 2007), which requires information on skin permeation, see below).

Skin permeation
Two models are used that implement the models described by Potts and Guy (1992) and ten Berge (ten Berge, 2009;Vecchia and Bunge, 2002) according to Equations 3 and 4, respectively.They provide the constant of permeation, K p .logK p = 0.71 log(K ow ) -0.0061 MW -2.7 (cm/h) (Eq. 3)

No observed adverse effect level (NOAEL)
The NOAEL model available in VEGA, which was built with the CORAL software using SMILES attributes, was used.Repeated dose 90-day oral toxicity study data in rodents was considered to build the model.Studies with a treatment duration of 28 days were also considered; this data was divided by a factor of 3 to approximate the 90-day value, as specified in the SCCS NoG (SCCS, 2018).The dataset consists of 140 organic compounds with experimental values collected from the US EPA's Integrated Risk Information System (IRIS) database, the Hazard Evaluation Support System (HESS) and Munro databases.The regression model is based on optimal descriptors calculated by the Monte Carlo method with SMILES attributes and the graph of atomic orbitals (Toropov et al., 2015).

Risk characterization
As the last step, SpheraCosmolife software carries out the risk characterization of chemicals or formulations following the process defined within the SCCS NoG (SCCS, 2018).The margin of safety (MoS) and the threshold of toxicological concern (TTC) are calculated.

Calculation of the margin of safety
The MoS is the ratio of the POD and SED.It is commonly used in human health risk assessment and, in particular, in cosmetics risk assessment, and it is a key value within the PIF, which is essential to be able to put the product on the European market.The software uses the NOAEL as a POD.The MoS is calculated according to Equation 7: MoS = POD (point of departure) (Eq.7) SED (systemic exposure dose) Within SpheraCosmolife, the POD can be an experimental, if available, or a predicted NOAEL.To consider a substance safe, the MoS must be higher than 100, as indicated in the SCCS NoG (SCCS, 2018).

Threshold of toxicological concern (TTC)
The software includes an assessment that considers the TTC approach, which refers to the establishment of a level of exposure below which there is no appreciable risk to human health (Kroes et al., 2004).It uses the Cramer decision tree, implemented in VE-ty test.The model provides a qualitative prediction of genotoxicity as induction of micronucleus in mammalian cells in vitro (MNvit).The model was built on a dataset containing 380 organic chemicals with genotoxicant and non-genotoxicant MNvit experimental data (153 inactive and 227 active chemicals).The experimental data were collected, according to the OECD 487 guideline, from eChemPortal inventory, peer-reviewed literature, SCCS and European Food Safety Authority (EFSA) opinions, European Centre for Validation of Alternative Methods (ECVAM) guidelines and review.The fragment-based model uses 138 structural alerts, including 82 active and 56 inactive fragments (Baderna et al., 2020).

Skin sensitization
Two models are used, which are available in VEGA.The first is the CAESAR model (Chaudhry et al., 2010).The second is a new model, which is described here: It is a decision tree (DT) based on a data set of 332 chemicals with data from the local lymph node assay (LLNA) (226 sensitizers and 106 non-sensitizers).Data were collected from the CAESAR model database (209 substances) and from Asturiol et al. (2016) (269 substances).The dataset was split into a training (80%) and a test (20%) set.To partition the chemicals into training set and test set, assuring high diversity and keeping the ratio sensitizer/non-sensitizer, the following procedure was used.The chemicals were initially separated into sensitizer and non-sensitizer.The subsequent steps were carried out separately for sensitizer and non-sensitizer.Each group was clustered based on chemical similarity defined by the chemicals' fingerprints (RDKit atomic pairs) into as many clusters as the number of chemicals divided by 10.Subsequently, 80% of clustered chemicals were assigned randomly to the training set, using the assigned cluster as stratification variable.The remaining 20% of chemicals were assigned to the test set.The chemicals were structurally diverse, and the distribution between sensitizers and non-sensitizers was preserved.Structural diversity refers to diversity in terms of chemical classes.Details are described by Asturiol et al. (2016).The RDKit atom pairs fingerprints9 were used to cluster similar chemicals with a kNN algorithm.The chemicals were randomly selected from each cluster in the same proportions of sensitizers/non-sensitizers as in the dataset.The DT model was built using the Recursive PARTitioning (rpart) module included in R software10 (Therneau and Atkinson, 2015) to develop CART (Classification and Regression Trees) models.The model is based on 2D descriptors calculated using Dragon (Dragon v. 7.0.8,Kode srl 11 ) and a stepwise variable selection using linear discriminant analysis (LDA).A bootstrap technique (based on balanced resampling) was used for validation to select the best variables, with an inhouse code implemented in R (Manganelli et al., 2019).Molecular descriptors used in the DT are: -nDB, number of double bonds -IC1, information content index the use of the results, a system has been developed to integrate the two models, following the approach for the integration of the results of the models for mutagenicity described above.
The results of the in silico model for skin sensitization were analyzed considering the true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN).We then calculated the statistical parameters sensitivity, specificity, accuracy, and Matthews correlation coefficient (MCC) as follows: The reported statistics refer to the performance on the training set.However, cross-validation (CV) was performed during model building using the recursive partitioning (rpart) module included in R software 10 (Therneau and Atkinson, 2015).

The results of the other in silico models
SpheraCosmolife is a novel software, which includes a new integrated software system for skin sensitization and a new model for this endpoint, as described previously.Other in silico models have been recently developed (e.g., in vitro micronucleus genotoxicity test model), and they are individually available in VEGA.They are organized into a unified workflow here.The detailed information on each VEGA model is available within the VEGA platform 8 .Tables 3 and 4 summarize the performance of each model, in classification and regression, respectively.

User inputs for SpheraCosmolife
In the first form, the user must insert the product type, the ingredients of the product, and their concentrations.Figure 1 shows where to insert the information.The product type needs to be selected from a list of possible product types (e.g., body lotion, face cream, hand cream, shampoo, shower gel, deodorant, etc.), according to the product types in Tables 2A, 2B and 3 of SCCS NoG (SCCS, 2018), to define an exposure scenario.The ingredients can be inserted using the SMILES, the INCI, or the CAS format.If INCI or CAS are provided as the only input, the molecule can be processed only if a match is found in the database, otherwise the SMILES of the ingredient is needed.The user can insert a single ingredient or a set of ingredients corresponding to a specific formulation.The concentration of each ingredient is required.Then, the software searches for the ingredients in the internal database.

The SpheraCosmolife software summary results
The software provides a summary table of the results for all the ingredients in the product in html format.The results depend on the product type and on the concentrations of the ingredients en-GA.The tool for TTC uses the algorithm implemented in Toxtree v. 3.1.0 (Patlewicz et al., 2008), with the original Cramer classes (Cramer et al., 1978;Munro et al., 1996;Patlewicz et al., 2008).The TTC assessment refers to the classes listed in Table S1 12 .

Results
We developed SpheraCosmolife, a unified software system intended to support risk assessors of cosmetic products and to assist companies to ensure the safety of their products, avoiding ingredients that may be of concern at a given concentration.For this purpose, some new models were developed and implemented on the VEGA website 8 .The overall scheme follows the workflow of the assessor.The evaluation is easy to perform: First, the user is asked to provide information regarding each ingredient by inserting it using the SMILES, the INCI, or the CAS format, and adding its concentration and the product type.Then, the software checks whether the ingredient is listed in the Annexes of Regulation (EC) 1223/2009(EC, 2009)).Next, the software searches for values in its database that are useful for the chemical's risk characterization, as described in the Methods section.If there are no experimental values, the software predicts the properties of interest.The focus is on systemic toxicity, and the MoS is calculated.The user is assisted by the provision of a series of outcomes related to the product of interest.We describe the components of the new software and its outcomes below.

Performance of new skin sensitization model
The new decision tree for the prediction of skin sensitization potential as expressed by the LLNA had good results (Tab.2): The results on the training set were quite balanced, while on the test set the model resulted in more false negatives than false positives.In contrast, the CAESAR model is over-conservative, so it is appropriate to look at the results of both models.To facilitate case for formaldehyde, and the software provides a warning by showing a red cell if an Annex II substance is included as an ingredient.In other cases, the Annexes give a threshold above which the substances cannot be used.For instance, phenoxyethanol, which is in Annex V, may be used at a maximum concentration of 1%.The user must enter a concentration that is lower than 1% for the product to be safe and in compliance with the law.Moreover, in this section the software checks whether the ingredient is classified according to the Classification, Labelling and Packaging (CLP) regulation (EC, 2008) and whether it is contained in the Safer Chemical Ingredients List (SCIL).This is a list of chemicals, arranged by functional-use class, that the Safer Choice Program has determined are safer than traditional chemical ingredients 13 .
Figure S1 12 gives an example of the described information.

Information on exposure
SpheraCosmolife provides information regarding the exposure to each ingredient, based on the inputs of the user (product type and concentration).Figure S2 12 shows an example of the exposure information provided by the software.The exposure is calculated for different scenarios.External exposure is obtained using the parameters defined by the SCCS NoG for the product type, as described in the Methods section.The SED is obtained for three scenarios: (1) absorption of 100%, oral or inhalation exposure; (2) absorption of 50%, a default value for dermal exposure, as indicated by SCCS NoG (SCCS, tered by the user.Figure 2 gives an example of the output table, with the summary of the hazard and exposure features of the ingredients indicated in the input file.It shows whether an ingredient is present in an Annex of the cosmetics regulation (EC, 2009), whether it is mutagenic (Ames test), whether it is a sensitizer, the dermal absorption according to the Kroes approach, the MoS, and the TTC.No assessment is provided for substances for which no SMILES is retrieved, e.g., for inorganic compounds and mixtures (unless present in the database).All the assessments refer to the neutralized structure of the compounds as they are neutralized during the process.

The SpheraCosmolife software detailed results
Clicking on the "Details" box for each ingredient in the first column of the summary table reveals the detailed results for each ingredient as an html page.First, the information on the substance is presented, i.e., the SMILES, the CAS and INCI, the information on the product type selected, and the concentration of the ingredient, as a percentage (%) and mg/g.Then, SpheraCosmolife reports the values from the SCCS NoG (SCCS, 2018) related to the product type selected by the user, i.e., the relative daily exposure, the surface area involved, the type of exposure, and the exposure time.These values are specific for each product type and serve to calculate the exposure, as described in the next section.
The presence of a substance in one of the Annexes does not preclude the evaluation.However, the substances in Annex II cannot be used in cosmetic products.This, for example, is the

Examples: comparison of the procedures and application of the software to a real case
As shown in the previous sections, SpheraCosmolife represents an innovative and robust system to guide the assessor during their evaluation analysis.The tool can be used to evaluate either a single ingredient or multiple ingredients typically used in a cosmetic product.It is important to note that the tool can process a great number of substances in a few seconds, saving time and money.
A first example is the evaluation of a hypothetical cosmetic product (body lotion) with a list of ingredients as described in Figure 2. Table 5 shows how the formulation is evaluated with or without the tool.Figure 4 reproduces the list of ingredients shown in Figure 2.
Another use of the tool is for the evaluation of impurities.Article 17 of Regulation (EC) 1223/2009 clearly explains that the non-intended presence of a small quantity of a prohibited substance (impurity) that is technically unavoidable in good manufacturing practice, shall be permitted, providing that such presence is in conformity with safety requirements.As an indicative real-case example, we consider the impurities detected during analytical checks of a volatile organic compound (VOC) in a nail product.Figure 5 shows the analytical results for a nail polish.The VOC analysis detected methanol (CAS 67-56-1), styrene (CAS 100-42-5), toluene (CAS 108-88-3) and dichloromethane (CAS 75-09-2).
Looking at Figure 5, the regulatory context is immediately clear.Methanol and toluene are substances included in Annex III (list of restricted ingredients).Therefore, even if they are not intentionally added, concern related to their safety is limited, since all the substances included in this list have already been evaluated by SCCS.Toluene can be used in nail polish up to 25% (Fig. 6), while methanol is allowed as an impurity of ethanol or isopropyl alcohol up to 5% (Fig. 7).2018); or (3) the more realistic scenario for dermal absorption based on the models for skin permeation.SpheraCosmolife provides the output of two models for skin permeation, and then chooses the worst case, using the most conservative of these two values.As explained above, for these models and all the others, whenever the database contains an experimental value for the ingredient of interest, the software shows and uses it.Water solubility and the J max are also shown.
Since all these models are implemented in VEGA, the output of the predictions considers the reliability of the prediction, measured using the applicability domain index (ADI) calculated by VEGA.ADI is a value that evaluates the reliability of the prediction by considering several parameters, such as the predictions done on similar compounds, the agreement between the predicted value for the target compound and the experimental values of the most similar compounds, the presence of unusual fragments, how similar the related substances are, etc. Full details for each model are given on the VEGA website 8 , including the user guidance 14 .Thus, for each prediction, SpheraCosmolife reports the level of reliability as low, medium or good.This indication should be used as a warning, and the assessor should evaluate whether the value can be used or not based on the level of reliability; in case the assessor needs more information to take the decision, they should use the original VEGA model of interest, which provides full details, while the SpheraCosmolife report only provides a summary.

Information on hazard values and TTC
The software reports values (experimental, if available, or predicted) related to mutagenicity (Ames integrated model), genotoxicity (in vitro micronucleus and chromosomal aberration), skin sensitization (CAESAR, DT model and the integrated model), and NOAEL.Figure S3 12 gives an example of the information on hazard assessment.All the values provided are highlighted with a color to assist the user.The cell with the prediction is red if the compound is predicted as mutagenic/genotoxic/sensitizer, or green if it is safe, i.e., if the prediction has good reliability or if an experimental value is available.The color is orange and yellow, respectively, if the reliability of the predictions is only moderate or low.
The software carries out a risk characterization, considering the systemic toxicity and the MoS.
The software also includes an assessment considering the TTC approach.In this case, the external and internal exposure values are compared with the TTC threshold for the specific ingredient in order to incorporate dermal bioavailability into the use of TTC for cosmetics.A decision tree is applied (Williams et al., 2016).The output shows a red or green cell, depending on whether external and internal exposure values are above or below the TTC threshold, using the Cramer decision tree implemented in VEGA. Figure 3 shows an example of the output for the TTC value for the three different SED scenarios.The TTC value is of interest for substances that are impurities.In general, the evaluation of the other endpoints, if reliable, should be considered more relevant.Figure 5 shows that, according to the Cramer decision tree, styrene is in class I (low level of concern; TTC = 0.03 mg/kg bw/day), while dichloromethane is in class III (high level of concern; TTC = 0.0015 mg/kg bw/day).Comparing these TTC thresholds with the exposure calculated by SpheraCosmolife (see Fig. 8, 9), an indication of the risk associated with the presence of an unexpected substance is shown, helping the safety assessor during the evaluation process.
Styrene and dichloromethane are prohibited as cosmetic ingredients (included in Annex II), and are considered substances of concern.However, it is possible to check the specific exposure of the users and define their risk in use.In this case, concentrations are low enough (0.002% for styrene and 0.0003% for dichloromethane) to consider them as "traces", and thus an evaluation using the TTC approach is possible.The assessment is based on in vitro tests and data retrieved from the literature; in silico inputs are usually not used, for lack of experience.Assessment is time consuming.Reliability and robustness of the data are hard to assess.Regulatory information needs to be manually retrieved.This may result in an incomplete safety report, for lack of data.
The assessor must browse through the regulation to retrieve information.
Data retrieval is time-consuming.Data retrieved from the literature needs to be curated and correctly interpreted.

SpheraCosmolife
The user inserts this information.The product type is selected from a picklist.The software automatically retrieves the ingredients from the database.If not present in the database, input structure is based on SMILES strings.
The output is obtained within seconds.Valid and consolidated in silico data are available for all endpoints.Reliability of the results is presented using a specific algorithm.The user can visualize data for exposure, hazard, MoS and TTC in the same table.Complete and exhaustive regulatory data (presence in the Annexes, maximum threshold, etc.) are automatically retrieved from the internal database.
All detailed information on the cosmetic regulation is retrieved from the internal database.
The process is much faster.Experimental data have been curated and the predictions report reliability to facilitate interpretation of the results.

Comparison consideration
Same conceptual scheme.Same equations and rules to define exposure scenarios.Process is facilitated and much faster.Database is checked automatically, not manually.
The safety assessment process is facilitated.Additional information is available from in silico models, with its related uncertainty.
A complete overview of the regulatory conditions of the different ingredients is offered, and information from US EPA is also provided.Models for hazard and exposure are integrated within the same platform.Risk assessment is done immediately and includes information commonly not addressed (internal exposure, Kroes approach, etc.), which improves the safety assessment process.Some experience in the interpretation of the VEGA results is needed.

The novelty of the tool
In silico models offer a unique opportunity for the assessment of cosmetic products, as they can process a very large number of ingredients quickly, without the need for new, additional tests, saving money and time (Raitano et al., 2019;Gellatly and Sewell, 2019;Taylor and Rego Alvarez, 2020).The assessment of cosmetic ingredients can benefit from the hundreds of in silico models available that predict the properties of interest for many endpoints.Some of these models are freely available, others are A last example of the use of the tool is the evaluation of a botanic derivative ingredient.Botanical extracts are commonly used in the cosmetic industry because they are relatively easy to obtain, they are generally considered safe (as they are of natural origin), and they are used commercially for marketing descriptions of cosmetic products.SpheraCosmolife allows a rapid evaluation of the botanic ingredient, as shown in the example described in Figure 10.The result is a safety profile of the botanic derivative ingredient that can be taken into consideration by the safety assessor to define the safety of the cosmetic product in which the botanic ingredient is used.silico models can be designed to fulfill the users' requirements.This has been accomplished within the VERMEER project.Development of the overall architecture should start at the very initial phases.Here, we codified the different endpoints used by an assessor of cosmetic products and analyzed how the assessment is done without the in silico model.Then, we tried to replicate these steps using different software modules, of which some already existed and others were developed for the required purpose.This process is close to the development of an expert system software, and we inserted a number of tools that are statistical-based, not only expert-based rules.

The limitations of the tool
The novel system should be considered as support for the assessor, and not as a replacement of the human evaluation.The major novelty is in the organized series of in silico models.In case experimental values are available in the database, the evaluation is straightforward.Otherwise, the missing values are predicted.However, particularly in this case, the user should consider the reliability of the values, indicated by the system as the uncertainty, in reaching a final assessment.This demonstrates the objectivity of the assessment, which systematically shows the different levels of uncertainty of the experimental and calculated evidence.The assessor should use expert knowledge, particularly when the uncertainty is high.Since SpheraCosmolife is supported by the VEGA platform, the ADI tool to evaluate the reliability of the prediction can be used, and further information can be obtained using the individual VEGA models, which can be downloaded from the VEGA website.These additionally show similar compounds, which can be used for read-across, and provide more details on the ADI.
The ADI is a quantitative value based on three fundamental components at the basis of all in silico models: the chemical information, the toxicological/property information, and the algorithm.Each of these components is used within the ADI.VEGA evaluates similar chemicals and the uncommon features from a chemical point of view.The ADI also investigates the toxicological profiles and features of the related substances (thus not only based on the chemical similarity) using, for instance, toxicological alerts.Finally, the uncertainty of the algorithm is addressed too.Combining all these analyses regarding the applicability domain, VEGA provides an assessment of the reliability of the predictions, which is reported by SpheraCosmolife.
The user should be aware that the uncertainty is higher for some of the endpoints, and Tables 2 and 3 provide a first indication.An example of an endpoint with higher uncertainty is the predictive tool for NOAEL, which is a difficult endpoint because it is affected by natural variability and by the choice of the doses within the experimental test.The current version of the model is based on a limited training set.We are working to reduce uncertainty, and new models for NOAEL will be added in the future.Skin sensitization is another endpoint with uncertainty.Both models for this endpoint implemented in the system are quite conservative, and therefore result in some false positives.Also in this case, we are developing new models, which will be implemented in a new version.As stated above, in case of un-commercially available, or require a fee.For instance, models to predict the Ames test are the most numerous, and they have been reviewed elsewhere (Cassano et al., 2014;Honma et al., 2019).Despite the number of models available, their application can be complicated, as the user must run separate models on different platforms, which may require different input formats, specific instructions, or provide outputs that may be difficult to integrate and compare.Furthermore, independent models for hazard and exposure assessment need to be run for risk assessment.All these difficulties represent a barrier to the use of in silico models for cosmetics.
Within the EC project VERMEER, we aimed to integrate models for hazard and exposure.The SpheraCosmolife software system presented here is one of the tools we are developing to increase the use of in silico tools for risk assessment, at the same time improving the robustness and the reliability of the results.
A major innovative aspect introduced by SpheraCosmolife is that the system does not provide the output for one single endpoint, as in the usual situation; instead, a battery of models that are specific for the application are wrapped into a single system.The user is not required to learn multiple programs, because the input is the same for all the models, which run automatically in the background.
A second innovative aspect is that SpheraCosmolife integrates models for hazard and exposure within the same platform.Historically, models for exposure have been developed separately, typically within commercial platforms, while platforms for hazard models do not contain tools for exposure.SpheraCosmolife offers a novel approach because it integrates both.
A third innovative aspect is that this system is dedicated to the specific sector of cosmetics.Thus, it contains legislative thresholds and specific references to the European regulation, proceeding in the direction of a practical application.Usually, the existing platforms are generic and intended to be used for multiple purposes.
A fourth innovative aspect of this system is that it can be used to assess products, not only ingredients, and thus it simultaneously addresses all ingredients contained in the cosmetic product.This also helps the user who is interested in the practical case of a product in which multiple substances are used.
The safety evaluation is performed easily: The user provides the structure of the chemicals of interest, the type of product in which the chemicals will be used as ingredients, and the concentrations in the final formulations.Then, SpheraCosmolife automatically executes several analyses and predictions, providing an overall evaluation of the formulation in a structured output report.
SpheraCosmolife is opening up new avenues to the use of in silico models, moving towards real case application, facing the practical problems of a sector urgently demanding solutions to assess products, and introducing novel topics to take into account the specific needs of a focused sector.The effort to apply models to practical cases will provide solutions, lowering the barriers to the use of in silico models, which tend to be viewed more as theoretical tools.Too often the developers oblige the user to learn programs that only partially assist them.Instead, a dialogue must be established between the users and the developers so that the in certainty the user can run the individual models within VEGA and look at the similar compounds, taking advantage of the readacross approach.In the future, read-across will be implemented into SpheraCosmolife, and further endpoints will be added.

Conclusions
We introduced for the first time a single software system, SpheraCosmolife, to facilitate and harmonize the safety assessment of cosmetic products.The tool is specific for cosmetic products and refers to the respective thresholds and requirements.The software system provides the MoS, based on the systemic exposure dose, and includes a number of models for both exposure and hazard prediction.Different scenarios are provided based on the use or not of the results of the skin permeation models.The software aims to be user-friendly, requiring a limited number of inputs from the user, and uses its internal database and models to provide an evaluation even when experimental values are lacking.The system was developed based on user requirements rather than aiming to offer a set of prebuilt tools which could be useful.The level of uncertainty of the results is given, and the assessor is provided with supporting information for the final evaluation.SpheraCosmolife has been designed with flexible capabilities for future extensions in mind, and more features will be added in future versions.New functionalities will be added to accommodate user requests.The frequency of the updates depends on the progress of the work necessary to create new models and evaluate the consistency of new toxicological endpoints, but a version 2.0 is planned for the end of 2021.

Tab. 1 :
Risk assessment steps defined by regulators and their translation into SpheraCosmolife Regulatory procedure HAZARD IDENTIFICATION: carried out to identify the intrinsic toxicological properties of the substance EXPOSURE ASSESSMENT: based on the declared functions and uses of a substance as cosmetic ingredient, the amount present in the respective cosmetic product categories, and the frequency of use DOSE-RESPONSE ASSESS-MENT: based on calculation of the retrieved from the internal database.In case experimental data are missing, in silico models (expert and statistical-based tools) to predict toxicological properties are provided.
MoS.A decision tree approach allows estimation of the TTC class and compares the TTC thresholds with the exposure values, incorporating dermal bioavailability into the use of TTC.*ref: SCCS, 2018.LOAEL, lowest observable adverse effect level; MoS, margin of safety; NOAEL, no observable adverse effect level; NoG, notes of guidance; POD, point of departure; QSAR, quantitative structure-activity relationship; SCCS, Scientific Committee on Consumer Safety; TTC, threshold of toxicological concern over, as indicated in Article 11 of Regulation 1223/2009, the responsible person shall keep a product information file (PIF) for it.

Fig. 2 :
Fig. 2: An example of the output table, with the summary of the hazard and exposure features of hypothetical ingredients of a cosmetic product

Fig. 3 :
Fig. 3: The output of the SpheraCosmolife software for the TTC for a given ingredient (Eugenol in a body lotion scenario)

Fig
Fig. 4: List of a hypothetical cosmetic product

: Summary of the statistical parameters of the classification models implemented in SpheraCosmolife
(Cassano et al., 2014) model that combines the predictions of four models(Cassano et al., 2014).The statistics of each individual model are reported in VEGA.This integrated model has been used to predict substances not present in the training set within three studies (see references), and the values are reported here.