Bisphenol A Binding Promiscuity: A Virtual Journey through the Universe of Proteins

Significant efforts are currently being made to move toxicity testing from animal experimentation to human relevant, mechanism-based approaches. In this context, the identification of molecular target(s) responsible for mechanisms of action is an essential step. Inspired by the recent concept of polypharmacology (the ability of drugs to interact with multiple targets) we argue that whole proteome virtual screening might become a breakthrough tool in toxicology reflecting the real complexity of chemical-biological interactions. Therefore, we investigated the value of performing ligand-protein binding prediction screening across the full proteome to identify new mechanisms of action for food chemicals. We applied the new approach to make a broader comparison between bisphenol A (BPA) (food-packaging chemical) and the endogenous estrogen, 17β-estradiol (EST). Applying a novel highthroughput ligand-protein binding prediction tool (BioGPS) by the Amazon Web Services (AWS) cloud (to speedup the calculation), we investigated the value of performing in silico screening across the full proteome (all human and rodent x-ray protein structures available in the Protein Data Bank). The strong correlation between in silico predictions and available in vitro data demonstrates the high predictive power of the method used. The most striking results obtained was that BPA was predicted to bind to many more proteins than the ones already known, most of which were common to EST. Our findings provide a new and unprecedented insight on the complexity of chemicalprotein interactions, highlighting the binding promiscuity of BPA and its broader similarity compared to the female sex hormone, EST.


Introduction
Our study exploits recent discoveries in supercomputer-based drug design to support the new paradigm under development in the field of toxicology.Indeed, exploring alternative approaches to animal experimentation is inducing a revolution, moving from black box high dose animal studies towards more mechanistic, human relevant approaches.It is illustrated by the concept of Adverse Outcome Pathway (AOP), consisting of a framework linking Molecular Initiating Events (MIEs) (based on chemical-target interactions), to a sequence of molecular, cellular anatomical and functional events leading to adverse effect (Burden et al., 2015).It is thought that AOP will transform risk assessment of chemicals, giving more emphasis to the integration of mechanistic information.In this context, when a MIE is known for a compound, introducing ligand-protein binding prediction can be helpful to assist toxicologists in the assessment of similar compounds (e.g.very important for risk 1 # contributed equally assessment of mixture).But, as reported for drugs, for which polypharmacology (the ability of drugs to interact with multiple targets) seems to be more a rule than an exception (Ellingson et al., 2014), xenobiotics may also interact with multiple targets, potentially triggering several MIEs and lead to adverse outcomes.
Following this hypothesis, we investigated the feasibility of applying a novel in silico workflow to toxicology, based on the virtual screening of all characterized proteins.This could represent a rapid method to identify MIEs and toxicological pathways potentially triggered by tested chemicals.
This idea fits with other strategies developed under large research initiatives such as U.S. Tox21 (Thomas et al., 2018) and ToxCast (Richard et al., 2016), based on in vitro screening using hundreds of bioassays, providing a large spectrum of potential biological effects.
To investigate the value of our approach in addressing toxicological questions, bisphenol A (BPA), in comparison with the endogenous EST, was selected as a proof-of-concept.It was chosen because it is well documented to interact with a number of protein targets, which could then serve as a basis to assess the performance of BioGPS (Siragusa et al., 2015(Siragusa et al., , 2016) ) to predict potential MIE.Moreover, the endocrine-disrupting effects of BPA has been widely discussed (Rubin, 2011) because of its structural similarity to the endogenous hormone estradiol (EST).This feature has inspired the search of its toxic effect uniquely through the Estrogen Receptor activation and consequently most BPA alternatives were designed based on the reduction of this activity.But a number of other diverse targets were also discovered, suggesting the involvement of other mechanisms and potentially MIEs (MacKay and Abizaid, 2018).Indeed as recommended by EFSA (EFSA, 2015) more research using mechanistic studies to determine the mode of actions of BPA are needed, and although our approach is not quantitative and only evaluates binding affinity, we believe it can be a good starting point to design more in vitro tests.
The main questions we addressed were whether BioGPS was suitable to identify the different known targets of BPA and how its broad in silico protein-binding profile compares to EST.Therefore we suggested an holistic view, enlarging our vision on the complete 3D proteome, including in the virtual screening analysis all of the human and rodent 3D structures available in the Protein Data Bank (PDB) (Berman et al., 2000), the central storehouse of biomolecular structures (30.153 structures, October 2014).
To be as exhaustive as possible we investigated not only active sites in the proteins but also secondary pockets (binding sites on the protein other than the active site).Indeed, it is increasingly evident that secondary pockets may play a role in biological responses (Liu and Nussinov, 2016;Lu et al., 2014).From a drug discovery viewpoint, the exploration of secondary sites contributes to the design of innovative therapeutic solutions (Tschammer, 2016;Livingston and Traynor, 2018;Citro et al., 2016;Kasbekar et al., 2016).Following the same reasoning, it appears that focusing attention only on 'main' active sites may be insufficient to understand the final toxicological effects.In this context, considering the interaction with 'secondary' sites may provide key elements to the toxic phenomena.
Finally, in silico -in vitro validation was performed using freely available information.Furthermore, to better compare BPA to EST, we generated the protein binding profile and we calculated the in silico binding promiscuity index, defined as the percentage of proteins binding versus the total number of proteins tested.The general hypothesis being that the more proteins a compound interacts with, the greater is the likelihood of observing adverse effect in toxicological studies.
To summarize, the approach proposed is based on the sequential application of: − BioGPS (Global Positioning System in Biological Space) software (Siragusa et al., 2015(Siragusa et al., , 2016)), implementing the virtual screening of a compound against all x-ray protein structures available in the Protein Data Bank (Berman et al., 2000).− Validation of the in silico ligand-binding prediction by in vitro data available at PubChemBioassay platform (Wang et al., 2017) and at Comparative Toxicogenomics Database (Davis et al., 2018).− Calculation of in silico binding promiscuity index.− Grouping the predicted ligand-protein interactions into different families, generating the protein-binding fingerprint profile.

Data collection
We collected from the Protein Data Bank (PDB) (Berman et al., 2000) protein structures belonging to four organisms: Homo Sapiens, Rattus Rattus, Rattus Norvegicus and Mus Musculus.The total number of protein structures was 30.153, belonging to 6.309 unique proteins."Protein structures" refers to structures downloaded from the PDB (Protein Data Bank), while "proteins" are the biological entity extracted, starting from the PDB code, from the Uniprot (Nightingale et al., 2017) database.

Cavity database generation
For each of the 30.153structures, the BioGPS protocol (Siragusa et al., 2015(Siragusa et al., , 2016) ) was applied for detecting and characterizing cavities.I) The fixpdb tool was used for preparing the input protein structure for subsequent analysis.Protein residues, solvent molecules, co-crystallized ligands, cofactors and ions contained in the PDB structure file were also properly processed.All nucleic acids, ligands and water molecules co-crystallized with the protein were removed, while cofactors were retained (i.e.NAD, FAD, GSH).Additionally, using a defined GRID energy threshold (Goodford, 1985), Cu+2, Fe+2, Zn+2, Mg+2 ions exhibiting interactions with the protein residues (excluding those ions that are artefacts) were retained.II) The flapsite tool was applied for detecting protein cavities.In total, 199.436 cavities were collected.III) For each detected cavity the Molecular Interaction Fields (MIFs) (Baroni et al., 2007) were calculated using the default GRID probes (H, DRY, O, and N1 are used to compute shape, hydrophobic interactions, H-bond acceptor interactions and H-bond donor interactions respectively).The resulting MIFs are then reduced in complexity to form a "Common Reference Framework".Those are stored in the BioGPS database to perform a pair-wise comparison with one or more template structures.By using 288 cpus (72 nodes, each of 4 cpus) on Amazon Elastic Compute Cloud (Amazon EC2), a web service that provides resizable compute capacity in the cloud, the cavities database creation took 20 days.

Virtual screening
Bisphenol-A (BPA) and 17β-Estradiol (EST) were used as template molecules for structure-based virtual screening against all the previously collected cavities.The Molecular Interaction Fields (MIFs) of Bisphenol-A and Estradiol were generated using the GRID probes H (shape), DRY (hydrophobic interactions), O (H-bonding donor interactions) and N1 (H-bonding acceptor interactions).The FLAP structure-based approach (Baroni et al., 2007;Cross et al., 2010;Brincat et al., 2011;Sirci et al., 2012;Siragusa et al., 2016) was applied for evaluating the MIF volume complementarity between the template molecule and a cavity.The MIFs superposition is reported as complementarity score: the higher the score, the better the ligand fits into the cavity.Score values span between 0 and 1, where 0 indicates no complementarity between ligand and cavity MIFs, and 1 indicates that the ligand fits the cavity with the maximum complementarity.Figure 1 reports an example where BPA fits well into a cavity (left side) and where the complementarity is not favorable (right side).The calculated score indicates the probability that a molecule fits into a cavity (in general the higher the interaction superposition, the higher the probability), but no information about the entity of the interaction (or the activity) is contained into the score.BPA and EST were screened against the entire database of MIF cavities (199.436cavities).By using 288 cpus (72 nodes, each of 4 cpus), the BPA and EST virtual screening took 3 days on Amazon Elastic Compute Cloud (Amazon EC2).The score values were considered for ranking the database cavities according to their complementarity with the two templates.An in-silico activity profile was obtained for both molecules.For each protein (identified through the Uniprot code) only the cavity with the maximum score was considered.
By setting a score threshold of 0.6 to discriminate between active and inactive targets (proteins bound and not-bound by the two templates), we identified 2.773 active proteins for BPA, 1.608 active proteins for EST.BioGPS software is distributed by Molecular Discovery Ltd2 .Use BioGPS for commercial or clinical purposes is strictly prohibited in the absence of a Commercial License Agreement from Moleculer Discovery Ltd.

PubChem bioassay
Biological assay results for small molecules are stored in the PubChem BioAssay (Wang et al., 2017) database.A PubChem BioAssay data entry includes an assay description, the corresponding target, the substances tested, the experimental values (Ki, IC50, EC50) and their activity (chemical probe, active, inactive, inconclusive and unspecified).We searched for BPA and EST biological activity data.For each bioassay, the corresponding target and activity are reported.The activity results include five categories: chemical probe, active, inactive, inconclusive and unspecified.We selected as positives those targets on which BPA and EST results 'active' and the activity outcome was derived from dose response curves following tests with multiple concentrations ('confirmatory bioassays').We then selected as negatives those targets on which BPA and EST results as 'inactive'.Among these proteins we considered only the ones having a deposited human, rat or mouse X-ray structures in the Protein Data Bank.

The Comparative Toxicogenomic Database (CTD)
The Comparative Toxicogenomic Database (Davis et al., 2018) supplies chemical-gene, chemical-protein interactions information and their relationships to medical condition.Data are curated by experts from published literature.We searched for BPA and EST bioactivity.A list of chemical-gene/protein interaction is available: any, abundance, activity, binding, cotreatment.Only binding type interactions have been selected and manually verified that they involve a direct link between one chemical and one protein.

Protein classification and protein binding fingerprint profile
We collected proteins resulting as positive from the virtual screening analysis both for BPA and EST (2.773 and 1.608 respectively).By using the Panther classification system (Thomas et al., 2003;Mi et al., 2010) we grouped these proteins into different classes.We first computed the total number of proteins in the screened database for each protein class.We then calculated the number of proteins resulting as positive for BPA for each class.To avoid bias introduced by the distribution of families among the entire dataset, we then computed the ratio between the number of positive and the total number of proteins for each class.We performed the same analysis for EST.We built a fingerprint graph, both for BPA and EST, reporting the number and the percentage of proteins in each class, as circle size and color gradient respectively.

Broadening the space of virtual screening
In this work we moved the magnifying glass from the well-known active sites to the unexplored rooms in protein structures, including in the virtual screening analysis all human and rodent 3D structures freely available at the Protein Data Bank (PDB) (Berman et al., 2000) (30.153 structures, October 2014).Using BioGPS (Siragusa et al., 2015(Siragusa et al., , 2016)), we identified all potential pockets on each of the 30.153 protein structures, finding 199.436 cavities, 15.857 of them containing a drug-like small molecule binding site (defined as 'active site').Therefore, we used all the cavities detected for each of the 30.1533D characterized structures for the in silico screening and our final BPA screening panel was thus composed of 199.436 cavities, belonging to 6.309 unique proteins and representing an exhaustive view for simulating a ligand binding interaction in the whole organism.To better clarify the meaning of such results we took, as an illustrating example, the interaction between bisphenol-A (BPA) and estradiol (EST) on the Estrogen Receptor α (ERα).As a first step we detected the location for all cavities (where cavity or pocket is defined as the region of receptor to which a ligand may energetically be hosted) in the Estrogen Receptor alpha structure co-crystallized with either BPA (PDBcode: 3uu7) or with EST (PDBcode: 1a52) (Delfosse et al., 2012).As reported in Figure 2a, ERα dimer cavities show a specular architecture: cavities 1 and 3 on one monomer (helix 1, in green) correspond to cavities 2 and 4 on the other monomer (helix 2, in cyan) respectively.Cavities 1 and 2 are the well know and extensively studied BPA binding sites, the other cavities represent the 'secondary' sites.BPA and EST were screened against all cavities detected in existing co-crystal structures of the Estrogen Receptor alpha (BPA in 1a52, EST in 3uu7), to verify that BioGPS correctly identifies the true binding location.Scores are reported in Figure 2b where EST and BPA scores are highest in the well-known binding sites (cavity 1 and 2).As a further validation of the method, we extracted the superposition of BPA and EST in Estrogen Receptor α binding site generated by the BioGPS algorithm.The best pose for each of the two molecules was selected.We have compared the two poses with the X-ray structure.The BioGPS algorithm was able to reproduce the X-ray position of the two ligands (Figure 2c), suggesting that the MIFs overlap is correct, and thus, it can be used as a valuable approach for detecting new putative targets.Interestingly, BioGPS retrieves the space between the two monomers (cavities 3 or 4), in agreement with studies reporting the likelihood of ER ligand binding at the interface of the dimer (Chakraborty et al., 2012).This fact confirms that unknown binding sites can be still unearthed, paving the way to unexplored territory.

Bioactivity profiling for BPA and EST
BPA and EST were screened against the panel of human, rat and mouse protein cavities collected from the Protein Data Bank (PDB) in order to identify putative protein targets.Structure-based approach consists in evaluating the degree of complementarity that occurs between a ligand (in our study BPA and EST) and a cavity (the region of receptor to which a ligand may energetically be hosted): the higher the similarity score, the better the molecule fits into the template cavity (see Methods).The complementarity score (see Methods) was used for evaluating the degree of complementarity between the two templates (BPA and EST) and the cavities in the panel.The distributions of scores for BPA and EST is reported in Figure 3, in blue and light green respectively.The distribution of complementarity score values of BPA was shifted to larger values (binding more proteins with higher scores) as compared with the distribution from EST, indicating a higher promiscuity as a ligand.This fact might be due either to a smaller shape of BPA compared to EST and/or its simplistic structure (easily adaptable to different cavities).What we learned from this first analysis is that BPA and EST binding performances differ in a modest way, with the BPA active profile more extended.In order to address this assumption, we selected from the panel those cavities accommodating the two molecules (BPA and EST) similarly to, or even better than, ERα.To do this, we first calculated the complementarity score obtained between BPA and EST among all Estrogen Receptor alpha binding sites (Fig. S1 3 ).For the two molecules, a value of 0.6 was obtained for the interaction with the most studied pocket.Being our benchmark to identify other potentially relevant interactors for BPA, we retrieved all the cavities presenting a score higher than 0.6, meaning, that they accommodate similarly to, or better than, ERα.17.991 and 7.624 cavities were detected as putative interactors for BPA and EST respectively, belonging to 2.773 and 1.608 proteins (Tab.1).Taking into account that for one protein many X-ray structures can be available (e.g for human ERα the number of proteins is 1, the number of structures available in the Protein Data Bank (October 2014) is 123, the number of cavities is 405), the number of binding structures were respectively 10.503 and 5.336.The whole list of screened proteins and BPA and EST hits is available in Tab.S1 3 .

Computational standpoint meets in vitro data
To get insights on how the putative BPA targets retrieved by the virtual screening compare to experimental data, we searched for proteins tested to interact with BPA (experimentally active and inactive targets).We used PubChem Bioassay platform (Wang et al., 2017) and the Comparative Toxicogenomics Database (CTD) (Davis et al., 2018) .Those proteins for which there is no 3D structure available were removed from the set.Finally, BPA was found to be experimentally active against 23 proteins and inactive against 82 proteins.Thus, among the computationally predicted proteins (about 6 thousand between active and inactive) only for a few of them experimental data are available.The same strategy was applied to EST, finding 25 experimentally active targets and 92 inactive targets (Tab.2).For evaluating the robustness of the BioGPS classificator we compared in vitro and in silico activity.Figure 4 shows the ROC enrichment curves resulting from BioGPS scoring (true positive rate versus false positive rate) and the final AUC (Area Under the Curve).AUC is a measure of the successful discrimination between known actives and inactives.The ROC curves clearly show good results of in silico predictions compared with in vitro data.In both cases, in the first 40% of the false positive, almost all true positives were correctly classified.
Table 2 reports all experimentally positive targets.Protein hits are divided into: "common" when they were experimentally active for both BPA and EST; and "selective BPA" or "selective EST" where they were documented active only for one ligand and inactive or not determined for the other one.
Among the BPA selective targets Nuclear factor erythroid 2 related factor 2 (Nrf2) came out as false negative for BPA (in vitro active target predicted as inactive).In reality searching deeply in literature, we found that the induction of Nrf2 through BPA is considered mediated by nitrosylation of Keap1 (Nakamura et al., 2018) and a direct binding to BPA is not reported.
In conclusion, the percentage of correctly predicted active targets is 95% (23 out of 24) for BPA and 88% (22 out of 25) for EST.32% and 54% of the inactive targets were correctly predicted, for BPA and EST respectively (see Tab. S1 3 for further details).Moreover we should highlight that binding is a condition necessary but not sufficient to trigger an activity and there is a distinction between binding (that we predict) and activity (experimental data we compare the prediction with) and in our approach we looked not only for the very well know binding site but for new potential ones that may or may not trigger the final activity.Interestingly, Graves et al. (2005) addressed the difference between true binders and non-binders for a target defining FP hits as "decoys" and distinguishing: i) geometric decoys, when an incorrect configuration of a ligand in a binding site is predicted and ii) hit list decoys, when a compound is predicted to bind but, on experimental testing, is found not to bind at relevant concentrations.Indeed "inactive" means not active a relevant concentration, therefore a binding can still exist.

In silico binding promiscuity and protein binding fingerprint profile
We calculated for BPA and EST the in silico protein-binding promiscuity index (isPI), defined as the percentage of receptors that bind versus the total number of receptors tested.It lies between 0 and 1, where 1 indicates that the ligand shows a complementarity with all proteins tested.BPA was found more promiscuous (isPI = 0.44) than EST (isPI = 0.25).The space of structural promiscuity is vast: thousands of potential binders were predicted.Interestingly, as already reported by other authors (MacKay and Abizaid, 2018), the vast binding profile does not involve solely Nuclear Receptors (Fig. 5).Indeed based on Panther (Thomas et al., 2003;Mi et al., 2010) protein classification, we grouped putative BPA and EST targets into different families.'Receptor' family was split into its subfamilies 'nuclear hormone receptors', 'G-protein coupled receptors' and 'other receptors' to discriminate nuclear and membrane receptors.Using the number of proteins for each retrieved family, we created what we called 'protein-binding fingerprint profile' for both BPA and EST (Fig. 5).The number of proteins binding is represented as circles of different sizes (the larger the circle, the more proteins); furthermore the percentage of protein hits was computed for each family (indicated as color gradient from green to red), to avoid bias introduced by the distribution of families among the entire dataset (see Methods).Indeed, looking at the distributions, we can deduce that a great number of BPA and EST protein hits belong to transferases class (big circle).Nevertheless, paying attention to percentage (blue/green circles), we can conclude that only a small part of available transferases in the screened database were actually retrieved as protein hits for BPA and EST (the 16.03% and the 10.46% respectively).What we learn from the analysis is that a broad array of enzymes (consisting of kinases, oxidoreductases, hydrolases, and many others), G-protein coupled receptors and transporters, contribute to the BPA in-silico protein-binding profiling.
Compared to BPA, EST exhibited a similar binding profile.Indeed, the relative dimensions of circles (Fig. 5) are very similar between the two 'protein-binding fingerprint profiles' (transferase is bigger than enzyme modulator, that is bigger than hydrolase, etc.).Nevertheless, what it is clear is that the BPA profile showed both higher percentages (more blue) and bigger absolute dimensions reflecting its higher promiscuity.This means that BPA tends to bind a higher percentage of proteins in the database, across different families.We then focused the analysis on proteins that are predicted either as selective targets for BPA, or EST, or for both ligands.1.254 proteins came out as selective hits for BPA, while 81 for EST.1.547 proteins were predicted to potentially bind both BPA and EST (Fig. 6a).We also classified common and selective targets according to Panther classification (Fig. 6b).
As expected, half of the protein hits were commonly targeted by BPA and EST: the high number of common hits is probably due to the similar structures they have, both constituted by a hydrophobic body and two terminal OH groups.Selective BPA putative interactors (43%) represent a high percentage of protein hits.From this structural standpoint, BPA has a wider binding profile, and potentially activity pattern, as compared to a physiological estrogen.This may be due to its simpler structure, easily adaptable to many cavities.In conclusion, from this in silico analysis we have learned that BPA has property as a promiscuous binder, as already discussed by other authors (MacKay and Abizaid, 2018), supporting the view that toxicological investigation should enlarge its panorama.

Discussion
We explored the opportunity to apply a new in silico workflow, based on whole proteome virtual screening, to support toxicological characterization of small molecules.The ultimate goal was to determine the nature and extent of chemical protein binding profile and to study the feasibility to identify potential multiple new relevant targets and mechanisms of action for food chemicals.Results provided a strong correlation between BioGPS predictions and freely available in vitro data, indicating the predictive value of the model.Unanticipated data, never reported earlier, were obtained.The virtual screening of all human and rodent 3D structures (30153 structures) revealed a high BPA protein binding promiscuity (expressed as the ratio between proteins predicted to bind and total proteins tested).With respect to protein binding, BPA was predicted to potentially interact with more than 2000 proteins.EST was predicted to bind proteins following a very similar profile, but with a lower promiscuity.Overall, these data throw a very new light on protein-ligand interactions, which may profoundly challenge how mode of action and AOP are currently viewed and understood.The data obtained in this study are in line with the complex and numerous experimentally documented biological and toxicological effects, as well as mechanistic information reported for BPA (Murata et al., 2018;EFSA 2015).This method may constitute a breakthrough tool reflecting the real complexity of chemical-biological interactions.It has to be underlined that this work constitutes a first step, and that the actual application of these data, e.g. for safety evaluation, it will require additional research (on going).This includes the assessment of many more chemicals expressing different types of toxicity through different mechanisms and the in vitro screening of additional proteins not yet investigated.It is important to note that the approach, as applied above, is qualitative and does not give any insight on how much of the substance is required to activate pathways and result in an effect.Indeed, it does not yet provide an estimate of potency.This constitutes a key research focus in order to exploit better BioGPS in a context of risk assessment.
In any case, in silico protein binding fingerprint can already give a new and broader overview on how to determine compound similarity, supporting and strengthening the application of read-across, one of the most used alternative method in toxicology (Hartung, 2016;Grace et al., 2017).Indeed read-across is based on filling toxicological data gaps with data available for similar chemical(s), called analogue(s), and the main challenge in its application is the selection of such similar compounds.In fact, molecules cannot be considered similar in absolute terms (only based on their chemical structure), but their similarity should be evaluated with respect to their biological activity.Applying our strategy, analogues definition, used in read-across, can be referred not only to chemical structure or a single mechanism of action (e.g.binding to ER in case of BPA), but on protein binding fingerprint covering the full spectra of potential MIEs.In this context we think that our approach can be considered as one step forward to improve and strengthen the use of in silico prediction in a context of chemical risk assessment.It can also support priority setting to direct further toxicological studies and to better design targeted in vitro and in vivo experiments.Indeed, 3D protein binding prediction is valuable for getting insight about possible mechanisms of action concealed behind in vivo effects.
Our approach could be considered complementary to those developed under large research initiatives such as U.S. Tox21 (Thomas et al., 2018) and ToxCast (Richard et al., 2016), based on in vitro screening using hundreds of bioassays (815 distinct assay endpoints for 1060 environmental-relevant chemicals for EPA ToxCast program, while Tox21 program has run more chemicals in a smaller number of assays).Similar to the method presented here, such broad approaches provide a large spectrum of potential biological effects.In silico approach is faster, cheaper and more comprehensive than the in vitro screening and it can be helpful for the interpretation and prioritization of in vitro assay (e.g.all information regarding protein binding profile for BPA and EST are reported 3 and they can trigger further in vitro screening).In silico screening can be directly applied only to pure substances, and mixtures as such are obviously not applicable.However, testing the known single constituents of a mixture with BioGPS could support grouping them according to binding profiles; and consequently, it could play a significant role to decide on the application of combined toxicology principles to assess the safety of the full mixture.Moreover, our method is feasible for testing compounds difficult to obtain, such as metabolites and impurities, and molecules onerous to isolate and synthetize.We strongly believe that together with ToxCast and Tox21 program, combining the use of hundreds of in vitro assays for many environmental-relevant chemicals, as well as a broad in silico screening of molecular targets could help in elucidating the potential Adverse Outcome Pathways (AOP) through which chemicals may cause adverse effect in vivo.
In conclusion, BioGPS appears as a promising tool, not only for polypharmacology, but also for mechanistic toxicological investigations.It is in line with current initiatives to develop alternative methods for toxicology, which rely less on high dose animal studies but more on human relevant and mechanistic information.

Fig. 1 :
Fig. 1: BPA fits into a cavity (left) with a high complementarity score Red and green pockets indicating respectively areas where hydrogen bonding acceptor and hydrophobic moieties are favorable hosted.Red dashed circles highlight that BPA -OH groups fit well in hydrogen bonding acceptor MIFs of the binding site.Green dashed circle indicates good complementarity of BPA hydrophobic moieties in hydrophobic MIFs of the binding site.BPA unfavorable fitting (right) with low complementarity score.

Fig. 2 :
Fig. 2: Binding of BPA to the estrogen receptor α (a) Cavities detected on Estrogen Receptor α (PDBcode: 3uu7).Monomer 1 and 2 are displayed in green and cyan respectively.Cavities 1 and 2 are the well know and extensively studied BPA binding sites, the other cavities represent the 'secondary' sites discovered by BioGPS.(b) BioGPS complementarity scores of BPA/EST against Estrogen Receptor α cavities (c) Superposition of BPA and EST poses generated by BioGPS, and X-ray structure of Estrogen Receptor α.X-ray ligands are displayed in cyan, BioGPS poses in magenta, the Estrogen Receptor α [left PDB code: 3UU7; right PDB code: 1A52] in green, and the binding site in grey.

Fig. 3 :
Fig. 3: Distributions of score values obtained by virtual screening of BPA (blue bars) and EST (green bars) against all cavities

Fig. 5 :
Fig. 5: Protein-binding fingerprint profiles for BPA and EST Distributions of BPA and EST hits among protein classes is represented as circles of different dimensions.Percentage, computedfor each family, of retrieved protein hits on total available proteins in screened database is reported as gradient color from green to red.

Fig. 6 :
Fig. 6: Hits and protein binding selectivity classification for BPA and EST (a) Pie chart reporting the fraction of hits detected only for BPA, only for EST and for both of them (selective and common hits).(b) Selective and common proteins classification.

Tab. 2: In silico binding prediction of the proteins known to be targeted by BPA and EST (only in vitro active are reported).
In green we reported proteins for which in silico predictions were confirmed by in vitro data, in red proteins for which in silico prediction was not confirmed by in vitro data.