Moving towards making (quantitative) structure-activity relationships ((Q)SARs) for toxicity-related endpoints findable, accessible, interoperable and reusable (FAIR)
Main Article Content
Abstract
(Quantitative) structure-activity relationships ((Q)SARs) are widely used in chemical safety assessment to predict toxicological effects. Many thousands of (Q)SAR models have been developed and published, however, few are easily available to use. This investigation has applied previously developed Findability, Accessibility, Interoperability, and Reuse (FAIR) Principles for in silico models to six published, different, machine learning (ML) (Q)SARs for the same toxicity dataset (inhibition of growth to Tetrahymena pyriformis). The majority of principles were met, however, there are still gaps in making (Q)SARs FAIR. This study has enabled insights into, and recommendations for, the FAIRification of (Q)SARs including areas where more work and effort may be required. For instance, there is still a need for (Q)SARs to be associated with a unique identifier and full data / metadata for toxicological activity or endpoints, molecular properties and descriptors, as well as model description to be provided in a standardised manner. A number of solutions to the challenges were identified, such as building on the QSAR Model Reporting Format (QMRF) and the application of QSAR Assessment Framework (QAF). This study also demonstrated that resources such as the QSAR Databank (QsarDB, www.qsardb.org) are valuable in storing ML QSARs in a searchable database and also provide a Digital Object Identifier (DOI). Many activities related to FAIR are currently underway and (Q)SAR modellers should be encouraged to utilise these to move towards the easier access and use of models. Enabling FAIR computational toxicology models will support the overall progress towards animal free chemical safety assessment.
Plain language summary
This study relates to the availability of computational (termed in silico) models to predict the harmful effects of substances from a knowledge of chemical structure alone. The specific models referred to are (quantitative) structure-activity relationships ((Q)SARs, which have developed for many endpoints. Six machine learning models for toxicity were assessed against existing principles intended to make (Q)SARs findable, accessible, Interoperable and reusable (the FAIR principles). Evaluation of existing models against the FAIR principles highlighted a number of areas where progress is required to ensure the (Q)SARs are available for use. Currently there is no standard means to store (Q)SAR or provide a unique identifier, although the QSAR Databank (QsarDB) was illustrated as one possible solution. It is also crucial record the meta data associated with a model, such that it may be reproduced. A standardised ontology is required to facilitate the effective and accurate story of models and data.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
Articles are distributed under the terms of the Creative Commons Attribution 4.0 International license (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is appropriately cited (CC-BY). Copyright on any article in ALTEX is retained by the author(s).
Ammar, A., Evelo, C., and Willighagen, E. (2024). FAIR assessment of nanosafety data reusability with community standards. Sci Data 11, 503. doi:10.1038/s41597-024-03324-x
Angelo, R. M., Andreia, K. I., Almeida, M. P. et al. (2020). OntoQSAR: an Ontology for Interpreting Chemical and Biological Data in Quantitative Structure-Activity Relationship Studies. IEEE 14th International Conference on Semantic Computing (ICSC), San Diego, CA, USA, 2020, pp. 203-206. doi:10.1109/ICSC.2020.00042
Barber, C., Fowkes, A., Hanser T. et al. (2024a). From model performance to decision support – The rise of computational toxicology in chemical safety assessments. Comput Toxicol 31, 100303. doi:10.1016/j.comtox.2024.100303
Barber, C., Heghes, C. and Johnston, L. (2024b). A framework to support the application of the OECD guidance documents on (Q)SAR model validation and prediction assessment for regulatory decisions. Comput Toxicol 30, 100305. doi:10.1016/j.comtox.2024.100305
Belfield, S. J., Cronin, M. T. D., Enoch, S. J. et al. (2023). Guidance for good practice in the application of machine learning in development of toxicological quantitative structure-activity relationships (QSARs). PLoS ONE 18, e0282924. doi:10.1371/journal.pone.0282924
Bishop, P. L., Mansouri, K., Eckel, W. P. et al. (2024) Evaluation of in silico model predictions for mammalian acute oral toxicity and regulatory application in pesticide hazard and risk assessment. Reg Toxicol Pharmacol 149, 105614. doi:10.1016/j.yrtph.2024.105614
Briggs, K., Bosc, N., Camara, T. et al. (2021). Guidelines for FAIR sharing of preclinical safety and off-target pharmacology data. ALTEX 38, 187-197. doi:10.14573/altex.2011181
Cronin, M. T. D., Belfield, S. J., Briggs, K. A. et al. (2023). Making in silico predictive models for toxicology FAIR. Reg Toxicol Pharmacol 140, 105385. doi:10.1016/j.yrtph.2023.105385
Cronin, M. T. D., Enoch, S. J., Madden, J. C., et al. (2022). A review of in silico toxicology approaches to support the safety assessment of cosmetics-related materials. Comput Toxicol 21, 100213. doi:10.1016/j.comtox.2022.100213
Dearden, J. C., (2016) The history and development of Quantitative Structure-Activity Relationships (QSARs). Int J Quant Struct-Prop Relat 1, 1-44. doi:10.4018/IJQSPR.2016010101
Dimitrov, S., Dimitrova, G., Pavlov. T. et al. (2005). A stepwise approach for defining the applicability domain of SAR and QSAR models. J Chem Inf Model 45, 839-849. doi:10.1021/ci0500381
ECHA – European Chemicals Agency (2023). The use of alternatives to testing on animals for the REACH Regulation. Fifth report under Article 117(3) of the REACH Regulation. European Chemicals Agency, Helsinki, Finland. doi:10.2823/805454
EC – European Commission, Joint Research Centre (JRC) (2020): JRC QSAR Model Database. European Commission, Joint Research Centre (JRC) [Dataset] PID: http://data.europa.eu/89h/e4ef8d13-d743-4524-a6eb-80e18b58cba4
Floris, M., Willighagen, E., Guha, R. et al. (2011). The Blue Obelisk descriptor ontology. Available at: http://qsar.sourceforge.net/dicts/qsar-descriptors/index.xhtml
Gissi, A., Tcheremenskaia, O., Bossa, C. et al. (2024). The OECD (Q)SAR Assessment Framework: A tool for increasing regulatory uptake of computational approaches. Comput Toxicol 31, 100326. doi:10.1016/j.comtox.2024.100326
Glont, M., Nguyen, T. V. N., Graesslin, M. et al. (2018). BioModels: expanding horizons to include more modelling approaches and formats. Nucl Acids Res 46, Issue D1, D1248–D1253. doi:10.1093/nar/gkx1023
Hansch, C. and Fujita, T. (1964). p-σ-π Analysis. A method for the correlation of biological activity and chemical structure. J Am Chem Soc 86, 1616-1626. doi:10.1021/ja01062a035
Hardy, B., Apic, G., Carthew, P. et al. (2012). Toxicology ontology perspectives. ALTEX 29, 139-156. doi:10.14573/altex.2012.2.139
Hardy, B., Douglas, N., Helma, C. et al. (2010). Collaborative development of predictive toxicology applications. J Cheminform 31, 7. doi:10.1186/1758-2946-2-7
Hastings, J., Chepelev, L,, Willighagen, E. (2011). The Chemical Information Ontology: Provenance and disambiguation for chemical data on the biological semantic web. PLoS ONE 6, e25513. doi:10.1371/journal.pone.0025513
Laroche, C., Annys, E., Bender, H. et al. (2019). Finding synergies for the 3Rs – Repeated Dose Toxicity testing: Report from an EPAA Partners' Forum. Reg Toxicol Pharmacol 108, 104470. doi:10.1016/j.yrtph.2019.104470
Madden, J. C., Enoch, S. J., Paini, A. et al. (2020). A review of in silico tools as alternatives to animal testing: principles, resources and applications. Altern Lab Anim 48, 146-172. doi:10.1177/0261192920965977
Malik-Sheriff, R. S., Glont, M., Nguyen, T. V. N. et al. (2020). BioModels — 15 years of sharing computational models in life science. Nucl Acids Res 48: D407–D415. doi:10.1093/nar/gkz1055
Netzeva, T. I., Worth, A., Aldenberg, T. et al. (2005). Current status of methods for defining the applicability domain of (quantitative) structure-activity relationships. The report and recommendations of ECVAM Workshop 52. Altern Lab Anim 33, 155-173. doi:10.1177/026119290503300209
OECD (Organisation for Economic Cooperation and Development) (2007). Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships. ENV/JM/MONO, vol 2, OECD, Paris, p. 154
OECD (Organisation for Economic Cooperation and Development) (2023a). (Q)SAR Assessment Framework: Guidance for the regulatory assessment of (Quantitative) Structure Activity Relationship models and predictions, OECD Series on Testing and Assessment, No. 386, OECD Publishing, Paris, doi:10.1787/d96118f6-en
OECD (Organisation for Economic Cooperation and Development) (2023b). (Q)SAR model reporting format (QMRF) v.2.1. OECD, Paris, France. Available at: https://one.oecd.org/document/ENV/CBC/MONO(2023)32/ANN1/en/pdf. Accessed 29 July 2024.
Piir, G., Kahn, I., García-Sosa, A. T. et al. (2018). Best practices for QSAR model reporting: Physical and chemical properties, ecotoxicity, environmental fate, human health and toxicokinetics endpoints. Environ Heal Persp 126, 126001. doi:10.1289/EHP3264
Ruusmann, V., Sild, S. and Maran, U. (2014). QSAR DataBank - an approach for the digital organization and archiving of QSAR model information. J Chemoinform 6, 25. doi:10.1186/1758-2946-6-25
Ruusmann, V., Sild, S. and Maran, U. (2015). QSAR DataBank repository: open and linked qualitative and quantitative structure–activity relationship models. J Chemoinform 7, 32. doi:10.1186/s13321-015-0082-6
Spjuth, O., Willighagen, E. L., Guha, R. et al. (2010). Towards interoperable and reproducible QSAR analyses: Exchange of datasets. J Cheminform 2 5. doi:10.1186/1758-2946-2-5
Westmoreland, C., Bender, H. J., Doe, J. E. et al. (2022). Use of New Approach Methodologies (NAMs) in regulatory decisions for chemical safety: Report from an EPAA Deep Dive Workshop. Reg Toxicol Pharmacol 135, 105261. doi:10.1016/j.yrtph.2022.105261
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018. doi:10.1038/sdata.2016.18
Wittwehr, C., Clerbaux, L. A., Edwards, S. et al. (2024). Why adverse outcome pathways need to be FAIR. ALTEX 41, 50-56. doi:10.14573/altex.2307131
Yang, C., Rathman, J. F., Bienfait, B. et al. (2023). The role of a molecular informatics platform to support next generation risk assessment. Comput Toxicol 26, 100272. doi:10.1016/j.comtox.2023.100272