Workshop Report 28

Group 2B

The group was asked to consider the following set of issues covering statistical aspects of SSD use in the order presented (order not according to importance):

Review current tools and key (statistical) methodology, including assumptions about distributions of sensitivity, use of hierarchical models, interspecies correlations. Identify where there are important differences and what the implications of these could be.
As sensitivity to chemical stress seems to be related to taxonomic closeness, how could this be used in the construction and interpretation of SSDs?
Do models based on prior knowledge provide advantages over other methods?
Are current modelling success criteria, such as those identified in the REACH TGD, sufficient, overly prescriptive or insufficient?
What are the research needs?

Although the group was guided by considering these issues, the discussion was not structured using the questions, and, ultimately, the group decided to report back about 3 main themes. Those themes yielded over-arching notions, valuable to answer the detailed questions and the application of SSDs and criteria. Building on the presentations and discussions of the previous workshop day regarding ecological aspects and validity, for example, it was made clear and discussed that various ecological aspects are currently being addressed in SSD-related risk assessments. Further, SSDs have various uses beyond criteria setting.

Initial Reactions and Thoughts from the Syndicate

At the start of the discussions, there was an opportunity for each member of group to highlight what they thought was the key issue surrounding the statistical models being discussed at the workshop. Here is the list of comments and questions that individuals thought were important for us to discuss:

What are the limitations of the statistical models? We need more experience with the models to understand the limitations and the differences. The influence of several data aspects on different models are uncertain. Factors include, but are not limited to, amount of data, confounding factors, and sensitivity to natural variability.
What are the possibilities for using the models for extrapolation outside the existing data? More experience is needed in understanding how chemicals and aquatic organisms differ in different climatic regimes and aquatic environments. It is uncertain how biological functions might differ between species and how structure might be different in various food webs.
What extra ecological knowledge needs to be included in the models to add further value? The connection between mathematical models and ecology is tenuous. While the goal is to mimic biological response mathematically, it is important to understand the influential biological and ecological factors that need to be measured and accounted for in deriving models tailored to the problem. There are pathways to address ecological information in SSDs, like via hSSD, tailoring the model to a site or water body or system. Questions that arise then include when and how to do that and how to handle spatial differentiation in modelling results (here: criteria)? Would tiering apply to this (i.e. a generic model for ‘the’ ecosystem and tailored models for ‘this’ ecosystem?). Improving the connection between statistical models and ecological relevance will help to reduce uncertainty and elevate confidence in the protectiveness of threshold values derived from the models.
Is the use of SSDs curves the best approach; are there other approaches? Tailoring and tiering also applies here. The regulatory community has prescribed the statistical methods behind SSD curves, and that appears to be one of the main reasons why SSD curves are being used, next to versatility. It would be useful to understand if sufficient effort has been invested to explore other options that provide equal or better confidence in the threshold values derived using the SSD approach. It would also add confidence to the current approach and perhaps point in new directions for improvements.
What biological, chemical, and ecological aspects are important and, perhaps, unaddressed in the current SSD approach? We should not throw away what is known in terms of mode of action, species differences, chemical interactions in mixtures, exposure scenarios, food webs etc. We do not understand well how these aspects translate into a statistical model, though hSSD is an example of incorporating novel data and concepts within the classical ‘flat’ SSD modelling which is based on ecotoxicological test data and model choice.
Are taxonomic similarities and differences important when extrapolating between species or adopting only available biota-response data (as opposed to a prescribed mix of different taxonomy or species)? There seems to be strong evidence that taxonomic closeness implies similar toxicological responses. It would be worth investigating similar relationships for mode of action or exposure scenarios. For mode of action, there expectedly is a stronger numerical effect of specific modes of action on SSD-modelling outcomes then there may be of narcotic action.
Has sufficient attention been given to whether SSD curves answer the important questions that underlie this work? There should be careful consideration of the modelling assumptions. In particular, regulators and scientists should ask themselves what biota or ecology are they trying to protect?
Does the use of SSD curves replace or compliment risk assessment? Using current models seems to be tinkering on the edge of what we expect from risk assessment. Are we trying to too hard to re-package risk assessment and its elements of exposure, dose-response, ecotoxicity and uncertainty into a different statistical method?

In conclusion, the statistical questions posed are of actual relevance, but gain perspective when considered in relation to the wider perspective of scientific developments on e.g. mode of action, ecological aspects linked to SSDs, and so forth. The group summarised the above discussions and perspectives into 3 main themes: (1) thoughtful decision processes, (2) inclusivity with regards to available data, and (3) interpretation of the uncertainty estimates from different models and took a step back from the detailed questions to consider the context for applying SSD curves, the information used to populate an SSD curve, and the interpretation of results generated by the model.

(1) Thoughtful decision processes

A decision tree (or other formal decision process) could be used to help structure the prior thinking and the subsequent steps in the hazard or risk assessment. Risk assessments do have very different problem definitions, and do imply different data availabilities, while asking for either generic or specific answers. This context suggests a formal decision process. SSDs could then be contained within such a defined process, and their role, which may be small, will be appropriate for the risk assessment problem in hand. Given the emergence of new chemicals every day, a thoughtful decision process can help to anticipate key chemical and ecological characteristics that may be more or less important relative to the evaluation of other chemicals in the same or similar aquatic systems.

It was noted, for example, that the REACH Technical Guidance Document (ECHA, 2011) is probably the most well documented example of how a decision process can guide the framework for conducting a risk assessment that yields proposals for criteria (PNECs). The same approach to a decision process might be useful to guide the development of SSD curves, as well, and other similar tools useful to understanding biological/ecological responses to chemicals in the environment. In the context of SSD curves and similar models, the decision process could be extended further to incorporate recommended approaches for handling old data, new data and different types of knowledge. A decision process will help risk assessors to plan, implement and evaluate how to apply models and decide what data to use. In fact, decision processes can help us to understand the value of the model and data. A proper decision process yields approaches that are best tailored to the problem definitions that exist, and harbour (thus) contextual flexibility (which question is answered, which approach is chosen) as well as transparent consistency (given a chosen method, there is a clear way how to do it in that context).

Also, a well-documented decision process will add transparency which is needed to improve current practices. At present, the decision process used to select information for populating an SSD curve is known only to the extent that the developer has openly identified the assumptions used to judge the quality of available data and to select certain studies or aquatic species and not others.

As part of developing useful decision trees or data evaluation processes, regulators and scientists need to ask the following questions: what are the problem definitions for which the SSD model is applied, where are current processes recorded and do individual organisations have different assessment procedures when using SSD results. For example, RIVM refers to technical guidance and certain rules for what procedure(s) to follow depending on the circumstances of the risk assessment. However, the danger of over-prescription should be avoided. Guidance and recommendations are preferable to hard do – and – don't rules. There is a worry that guidance can rapidly evolve to become rules, which can lead to fossilisation of statistical models and approaches, or major communication problems when the SSD model is used in a different context (e.g. disaster management as opposed to a generic risk assessment for which the SSD was derived) when the rules set for criteria derivation are assumed valid. Therefore, any guidance that is produced must also look to future proofing.

Lastly, there is a concern that overly prescriptive models and data assessment methodologies might stop regulators and scientists from thinking about the assumptions of the models and their best use for individual risk assessments. Coupled with this is the concern that regulation might constrict the process because transparency and uncertainty are difficult aspects to accept. The key for any process (and associated statistical methodology) is that it must be fit-for-purpose. We do not wish to be regimented in our assessment approaches, given the context of various applications of risk assessment.

(2) Inclusivity with regards to available data

Undoubtedly, scientists and regulators wish to use as much data as possible in the risk assessment process, including the populating of SSD curves. A formal weighting process is one approach that can help to qualify all available data and prioritise the importance and value of data.

To achieve this, there is a need for a more formal data evaluation process. There are often several good reasons for excluding data from an analysis (e.g. chemical purity issues, exposure and biological issues, and poor reporting of the testing procedures). However, attention should be given to whether so-called discarded/rejected data might have some utility in the evaluation of model results. Early exploration of models using high- and low- quality data might be a logical first step as part of early data exploration, and before settling on a formal and final model and analysis. The early consideration of all available data offers the opportunity for insight on chemical and ecological attributes that might be missed when certain data sets are removed from the risk assessment. Care should be given though that the less strict approach to data evaluation for SSD derivation is not flawed by creating a bias in the SSD, as would be expected to occur, if for example, a very volatile substance is tested in open vessels and the test concentration is not verified.

The importance of a principled weight-of-evidence approach would address the final outcome. The practice of including and excluding data sets in a systematic manner to explore the influence of different data sets used to develop an SSD curve and perform a risk assessment could be particularly valuable in a weight-of-evidence scheme. Such a scheme should be included with the presentation of results of modelling and risk assessment.

Data are not available for all possible risk assessment scenarios. It was felt that improvement is needed in the both the quality and breadth (in terms of taxonomic diversity, mode of actions etc.) of data used in assessments. For instance, Web-ICE is currently based upon a data repository that is fit for US risk assessments. It would be useful to know if Web-ICE and similar approaches could be fit for risk assessments in other regulatory arenas.

(3) Interpretation of uncertainty estimates

It was evident to the group that the uncertainties associated with the HC5 results reported using each of the 3 primary statistical models discussed in the plenary meeting (i.e. ETX/R, Web-ICE and hSSD) are not the same in kind, representing different aspects of underlying variability, though numerically (partly) overlapping or (often) in the same order of magnitude. The different approaches to choice of data and data interpretation generate different types of uncertainties. The attributes of the uncertainties must be communicated properly.

For example, the confidence intervals reported using the ETX method stem from uncertainty in the fitted parameters of the underlying statistical distribution. The confidence intervals from the Web-ICE method attempt to capture uncertainty caused only by cross-species extrapolation. The credible intervals reported in the hSSD method stem from the characterisation of uncertainty about the underlying biological-response data and taxonomic differences. Each of these models is highlighting different model limitations.

This itself highlights the importance of data and model transparency when interpreting SSD curves and risk assessment models. There should be no blind application of statistical models (for instance, we should be concerned if the data underpinning the SSD is showing multi-modal behaviour and we are fitting a unimodal distribution due to habit or procedural prescription). Because uncertainty influences assessment factor specifications in some regulatory arenas, care must be taken in the interpretation of the uncertainty. Characterising and interpreting uncertainty correctly could influence the interpretation of SSD curves and risk assessments that encourage maximum insight from available data.

In addition to the uncertainty associated with the statistical modelling approach, there is a more general concern about ecological relevance and the interpretation of the models and the associated uncertainties. SSD curves as they are produced may have a fundamentally flawed misfit to the ecology and exposure conditions of the exposed ecosystem(s) of concern, and neither knowledge of the ecosystems nor the SSD-model itself may be flexible enough to capture that variability in nature. By their nature, SSDs are statistical models, which can only to a limited extent be expected to incorporate ecological information in them. They are, and will probably remain, lower-tier approaches in terms of addressing ecological. This issue raises questions about the predictive accuracy of statistical models based entirely or predominantly on data extrapolation. We also need to be clear about whether the results of SSD curves should be correctly reported in terms of the HC5, and whether the upper and lower bounds of uncertainty should also be reported and considered in regulatory decision-making.

Research questions

Throughout the discussion, the group identified research questions and paths for new or additional work that could help to improve modelling and risk assessment.

What are the limitations of the models and are they fit for purpose?
What are viable methods for incorporating all relevant data?
Is it possible to treat mode of action in the statistical models in the same way taxonomic distance is being used? (In particular, is this feasible for Web-ICE and hSSD?)
Can a formal decision tree approach that is inclusive of the available data and is transparent be defined?
What additional ecological knowledge needs to be included to add value for the risk assessors?