HC5 estimation in SSDs revisited
Tom Aldenberg
RIVM, the Netherlands
Species sensitivity distributions, in their basic form defined as univariate continuous statistical distributions over a logarithmic species sensitivity concentration axis for a particular chemical substance, can be applied in environmental risk assessment to estimate a PNEC for that toxicant. This PNEC is in many cases implemented as a statistical estimate of the log HC5 concentration. This minimalist model, originally due to Kooijman (Kooijman, 1987) and Van Straalen (Van Straalen and Denneman, 1989), needs extension to address a multitude of thinkable challenges, e.g. with regard to species selection, ecosystem representativeness / functioning, data quality, statistical model selection, and predictive evaluation of the SSD and its quantiles.
This presentation first reviewed how we handled the uncertainty of the log HC5 for the Logistic and Normal distribution, from a Bayesian viewpoint. Second, we developed the estimation of the so-called predictive distribution - formally the mean of the Bayesian spaghetti plot SSD - in order to pinpoint a single-curve SSD for a given statistical family. This leads to an improved log HC5 - or other quantile - estimate, to better reflect uncertainty due to small sample size. Presently, we consider the ubiquitous median estimate log HC5 as being unrealistically insensitive to small sample size, hence risking lack of conservativeness. This is compounded by the 5th and 95th confidence limits of log HC5 uncertainty often not being reported. The Bayesian predictive distribution method spawns a new table of extrapolation constants, addressing both chronic and acute species sensitivity data, depending on the basic fraction affected. The sensitivity of these new extrapolation constants is evaluated in the light of the REACH-required samples sizes of 10, preferably 15. A recurring concern is the effect of log species toxicity data uncertainty. Operationally, this may derive from having multiple data for the same species, from dose-response curve confidence limits, from QSAR-estimated toxicity data with associated confidence, and possibly a host of other sources of uncertainty. Intuitively, one would expect data uncertainty to further lower old - as well as new - log HC5 estimates, but methods of hierarchical modelling reveal that the reverse is the case: the more variation has to be attributed to the individual species points, the less variation remains for the SSD itself. Surprisingly, theory, as well as numerical experiments, show that the effect of data uncertainty is quite modest, leading to the recommendation to take the mean of log data point uncertainty, and continue with the old, or updated, extrapolation methodology, as if data were certain. Averaging multiple species data was already recommended in REACH (EC, 2006). It follows that using such averages per species, or employing point estimates, i.e. expected values, through model-estimated species sensitivities, only leads to slightly increased conservative, i.e. lower estimates of PNEC values being pursued. New insights of predictive SSD and the effect of data uncertainty would both help to alleviate the need for assessment factors addressing these particular issues.