Primary tabs

2022 CSEMP Biota Results

The file biota_results.csv summarises the assessment results for each timeseries of contaminants and biological effects in biota in the 2022 assessment of the UK's Clean Seas Environment Monitoring Programme (CSEMP). The results come from timeseries models fitted to the contaminant and biological effect data in biota_data.csv which in turn are based on data extracted from the MERMAN database on 11 January 2023 (supplemented by micronucleated cell and fish disease index data provided by the Marine Directorate of the Scottish Government and the Centre for Environment, Fisheries and Aquaculture Science). Details of the modelling procedures can be found in the method help files for contaminants, PAH metabolites, imposex and other biological effects.

The results form the basis for the biota component of the following Descriptor 8 Indicators in the UK's 2025 Marine Strategy Assessment:
* Status and Trends of Metals (including lead, cadmium and mercury) in Biota and Sediment
* Status and Trends in Polycyclic Aromatic Hydrocarbons (PAHs) in Biota and Sediment
* Status and Trends of Polychlorinated Biphenyls (PCBs) in Biota and Sediment
* Status and Trends of Polybrominated Diphenylethers (PBDEs) in Biota and Sediment
* Status and Trends in the Levels of Imposex in Marine Gastropods
* Biological effects (EROD enzyme activity) in fish
* Biological effects (micronucleus) of contaminants in fish
* External Fish Disease
* Status and Trends of Chemicals (excluding metals, PAHs, PCBs and PBDEs) in Water, Biota and Sediment
The Indicators only use results from coastal and offshore stations (estuarine stations are excluded using the variable waterbody_type) and from 'medium' and 'long' timeseries ('short' timeseries are excluded using the variable shape). There are two exceptions: the imposex Indicator also uses results from estuarine stations and the Indicator for 'other chemicals' also uses 'short' timeseries.

The file biota_results.csv is UTF-8-BOM encoded so can be read directly into Excel, or into R using the function read.csv with the argument fileEncoding = “UTF-8-BOM”.

The data mostly relate to chemical concentrations and the variable descriptions are written with that in mind. In particular, 'concentration' is used to describe the measurement in all time series including biological effects. However, there are some important differences for time series of imposex, ACHE, fish disease index, neural red retention time, and stress on stress and these are described in the footnotes at the bottom of the document.

The variables in the file are:

series
Description: timeseries identifier
Unit:
Type: categorical
Levels: 5644
Note: links to the timeseries data in biota_data.csv

MSFD_subregion
Description: Marine Strategy Framework Directive subregion
Unit:
Type: categorical
Levels: Greater North Sea, Celtic Seas
Note: in the map of biogeographic regions the Greater North Sea comprises the Northern North Sea, Southern North Sea, and Eastern Channel, whilst the Celtic Seas comprises the Western Channel and Celtic Sea, Irish Sea, Minches and Western Scotland, Scottish Continental Shelf, and Atlantic North-West Approaches.

biogeographic_region
Description: biogeographic region
Unit:
Type: categorical
Levels: Northern North Sea, Southern North Sea, Eastern Channel, Western Channel & Celtic Sea, Irish Sea, Minches & Western Scotland, Scottish Continental Shelf
Notes:
* regional assessments are conducted at the biogeographic regional scale
* there are 8 biogeographic regions but only 7 with data

CSEMP_region
Description: CSEMP region
Unit:
Type: categorical
Levels: 18
Notes:
* there are 22 monitoring regions in the CSEMP but only 18 with data
* the CSEMP regions are aggregated into 8 biogeographic regions for regional assessments

CSEMP_stratum
Description: subdivision of the CSEMP region
Unit:
Type: categorical
Levels: 118
Note: there are 818 CSEMP strata, most of which are WFD water bodies, but only 118 with data

station_governance
Description: organisation responsible for the monitoring station
Unit:
Type: categorical
Levels: CEFAS, DAERA, EANat, MDSG, NRW, SEPA
Notes:
* Centre for Environment, Fisheries and Aquaculture Science (CEFAS)
* Department of Agriculture, Environment and Rural Affairs (DAERA)
* Environment Agency (EANat)
* Marine Directorate of the Scottish Government (MDSG)
* Natural Resources Wales (NRW)
* Scottish Environment Protection Agency (SEPA)

station_code
Description: monitoring station identifier
Unit:
Type: categorical
Levels: 197
Note:

station_name
Description: name associated with the monitoring station
Unit:
Type: categorical
Levels: 197
Note:

station_latitude
Description: station latitude
Unit: decimal degrees
Type: continuous
Range: 50.08, 60.78
Note: this is a nominal position: sampling occurs in a pre-defined area broadly centred on this position

station_longitude
Description: station longitude
Unit: decimal degrees
Type: continuous
Range: -7.11, 2.90
Note: this is a nominal position: sampling occurs in a pre-defined area broadly centred on this position

station_type
Description: type of monitoring station
Unit:
Type: categorical
Levels: B, RH, IH
Note: baseline (B), reference (RH) or impacted (IH)

waterbody_type
Description: station typography
Unit:
Type: categorical
Levels: Estuary, Coast, Open Sea
Note: only coastal and open sea stations are used in regional assessments, apart for the imposex regional assessment where estuarine stations are also used

determinand
Description: contaminant or biological effect
Unit:
Type: categorical
Levels: 98
Notes:
* see ICES reference codes for PARAM
* SBDE6 is the code used for the sum of BDE28, BDE47, BDE99, BD100, BD153 and BD154
* FDI is the code used for the fish disease index

determinand_group
Description: contaminant or biological effect group
Unit:
Type: categorical
Levels: Metals, Organotins, PAH parent compounds, PAH alkylated compounds, PAH metabolites, Polybrominated diphenyl ethers, Organobromines (other), Organofluorines, Polychlorinated biphenyls, Organochlorines (other), Imposex, Biological effects (other)
Note:

species
Description: species
Unit:
Type: categorical
Levels: 6
Note:
* Limanda limanda: AphiaID = 127139
* Merlangius merlangus: AphiaID = 126438
* Mytilus edulis: AphiaID = 140480
* Nucella lapillus: AphiaID = 140403
* Platichthys flesus: AphiaID = 127141
* Pleuronectes platessa: AphiaID = 127143

matrix
Description: sample matrix (tissue)
Unit:
Type: categorical
Levels: BI, ER, HML, LI, LIS9, MU, SB, WO
Note: see ICES reference codes for MATRX

basis
Description: basis of the assessment
Unit:
Type: categorical
Levels: dry weight (D), lipid weight (L) or wet weight (W)
Note:

unit
Description: unit of measurement
Unit:
Type: discrete
Levels: %, d, idx, mins, nmol/min/mg protein, nr/1000 cells, pmol/min/mg protein, st, ug/kg, ug/ml
Note:

sex
Description: sex
Unit:
Type: categorical
Levels: F, M
Note: only provided for EROD where separate time series are assessed for females (F) and males (M)

method_analysis
Description: method of chemical analysis
Unit:
Type: categorical
Levels: FLM-SS
Notes:
* see ICES reference codes for METOA
* only provided for PAH metabolites where the assessment concentrations depend on the method of analysis

shape_env
Description: the symbol used to summarise the fitted trend when using environmental thresholds
Unit:
Type: categorical
Levels: upward_triangle, downward_triangle, large_filled_circle, small_filled_circle, small_open_circle
Notes:
* upward_triangle: significant (p < 0.05) increase in concentration in the last 20 years
* downward_triangle: significant (p < 0.05) decrease in concentration in the last 20 years
* large_filled_circle: no significant (p > 0.05) change in concentration in the last 20 years
* small_filled_circle: insufficient years of data to test for trends
* small_open_circle: only 1-2 years of data (or a time series dominated by less-than values for which no assessment criteria is available)
* the relevant significance level is given in p_recent_trend
* 'short' timeseries are those with a small_open_circle and are excluded from all Indicator assessments apart from that for 'other chemicals'
* 'medium' timeseries are those with a small-filled circle
* 'long' timeseries are those where a trend can be fitted (upward_triangle, downward_triangle or large_filled_circle)

colour_env
Description: the colour used to summarise the status assessment when using environmental thresholds
Unit:
Type: categorical
Levels: blue, green, red, orange, black
Notes:
* blue: the mean concentration is significantly below the Background Assessment Concentration (BAC) or equivalent (p < 0.05)
* green: the mean concentration is significantly below the Environmental Assessment Criterion (EAC) or equivalent (p < 0.05)
* red: the mean concentration is not significantly below the EAC or equivalent (p > 0.05)
* orange: the mean concentration is not significantly below the BAC or equivalent (p > 0.05) and there is no EAC or equivalent
* black: no assessment critieria

shape_health
Description: the symbol used to summarise the fitted trend when using human health thresholds
Unit:
Type: categorical
Levels: upward_triangle, downward_triangle, large_filled_circle, small_filled_circle, small_open_circle
Notes:
* upward_triangle: significant (p < 0.05) increase in concentration in the last 20 years
* downward_triangle: significant (p < 0.05) decrease in concentration in the last 20 years
* large_filled_circle: no significant (p > 0.05) change in concentration in the last 20 years
* small_filled_circle: insufficient years of data to test for trends
* small_open_circle: only 1-2 years of data (or a time series dominated by less-than values for which no assessment criteria is available)
* the relevant significance level is given in p_recent_trend
* 'short' timeseries are those with a small_open_circle and are excluded from all Indicator assessments apart from that for 'other chemicals'
* 'medium' timeseries are those with a small-filled circle
* 'long' timeseries are those where a trend can be fitted (upward_triangle, downward_triangle or large_filled_circle)

colour_health
Description: the colour used to summarise the status assessment when using human health thresholds
Unit:
Type: categorical
Levels: green, red, black
Notes:
* green: the mean concentration is significantly below the human Health Assessment Criterion (HAC) (p < 0.05)
* red: the mean concentration is not significantly below the HAC (p > 0.05)
* black: no assessment critierion

n_year_all
Description: number of years with data
Unit:
Type: integer
Range: 1, 21
Note:

n_year_fit
Description: number of years included in the statistical analysis
Unit:
Type: integer
Range: 1, 21
Note: some early years might be excluded because they are separated from the bulk of the data by large gaps in time, or because they are dominated by 'less-than' values

n_year_positive
Description: number of years included in the analysis that have at least one concentation measurement above the limit of detection
Unit:
Type: integer
Range: 0, 21
Note:

first_year_all
Description: first year with data
Unit: y
Type: integer
Range: 1999, 2021
Note:

first_year_fit
Description: first year included in the statistical analysis
Unit: y
Type: integer
Range: 1999, 2021
Note: see n_year_fit for explanation

last_year
Description: last year of data
Unit: y
Type: discrete
Range: 2016, 2021
Notes:
* the last year is always included in the statistical analysis
* only timeseries with some data in the last six monitoring years are included in the assessment (i.e. 2016-2021 for the 2022 assessment)

p_nonlinear
Description: the significance of the nonlinear component of the trend
Unit:
Type: continuous
Range: 0, 0.051
Notes:
* this assesses whether log concentrations changed nonlinearly over the monitoring period
* it is based on a likelihood ratio test comparing the smooth model with a linear model and is only given if a smooth model is selected by AICc

p_linear
Description: the significance of the linear component of the trend
Unit:
Type: continuous
Range: 0, 1
Notes:
* this test only has a simple interpretation when the trend is linear (rather than smooth) in which case it assesses whether concentrations changed (log-linearly) over the monitoring period
* it is based on a likelihood ratio test comparing the linear model with the null model (in which only an intercept if fitted)
* for smooth models, the terms p_linear_trend and p_recent_trend are more relevant

p_overall
Description: the overall significance of the trend
Unit:
Type: continuous
Range: 0, 1
Notes:
* this assesses whether mean concentrations changed over the monitoring period
* it is based on a likelihood ratio test comparing the fitted model (smooth or linear) with the null model
* p_overall is identical to p_linear if the fitted model is linear

p_linear_trend
Description: a test of whether the mean concentrations at the start and end of the monitoring period are the same
Unit:
Type: continuous
Range: 0, 1
Notes:
* for linear models, p_linear_trend is identical to p_linear
* for smooth models, p_linear_trend is based on a Wald test that compares the fitted values at the start and end of the monitoring period
* p_linear_trend can be non-significant even if p_overall is highly signficant; for example, if concentrations have increased and then decreased by the same amount

linear_trend
Description: an estimate of the change in mean concentration between the start and end of the monitoring period
Unit:
Type: continuous
Range: -44, 29.6
Notes:
* loosely, linear_trend can be interpreted as the percentage yearly change in concentration between first_year_fit and last_year assuming the trend in concentration is log-linear
* formally, linear_trend is 100 times the difference in log concentration between first_year_fit and last_year divided by the difference in years; if the trend in log concentration is linear, this is equivalent to 100 times the yearly change in log concentration

p_recent_trend
Description: a test of whether the mean concentrations 20 years ago is the same as it is today
Unit:
Type: continuous
Range: 0, 1
Notes:
* p_recent_trend assesses whether the mean concentration in 2002 (or first_year_fit whichever is later) is the same as the mean concentration in 2021 (or last_year whichever is earlier)
* for linear models, p_recent_trend is identical to p_linear_trend
* for smooth models, p_recent_trend is based on a Wald test that compares the fitted values in 2002 (or first_year_fit) and 2021 (or last_year)

recent_trend
Description: an estimate of the change in mean concentration in the last 20 years
Unit:
Type: continuous
Range: -44, 29.6
Notes:
* loosely, recent_trend can be interpreted as the percentage yearly change in concentration in the last 20 years assuming the trend in concentration is log-linear
* formally, recent_trend is 100 times the difference in log concentration between 2002 (or first_year_fit whichever is later) and 2021 (or last_year whichever is earlier) divided by the difference in years; if the trend in log concentration is linear, this is equivalent to 100 times the yearly change in log concentration
* for linear models, recent_trend is identical to linear_trend

detectable_trend
Description: a measure of the power of the time series to detect changes over time
Unit:
Type: continuous
Range: 1.7, 136
Notes:
* the yearly change in log concentration (multiplied by 100) that would be detected with 90% power based on a (two-sided) test at the 5% significance level given 10 years of annual monitoring and variability typical of the time series
* loosely, detectable_trend can be interpreted as the percentage yearly change in concentration detectable with 90% power in 10 years of annual monitoring

mean_last_year
Description: the fitted mean concentration in last_year
Unit: see unit
Type: continuous
Range: -0.46, 317511
Note: the single negative value is for FDI (fish disease index), which can be negative

climit_last_year
Description: the upper one-sided 95% confidence limit on the fitted mean concentration in last_year
Unit: see unit
Type: continuous
Range: 0.000000000000028, 423064
Note:

BAC_type
Description: the name of the Background Assessment Concentration (BAC) or equivalent
Unit:
Type: categorical
Levels: BAC
Note:

BAC_value
Description: the value of the BAC (or equivalent)
Unit: see unit
Type: continuous
Range: 0.0054, 63000
Note: more details can be found in the help files on assessment criteria for contaminants in biota and biological effects

BAC_diff
Description: the difference between climit_last_year and the BAC (or equivalent)
Unit: see unit
Type: continuous
Range: -2175, 360064
Note: a negative value means that the mean concentration in the final monitoring year is significantly (p < 0.05) below the BAC

BAC_achieved
Description: the first year (moving forward) in which the mean concentration is predicted to be below the BAC (or equivalent)
Unit: y
Type: integer
Range: 2016, 3000
Notes: there are four cases
* mean_last_year &le BAC: the mean concentration is already below the BAC and BAC_achieved is set to last_year
* mean_last_year > BAC and recent_trend < 0: concentrations are predicted to decrease and BAC_achieved is set to the first year that the predicted concentration is below the BAC, assuming the rate of decrease is given by recent_trend; the year is truncated at 3000 to prevent values getting silly
* mean_last_year > BAC and recent_trend &ge 0: concentrations are predicted to increase and the mean concentration will never be below the BAC, so BAC_achieved is arbitrarily set to 3000
* mean_last_year > BAC and no trend is estimated: BAC_achieved is left blank

BAC_below
Description: the result of a non-parametric test of whether mean concentrations are below the BAC (or equivalent)
Unit:
Type: categorical
Levels: above, below
Notes:
* a one-sided sign-text, based on the last five monitoring years, is used to test whether mean concentrations are below the BAC; this provides a non-parametric alternative to the parametric test of status based on cl_last_year, and is useful when the data are dominated by less-than measurements
* the status of the time series (colour) is determined by BAC_diff if a parametric model has been fitted, and by BAC_below otherwise

EAC_type
Description: the name of the Environmenal Assessment Criterion (EAC) or equivalent
Unit:
Type: categorical
Levels: EAC, FEQG, QSsp
Notes:
* EAC = Environmental Assessment Criterion
* FEQG = Federal Environmental Quality Guideline
* QSsp = Quality Standard secondary poisoning

EAC_value
Description: the value of the EAC (or equivalent)
Unit: see unit
Type: continuous
Range: 1.67, 3340
Note: more details can be found in the help files on assessment criteria for contaminants in biota and biological effects

EAC_diff
Description: the difference between climit_last_year and the EAC (or equivalent)
Unit: see unit
Type: continuous
Range: -3340, 22428
Note: a negative value means that the mean concentration in the final monitoring year is significantly (p < 0.05) below the EAC

EAC_achieved
Description: the first year (moving forward) in which the mean concentration is predicted to be below the EAC (or equivalent)
Unit: y
Type: integer
Range: 2016, 3000
Notes: there are four cases
* mean_last_year &le EAC: the mean concentration is already below the EAC and EAC_achieved is set to last_year
* mean_last_year > EAC and recent_trend < 0: concentrations are predicted to decrease and EAC_achieved is set to the first year that the predicted concentration is below the EAC, assuming the rate of decrease is given by recent_trend; the year is truncated at 3000 to prevent values getting silly
* mean_last_year > EAC and recent_trend &ge 0: concentrations are predicted to increase and the mean concentration will never be below the EAC, so EAC_achieved is arbitrarily set to 3000
* mean_last_year > EAC and no trend is estimated: EAC_achieved is left blank

EAC_below
Description: the result of a non-parametric test of whether mean concentrations are below the EAC (or equivalent)
Unit:
Type: categorical
Levels: above, below
Notes:
* a one-sided sign-text, based on the last five monitoring years, is used to test whether mean concentrations are below the EAC; this provides a non-parametric alternative to the parametric test of status based on cl_last_year, and is useful when the data are dominated by less-than measurements
* the status of the time series (colour) is determined by EAC_diff if a parametric model has been fitted, and by EAC_below otherwise

HAC_type
Description: the name of the (human) Health Assessment Criterion (HAC)
Unit:
Type: categorical
Levels: MPC, QShh
Notes:
* MPC = Maximum Permissible Concentration
* QShh = Quality Standard human health
* there are no HAC for biological effects (as one might expect)

HAC_value
Description: the value of the HAC
Unit: see unit
Type: continuous
Range: 0.052, 9146
Note: more details can be found in the help file on assessment criteria for contaminants in biota

HAC_diff
Description: the difference between climit_last_year and the HAC
Unit: see unit
Type: continuous
Range: -8771, 12300
Note: a negative value means that the mean concentration in the final monitoring year is significantly (p < 0.05) below the HAC

HAC_achieved
Description: the first year (moving forward) in which the mean concentration is predicted to be below the HAC
Unit: y
Type: integer
Range: 2016, 3000
Notes: there are four cases
* mean_last_year &le HAC: the mean concentration is already below the HAC and HAC_achieved is set to last_year
* mean_last_year > HAC and recent_trend < 0: concentrations are predicted to decrease and HAC_achieved is set to the first year that the predicted concentration is below the HAC, assuming the rate of decrease is given by recent_trend; the year is truncated at 3000 to prevent values getting silly
* mean_last_year > HAC and recent_trend &ge 0: concentrations are predicted to increase and the mean concentration will never be below the HAC, so HAC_achieved is arbitrarily set to 3000
* mean_last_year > HAC and no trend is estimated: HAC_achieved is left blank

HAC_below
Description: the result of a non-parametric test of whether mean concentrations are below the HAC
Unit:
Type: categorical
Levels: above, below
Notes:
* a one-sided sign-text, based on the last five monitoring years, is used to test whether mean concentrations are below the HAC; this provides a non-parametric alternative to the parametric test of status based on cl_last_year, and is useful when the data are dominated by less-than measurements
* the status of the time series (colour) is determined by HAC_diff if a parametric model has been fitted, and by HAC_below otherwise

imposex_class
Definition: an alternative classification of imposex status
Unit:
Type: categorical
Levels: A, B, C
Note:

Differences for some biological effects

For imposex, linear_trend and recent_trend are the estimated log odds ratio of the VDS (or INTS) of an individual exceeding any given value in one year relative to the previous year.

For ACHE, neural red retention time, and stress on stress, low values indicate unhealthy organisms. Consequently,
* colour_env is e.g. blue if the mean concentration is significantly above the BAC (p < 0.05)
* climit_last_year is the lower one-sided 95% confidence limit on the fitted mean concentration in last_year
* a positive value of BAC_diff means that the mean concentration in the final monitoring year is significantly above the BAC (p < 0.05)
and so on.

FDI (fish disease index) is modelled on the untransformed scale (i.e. not log transformed) to deal with negative indices. Consequently,
* linear_trend is the difference in FDI between first_year_fit and last_year divided by the difference in years; if the trend in FDI is linear, this is equivalent to the yearly change in FDI
* recent_trend is the difference in FDI between 2002 (or first_year_fit whichever is later) and 2021 (or last_year whichever is earlier) divided by the difference in years; if the trend in FDI is linear, this is equivalent to the yearly change in FDI

Data Preview: Note that by default the preview only displays up to 100 records. Use the pager to flip through more records or adjust the start and end fields to display the number of records you wish to see.