The file biota_data.csv contains the contaminant and biological effect data in biota used in the 2022 assessment of the UK's Clean Seas Environment Monitoring Programme (CSEMP). The data are a cleaned and processed subset of the data extracted from the MERMAN database on 11 January 2023 (supplemented by micronucleated cell and fish disease index data provided by the Marine Directorate of the Scottish Government and the Centre for Environment, Fisheries and Aquaculture Science).
The data form the basis for the biota component of the following Descriptor 8 Indicators in the UK's 2025 Marine Strategy Assessment:
* Status and Trends of Metals (including lead, cadmium and mercury) in Biota and Sediment
* Status and Trends in Polycyclic Aromatic Hydrocarbons (PAHs) in Biota and Sediment
* Status and Trends of Polychlorinated Biphenyls (PCBs) in Biota and Sediment
* Status and Trends of Polybrominated Diphenylethers (PBDEs) in Biota and Sediment
* Status and Trends in the Levels of Imposex in Marine Gastropods
* Biological effects (EROD enzyme activity) in fish
* Biological effects (micronucleus) of contaminants in fish
* External Fish Disease
* Status and Trends of Chemicals (excluding metals, PAHs, PCBs and PBDEs) in Water, Biota and Sediment
The file is UTF-8-BOM encoded so can be read directly into Excel, or into R using the function read.csv with the argument fileEncoding = “UTF-8-BOM”.
The variables in the file are:
series
Description: timeseries identifier
Unit:
Type: categorical
Levels: 5644
Note: identifies the data (typically a station / contaminant / species / matrix combination) that were grouped together into a timeseries and modelled to assess status and trends
MSFD_subregion
Description: Marine Strategy Framework Directive subregion
Unit:
Type: categorical
Levels: Greater North Sea, Celtic Seas
Note: in the map of biogeographic regions the Greater North Sea comprises the Northern North Sea, Southern North Sea, and Eastern Channel, whilst the Celtic Seas comprises the Western Channel and Celtic Sea, Irish Sea, Minches and Western Scotland, Scottish Continental Shelf, and Atlantic North-West Approaches.
biogeographic_region
Description: biogeographic region
Unit:
Type: categorical
Levels: Northern North Sea, Southern North Sea, Eastern Channel, Western Channel & Celtic Sea, Irish Sea, Minches & Western Scotland, Scottish Continental Shelf
Notes:
* regional assessments are conducted at the biogeographic regional scale
* there are 8 biogeographic regions but only 7 with data
CSEMP_region
Description: CSEMP monitoring region
Unit:
Type: categorical
Levels: 18
Notes:
* there are 22 monitoring regions in the CSEMP but only 18 with data
* the CSEMP regions are aggregated into 8 biogeographic regions for regional assessments
CSEMP_stratum
Description: subdivision of the CSEMP monitoring region
Unit:
Type: categorical
Levels: 118
Note: there are 818 CSEMP strata, most of which are WFD water bodies, but only 118 with data
station_governance
Description: organisation responsible for the monitoring station
Unit:
Type: categorical
Levels: CEFAS, DAERA, EANat, MDSG, NRW, SEPA
Notes:
* Centre for Environment, Fisheries and Aquaculture Science (CEFAS)
* Department of Agriculture, Environment and Rural Affairs (DAERA)
* Environment Agency (EANat)
* Marine Directorate of the Scottish Government (MDSG)
* Natural Resources Wales (NRW)
* Scottish Environment Protection Agency (SEPA)
station_code
Description: monitoring station identifier
Unit:
Type: categorical
Levels: 197
Note:
station_name
Description: name associated with the monitoring station
Unit:
Type: categorical
Levels: 197
Note:
station_latitude
Description: station latitude
Unit: decimal degrees
Type: continuous
Range: 50.08, 60.78
Note: this is a nominal position: sampling occurs in a pre-defined area broadly centred on this position
station_longitude
Description: station longitude
Unit: decimal degrees
Type: continuous
Range: -7.11, 2.90
Note: this is a nominal position: sampling occurs in a pre-defined area broadly centred on this position
station_type
Description: type of monitoring station
Unit:
Type: categorical
Levels: B, RH, IH
Note: baseline (B), reference (RH) or impacted (IH)
waterbody_type
Description: station typography
Unit:
Type: categorical
Levels: Estuary, Coast, Open Sea
Note: only coastal and open sea stations are used in regional assessments, apart for the imposex regional assessment where estuarine stations are also used
monitoring_year
Description: monitoring year
Unit:
Type: discrete
Range: 1999, 2021
Note: for some organisations, monitoring is in winter and e.g. sampling in December 2020 and January 2021 would be regarded as having come from the same monitoring year
sample_id
Description: sample identifier
Unit:
Type: categorical
Levels: 23562
Note:
sample_date
Description: sampling date
Unit:
Type: discrete
Range: 1999-01-05, 2021-09-15
Note:
sample_time
Description: sampling time
Unit:
Type: continuous
Range: 00:00:00, 23:29:00
Note:
sample_latitude
Description: sampling latitude
Unit: decimal degrees
Type: continuous
Range: 50.08, 60.79
Note:
sample_longitude
Description: sampling longitude
Unit: decimal degrees
Type: continuous
Range: -7.11, 2.91
Note:
species
Description: species
Unit:
Type: categorical
Levels: 6
Note:
* Limanda limanda: AphiaID = 127139
* Merlangius merlangus: AphiaID = 126438
* Mytilus edulis: AphiaID = 140480
* Nucella lapillus: AphiaID = 140403
* Platichthys flesus: AphiaID = 127141
* Pleuronectes platessa: AphiaID = 127143
sex
Description: sex
Unit:
Type: categorical
Levels: F, I, M, X
Note: see ICES reference codes for SEXCO
matrix
Description: sample matrix
Unit:
Type: categorical
Levels: BI, ER, HML, LI, LIS9, MU, SB, WO
Note: see ICES reference codes for MATRX
determinand_group
Description: contaminant or biological effect group
Unit:
Type: categorical
Levels: Metals, Organotins, PAH parent compounds, PAH alkylated compounds, PAH metabolites, Polybrominated diphenyl ethers, Organobromines (other), Organofluorines, Polychlorinated biphenyls, Organochlorines (other), Imposex, Biological effects (other)
Note:
determinand
Description: contaminant or biological effect
Unit:
Type: categorical
Levels: 98
Notes:
* see ICES reference codes for PARAM
* data submitted as CHRTR and VDSI have been relabelled as CHR and VDS respectively
* SBDE6 is the code used for the sum of BDE28, BDE47, BDE99, BD100, BD153 and BD154
* FDI is the code used for the fish disease index
method_analysis
Description: method of chemical analysis
Unit:
Type: categorical
Levels: 24
Note: see ICES reference codes for METOA
basis
Description: basis on which the measurement is expressed
Unit:
Type: categorical
Levels: dry weight (D), lipid weight (L) or wet weight (W)
Note:
unit
Description: unit of the concentration measurement and its uncertainty
Unit:
Type: discrete
Levels: %, d, idx, mins, nmol/min/mg protein, nr/1000 cells, pmol/min/mg protein, st, ug/kg, ug/ml
Note:
concentration
Description: concentration of contaminant or equivalent for biological effects
Unit: see unit
Type: continuous
Range: -0.57, 5600000
Note:
censoring
Description: less-than qualifier for the concentration measurement
Unit:
Type: categorical
Levels: "", D, Q, <
Notes:
* "" (a blank or missing value) indicates a non-censored measurement
* D indicates the measurement is left-censored at the limit of detection; i.e. the measurement is below the limit of detection, but it is not known by how much; the limit of detection is given in the concentration column
* Q indicates the measurement is left-censored at the limit of quantification; i.e. the measurement is below the limit of quantification, but it is not known by how much; the limit of quantification is given in the concentration column
* < indicates the measurement is left-censored by an unspecified censoring criterion (which could be the limit of detection or quantification); the value of the censoring criterion is given in the concentration column
uncertainty
Description: uncertainty in the concentration measurement
Unit: see unit
Type: continuous
Range: 0.0001, 920099
Note: analytical uncertainty expressed as the standard deviation; not applicable to some biological effects measurements
LNMEA
Description: mean length
Unit: cm
Type: continuous
Range: 0.1, 103
Note: length of monitoring organism, or mean length if several individuals were pooled; there are unit errors in these data, so the data should be used with caution
DRYWT
Description: dry weight of the sample
Unit: %
Type: continuous
Range: 2.46, 91.6
Note: all values are above the limit of detection
DRYWT_uncertainty
Description: uncertainty in the dry weight measurement
Unit: %
Type: continuous
Range: 0.029, 12.4
Note: analytical uncertainty expressed as the standard deviation
LIPIDWT
Description: lipid weight of the sample
Unit: %
Type: continuous
Range: 0.12, 81.4
Note:
LIPIDWT_censoring
Description: less-than qualifier for the lipid weight measurement
Unit:
Type: categorical
Levels: "", D
Note: see censoring
LIPIDWT_uncertainty
Description: uncertainty in the lipid weight measurement
Unit: %
Type: continuous
Range: 0.03, 10.1
Note: analytical uncertainty expressed as the standard deviation
n_individual
Description: number of individuals pooled in the sample
Unit: nr
Type: discrete
Range: 1, 318
Note:
FEMALEPOP
Description: % of the sample that are females
Unit: %
Type: continuous
Range: 17, 100
Note: used to model imposex data when submitted as a pooled sample
CMTQCNR
Description: Comet assay cells screened
Unit: nr
Type: discrete
Range: 41, 168
Note: used to model Comet assay (%DNATAIL) data
MNCQCNR
Description: Micronucleus assay cells screened
Unit: nr
Type: discrete
Range: 1000, 5000
Note: used to model Micronucleus assay (MNC) data