Skip to main content
  • Before you start
  • Who we are

How the data are structured

Data in the archive are organised within a hierarchy and identified via strict naming conventions.

Table of contents
Select a section to jump to:

The structure of data in the archive

The data, representing observed environmental information, are referenced by where, what, and how often, i.e.

  • where were they observed - their position in the world,
  • what was measured - the conditions they describe, and
  • how often were observations made - this may also describe whether observations are regular or irregular.

Accordingly, each time series of individual data measurements may be identified by either of two systems:

  1. a unique identifier, the time series ID; ts_id, or
  2. a logical hierarchy of text identifiers that specify:
    • the location,
      • the spatial identifier
    • the environmental parameter,
      • the categorical identifier
    • how often individual data-points are available
      • this represents the temporal resolution of the data.

Either identification system can be used in an API request to specify which data to return.

  Spatial Categorical Temporal
  Where? What? How often?
Hierarchy Site Station Parameter Time series
Unique integer identifier site_id station_id parameter_id ts_id
Free text label site_name station_name stationparameter_name ts_name
Unique text label site_no station_no stationparameter_no ts_shortname
ts_path site_no/station_no/stationparameter_no/ts_shortname

The key hierarchical identifiers for data are:

  • the Site
  • the Station
  • the Parameter and
  • the Time series (type)

The time series type indicates what sample or statistic of the parameter is represented by the values in the time series, e.g. instantaneous value, day mean, annual maximum, &c.

Together, these key identifiers make up a path to the data - the ts_path. Analogous to the path to a file in a computer filesystem, the time series path can be used in certain API query functions to specify which time series data to return. When used in a data request, the ts_path can include wildcards to access data from a whole class of time series.

station_name site_no station_no stationparameter_no ts_shortname ts_id tstamp value
Boreland 1 115518 RE HDay.Total 60199010 2022-01-01 09:00:00 3.4
Boreland 1 115518 RE HDay.Total 60199010 2022-01-02 09:00:00 1.2
Boreland 1 115518 RE HDay.Total 60199010 2022-01-03 09:00:00 5.2
Boreland 1 115518 RE HDay.Total 60199010 2022-01-04 09:00:00 0.2
Loch Katrine 1 14910 SG 15m.Cmd 70022010 2022-01-01 00:00:00 0.322
Loch Katrine 1 14910 SG 15m.Cmd 70022010 2022-01-01 00:15:00 0.313
Loch Katrine 1 14910 SG 15m.Cmd 70022010 2022-01-01 00:30:00 0.315
Loch Katrine 1 14910 SG 15m.Cmd 70022010 2022-01-01 00:45:00 0.325
Shenachie 1 234306 Q HMonth.Max 80287010 2022-01-01 09:00:00 121.076

Hierarchy

Site

'Site' is a broad category of location containing many separate measurement stations. Unlike SEPA's primary archive, stations in the API duplicate are ascribed to a single site: its site_no is 1 and its site_name is Stations. This renders the getSitelist query function largely redundant.

Station

Measurements are identified by the station at which they are observed. The station's name is  usually based on the name of settlement nearest to the station; it may include the type of gauge at the station, e.g. Hawick, Capenoch Raingauge, and Kingston Tide Gauge.

Explore API requests for station information, including identifiers and metadata amongst the endpoint examples.

Parameter

Each measurable environmental characteristic is identified as a parameter, for instance level, flow, or temperature. Each parameter is ascribed a plain English long name, stationparameter_name, but is identified uniquely by its abbreviated short name, stationparameter_no, or by its parameter_id.

Parameter information can be accessed either on its own or as metadata associated with station or time series data.

A list of available stationparameter_name and stationparameter_no values are shown below in the table of time series types.

stationparameter_name stationparameter_no description
Rain RE or RS RE is data from automatic gauges and RS is data from manually read gauges
Groundwater Level GWL Readings of the level of water under the ground given in metres above Ordnance Datum
Level SG The stationparameter_no being technically derived from 'Staff Gauge', this observation of water level in (fresh) surface water is usually measured relative to a local station datum. The reduced level of the local datum relative Ordnance Datum is available as a station additional attribute
Flow Q Discharge of water from a river, usually in cubic meters per second. Flow measurements, or Gaugings are stored in the Flow parameter.
Tidal Level TL  

Each parameter is stored in consistent units; rainfall in mm, level in m, flow in cumecs (m³.s-1), temperature in Celsius.

Measurements of parameter values are stored in one or more time series.

Time series

The time series are defined by the time interval between the data in the series and the type of observation each measurement represents, either an instantaneous value or an aggregation over the period of the time interval. Time series are ascribed both a descriptive label, ts_name, and unique text identifier, ts_shortname, principally for use in coding.

As an example, instantaneous values measured at a 15 minute interval would be in a time series with ts_name 15minute and ts_shortname 15m.Cmd (in this case the Cmd is an abbreviation for 'continuously measured data'). For rainfall, where the value represents a total over a given interval the highest resolution data have ts_name 15minute.Total with corresponding ts_shortname 15m.Total.

Each parameter has its own structure of time series at various temporal resolution which are generally similar between parameters. The time series resolutions and aggregations are the same at every station for each parameter. A catalogue of the time series types available for each parameter may be found below.

The time series configured under each parameter are broadly the same, so there should be little difference in the available time series between stations per parameter. The exceptions to this rule are:

  • For Flow parameters, the time series summarising some high flow data are not available where there is insufficient confidence in the quality
  • For Daily observed rainfall data (parameter_no RS), time series for 15-minute and hourly data are unavailable as these data are not measured.

Hydrological and Calendar time intervals

Hydrological science uses the concept of a hydrological day, hydrological month, and hydrological year.

  • The hydrological day runs from 09:00 on the given day to 09:00 on the following day
  • The hydrological month runs from 09:00 on the 1st of the month and runs up to 09:00 on the 1st of the following month.
  • The hydrological year starts on the 1st of October at 09:00 and runs up to 09:00 on the 1st of October the following year.

With reference to time series naming:

  • HydrologicalYear is included at the start of the time series name for annual values calculated on the hydrological year.
    • The ts_shortname for hydrological time series starts H
  • CalendarYear is included at the start of the time series name for annual values calculated on the hydrological year.
    • The ts_shortname for time series calculated on the calendar year start C, for example CYear.Total
  • Hydrological or Calendar is not used in the time series names for monthly or daily time series, but 'H' is used in the ts_shortname.

These concepts have particular significance for aggregated derived time series.

The daily mean flow for a given date is the total discharge in a river between 09:00 on the given date up to 09:00 on the following. Similarly for rainfall the daily total rainfall is the total amount of rain that fell between 09:00 on the given day up to 09:00 on the following day.

Long-term Values

Long-term values are summary statistics available for certain parameters. These time series contain average, maximum, and minimum values for the period of record for each station. The time series have a general date that identifies the broad period that the statistic represents, and where applicable, an occurrence timestamp (sic), which identifies the specific occurrence time within the aggregation period of the statistic.

As an example, the river level Long-term Monthly Maximum represents the highest value recorded in each month of the year across the whole record and, hence, returns 12 values, with the dates on which they occurred.

Returned values:

Claggan long-term value example

Catalogue of time series names

The following tables describe the time series names associated with the available parameters.

stationparameter_name : Rain (stationparameter_no : RE or RS)

Time Interval of data Data type ts_name ts_shortname
15 minute * Total 15minute.Total 15m.Total
Hourly * Total Hour.Total Hour.Total
Daily (Hyd) Total Day.Total HDay.Total
Monthly (Hyd) Total Month.Total HMonth.Total
Yearly (Cal) Total CalendarYear.Total CYear.Total
Yearly (Hyd) Total HydrologicalYear.Total HYear.Total
Monthly (Hyd) Long Term Value Maximum LongTermValue.Month.Max LTV.HMonth.Max
Monthly (Hyd) Long Term Value Minimum LongTermValue.Month.Min LTV.HMonth.Min
Monthly (Hyd) Long Term Value Mean LongTermValue.Month.Mean LTV.HMonth.Mean

stationparameter_name : GroundwaterLevel (stationparameter_no : GWL)

Time Interval of data Data type ts_name ts_shortname
Hourly instantaneous Hour Hour.Cmd
Daily (Cal) Maximum Day.Max CDay.Max
Daily Minimum Day.Min CDay.Min
Daily Mean Day.Mean CDay.Mean
Monthly (Cal) Maximum Month.Max CMonth.Max
Monthly Minimum Month.Min CMonth.Min
Monthly Mean Month.Mean CMonth.Mean
Yearly (Hyd) Maximum HydrologicalYear.Max HYear.Max
Yearly (Hyd) Minimum HydrologicalYear.Min HYear.Min
Yearly (Hyd) Mean HydrologicalYear.Mean HYear.Mean
Yearly (Cal) Maximum CalendarYear.Max CYear.Max
Yearly (Cal) Minimum CalendarYear.Min CYear.Min
Yearly (Cal) Mean CalendarYear.Mean CYear.Mean
Monthly (Cal) Long Term Value Maximum LongTermValue.Month.Max LTV.HMonth.Max
Monthly Long Term Value Minimum LongTermValue.Month.Min LTV.HMonth.Min
Monthly Long Term Value Mean LongTermValue.Month.Mean LTV.HMonth.Mean

stationparameter_name : Level (stationparameter_no : SG)

Time Interval of data Data type ts_name ts_shortname ts_path
15 minutes instantaneous 15minute 15m.Cmd 1/station_no/SG/15m.Cmd
Daily (Hyd) Maximum Day.Max HDay.Max 1/station_no/SG/HDay.Max
Daily (Hyd) Minimum Day.Min HDay.Min 1/station_no/SG/HDay.Min
Daily (Hyd) Mean Day.Mean HDay.Mean 1/station_no/SG/HDay.Mean
Monthly (Hyd) Maximum Month.Max HMonth.Max 1/station_no/SG/HMonth.Max
Monthly (Hyd) Minimum Month.Min HMonth.Min 1/station_no/SG/HMonth.Min
Monthly (Hyd) Mean Month.Mean HMonth.Mean 1/station_no/SG/HMonth.Mean
Yearly (Hyd) Maximum HydrologicalYear.Max HYear.Max 1/station_no/SG/HYear.Max
Yearly (Hyd) Minimum HydrologicalYear.Min HYear.Min 1/station_no/SG/HYear.Min
Yearly (Hyd) Mean HydrologicalYear.Mean HYear.Mean 1/station_no/SG/HYear.Mean
Yearly (Cal) Maximum CalendarYear.Max CYear.Max 1/station_no/SG/CYear.Max
Yearly (Cal) Minimum CalendarYear.Min CYear.Min 1/station_no/SG/CYear.Min
Yearly (Cal) Mean CalendarYear.Mean CYear.Mean 1/station_no/SG/CYear.Mean
Monthly (Hyd) LongTermValue LongTermValue.Month.Max LTV.HMonth.Max 1/station_no/SG/LTV.HMonth.Max
Monthly (Hyd) LongTermValue LongTermValue.Month.Min LTV.HMonth.Min 1/station_no/SG/LTV.HMonth.Min
Monthly(Hyd) LongTermValue LongTermValue.Month.Mean LTV.HMonth.Mean 1/station_no/SG/LTV.HMonth.Mean

stationparameter_name : Flow (stationparameter_no : Q)

Time Interval of data Data type ts_name ts_shortname
15 Minute instantaneous 15minute 15m.Cmd
Daily (Hyd) Mean Day.Mean HDay.Mean
Gaugings (Flow and Level **) Instantaneous value Gaugings Cmd.Gaugings
Instantaneous (*) Instantaneous peak value PeaksOverThreshold POT
Yearly (Hyd) (*) Maximum HydrologicalYear.Max HYear.Max

(* Only available at selected stations) (** Flow is the primary value, level is available as the return field Stagesource Value)

stationparameter_name : TidalLevel (stationparameter_no : TL)

As for Level, above, plus the following.

Time Interval of data Data type ts_name ts_shortname
Irregular (tide cycle) Maximum HighWater Cmd.HW
Irregular (tide cycle) Minimum LowWater Cmd.LW
Irregular (tide cycle) Mean MeanWater Cmd.MSL