Thomas Goossens

Geographer/Data-analyst/Coder

stackoverflow spotify github linkedin email
A review of weather spatialization projects
May 24, 2018
11 minutes read

Introduction and context

The aim of the Agromet project is to provide a near real-time hourly gridded datasets of weather parameters at the resolution of 1 km² for the whole region of Wallonia characterized by a quality indicator.

The Agromet project is largely inspired by what the ZEPP has developed in the context of their late blight warning services (see academic paper).

Before starting the development of our own service, we decided to submit a survey to our end users and to perform a preliminary benchmark of weather data interpolation facilities developed by other institutions.

This document compiles the useful information we have gathered during our benchmarking and synthetise the main ideas to keep in mind while building our platform.

Literature review

An extensive literature review of weather spatialization techniques has been performed.

European experts in weather data spatialization

Here is a list of european experts in terms of weather spatialization worth following.

Country Author Institution Publication
Allemagne T. Zeuner ZEPP German Crop Protection Services Use of geographic information systems in warning services for late blight
Serbie Milan Kilibarda University of Belgrade Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution
Pays-bas Raymond Sluiter KNMI Interpolation methods for climate data - literature Review
Pays-bas Tomislav Hengl ISRIC World Soil Information Institute R-package Spatial Analyst
Norwegian Jean-Marie Lepioufle Norwegian Meteorological Institute Recent developments in spatial interpolation of precipitation and temperature at MET Norway
Grèce Kostas philippopoulos University of Reading Artificial Neural Network Modeling of Relative Humidity and Air Temperature Spatial and Temporal Distributions Over Complex Terrains
Portugual Silva Alvaro Instituto Português do Mar e da Atmosfera Neural Networks application to spatial interpolation of climate variables
Slovénie Luka honzak Bo Mo LTD WEATHER SCENARIO APP
France Mehdi Sine Vigicultures par Arvalis - institut du Végétal VIGICULTURES – An early warning system for crop pest management
Belgique Aurore Degré Faculté Gemboux Different methods for spatial interpolation of rainfall data for operational hydrology and hydrological modeling at watershed scale: a review
Pologne Maciej Kryza university de Wroclaw

Key learnings from the review

The literature reveals that a lot of spatial interpolation methods have been developed the last decades. These techniques have been borrowed from other fields and transposed (oil prospection) in the field of meteorology where the comprehension and modelisation of the processes is much more technical due to the complexity and the spatial heterogeneity of weather events. In such, there is not an out-of-the box recipe to apply to each weather parameter.

The choice of the right interpolation method depends of many factors such as the spatial distribution of the weather station network, the topography, the number of stations, local gradients such as global circulation effects, etc. Moreover, more attention has been ported on the spatialization of climate data rather than hourly meteorological data which is our concern. Therefore, an important phase of testing, benchmarking and tweaking of the processes described here above is required in order to efficiently produce useful and sensible gridded outputs that could be used profitably by agronomical models.

These phase will require a deep knowledge of the principles of these geostatistical spatialization method combined with the development of programming skills required to explore the data and conduct practical analysis. The exploratory phases needed for the development of an adjusted data analysis technique able to deal with data scarcity should be performed in a way such as the the most simple solutions are evaluated first. Depending of the results of the evaluation of the investigated technique, we will decide if further investigations are required. If no significant extra-value is added by are more complex process, the later will not be retained.

Data and Automatic Weather Stations (AWS) networks knowledge

A specific attention will be ported on the analysis of the quality of the data produced by each of our stations. We will need to carry an analysis in order to detect eventual structural or local effects such as overheating in temperature shelters.

It is important to get a deep insight and comprehensive overview of our weather station network before interpolating its data in order to avoid the integration of non-desired local or structural effects during the interpolation process. Local temperature effects will be detectable by pointing out abnormally high our low values appearing from long term analysis of each of the stations from our network.

Again, a good knowledge of the station network (eg : situation and direct environment of each of the stations) is required. To remove local effects from the interpolation process, each station could first be weighted according to a quality parameter characterized by the local situation of the considered station. Time series analysis ( example map ) will help us for this purpose.

The Agromet project will spatialize weather data gathered both by the Pameseb network own by the CRA-W and stations owned by the national weather office RMI. Before integrating two different networks in the spatilization process, we need to assess their intercompatibilty. To address this, both our team and the RMI works on an intercomparison of the networks performed by the mean of a location (Humain - Belgium) equiped with 2 stations belongings to the 2 networks. The first results of this comparative analysis are available on this repository.

The first results suggest that for the temperature, we will need to apply a correction model to the Pameseb measurments recorded around the daily maximal temperature hours.

Understanding our end-users needs

A web-survey has been submitted to our potential end-users. Its purpose was to insure that the platform integrates the real needs of the future end-users (walloon crop warning system managers and academic research). The results of this survey also serve as a development priority list. The results are available in this report.

Audit of an external spatialized weather data provider

The Weather Company (owned by IBM) provides hourly gridded dataset at the resolution of 1 km². Using their solution would allow us to rapidly provide a functional platform. However, the inherent costs of the use of a third party provider and the lack of transparency regarding how the spatialization process works and performs do not allow us to choose this solution.

As a research institution, it is also our role to develop expertise in various fields like weather data spatialization and to make this expertise valuable to our clients (the walloon farmers). Its is also worth to keep in mind that developping our own platform is an excellent way to value the Pameseb AWS network.

For the complete solution proposed by IBM, please refer to the IBM supplementary material

Exchanges with our partners

Here we present the key learnings from the experience feedbacks of the various institutions we have met during our benchmarking campain.

KNMI - Netherlands

The KNMI (KONINKELIJK NATIONAAL METEOROLOGISCH INSTITUUT) has developed what they call An operational R-based interpolation facility for climate and meteo data. In october 2017 we have organized a first knowledge exchange workshop with this partner.

They have found R-software to be the most appropriate tool for weather data spatialization. This opinion is also shared by Meteo Switzerland (Christopher Frei), Meteo Norway (Ole Einar Tveito) and the RMI (Michel Journée).

Raymond Sluiter has published the review paper Interpolation methods for climate data into which he details the various deterministic and stochastic spatilization methods available. This review is an excellent starting point for who wants to start in the field of weather data spatialization.

Their developments were conducted in the context of the creation of a new climate atlas rather than with agronomical purposes. According to their feedback, there is no out-of-the box solution. We must find the solution best suited to our purpose by proceeding from the simplest solution and progressively add more complexity while asserting the level of accuracy brougth by this additional complexity. A good balance must be found between complexity and operability since we aim to build an operational suite.

Their presentations are available in the KNMI supplementary materials

Arvalis - France

Arvalis (Institut du Végétal) has also conducted weather data spatialization research in an agricultural context (crop warning systems). We have organized a knowledge exchange workshop in January 2018. Like the KNMI they have tested various methods with an increasing level of complexity. Our contact Olivier Deudon also uses R-software to conduct his researches.

The key points of their research are detailed in the arvalis supplementary materials. Here we present a brief summary of their methodology and main findings. The aim of their work was to test various methods of weather data spatial interpolation and find the most efficient ones (in terms of accuracy) for various parameters (temperature, relative humidity, rainfall) in the context of their specific AWS network (> 400 stations in France).

Regarding temperature :

  • tested methods : Inverse distance, multiple regressions, various kriging methods
  • validation method : splitting the dataset in training set (355 stations) and test set (100 stations)
  • model evaluation criterion : RMSE
  • method with the lowest RMSE for T°: universal kriging
  • used covariates : elevation, surface solar irradiance

ZEPP - Germany

As mentioned above, our project is mainly inspired from the ZEPP (ZENTRALSTELLE DER LÄNDER FÜR EDV-GESTÜTZTE ENTSCHEIDUNGSHILFEN UND PROGRAMME IM PFLANZENSCHUTZ - Central Institute for Decision Support Systems in Crop Protection) work. Here we present the key points of our November 2017 workshop.

It is essential to keep in mind the agricultural scope of the platform. The objective is make the best predictions in cultural area. It is not a problem if the quality of the prediction is not as high in area were not crops are grown (e.g. Hautes-Fagnes).

What matters most are the quality of the decision support tools outputs based on our weather data rather than the weather data itself. Their comparison of various spatialization technique revealed that for their needs, the most efficient technique is the multiple regression based on elevation, latitude and longitude. This comparison is extensively discussed in the Zeuner PhD Thesis present in the ZEPP supplementary material

Here we present a brief summary of their method and main findings. The aim of their work was to provide an operationnal platform able to supply crop alert system models with hourly gridded datasets of temperature and relative humidity accross germany that present the highest accuracy.

Regarding temperature :

  • tested methods : krigin, IDW, spline, multiple regression
  • validation method : 570 stations
  • model evaluation criterion: difference hourly interpolated - measured at the location of the stations (+ boxplots)
  • used covaraites : elevation
  • choosen method : multiple regression

RMI - Royal Meteorological Institute - Belgium

The RMI is our primary partner in terms of weather data spatialization with who we work in close collaboration. As the KNMI, they have an advanced expertise in terms of spatialization of weather data using R software.

RMI supplementary material

Choosing the right software

Among all the available programming languages, we choose R : - Fully open-source and free (like beer and freedom) - large user base and more and more used - R is developed by statisticians for statisticians - R is already used by other institutions implicated in weather data spatialization and internally at CRA-W - Many libraries (packages) cover our needs

package purpose
gstat Spatial and Spatio-Temporal Geostatistical Modelling, Prediction and Simulation
meteo Spatio-Temporal Analysis and Mapping of Meteorological Observations
sf Simple Features for R
raster Geographic Data Analysis and Modeling
automap Automatic interpolation package
dplyr A Grammar of Data Manipulation
rgdal Bindings for the Geospatial Data Abstraction Library
ggplot2 Create Elegant Data Visualisations Using the Grammar of Graphics
shiny Web Application Framework for R
geoR Analysis of Geostatistical Data
validate Data Validation Infrastructure
mlr Machine Learning in R

Here is an infography that compares R to python.

Extra investigations

If extra time remains, we could investigate to incorporate crowdsourced datasets. At the present time, both the KNMI and the RMI work on such a process in the context of the WOW experiment initiated by the UK metoffice.

Data dissemination policy

A particular attention will be given to make our data INSPIRE compliant and their origin will be described using the W3C recommandations.

As developers we push for the adoption of an open-data policy as the Community Data License Agreement. However, the final decision regarding the choice of the policy will be taken at higher levels.

Here is a selection of publications in agreement with an open-data approach :

Tags:

Categories:

Back to posts


comments powered by Disqus