User Tools

Site Tools


data_quality

Data Quality

The issue of data quality is an important one to weather station owners as well as the weather data services that they publish to. The weather data that you generate and share could be used for a number of different purposes, including research and planning. For that reason, and for posterity itself, it is important that the weather data coming from your data source be as accurate as possible.

There are many different factors that can affect a weather stationʼ’s data quality. Here are some of the main ones:

  1. Erratic or malfunctioning sensors, or wireless sensors that have weak batteries.
  2. Sensors that are improperly placed or mounted (e.g. a temperature sensor is located in direct sunlight, or a wind gauge is partially blocked by a wall or other obstruction).
  3. Intermittent communication between a wireless sensor and its base station due to poor placement, frayed wiring or battery issues.
  4. Intermittently garbled or junk data coming from the station to the Mac due to poor cabling, older station firmware, or poorly designed hardware.

In some of these instances, proper planning can have a direct impact on the quality of the data. In other cases, the problem lies with the data source itself, and no amount of rectification can avoid the occasional data error.

The quality of collected weather data is more important than how much of the data is collected. While more data translates into higher data resolution, having a lot of invalid data is meaningless compared to have a smaller set of good data.

Determining Good vs Bad Data

When it comes to determining what is good data and what is bad data, there are three determining factors: range, age and deviation of the data.

Range

Every type of weather property that WeatherSnoop supports (temperature, humidity, linear measurement, etc) has a valid range limit that a value must fit within in order to be considered of good quality. For example, it makes no sense to have a negative hourly rainfall or wind speed. Likewise, extreme temperatures such as 500F or 250C represent impractical weather scenarios and could indicate a station malfunction.

Age

Much of our weather data tends to vary over time. Temperature, humidity and pressures are constantly changing, even if slightly. On the other hand, it may not rain for days or weeks at a time, so those values are still valid even though they have not recently changed.

Every weather property that WeatherSnoop supports has a timestamp associated with it. This timestamp can be seen in the Weather Property view, and one can quickly determine the “age” of a particular weather property. This age is useful in determining the data's validity in computations of the values of other weather properties.

Deviation

In most cases, a weather propertyʼs value will not be significantly different from its previous value. For example, it is unlikely that a temperature will be 20F one minute, then 97F the next. On the other hand, wind speeds can change quickly and drastically.

data_quality.txt · Last modified: 2015/08/01 11:06 (external edit)