2013.8U-Air-When Urban Air Quality Inference Meets Big Data.pdf
文本预览下载声明
U-Air: When Urban Air Quality Inference Meets Big Data
Yu Zheng, Furui Liu, Hsun-Ping Hsieh
Microsoft Research Asia, Beijing China
{yuzheng, v-ful, v-hshsie}@
ABSTRACT
Information about urban air quality, e.g., the concentration of
PM2.5, is of great importance to protect human health and control
air pollution. While there are limited air-quality-monitor-stations
in a city, air quality varies in urban spaces non-linearly and
depends on multiple factors, such as meteorology, traffic volume,
and land uses. In this paper, we infer the real-time and fine-
grained air quality information throughout a city, based on the
(historical and real-time) air quality data reported by existing
monitor stations and a variety of data sources we observed in the
city, such as meteorology, traffic flow, human mobility, structure
of road networks, and point of interests (POIs). We propose a
semi-supervised learning approach based on a co-training
framework that consists of two separated classifiers. One is a
spatial classifier based on an artificial neural network (ANN),
which takes spatially-related features (e.g., the density of POIs
and length of highways) as input to model the spatial correlation
between air qualities of different locations. The other is a
temporal classifier based on a linear-chain conditional random
field (CRF), involving temporally-related features (e.g., traffic
and meteorology) to model the temporal dependency of air quality
in a location. We evaluated our approach with extensive
experiments based on five real data sources obtained in Beijing
and Shanghai. The results show the advantages of our method
over four categories of baselines, including linear/Gaussian
interpolations, classical dispersion models, well-known
classification models like decision tree and CRF, and ANN.
Categories and Subject Descriptors
H.2.8 [Database Management]: Database Applications - data
mining, Spatial databases and GIS;
General Terms
Algorithms, M
显示全部