Air quality is one of the most concerning problems in major industrialized cities in the world. Prediction of future air quality is highly relevant to public health. In some big cities, multiple air quality measurement stations are deployed at different locations to monitor air pollutants, such as NO2, CO, PM 2.5 and PM 10, over time. At every monitoring time stamp t, we observe one station×feature matrix xt of the pollutant data, which represents a spatio–temporal process. Traditional methods on prediction of air quality typically use data from one station or can only predict a single pollutant (such as PM 2.5) at a time, which ignores the spatial correlation among different stations. Moreover, the air pollution data are typically highly nonstationary, We propose a de-trending graph convolutional LSTM (long short term memory) to continuously predict the whole station×feature matrix in the next 1 to 48 hours, which not only captures the spatial dependency among multiple stations by replacing an inner product with convolution, but also incorporates the de-trending signals (transform a nonstationary process to a stationary one by differencing the data) into our model. Experiments on the air quality data of the city Chengdu and multiple major cities in China demonstrate the feasibility of our method and show promising results.