Story

End-to-end data-driven prediction of urban airflow and pollutant dispersion

Materials & EngineeringEarth & Environment

Key takeaway

Researchers developed a new system that can accurately predict airflow and pollution levels in cities, helping urban planners improve air quality and reduce public health risks.

Read the paper

Quick Explainer

The proposed framework integrates several key components to efficiently model and predict urban airflow and pollutant dispersion. It uses Spectral Proper Orthogonal Decomposition (SPOD) to extract dominant flow structures, an autoencoder for nonlinear dimensionality reduction, a Long Short-Term Memory (LSTM) network for temporal forecasting, and a Convolutional Neural Network (CNN) to map velocity to pollutant concentration fields. This modular design allows the framework to capture the complex, multiscale dynamics of the street canyon flow while maintaining computational efficiency. The key novelty is the end-to-end data-driven approach, which eliminates the need to explicitly solve the governing fluid dynamics equations.

Deep Dive

Technical Deep Dive: End-to-end data-driven prediction of urban airflow and pollutant dispersion

Overview

This paper proposes an end-to-end data-driven framework for efficiently modeling and predicting the airflow and pollutant dispersion in an urban street canyon. The framework integrates several key components:

Spectral Proper Orthogonal Decomposition (SPOD) to extract coherent flow structures
Autoencoder for nonlinear dimensionality reduction of the SPOD coefficients
Long Short-Term Memory (LSTM) network for temporal forecasting of the latent space
Convolutional Neural Network (CNN) for mapping the velocity field to the pollutant concentration field

The modular design allows the framework to capture the multiscale, chaotic dynamics of the street canyon flow while maintaining computational efficiency.

Methodology

SPOD: The high-fidelity Large Eddy Simulation (LES) data is decomposed using SPOD to extract the dominant spatial and temporal flow structures. Dimensionality reduction is achieved by:
- Retaining only the 238 leading frequencies that capture 99% of the total turbulent kinetic energy (TKE)
- Removing redundant SPOD modes by enforcing a spatial similarity threshold of 0.2, reducing the final basis to 2003 modes
Autoencoder: The time-domain SPOD coefficients are compressed into a 30-dimensional latent space using a dense autoencoder. This nonlinear compression retains the essential manifold of the dynamics while removing high-frequency content.
LSTM: A Long Short-Term Memory (LSTM) network is trained to propagate the latent space variables forward in time, enabling autonomous forecasting of the flow field evolution.
CNN: A Convolutional Neural Network (CNN) is employed to learn the nonlinear mapping from the reconstructed velocity field to the corresponding pollutant concentration field.

Data & Experimental Setup

The dataset used in this study consists of time-resolved snapshots obtained from a Large Eddy Simulation (LES) of a 3D street canyon flow. The simulation domain represents an idealized urban configuration with two rows of rectangular blocks, forming a canyon with an aspect ratio (height/width) of 1.

The simulation was run for a sufficiently long duration to ensure the boundary layer had developed and the flow reached a quasi-steady state. Velocity and scalar concentration data were sampled at a high frequency (1000 Hz) over a period of 80 flow-through times, resulting in 80,000 snapshots.

Results

SPOD Analysis: The SPOD analysis revealed that the essential flow dynamics can be captured by retaining only the 238 leading frequencies, representing a 88% reduction in the original feature space. Further pruning based on spatial mode similarity reduced the basis to 2003 modes, a 97% reduction.
Autoencoder Performance: The 30-dimensional latent space produced by the autoencoder was able to accurately reconstruct the temporal evolution of the dominant SPOD coefficients, with a normalized mean-squared error (NMSE) of 0.089 for the velocity field.
LSTM Forecasting: The LSTM network successfully learned the nonlinear dynamics of the latent space, accurately predicting the temporal evolution of the SPOD coefficients over a long horizon. The predicted velocity fields maintained bounded errors compared to the SPOD reconstruction.
Pollutant Dispersion: The CNN model effectively mapped the predicted velocity fields to the corresponding pollutant concentration fields. The framework captured the key features of the time-averaged concentration distribution and the vertical mass flux across the canyon, despite some underestimation of small-scale gradients near the source.

Interpretation

The proposed data-driven reduced-order modeling framework demonstrates the capability to efficiently predict both the airflow and pollutant dispersion in an urban street canyon. The modular architecture leverages the strengths of different neural network components to address the challenges of multiscale, chaotic turbulent flows.

The SPOD-based dimensionality reduction preserves the energetically dominant flow structures, while the autoencoder provides a compact latent representation suitable for temporal forecasting. The LSTM model successfully learned the underlying dynamics of this latent space, enabling long-term autonomous predictions. Finally, the CNN mapping from velocity to concentration fields allowed the framework to estimate the dispersion of a passive scalar without explicitly solving the transport equation.

Limitations & Uncertainties

The success of the framework relies on the ability of the SPOD modes to span the possible range of parameters encountered in real-world urban environments. Expanding the training database to include more geometric configurations and meteorological conditions would be necessary to ensure robustness.
The computational cost of training the neural network components is the main bottleneck of the approach. However, once trained, the models can be deployed efficiently for real-time inference.
The CNN model exhibits some limitations in capturing small-scale concentration gradients near the source, likely due to the use of a mean-squared error loss function. Incorporating alternative loss formulations or architectures may improve the fidelity of the scalar field predictions.

What Comes Next

Future work could explore the following avenues:

Incorporating data assimilation techniques to update the model outputs with new observations, improving long-term predictive capabilities.
Investigating transfer learning approaches to efficiently retrain the model for new urban configurations, reducing the need for costly retraining from scratch.
Exploring more advanced neural network architectures, such as physics-informed neural networks, to better encode the underlying fluid dynamics within the modeling framework.

Source

End-to-end data-driven prediction of urban airflow and pollutant dispersion
PreprintarXiv cs.LG3/19/2026