Story
End-to-end data-driven prediction of urban airflow and pollutant dispersion
Key takeaway
Researchers developed a new computer model to predict urban air flow and pollution levels, which could help cities improve air quality and protect public health.
Quick Explainer
The key idea is to develop a data-driven reduced-order modeling (ROM) framework that can efficiently predict both urban airflow and pollutant dispersion. It works by first using spectral proper orthogonal decomposition (SPOD) to extract the most important flow structures from high-fidelity computational fluid dynamics (CFD) simulations. An autoencoder then compresses this information into a compact latent space representation, which a long short-term memory (LSTM) network models over time. Finally, a convolutional neural network maps the predicted velocity field to the corresponding pollutant concentration. This modular architecture provides a computationally efficient alternative to expensive CFD simulations, enabling real-time forecasting and urban planning applications.
Deep Dive
Technical Deep Dive: End-to-end Data-Driven Prediction of Urban Airflow and Pollutant Dispersion
Problem & Context
- Climate change and urban population growth are intensifying environmental stresses within cities, making urban atmospheric flows critical for public health, energy use, and livability
- Computational fluid dynamics (CFD) models are commonly used to simulate urban airflow, but have high computational costs and are unsuitable for real-time forecasting or parametric studies
- This study aims to develop a data-driven reduced-order modeling (ROM) framework to efficiently predict urban pollutant dispersion, with a focus on the skimming flow regime in street canyons
Methodology
Data & Experimental Setup
- Large eddy simulation (LES) dataset of a 3D street canyon flow with a continuous pollutant source at the canyon base
- Snapshot data includes time-resolved streamwise and vertical velocity components, as well as scalar concentration
- Flow conforms to the skimming flow regime, characterized by a principal recirculation cell and smaller corner vortices
Reduced-Order Modeling
- Spectral proper orthogonal decomposition (SPOD):
- Extracts coherent spatial structures and their temporal evolution in the frequency domain
- Dimensionality reduction via energy-based truncation and mode similarity criteria
- Autoencoder (AE):
- Learns a compact, nonlinear latent representation of the SPOD coefficients
- Long short-term memory (LSTM):
- Models the temporal evolution of the latent space variables
- Convolutional neural network (CNN):
- Maps the predicted velocity field to the corresponding pollutant concentration field
Results
SPOD Analysis
- SPOD effectively isolates key flow structures, such as the Kelvin-Helmholtz instability in the shear layer
- Dimensionality reduction retains 97% of the total turbulent kinetic energy using 2,003 SPOD modes
Latent Space Compression
- AE with 30-dimensional latent space captures the essential flow dynamics
- Reconstructed velocity fields exhibit good agreement with the reference LES data
Velocity Field Prediction
- LSTM accurately forecasts the temporal evolution of the latent variables, reproducing the phase space topology
- Velocity field reconstruction remains bounded and stable over long prediction horizons
Pollutant Dispersion Prediction
- CNN-based mapping from velocity to concentration field successfully captures the large-scale plume topology and ventilation characteristics
- Some smoothing of fine-scale structures, but overall good agreement with LES reference
Interpretation
- The proposed end-to-end data-driven ROM framework demonstrates the capability to efficiently predict both the velocity and pollutant concentration fields in an urban street canyon
- The modular architecture allows for flexible customization and future extensions, such as incorporating data assimilation or transfer learning
- The framework offers a computationally efficient alternative to high-fidelity CFD simulations, enabling real-time forecasting and extensive parametric studies for urban planning and air quality management
Limitations & Uncertainties
- Success of the framework relies on the ability of the SPOD modes to span the possible range of parameters; more configurations should be incorporated in the training dataset
- Training the neural networks is computationally expensive, but can potentially be addressed through techniques like transfer learning
- Validation is limited to a single street canyon configuration; further testing on diverse urban geometries is required to assess the generalization capabilities
What Comes Next
- Explore methods to incorporate additional physical constraints and domain knowledge into the neural network architectures
- Investigate strategies for online model adaptation and data assimilation to improve long-term prediction accuracy
- Extend the framework to handle time-varying boundary conditions and unsteady scenarios
