Story

Global River Forecasting with a Topology-Informed AI Foundation Model

ClimateComputing

Key takeaway

AI-powered model can forecast river levels across entire networks, helping communities better prepare for floods and droughts despite data scarcity.

Read the paper

Quick Explainer

The key insight behind GraphRiverCast (GRC) is that river systems possess inherent structural constraints, particularly in their complex network topology, which can be leveraged by an AI model to make accurate flow predictions even without historical data. GRC fuses complementary encoders - for static features, temporal dynamics, and river network topology - to capture this non-Euclidean structure. Unlike previous data-driven approaches, GRC's topology-informed neural operator framework enables it to operate in a "ColdStart" mode, generating real-time predictions without relying on past river states. This distinctive architecture allows GRC to maintain robust predictive skill across global river systems while offering the possibility of fine-tuning the model with sparse local observations.

Deep Dive

Technical Deep Dive: Global River Forecasting with a Topology-Informed AI Foundation Model

Overview

This paper presents GraphRiverCast (GRC), a topology-informed AI framework for global-scale multivariate river simulation. GRC integrates static geomorphic features, temporal dynamics, and river network topology to capture the non-Euclidean structure of river systems. Distinct from previous studies, the GRC model is designed to operate in a "ColdStart" mode, generating predictions without relying on historical river states.

Problem & Context

River systems are vital links in the global hydrological cycle, but water-related disasters have surged due to climate change.
Existing physics-based river models face challenges due to high computational costs, while standard deep learning models struggle in ungauged basins due to data scarcity.
The key insight is that river systems are inherently dissipative, so an AI model that can internalize the system's structural constraints, particularly topology, should be capable of reconstructing flow dynamics from forcings alone.

Methodology

Model Architecture

GRC fuses three complementary encoders: feature encoder for static attributes, temporal encoder for dynamics, and graph encoder for topology.
This topology-informed neural operator framework enables adaptive spatiotemporal learning with real-time computational efficiency.
GRC operates in two modes: GRC-HotStart (with historical state initialization) and GRC-ColdStart (without historical states).

Training & Evaluation

GRC is pre-trained on physics-based CaMa-Flood simulations to capture global-scale hydrodynamic patterns.
Evaluation is performed through global pseudo-hindcasts, with metrics including Nash-Sutcliffe Efficiency (NSE) and High-Flow Volume Bias (FHV).
Ablation studies quantify the contributions of static features, temporal information, and topological structure.

Results

ColdStart Performance

In the absence of historical states, GRC-ColdStart maintains robust predictive skill, with global mean NSE of ~0.82 for discharge, depth, and storage.
Topology is identified as the essential structural information, with a unique NSE contribution exceeding temporal information by an order of magnitude.

HotStart Performance

With historical state initialization, GRC-HotStart achieves high accuracy (NSE ~0.93), but is constrained by the dependency on observational data.
The presence of historical data reduces the model's reliance on explicit spatial routing, as temporal inertia becomes the dominant factor.

Fine-tuning for Local Improvement

GRC's pre-training and fine-tuning framework bridges global hydrodynamic knowledge with sparse local observations.
In the Amazon Basin, the fine-tuned GRC outperforms the physics-based baseline and data-driven scratch models, especially at ungauged reaches.
The Upper Danube experiment demonstrates GRC's cross-scale transferability, leveraging scale-invariant topological priors.

Limitations & Uncertainties

The pre-training uses outputs from the CaMa-Flood model, which faces its own challenges related to forcing uncertainties and limited calibration.
The current model does not explicitly represent anthropogenic water management structures like dams and reservoirs, limiting its fidelity in regulated basins.

What Comes Next

Integrating knowledge from multiple physics-based models and incorporating explicit water management modules are identified as future directions.
The "pre-train–fine-tune" paradigm establishes a collaborative framework for global river AI, enabling researchers to leverage a general foundation model and contribute local refinements.

Source

Global River Forecasting with a Topology-Informed AI Foundation Model
PreprintarXiv cs.LG2/27/2026