AGFlow

Accepted at CVPR 2026 MORSE Workshop

Asynchronous Remote Sensing Time-Series Fusion for Cloud Removal and Anytime Reconstruction

AGFlow is a timestamp-conditioned spatiotemporal flow-matching framework that fuses asynchronous Sentinel-1 SAR and Sentinel-2 optical time series for cloud removal, full-frame gap filling, and anytime reconstruction.

Forouzan Fallah1    Chia-Yu Hsu2    Wenwen Li2    Anna Liljedahl3    Yezhou Yang1

1Arizona State University, School of Computing and Augmented Intelligence 2Arizona State University, School of Geographical Sciences and Urban Planning 3Woodwell Climate Research Center

Abstract

Frequent cloud cover makes Sentinel-2 optical time series incomplete and irregular. AGFlow addresses this by treating acquisition time as a first-class signal, internally aligning asynchronous Sentinel-1 and Sentinel-2 observations, modeling spatial structure together with temporal dynamics, and generating cloud-free Sentinel-2 frames at both observed and user-specified dates.

What AGFlow adds

One model for cloud removal, gap filling, and anytime generation

AGFlow is designed for real satellite time series where optical and SAR acquisitions are irregular and asynchronous. Instead of pairing dates outside the model, it learns alignment inside the network and keeps the formulation unified across tasks.

Internal time alignment

Acquisition dates guide temporal attention and cross-sensor matching, so the model can fuse Sentinel-1 and Sentinel-2 without nearest-date preprocessing.

Spatiotemporal generation

A Sequential Denoising Transformer works on spatiotemporal patch tokens, preserving image structure while modeling temporal change.

Anytime querying

The same masked generation setup supports cloud removal, full-frame reconstruction, and synthesis at user-specified dates inside the monitoring window.

Method overview

AGFlow tokenizes optical and SAR sequences into spatiotemporal patches, injects acquisition-date embeddings, and uses time-aligned cross-attention to fuse asynchronous observations before masked flow matching reconstructs the missing regions.

Overview of AGFlow with asynchronous Sentinel-2 and Sentinel-1 time series, time-aligned cross-attention, and flow-based reconstruction.
Overview of AGFlow. The model uses date-aware spatiotemporal tokenization, time-aligned cross-attention, and masked flow matching to reconstruct cloud-free Sentinel-2 sequences.

Masked flow matching

Observed pixels stay clamped while the model updates only masked regions. This makes the same formulation work for local cloud masks and fully missing frames.

Time-aligned SAR fusion

Spatial cross-attention matches local structure and temporal cross-attention selects the most relevant SAR times for each optical query time.

Real-date temporal encoding

Relative time bias and rotary temporal encoding let the network reason over real acquisition gaps instead of assuming evenly spaced time steps.

Quantitative results

Stronger performance on missing frames and cloud-corrupted pixels

The paper reports consistent gains over RESTORE-DiT on both the hard missing-frame setting and standard cloud removal on the France test set.

Missing-frame reconstruction

Full-frame gap filling

One Sentinel-2 frame is fully removed and reconstructed from the remaining temporal context and Sentinel-1 observations.

Model MAE ↓ RMSE ↓ SAM ↓ PSNR ↑ SSIM ↑
RESTORE-DiT 0.0214 0.0322 2.9514 32.1755 0.9139
AGFlow 0.0179 0.0261 2.7761 32.8671 0.9420

AGFlow reduces MAE by 16.4% and RMSE by 18.9% in the fully missing-frame setting.

Cloud removal

France test set

Metrics are computed over cloud-corrupted pixels across all ten Sentinel-2 bands.

Method MAE ↓ RMSE ↓ SAM ↓ PSNR ↑ SSIM ↑
Linear 0.0257 0.0401 4.35 28.40 0.929
U-TILISE 0.0202 0.0314 3.76 30.38 0.936
U-TILISE-SAR 0.0193 0.0298 3.66 30.77 0.937
RESTORE-DiT 0.0140 0.0224 2.64 33.32 0.959
AGFlow 0.0133 0.0217 2.45 33.65 0.964

Compared with RESTORE-DiT, AGFlow improves MAE by 5.0%, RMSE by 3.1%, and SAM by 7.2%.

Qualitative reconstruction results

AGFlow produces cleaner reconstructions under both partial masking and fully missing-frame conditions, with fewer visible artifacts and sharper spatial structure than RESTORE-DiT.

Qualitative comparison showing input, RESTORE-DiT, AGFlow, and ground truth across multiple dates.
Missing-frame reconstruction and cloud removal. The highlighted AGFlow row stays closer to the ground truth while reducing cloud residuals and preserving field boundaries.

Sharper structure under long gaps

The paper shows that AGFlow keeps boundaries and field patterns more stable when an entire optical frame is missing, which is one of the hardest cases in the benchmark.

Cleaner cloud-affected regions

In partially masked scenes, AGFlow reduces cloud leftovers and blends reconstructed areas more naturally into surrounding context.

Anytime evaluation with NDVI trend agreement

For user-specified query dates without aligned Sentinel-2 ground truth, the paper evaluates generated outputs with NDVI against an auxiliary RapidAI4EO cloud-free reference, focusing on regional seasonal dynamics rather than strict pixel-wise matching.

NDVI curves comparing AGFlow and RapidAI4EO over time for multiple locations.
NDVI-based anytime evaluation. AGFlow tracks seasonal vegetation dynamics closely and stays consistent with the auxiliary reference at the region level despite timing and sensor mismatch.
Why this matters. Being able to query cloud-free outputs at arbitrary dates makes the model useful for dense vegetation monitoring and other downstream workflows that need a temporally consistent optical signal even when direct observations are missing.

Resources

Paper and citation

BibTeX

@inproceedings{fallah2026asynchronous,
      title={Asynchronous Remote Sensing Time-Series Fusion for Cloud Removal and Anytime Reconstruction},
      author={Fallah, Forouzan and Hsu, Chia-Yu and Li, Wenwen and Liljedahl, Anna and Yang, Yezhou},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
      pages={7772--7780},
      year={2026}
    }