Bosong Zhang

Model Development and Evaluation

Climate Emulators

Cross-architecture emulator evaluation shows that mean-state agreement alone is insufficient; process-level and distributional fidelity control predictive reliability under warming.

Generative + Autoregressive + Hybrid Uniform SST Warming Out-of-Sample Diagnostics

3 Model Families

cBottle, ACE2-ERA5, and NeuralGCM

10-Year Runs

Control and warming trajectories for emulator stress tests

ArXiv (2025)

Equilibrium response to uniform sea-surface-temperature warming

Related Publication

Zhang, B., and T. M. Merlis, 2025: The Equilibrium Response of Atmospheric Machine-Learning Models to Uniform Sea Surface Temperature Warming. arXiv, 2510.02415.

Scientific Logic

  • Question: How reliably do climate emulators reproduce baseline climate structure and warming responses relevant to impacts?
  • Method: Cross-model evaluation of generative, autoregressive, and hybrid emulator configurations against reference model fields in control and warming-like states.
  • Mechanism: Architecture-dependent skill in preserving spatial covariance, variability, and extremes determines whether mean-state agreement translates to trustworthy anomaly responses.
  • Main Findings: Some emulator families match global means but diverge regionally and in distribution tails, motivating process-aware constraints before impact attribution.

Scientific Focus

How reliably do next-generation climate emulators reproduce mean-state and warming-response behavior, particularly when evaluated outside their training distribution?

Model Set

  • cBottle: non-autoregressive diffusion framework generating hourly fields from SST forcing.
  • ACE2-ERA5: autoregressive SFNO architecture at 1 degree / 6-hour resolution.
  • NeuralGCM: hybrid dynamical-core plus neural-physics model enabling stable decadal runs.

Key Findings

  • Different emulator families show distinct strengths in spatial pattern, variability, and tail-behavior reproduction.
  • Skill in control climate does not guarantee robust response under warming perturbations.
  • Process-aware diagnostics are required before using emulators for attribution-sensitive applications.

Figure