The field of numerical weather prediction (NWP) is undergoing its most profound transformation in decades, as machine-learning models from tech giants increasingly rival the physics-based supercomputer simulations that have defined operational forecasting since the 1950s. In recent months, the U.S. National Oceanic and Atmospheric Administration (NOAA), the European Centre for Medium-Range Weather Forecasts (ECMWF), and private-sector players including Google DeepMind, Nvidia, and Huawei have all rolled out competing systems — raising urgent questions about how national weather services will integrate, validate, and govern AI-generated forecasts during a period of intensifying extreme weather.
A Paradigm Shift in Forecasting
For more than half a century, weather prediction has relied on solving the primitive equations of atmospheric motion on dense three-dimensional grids — a computationally brutal task requiring some of the world’s largest supercomputers. The ECMWF’s Integrated Forecasting System (IFS) has long been considered the global gold standard, edging out NOAA’s Global Forecast System (GFS) in most skill metrics. But that hierarchy is being scrambled. Google DeepMind’s GraphCast model, published in Science in late 2023, demonstrated that a graph neural network trained on four decades of reanalysis data could outperform the IFS on a majority of variables — and do so in under a minute on a single TPU, rather than hours on a supercomputer.
Since then, the pace has only accelerated. ECMWF has begun running its own Artificial Intelligence Forecasting System (AIFS) operationally alongside the IFS, releasing daily forecasts to users. The center’s leadership has been notably candid about the disruption: in public briefings, ECMWF Director-General Florence Rabier has acknowledged that data-driven models now match or exceed traditional physics-based output for many medium-range applications, while cautioning that hybrid approaches remain essential for extremes and for variables where training data are sparse.
Where the GFS Fits In
Against this backdrop, NOAA’s stewardship of the GFS has come under renewed scrutiny. The agency has outlined a multi-year plan to consolidate its sprawling suite of forecast models into a Unified Forecast System built around the FV3 dynamical core, while simultaneously developing AI-based components. NOAA’s National Weather Service has emphasized that machine-learning forecasts will not replace human forecasters or physics-based guidance in the near term, but will be folded in as additional ensemble members and post-processing tools. Critics — including some atmospheric scientists who have argued in commentaries on platforms like Eos — contend that U.S. operational forecasting risks falling further behind Europe and the private sector unless investment in compute infrastructure and reanalysis datasets is sharply increased.
The stakes are not academic. Hurricane Helene’s catastrophic inland flooding in September 2024, the rapid intensification of Hurricane Milton, and a string of billion-dollar severe-weather events have underscored the public-safety value of every additional hour of forecast lead time. Independent verification studies during the 2024 Atlantic season suggested that GraphCast and Huawei’s Pangu-Weather model produced track forecasts competitive with — and occasionally superior to — the operational GFS and IFS, though both AI systems struggled with intensity prediction, a known weakness tied to their coarse training resolution.
Governance, Trust, and the Road Ahead
Beyond raw skill scores, the rise of AI weather models raises governance questions that the meteorological community is only beginning to confront. ECMWF reanalysis products such as ERA5 underpin nearly every major machine-learning model in production, meaning that public investment in observation networks and data assimilation continues to subsidize private-sector AI development. Researchers writing in Nature and at the World Meteorological Organization have called for clearer frameworks around model transparency, reproducibility, and the responsibility for issuing warnings when AI forecasts diverge from physics-based guidance.
Looking ahead, the next twelve to eighteen months are likely to be decisive. ECMWF is expected to expand AIFS resolution and ensemble configurations, NOAA is preparing GFS version 17 with deeper machine-learning integration, and academic groups are racing to build kilometer-scale generative models capable of resolving thunderstorms directly. Whether the result is a fully hybrid forecast enterprise — or a fragmented landscape in which private models outpace public services — will shape how societies prepare for weather and climate extremes for a generation.
For more reporting on advances in atmospheric science, climate modeling, and the technologies reshaping how we understand the Earth system, visit science.wide-ranging.com for related coverage and in-depth features.

