Description


The bias-corrected CMIP6 global dataset for dynamical downscaling of the Earth’s historical and future climate (1979–2100) is now available. This dataset provides high-quality large-scale forcing for dynamical downscaling simulations and will improve the reliability of future projections of the regional climate and environment.

The traditional dynamical downscaling introduces biases that come from the boundary. To correct the biases, a series of steps are carried out. The method of correction decomposes the General Circulation Model (GCM) and reanalysis data into long-term trends and anomalies.

The long-term trends are computed using the multimodel ensemble (MME) mean derived from 18 CMIP6 models over historical and future time periods. In order to preserve the internal climate variability, one of the CMIP6 models is used to compute the anomalies. The high-resolution version of the MPI-M Earth system model (MPI-ESM1-2-HR), which is configured with a horizontal grid spacing of 100km in the atmosphere and 40km in the ocean, is used to produce the weather and interannual variability of the six-hourly large-scale forcing data. The data contain the upper air temperature, zonal wind, meridional wind, relative humidity, geopotential height, surface pressure, sea-level pressure, and sea surface temperature. The reanalysis dataset from 1979 to 2014 is retrieved from ECMWF ERA5. Both datasets were re-gridded to a horizontal grid spacing of (1.25° x 1.25°) using bilinear interpolation.

GCM variance bias corrections

The two datasets (GCM and ERA) can then be broken down into a long-term non-linear trend and an interannual perturbation term for each six-hour period and day of the year. The non-linear trend was computed using the ensemble empirical model decomposition (EEMD) method, excluding the long-term non-linear trend in the perturbation term. Since GCM may contain bias, that is measured by the ratio of the GCM variance to the reanalysis variance, in the amplitude of the interannual variations. In order to correct the variance bias, a scaling factor, which is the ratio of the standard deviation of the detrended reanalysis data to that of the detrended GCM data over the historical time period, can be multiplied to the perturbation term by assuming the variance bias remains the same from the historical period to a future period. Since the standard deviations are computed using the detrended data, the variance of the interannual and interdecadal variations are adjusted so that the non-linear trend remained unchanged. It is noted that the anomaly of the detrended HCM data at each six-hourly interval/day of the year is computed by subtracting the climatological mean of the detrended data from the detrended GCM data, before correcting the variance biases. The standard deviation of each six-hour interval and day of the year was calculated across 36 years from 1979 to 2014. The original standard deviation with all 36 years of data is first computed. Then the standard deviation is recalculated after removing the years with anomalies greater than three times of the original standard deviation. This helps remove the effects of the extreme events on the data, for instance, the unrealistic ratios of the standard deviation.

GCM mean bias correction

After that, the long-term non-linear trend derived from the single GCM is replaced by that derived from the MME, so the GCM data is then the sum of the long-term non-linear trend derived from MME with the EEMD over the historical-future time period and the GCM perturbation term multiplied by the scaling factor. The mean bias of the long-term trend of the GCM data relative to that of the reanalysis dataset over the historical period is removed to correct the GCM mean bias. Therefore, the bias-corrected six-hourly GCM data over the future period have a base climate provided by the reanalysis dataset over the historical period, with the change in future climate relative to the historical climatology generated by the MME and the future bias-corrected weather and climate variability derived from a single GCM.

It is noted that the long-term non-linear trends are assumed to remain the same for each six-hourly/daily value of variables within the same month to save the computing time when using EEMD method for computing the non-linear trend. Thus, the climatological mean of the detrended data is not zero because the GCM outputs and the long-term non-linear GCM trend are derived from six-hourly data and monthly data respectively.

For details, please refer to :
Xu, Z., Y. Han, C.-Y. Tam, Z.-L. Yang and C. Fu, 2021: A bias-corrected CMIP6 global dataset. for dynamical downscaling of future climate, 1979–2100, Scientific Data, doi: 10.1038/s41597-021-01079-3.



Download Application


Please click here or the "Download Application" button above to fill in the application form to request the data.

Your request will be processed shortly once it is received. The processing time might vary, depending on the number of requests at the time.


Data Information


The whole dataset is about 2 TB large.

Please read the README.txt file to see the detailed descriptions for each folder. The data files in .nc4 format and some scripts in FORTRAN are provided.

Please note that two attributes, scale_factor and add_offset, should be used to unpack the variables of the compressed NetCDF data.

A FORTRAN code is provided to convert the compressed NetCDF files to WRF intermediate files. The scripts for correcting the CMIP6 data biases are also provided in the dataset.

For detailed usage of the files, please refer to the "Usage Notes" Section in the paper.


The dataset is prepared by the Institute of Atmospheric Physics, the Chinese Academy of Sciences in collaboration with the Chinese University of Hong Kong and the University of Texas at Austin.

Creative Commons License The above open-access data are distributed under the terms of Creative Commons Attribution 4.0 International License.