Accounting for model error in air quality

Full text

Turn on search term navigation

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/ doi:10.5194/gmd-9-3933-2016 Author(s) 2016. CC Attribution 3.0 License.

Accounting for model error in air quality forecasts: an application of 4DEnVar to the assimilation of atmospheric compositionusing QG-Chem 1.0

Emanuele Emili, Selime Grol, and Daniel Cariolle

CECI UMR5318 CNRS/CERFACS, Toulouse, France

Correspondence to: Emanuele Emili ([email protected])

Received: 5 May 2016 Published in Geosci. Model Dev. Discuss.: 24 May 2016 Revised: 15 September 2016 Accepted: 10 October 2016 Published: 8 November 2016

Abstract. Model errors play a signicant role in air quality forecasts. Accounting for them in the data assimilation (DA) procedures is decisive to obtain improved forecasts. We address this issue using a reduced-order coupled chemistry meteorology model based on quasi-geostrophic dynamics and a detailed tropospheric chemistry mechanism, which we name QG-Chem. This model has been coupled to the software library for the data assimilation Object Oriented Prediction System (OOPS) and used to assess the potential of the 4DEnVar algorithm for air quality analyses and forecasts. The assets of 4DEnVar include the possibility to deal with multivariate aspects of atmospheric chemistry and to account for model errors of a generic type. A simple diagnostic procedure for detecting model errors is proposed, based on the 4DEnVar analysis and one additional model forecast. A large number of idealized data assimilation experiments are shown for several chemical species of relevance for air quality forecasts (O3, NOx, CO and CO2) with very different atmospheric lifetimes and chemical couplings. Experiments are done both under a perfect model hypothesis and including model error through perturbation of surface chemical emissions. Some key elements of the 4DEnVar algorithm such as the ensemble size and localization are also discussed. A comparison with results of 3D-Var, widely used in operational centers, shows that, for some species, analysis and next-day forecast errors can be halved when model error is taken into account. This result was obtained using a small ensemble size, which remains affordable for most operational centers. We conclude that 4DEnVar has a promising potential for operational air quality models. We nally highlight areas that deserve further research for applying 4DEnVar to large-scale chemistry models, i.e., localization techniques, propagation

of analysis covariance between DA cycles and treatment for chemical nonlinearities. QG-Chem can provide a useful tool in this regard.

1 Introduction

In recent years, data assimilation (DA) of atmospheric constituents has become a key tool for providing more accurate forecasts and reanalyses of the atmospheric composition. The increasing availability of chemical observations from both satellites and ground-based instruments allowed to reduce the uncertainty of atmospheric chemistry models in a large number of applications. Utilization of DA can be found in the modeling of volcanic ash (Lu et al., 2016), in operational air quality forecasts at continental scale (Marcal et al., 2015) or in the reanalysis of the global atmospheric composition at decennial scale (van der A et al., 2010; Inness et al., 2013). Data assimilation can also be used to infer surface uxes of long-lived chemical compounds (Thompson and Stohl, 2014; Chevallier et al., 2005). A review of the utilization of data assimilation for the atmospheric composition can be found in Zhang et al. (2012) and Bocquet et al. (2015).

The main goal of DA is to reduce the uncertainties of a model through a timely combination of model results and observations. This is generally done by means of correcting the so-called control variables of the given model.The choice of the control variables should reect the largest source of uncertainty of the considered model. In atmospheric chemistry, control variables are typically associated

Published by Copernicus Publications on behalf of the European Geosciences Union.

correlations between interacting chemical species in the 3DVar background error covariance matrix is also particularly difcult, because chemical interactions depend on the local concentrations and on meteorological conditions. As a consequence, multivariate chemical DA with 3D-Var schemes has not been yet documented in the literature. In EnKF systems, the forecast model is used to propagate and estimate the background error covariance, but ad-hoc adjustments are necessary to avoid the collapse of the ensemble variance and obtain realistic covariance matrices for 1 h forecasts (Gaubert et al., 2014; Constantinescu et al., 2007a). As a result, costly algorithms such as EnKF or 3D-Var hardly give better results than more simple OI for chemical reanalyses (Rouil and the MACC team, 2014). More importantly, very little improvement is obtained, regardless of the employed DA algorithm, for the next-day model forecast (Wu et al., 2008).Forecasts of reactive gases such as O3 or NO2, but also other pollutants such as aerosols mixtures (PM10 and PM2.5), depend weakly on the initial condition and are more sensitive to model settings such as surface emissions or physical parameterizations. Current operational systems can achieve accurate reanalyses of observed chemical species through DA but parameter estimation and, more generally, model errors must be taken into account in DA to improve chemical forecasts of reactive gases and particles.

Some studies evaluated more advanced DA algorithms to jointly correct surface emissions of precursor species and initial condition of observed species. For example, Elbern et al. (2007) employed a 4D-Var scheme in combination with assimilation windows of 24 h to assimilate O3, NOx and SO2 measurements. A similar study has been done also in the context of a toy model experiment by Hamer et al. (2015), where only emissions of precursor species are adjusted to improve O3 forecasts. Results seem promising but still rely on the assumption that the model is almost perfect, i.e., that there are no additional sources of uncertainties in the model forecast other than the controlled variables (the initial state and the selected emissions). This can lead to the overcorrection of control variables when other non-negligible model errors exist, for example, due to the meteorological forcing, photochemistry coefcients, dry or wet deposition. Concerning EnKF implementations, some authors also tested joint optimization of the chemical state and precursor emissions (Miyazaki et al., 2012; Tang et al., 2011; Constantinescu et al., 2007b). EnKF naturally includes model uncertainties in its formulation, which can be added through stochastic perturbation of model parameters during the ensemble forecast (Evensen, 2003). However, EnKF corrects the model trajectories sequentially. In a typical air quality context, the emissions of O3 precursor species (e.g., NOx and VOCs) in the early morning or night can affect the concentration of observed species (e.g., O3) in the early afternoon, when the photochemistry takes place. When using EnKF, the information made available by afternoon measurements cannot be used to correct the model at previous hours.

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

3934 E. Emili et al.: QG-Chem

with the model initial state (Elbern et al., 1997) or chemical emissions (Chevallier et al., 2005). However, inaccurate identication of the model uncertainty can lead to a wrong adjustment of the model through DA, even though the spread between model predictions and assimilated observations is reduced (Tang et al., 2016). The choice of the control variable, and the approximate knowledge of its uncertainty (also named background error covariance), is therefore critical for the design of an appropriate assimilation algorithm and to ensure correct results of DA.

Principal sources of uncertainty of atmospheric chemistry models include the model initial condition, model ancillary data or parameters, model physical parameterization, chemical mechanism, etc. (Beekmann and Derognat, 2003; Mallet and Sportisse, 2006). Since different chemical species can be sensitive to different physical and chemical processes, the main sources of uncertainties can also differ from species to species. For example, long-lived species like CO2 or CO are mostly sensitive to uncertainties in surface uxes, which can be corrected using variational algorithms in combination with long assimilation windows of several weeks (Chevallier et al., 2005; Koohkan and Bocquet, 2012). This is possible since the chemical reactivity and the sensitivity to the initial condition are negligible for long integration of the model.Uncertainty in the transport processes is also generally neglected (Babenhauserheide et al., 2015). Data assimilation of short-lived gases like tropospheric O3 or NO2, which are encountered in air quality applications, is instead trickier. O3 and NO2 are involved in rapid chemical reactions and sensitive to several model parameters ranging from reaction rates, emissions of primary species, clouds and radiation, boundary layer mixing, etc. It is much more difcult in this case to identify a single and predominant source of uncertainty in the model predictions.

In most air quality operational models a pragmatic choice is currently made by setting the control variable to the initial state of the measured species and using short forecast/assimilation cycles (e.g., 1 hour). Since ground-based measurements are generally available at hourly frequency, most used sequential DA algorithms, such as optimal interpolation (OI), 3D-Var or ensemble Kalman lters (EnKFs) (Marcal et al., 2015), all provide a strong constraint on model trajectories (Wu et al., 2008). This strategy gives robust results for operational analyses because chemical elds are corrected every hour. Since the control variable correspond to the assimilated observations, this also permits to estimate the background error covariance from previous validation of the model against observations (Hollingsworth and Loennberg, 1986) and keeps from the difcult diagnosis of the true model uncertainties.

However, the model dynamics are neglected in OI or 3DVar DA schemes. Attempts of using ensembles of model analyses to specify the background error covariance within 3D-Var dynamically did not show clear improvements over the static case (Jaumouill et al., 2012). Specication of cross

E. Emili et al.: QG-Chem 3935

Based on above-mentioned facts, we determine that the following capabilities are needed to further improve DA in air quality applications:

permit simultaneous assimilation and optimization of multiple chemical species (multivariate DA), with possible chemical interactions and very different lifetimes;

include model error and account for disparate sources of model uncertainty;

allow long assimilation windows to make best use of the information content of frequent air quality observations.

This can be accomplished by using the so-called weak constraint 4D-Var (Trmolet, 2006), which is an extension of the 4D-Var algorithm that accounts for the model error. On top of the operators already needed to perform the strong constraint 4D-Var (e.g., the tangent linear and adjoint codes of the forecast model), the formulation of the weak constraint 4D-Var requires the denition of the model error covariances Q, which can be difcult to estimate in real applications (Trmolet, 2007). Recent studies have shown that the linear operators needed in the weak constraint 4D-Var formulation can be approximated through an ensemble of model forecasts.For example, the 4D-Var-EnKF (Mandel et al., 2016) uses an ensemble approach to mimic the tangent linear and ad-joint model for the minimization of the weak constraint 4DVar cost function. The iterative ensemble Kalman smoother (IEnKS, Bocquet and Sakov, 2014) is a nonlinear 4DEn-Var formulated under perfect model assumptions, which can also be used to estimate erroneous model parameters through an augmented state formalism (Bocquet and Sakov, 2013).A major asset of IEnKS for chemistry applications is that it can also account for strong nonlinearities of the forecast model. The 4DEnVar method (Desroziers et al., 2014) uses an ensemble of nonlinear model trajectories to estimate both the error covariances for the initial condition and the model error, as well as to approximate the tangent linear and ad-joint model. These approaches are generally referred in the literature as ensemble variational EnVar (Lorenc, 2013), as opposed to hybrid methods, which make use of ensembles only to specify error covariances matrices in variational algorithms (Belo Pereira and Berre, 2006). EnVar methods have a major advantage for atmospheric chemistry applications: they avoid the construction of tangent linear and ad-joint codes of the forecast model, which are still lacking for most of the operational CTMs, or are becoming very dif-cult to be maintained due to the rapid evolution of models and computer architectures.

The main advantage of the 4DEnVar method is that it permits to account for a generic model error through the addition of stochastic perturbations during the model integration step (like in EnKF). Moreover, it focuses exclusively on the estimation of the model state, which is the only variable that is directly constrained by observations. This avoids the difcult

specication of Q still needed in the 4D-Var-EnKS. As all ensemble-based methods, 4DEnVar also naturally supports multivariate chemical DA, with the cross-covariance terms between chemical species being automatically obtained from the ensemble of the nonlinear model forecasts.

Variants of the 4DEnVar have been already tested in real numerical weather prediction (NWP) applications (Lorenc et al., 2015). The method has proven to be affordable for large-scale operational NWP models, even though the skills of the operational hybrid 4D-Var are not yet matched. To the knowledge of the authors, EnVar-type methods have not yet been implemented in air quality or atmospheric chemistry models and only one study has already examined the potential of EnVar methods for chemical DA (Haussaire and Bocquet, 2016). Note that, operational NWPs are already based on well-matured 4D-Var DA systems, whereas very few atmospheric chemistry models are based on such systems. Therefore, there is more room for improvement from the EnVar type of algorithms in air quality models than in NWP. Hence, the main objectives of this study are

to present a new atmospheric chemistry toy model built for assessing and comparing performances of several DA algorithms;

examine the potential and limits of the 4DEnVar algorithm for air quality analyses, compared to the generally used 3D-Var;

present a new procedure based on 4DEnVar to improve chemical forecasts on the next day.

The purpose is to examine state-of-the-art DA algorithms in the reactive gases/air quality context and, therefore, to guide future developments for the operational DA systems. Four gaseous species with very different lifetimes and chemical mechanisms, currently well observed either from satellites or from ground-based instruments, are considered for this study (CO, O3, NO2 and CO2). Using a simplied model in this context permits faster implementation of complex DA algorithms, cheaper numerical experiments and more straightforward interpretation of the DA results (Fairbairn et al., 2013).The latter is particularly true compared to DA experiments done using real observations, with generally unknown error statistics. Compared to already-mentioned simplied models (Hamer et al., 2015; Haussaire and Bocquet, 2016), which are, respectively, 0-D and 1-D, the newly proposed model is 3-D and uses the same tropospheric chemistry scheme of operational air quality models. This allows us to reproduce more features of real models, for example, the complex interactions of reactive chemistry and large-scale advection or the effect of boundary conditions. This also permits to better examine typical issues of DA within large systems, like the emergence of sampling errors due to the nite size of the ensemble and the consequences of localization techniques.Additionally, the use of 3-D elds and operators eases the

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

2.1 Quasi-geostrophic meteorology

The two-layer QG model is a geophysical uid model composed of two atmospheric layers of xed depth and potential temperature. It is a simple model of the atmosphere at midlatitudes, whose main forcings are represented by the Coriolis force and the orography or surface heating. The governing equation is the conservation

Dqi

Dt = 0 (1)

of the potential vorticity q = (q1,q2) expressed in nondimen

sional variables (Fandry and Leslie, 1984):

q1 = r2 1 F1( 1 2) + y (2) q2 = r2 2 F2( 2 1) + y + Rs, (3)

where the subscripts 1 and 2 stand for the top and bottom layer, respectively. r2 is the two-dimensional Lapla

cian, Rs represents orography or heating, is the (nondimensionalized) northward variation of the Coriolis parameter at a xed latitude, F1 and F2 couple the layers together being a function of Coriolis force, layer depths, gravity and typical length scale. The stream function = ( 1, 2),

whose horizontal derivatives give the horizontal wind eld (ui,vi), can be considered as the model state vector.

The code of the QG model that is distributed with the OOPS DA library have been used for this study. The depth of the two layers, the resolution of the horizontal grid and the integration time step [Delta1]t are the main model parameters that can be set at runtime. The dimensional scaling and model orography are xed, as well as the extension of the domain, which is 12 000 km in the zonal direction and 6300 km in the meridional direction. The domain is cyclic in the eastwest direction, i.e., the model elds are periodic in this direction.The stream function is set to climatological values at meridional walls (Dirichlet boundary conditions). For all the experiments presented in this study, a coarse resolution of approximately 750 km (16 [notdef] 8 grid points, respectively, for the

eastwest and northsouth directions) has been used. We remind readers that the focus of this study is to test chemical DA algorithms in a toy model framework. Therefore, there is no stringent requirement on the realism of the meteorological elds and no need to reproduce a real atmospheric situation.The only desired property is to obtain wind elds that exhibit typical patterns of the complex atmospheric circulation.A summary of the QG model parameters used in this study is detailed in Table 1.

2.2 Tropospheric chemistry

The state vector of the QG model has been extended to include chemical species. The regional atmospheric chemical mechanism (RACM) (Stockwell et al., 1997), which describes 96 chemical species with about 300 reactions, has

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

3936 E. Emili et al.: QG-Chem

estimation of numerical costs and possible bottlenecks of the DA algorithm in terms of operational implementation. We remind readers that the objective of this study is to demonstrate the applicability of a DA algorithm that could outperform currently implemented methods in operational centers, but with an acceptable compromise between computational costs and precision. Finally, the toy system has been implemented using the library for data assimilation OOPS (Yan-nick Trmolet, personal communication, 2015) to ease the exchange of assimilation algorithms or toy models between scientists.

The paper is outlined as follows. The developed atmospheric chemistry model will be presented in Sect. 2. A summary of the data assimilation algorithms employed in this study is given in Sect. 3. The rst section of the numerical results (Sect. 4.1) presents a number of DA experiments done under the hypothesis of the perfect model. A detailed comparison between 4DEnVar and 3D-Var is presented, as well as a sensitivity study on the principal parameters of the 4DEnVar algorithm, i.e., the ensemble size and the localization choices. In Sect. 4.2, the effects of a model error are investigated using the 4DEnVar algorithm. A statistical comparison of the 4DEnVar and 3D-Var performances on multiple cycles of analyses and forecasts is presented in Sect. 4.3.Finally, conclusions are given in Sect. 5.

2 Model description

A new atmospheric chemistry low-order model has been developed for this study and named QG-Chem. The objective was to reproduce typical features of chemical elds from large-scale chemical transport models (CTMs), but maintaining the computational cost low enough to allow a comfortable usage on a personal computer. The meteorological forcing is computed using a two-layer quasi-geostrophic (QG) model, representative of midlatitude mesoscale dynamics (Pedlosky, 1992). The QG wind eld is used to advect the chemical species, which makes QG-Chem a coupled meteorologicalchemistry model. This choice permits to examine the behavior of DA in the presence of complex gradients of wind elds and vorticity. Since all DA algorithms make strong assumptions on the model dynamics (Sect. 3), it is important to test them in the presence of advection patterns that can be found in real applications. Nevertheless, the focus of this study remains on atmospheric chemistry. Therefore, a detailed tropospheric chemical mechanism has been considered. Aspects of DA concerning the coupling between meteorology and chemistry are also left for future work, with the present using QG-Chem in a CTM-like mode. The details of the meteorological and chemical models are given next.

E. Emili et al.: QG-Chem 3937

been implemented. This chemical scheme has been developed for air quality modeling and is currently used by a number of operational models in Europe (Marcal et al., 2015).Photochemistry and its diurnal cycle are included via look-up tables, assuming global clear-sky conditions. Surface uxes of chemical species (emissions and dry deposition velocities) are assigned at runtime and are kept constant during the temporal integration of the model. Chemical species are advected by the QG wind eld using the semi-Lagrangian scheme used for solving the QG governing equation (Eq. 1).

After the advection of the species, their concentrations are updated by addition of the chemical tendencies. They are computed by solving the stiff ODE system that describes the adopted chemical mechanism. The ODE system is of the nonlinear form:

@C/@t = f (C) = P (C) L(C) [notdef] C (4) where C represents the local species concentrations, P (C)

and L(C) the production and loss terms. The stiffness of the systems comes from the wide range of values that can take the loss terms leading to a large range of chemical lifetimes.

The above system is integrated using the adaptative semi-implicit scheme (ASIS). ASIS is a one-step semi-implicit scheme with prognostic time steps. To solve the system of linear equations associated with the semi-implicit scheme, ASIS uses the generalized minimal residual method (Saad and Schultz, 1986) which appears to be very competitive in terms of computation time with good convergence properties. The ASIS solver is mass conservative and adapts its sub-time step to the adopted tolerance errors as described by Verwer (1994). In our application, the ASIS solver uses a 7 s minimum sub-time step, an absolute error tolerance of 104 molecules cm3 and a relative tolerance error of 0.01.

Therefore, a common integration time step (dt) for both the dynamical and chemical solvers is used, which is set to dt = 10 min to ensure reliable chemistry solutions. The dry

deposition is computed for each species before the application of the ASIS solver, i.e., using the main model time step dt. Concentrations are updated according to

Ci(t + dt) = Ci(t) iCi(t)dt, (5) where i denotes the chemical species and i is the corresponding deposition timescale in s1, proportional to the deposition velocity.

The chemical eld is set equal to climatological values on the NS boundaries, which correspond to Dirichlet-type conditions. The values of surface pressure and temperature used by the chemical mechanism are xed globally and do not depend on the QG elds. These modeling choices let this study focus on the following main processes of air quality models: emissions, chemistry and transport. Therefore, only the bottom layer of the QG-Chem model will be analyzed throughout this study. A summary of the conguration of QG-Chem is given in Table 1.

Table 1. QG-Chem model parameters and nondimensional scaling factors. The parameters marked by * are xed globally and only relevant for the chemical mechanism.

Characteristic Description

Geographical domain 12 000 km (EW) [notdef]

6300 km (NS)

Zonal resolution 750 km (16 grid points) Merid. resolution 790 km (8 grid points) Top layer depth 6 kmBottom layer depth 4 kmTypical horizontal scale 1000 kmTypical velocity 10 m s1

Coriolis parameter F 104 Merid. gradient of F ( ) 1.5 [notdef] 10

Orography Gaussian hill (2 km alt.) Chemical mechanism RACM (Stockwell et al., 1997) Surface pressure* 1000 hPaTemperature* 24.9 C

Boundary layer thickness* 1.2 km

2.3 Description of the case study

A model run of 20 days is performed prior to DA experiments, starting from the initial condition given in Table 2, which corresponds to a homogeneous, relatively clean atmosphere and zonal circulation. Emissions (Table 3) are taken from the study of Crassier et al. (2000), FLUX case, representative of the urban environment of Paris. Spatial heterogeneity of model concentrations is obtained by scaling the reference emissions in Table 3 by 0.01, 1 and 0.25, respectively, on the western, central and eastern parts of the domain (Fig. 1). Deposition velocities from the French air quality model MOCAGE (Marcal et al., 2015), averaged over the Paris region during the month of July 2010, are used and set constant over the QG-Chem domain. Values are reported in Table 4. The chemical concentrations at the meridional boundaries are set to the same values as in Table 2. Hence, the presence of NS boundaries can eventually counterbalance the growth of long-lived species by advection of clean air masses from outside the domain. This conguration relates to regional air pollution modeling mainly concerning the type and amplitude of chemical emissions, spatial heterogeneity of sources and presence of boundaries.

Results for four key species are considered through the study: nitrogen dioxide (NO2), ozone (O3), carbon monoxide (CO) and carbon dioxide (CO2). The rst three are of concern for air quality, since they have an elevated toxicity and their concentration is strongly related to anthropogenic emissions.The chemical reactivity and typical tropospheric lifetime of NO2, O3 and CO is, however, very different. NO2 typically arises from the oxidation of nitric oxide (NO) in combustion processes. It is highly reactive, lasting in the atmosphere from a few hours in summer to several days in winter, and it is

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

3938 E. Emili et al.: QG-Chem

Table 2. Initial conditions used to initialize the truth simulation. The initial elds are constant over the QG-Chem domain. Same values are also assigned to the meridional boundaries of QG-Chem during all simulations. Values are equal to zero for chemical species that are not listed below. Chemical concentrations are expressed in volume mixing ratio units (vmr).

Variable Value

Meteorology (m s1)

(u1,v1) (40, 0) (u2,v2) (10, 0)

Chemistry (vmr)

O2 0.2095 O3 30 [notdef] 10

0 1500 3000 4500 6000 7500 9000 10 500 12 000

Figure 1. QG-Chem horizontal domain: scaling factor for the chemical surface emissions (in colors), location of the synthetic observations used in the assimilation experiments (black circles) and locations for which time series of DA experiments are displayed (crossed boxes A and B). The numerical grid is displayed using white lines.

distribution of the sources. The maps also display the inuence of meridional boundary conditions, which produce local minima in O3 and CO elds in correspondence with the advection of clean air masses from outside the domain. The average model trajectory during 24 h shows signicant differences among all considered species. Since all species are advected and surface uxes are constant in time, these features arise from the complex chemical interactions and photochemistry. Note, for example, the daylight increase of O3 as a consequence of NO2 production from NO emissions during nighttime and daytime photolysis. In contrast, CO shows an almost linear increase in time, due to a constant surface emission and longer lifetime. Most of the numerical experiments that are shown later in this study start on the above-discussed day.

3 Data assimilation algorithm

We considered two data assimilation algorithms in this study: 3D-Var and 4DEnVar. The rst is the simplest type in the family of variational DA algorithms and currently the most used in operational chemical assimilation systems (Marcal et al., 2015). It is taken as a reference against which the benets of more complex (and costly) algorithms can be assessed. 4DEnVar is an hybrid algorithm that combines benets of variational and ensemble methods. It is already used in a number of NWP models (Buehner et al., 2010; Lorenc et al., 2015) and was tested in the framework of meteorological toy models (Desroziers et al., 2014; Fairbairn et al., 2013). A summary description of the two algorithms is given below, as well as some specic aspects relative to the atmospheric chemistry implementation presented in this study. In the third section, a method based on the postprocessing of 4DEnVar output is proposed to correct model biases.

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

CO2 310 [notdef] 10

OH 1.5 [notdef] 10

HO2 1 [notdef] 10

H2O2 1 [notdef] 10

N2O 310 [notdef] 10

NO 0.2 [notdef] 10

NO2 0.1 [notdef] 10

HNO3 0.5 [notdef] 10

HNO4 0.1 [notdef] 10

CH4 1.6 [notdef] 10

CO 150 [notdef] 10

Cl 1 [notdef] 10

HOCl 1 [notdef] 10

HCl 1 [notdef] 10

Br 1 [notdef] 10

BrO 1 [notdef] 10

HBr 1 [notdef] 10

an ozone precursor. O3 is a secondary gas, mainly formed by the reaction of nitrogen oxides and hydrocarbons under sunlight. In the troposphere, it has a typical atmospheric lifetime of 23 weeks. CO is produced by partial oxidation of carbon compounds, which can occur in industrial or natural combustion processes. It also participates in O3 chemistry and has an atmospheric lifetime of about 12 months. Finally, CO2 is of major concern for its effect on climate and has the longest lifetime among all the considered species (> 30 years). However, CO2 uxes are not activated in the experiments, which makes this gas behave like a passive tracer in our study.

Averaged model elds (24 h) on day 20 are shown in Fig. 2. We note the presence of zonal gradients of chemical concentrations for species that are strongly related to surface emissions (e.g., NO2, O3 and CO), which are higher in the central part of the domain (Fig. 1). The developed cyclonic circulation also allows the accumulation of longer-living pollutants (O3, CO) in correspondence with the central low-pressure system, whereas patterns of short-living gases (NO2) maintain a stronger similarity with the geographical

E. Emili et al.: QG-Chem 3939

Wind field (Avg: 4.31, Max: 7.12, Min: 0.64) x 10 m s-1

CO (Avg: 448, Max: 791, Min: 164)

NO2 (Avg: 0.76, Max: 1.44, Min: 0.07)

O3 (Avg: 58.1, Max: 98.0, Min: 28.8)

Time averages

Domain averages

NO2

(m s )

-1

0 1500 3000 4500 6000 7500 9000 10 500 12 000

ppbv

Figure 2. Meteorological and chemical elds of QG-Chem on day 20. Time-averaged elds for a 24 h period on the left, time series of domain-averaged values for the same 24 h period on the right. The wind eld and the concentration of the four chemical species of interest (CO, NO2 and O3, with CO2 not shown since it is constant and equal to 310 ppmv) are shown from top to bottom.

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

3940 E. Emili et al.: QG-Chem

Table 3. Surface chemical emissions used for all QG-Chem experiments. Values are equal to zero for chemical species that are not listed below. Geographical scaling is further applied to these values as shown in Fig. 1.

Species Extended name Value (109 molec cm2 s1)

NO Nitric oxide 121.29 CO Carbon monoxide 2500 CH4 Methane 802

ETH Ethane 6.25 HC3 Alkanes, alcohols, esters and alkynes 37.67 HC5 Alkanes, alcohols, esters and alkynes 44.43 HC8 Alkanes, alcohols, esters and alkynes 19.14 ETE Ethene 22.33 OLT Terminal alkenes 39.67 OLI Internal alkenes 6.37 TOL Toluene 9.02 HCHO Formaldehyde 5.77 ALD Acetaldehyde and higher aldehydes 14.45 KET Ketones 5.7 XYL Xylene 14.55 CSL Cresol and other aromatics 3.68

The Object Oriented Prediction System (OOPS), a generic software framework to develop data assimilation systems (Y.Trmolet, personal communication, 2015), was used to implement and run all the DA experiments described in this study.

3.1 3D-Var

The 3D-Var analysis can be computed after a model forecast time step by means of minimizing the quadratic cost function J (Kalnay, 2003):

J ( x) =

2 xT B1 x+

2( yoH x)T R1( yoH x) (6)

for the increment x used to correct the previous forecast xb (also named background), i.e., x = xb + x, where x is the

control variable (e.g., the 3-D chemical state). Here, B and R are the background and observation error covariance matrices, H the linearized observation operator that transforms an increment of the control variable into an increment in the observation space, yo the difference between the observations vector yo and the previous forecast yo = yo H(xb),

using the nonlinear observation operator. The minimization of J can be achieved with standard techniques for the solution of large linear systems: the B-preconditioned conjugate gradient algorithm has been used in this study (Derber and Rosati, 1989). The result of the minimization is the analysis increment xa (3-D). To advance in time, the analysis xb + xa is used as the new initial condition for the follow

ing forecast step and so forth. The relative simplicity, efciency and robustness of the 3D-Var algorithm make it very suitable for operational models. Its practical implementation requires mainly the development of a covariance model for B

(Weaver and Courtier, 2001). Multivariate chemical assimilation can be performed with 3D-Var by extending the 3-D control variable x to contain multiple 3-D model variables (chemical species).

In this study the control variable x is set to represent the complete model state, i.e., the stream function plus the 96 chemical species. The covariance matrix B is modeled through the sequential application of 1-D square root correlation operators and a diagonal matrix, representing the background error standard deviation:

B x = BT/2B1/2 x (7)

B1/2 = [Sigma1]1/2C1/2zC1/2xC1/2yC1/2v, (8)

where Cz, Cx, Cy and Cv are, respectively, the vertical, zonal, meridional and multivariate correlation operators and [Sigma1] is the variance (diagonal matrix). Cx and Cy are isotropic homogeneous correlation operators providing Gaussian spatial structures. The other correlation operators are represented by symmetric positive-denite matrices. The following parameters are used to set B: one horizontal-length scale that denes the decorrelation scale for the zonal and meridional coordinates, one value for the vertical correlation and one value for the chemical correlation between each couple of model variables. The variance is specied using one global value for each variable. The resulting B is uniform and homogeneous on the horizontal plane. More complex B models could be introduced to account, for example, for spatial variability and heterogeneity of the background error covariance.However, this is out of the scope of the present study, which is intended to reproduce typical operational chemical DA settings, where B is usually specied using a single variance and correlation length for each chemical species.

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

E. Emili et al.: QG-Chem 3941

Table 4. Surface dry deposition velocity used for all QG-Chem experiments. Values are equal to a xed and low deposition velocity of 105 m s1 for chemical species that are not listed below.

Species Extended name Value(103 m s1)

HNO4 Perinitric acid 1.167 MACR Methacrolein and other unsaturated monoaldehydes 0.421

HO2 Hydroperoxy radical 8.665 TOL Toluene 0.035

H2O2 Hydrogen peroxide 16.32 DCB Unsaturated dicarbonyls 2.735

XYL Xylene 0.035 GLY Glyoxal 6.686 HNO3 Nitric acid 15.31

MO2 Methyl peroxy radical 0.931

N2O5 Dinitrogen pentoxide 17.69

OP1 Methyl hydrogen peroxide 2.152 HONO Nitrous acid 1.430 NO2 Nitrogen dioxide 1.435

ALD Acetaldehyde and higher aldehydes 0.638 OP2 Higher organic peroxides 1.348 KET Ketones 0.622 HCHO Formaldehyde 1.500 PAA Peroxyacetic acid and higher analogs 1.317 ONIT Organic nitrate 0.106 TPAN Unsaturated PANs 1.047 PAN Peroxyacetyl nitrate and higher saturated PANs 1.099 O3 Ozone 3.557

NO3 Nitrogen trioxide 1.316 MGLY Methylglyoxal and other alpha-carbonyl aldehydes 2.804

SO2 Sulfur dioxide 4.463 HKET Hydroxy ketone 2.954

UDD Unsaturated dihydroxy dicarbonyl 12.31 CSL Cresol and other aromatics 1.323

Only surface observations are considered in this study.

Therefore, the observation operator H is represented by the bi-linear interpolation of model values to the observation location. A diagonal observation error covariance R is used, as it is the case in most real DA systems.

One drawback of 3D-Var is that DA results rely strongly on the background error covariance B, which should depend on the previous assimilation cycles and forecast errors (ow dependence). In practical applications, B is usually set constant, estimated from verication of previous forecasts (climatological) and/or tuned to provide the best t of the analyses against independent observations. In the case of multivariate chemical assimilation, the estimation and validity of climatological error covariances between chemical species has not yet been demonstrated. As a consequence, multivariate corrections are normally neglected (Cv = I). This simpli

cation is also used in this study. Finally, 3D-Var provides a correction to the control variable each time the system is observed (every hour in air quality applications). This does not allow to exploit the dynamical information contained in

observation time series and prevent all estimation of model error terms.

3.2 4DEnVar

The 4DEnVar algorithm can solve the main drawbacks of 3D-Var by introducing the temporal dimension in the quadratic cost function (Eq. 6), similarly to what the classic 4D-Var algorithm does (Dimet and Talagrand, 1986). However, compared to the latter, it avoids the introduction of the tangent linear and adjoint codes of the forecast model. Following the notation in Desroziers et al. (2014), the 4DEnVar cost function can be written as

J ( x) =

2 xT B1e x +

2( yo H x)T R1( yo H x), (9)

where all underlined terms are now time dependent (4-D).

The control variable x becomes the temporal trajectory of the model state ( plus the 96 chemical species in this study).The cost function is computed for an assimilation window that can span several hours or days. The assimilation window is discretized with an arbitrary number of sub-windows,

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

3942 E. Emili et al.: QG-Chem

which denes the temporal dimension of the 4-D vectors and matrices in Eq. (9). The minimization of J returns a 4-D vector xa, which provides the analysis trajectory for the entire assimilation window.

The forecast error covariance Be is estimated from an ensemble of perturbed model trajectories:

Be =

1L 1

(x[prime]1,...,x[prime]L)(x[prime]1,...,x[prime]L)T , (10)

where x[prime] are the four-dimensional perturbation and L denotes the size of the ensemble. It results that Be describes spatial (3-D), multivariate and temporal covariances at once.

An analogy with the 4D-Var weak constraint cost function shows that Be represents a numerical approximation of the linearized forecast model and model error covariance (Desroziers et al., 2014). As for the 3D-Var case, the minimization of J is achieved using a B-preconditioned conjugate gradient algorithm (algorithm no. 3 in Desroziers et al., 2014). H and R are block diagonal matrices whose blocks are the operators H and R dened in Sect. 3.1 for each sub-window.

A nice property of the 4DEnVar algorithm is that the effect of the model error covariance, generally denoted by Q (Trmolet, 2006), can be introduced easily by adding physical perturbations during the computation of the ensemble of forecasts. This lets the model dynamics develop the complex covariances that derive from physical perturbations, without the need to specify and implement a covariance model for Q. Whenever the sole initial condition of the ensemble is perturbed at the beginning of the assimilation window, the 4DEnVar approximates the 4D-Var algorithm in the strong constraint formulation. The exibility of the model error specication in 4DEnVar is a strong asset for atmospheric chemistry DA, where the sources of uncertainty can be highly variable and a function of the chemical species.

The main drawback of 4DEnVar in practical applications to large-scale models is related to the nite size of the ensemble. As a consequence, the 4-D error covariance Be is not fully ranked and covariance terms may contain statistical noise. The effect of a noisy covariance is the appearance of spurious analysis increments far away from assimilated observations, or between physically uncorrelated model variables (in the multivariate case). Since with 4DEnVar temporal covariances are also estimated through the ensemble, this effect concerns also the time dimension. This is a typical issue with all ensemble-based methods and large systems, and demands the introduction of a localization operator, which attenuates non-local increments. The localization is applied to Be by

B = Be C, (11)

where denotes the Schur product (entry-wise product) and

C is a 4-D correlation operator that damps non-local covariances. The numerical implementation of Eq. (11) is made

under the following approximations: the same 3-D (and multivariate) correlation operator C is used for all 4DEnVar sub-windows. It follows that C is a block matrix with all elements set equal to C. Hence, in order to specify C, we could use the covariance operator described in Eq. (7) by setting the variance terms to one. The choice of parameters of the correlation operators is discussed in the results section. This simplication, also called static localization, signicantly reduces the numerical cost of the algorithm (Desroziers et al., 2014) at the price of degraded precision when increasing the length of the DA windows (Bocquet, 2016). The development of localization procedures that are more consistent with the dynamics of the forecast model is an ongoing research topic (Bocquet, 2016; Desroziers et al., 2016) and possible applications to QG-Chem will be considered in a future study.

3.3 Diagnosis of model error and forecast correction with 4DEnVar

One of the objectives of this study is to retrieve model error information from the 4DEnVar solution, which can be potentially used to improve the chemical forecasts for the next day. The 4DEnVar analysis increment xa accounts for the correction of both the initial condition and the model forecast (model error). However, the control variable xa only contains the model state. It follows that the correction of the initial condition is simply given by x(t=0)a, i.e., the rst element of the 4-D vector xa. The correction of the model error contributes to the values of x(t>0)a but a diagnostic procedure has to be applied to retrieve it.

We can compute a forecast trajectory xf using the nonlinear model starting from the updated initial condition x(t=0)b+ x(t=0)a. Subtracting it from the 4DEnVar analysis at t > 0, we obtain

[Delta1]x(t>0) = x(t>0)a x(t>0)f. (12)

This difference contains information about the contribution of the model error in the 4-D state, as explained below. Let us rst dene the model error and analysis error ea as

(t) = x(t) M(t1)!(t)(x(t1) ) (13) e(t)a = x(t)a x(t) , (14)

where x(t) is the truth at time t, M(t1)!(t) is the integration

of the model from time (t 1) to time t, with the temporal

discretization matching the length of 4DEnVar sub-windows (Sect. 3.2). From Eq. (13), we have

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

E. Emili et al.: QG-Chem 3943

(t+1) = x(t+1) M(t)!(t+1)(x(t) )

= x(t+1) M(t)!(t+1)( (t) + M(t1)!(t)(x(t1) ))

= x(t+1) M(t)!(t+1)( (t) + M(t1)!(t) ( (t1) + M(t2)!(t1)(x(t2) )))

= ...

[similarequal] x(t+1) M(0)!(t+1)(x(0) )

model biases that show an hourly variability but are stationary on successive days, as it is found in most of air quality models (Marcal et al., 2015; Gaubert et al., 2014). Note also that the computation of the effective model error is blind to the type of the underlying model error (e.g., chemical emissions or physical parameterizations). The two main requirements needed to make a useful estimation of [Delta1]x are(i) analysis errors smaller than model errors and (ii) an approximate knowledge of the sources of model error, necessary to generate informative ensembles.

Alternatively from the proposed procedure, model error terms (t) could be diagnosed for each sub-window and then applied to the next-day forecast on an hourly basis. However, this correction method is more intrusive, because it must be applied during the nonlinear model forecast and was not considered for this study.

4 Results and discussion

Numerical experiments are described and discussed in this section. The objective is to assess the performances of the 4DEnVar algorithm for the assimilation of the four key species: NO2, O3, CO and CO2. In all experiments, a model simulation with unperturbed parameters (same as the one in Sect. 2.3) represents the truth. The experiments are performed during the meteorologicalchemical situation described in Sect. 2.3. Synthetic observations are generated from the truth by applying H (Sect. 3.1) and by adding a normally distributed error (Table 5). Four observation locations are considered for the experiments (Fig. 1), where chemical species are observed hourly. The relatively low density of the observation network allows us to assess DA skills at unobserved locations. Model forecasts are produced by perturbing only the initial condition (perfect model), the surface emissions (model error) or both. DA is performed with either 3DVar or 4DEnVar, and the obtained analyses are compared to the truth. The meteorology is never observed nor perturbed, which corresponds to use QG-Chem in a CTM-like mode.

Operational air quality centers collect hourly observations and perform DA typically once per day (Marcal et al., 2015). Therefore, a 24 h assimilation window is adopted when using the 4DEnVar algorithm, with 1 h sub-windows matching the observations frequency. For the same reason, 24 sequential cycles of 1 h are adopted with 3D-Var. The main processes affecting air quality forecasts (daily emissions, evolution of the mixing layer, photochemistry) have a period of approximately 24 h. Therefore, a 24 h window permits to account for errors in main model processes. The utilization of longer windows is theoretically possible with 4DEnVar, assuming that the linearization of the model perturbations remains valid. However, the numerical cost of the minimization increases with the windows length when keeping xed the duration of the sub-windows. The costbenet

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

Xj=1t+1

Yl=j+1 Ml (j),

where we have used the Taylor expansion and assumed that the forecast model can be linearized inside each sub-window:

M(t1)!(t) Mt. This equality can be rewritten as

x(t+1) M(0)!(t+1)(x(0) ) [similarequal] (t+1)+

Xj=1t+1

Yl=j+1Ml (j). (15)

Using the denition (Eq. 14) and the linear assumption on the model, Eq. (12) yields

[Delta1]x(t) = x(t)a M(0)!(t)(x(0)a)

= x(t)a x(t) + x(t) M(0)!(t)(x(0)a x(0) + x(0) )

= e(t)a + x(t) M(0)!(t)(e(0)a + x(0) )

[similarequal] e(t)a + x(t) M(0)!(t)(x(0) )

Yl=0 Mle(0)a.

Substituting the equality (Eq. 15) into the latter, we get

[Delta1]x(t) [similarequal] e(t)a

Yl=0Mle(0)a + (t) + t1

Xj=1t

Yl=j+1Ml (j). (16)

Therefore, if the error (e(t)a

Qtl=0Mle(0)a) is small, the vector [Delta1]x(t) represents the contribution of the model error on the 4-D state, accumulated through the preceding 4DEnVar sub-windows. Hereafter, we will name [Delta1]x(t) effective model error, to be distinguished from the model error, which is generally associated with .

In a rst instance, [Delta1]x can be used to diagnose the underlying presence of model error in the forecast. Furthermore, if the model error is stationary along multiple DA windows (bias), [Delta1]x could be used to correct the forecast for the next window:

exi+1f = xi+1f [Delta1]xi, (17) where the superscript denotes the assimilation window index and the

exf is the corrected forecast. Validation of operational air quality models shows that biases contribute to a large part of the model uncertainties (Marcal et al., 2015; Zyryanov et al., 2012; Huijnen et al., 2010), which justies the implementation of the proposed bias correction procedure. Note that this procedure is compatible with

3944 E. Emili et al.: QG-Chem

Table 5. Background (B) and observation (O) error standard deviation used in DA experiments expressed in volume mixing ratio units and, within brackets, in percentage of the average eld value in Fig. 2.

Species B O

NO2 1.2 [notdef] 10

10 (16 %) 0.9 [notdef] 10

10 (11 %)

using small size ensembles, the localization of the sample covariance is necessary. In this study, horizontal localization is applied to the 4-D ensemble covariance using Eq. (11), by setting a horizontal length scale equal to the double of values used for B (1500 km). Multivariate localization is performed using a correlation coefcient of 0.5, which was chosen empirically. A sensitivity analysis concerning the ensemble size and the localization choices is presented later in this section.Since we use QG-Chem in a CTM-like conguration and the two layers are chemically not coupled, the vertical terms of the covariance or localization matrix are always set to zero in this study, without any impact on the presented results.

4.1.1 Univariate assimilation

This section compares results of 3D-Var and 4DEnVar DA in univariate settings, i.e., one independent assimilation experiment is performed for each of the four chemical species. The controlled species corresponds to the species that is measured and that is perturbed at the initial time. With 3D-Var, this is obtained setting all terms of B to zero except for the 3-D covariance of the selected species. With 4DEnVar the same is obtained by setting all multivariate localization coefcients to zero.

Figure 3 compares the temporal trajectories of the analysis of each species obtained from 3D-Var and 4DEnVar. Each gure provides the temporal trajectories at the two grid points shown in Fig. 1: one located in the polluted region and in correspondence with assimilated observations (grid point A) and one in the cleaner region and slightly displaced from the measurements location (grid point B).

In addition to the comparison of the two DA algorithms, these experiments also permit to assess the impact of the initial perturbation on chemical forecasts. First of all, we remark that O3 forecasts are very close to the truth values after 24 h, which is a consequence of the fact that O3 is strongly controlled by precursor emissions and photochemistry. The memory of the initial condition is rapidly lost for O3, as it was also demonstrated within regional air quality models (Jaumouill et al., 2012; Wu et al., 2008). This is not the case for CO2 and CO, which have a longer lifetime. Therefore, the initial perturbation is advected by the wind eld and the spread between the forecast and the truth lasts longer in time. NO2, which lasts a few hours in a summertime atmosphere, is practically not sensitive to the perturbation of the initial condition (Fig. 3). Note also that the chemical concentrations are always lower at the clean location (grid point B) than at the polluted one (grid point A), except for CO2 that is neither emitted nor chemically produced. Again, this is a consequence of the geographical variability of the emission factors (Fig. 1).

We remark that the 4DEnVar provides in general better analyses than 3D-Var for all species. First, it can be observed from Fig. 3 that the analysis time series obtained with the 4DEnVar are smoother than those resulting from 3D-Var be-

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

O3 12.2 [notdef] 10

9 (21 %) 3.2 [notdef] 10

9 (5 %)

CO 64.8 [notdef] 10

9 (14 %) 8.1 [notdef] 10

9 (2 %)

CO2 40.5 [notdef] 10

6 (13 %) 20.2 [notdef] 10

6 (6 %)

ratio of longer windows will deserve further investigations in real applications.

Section 4.1 presents results based on a perfect model hypothesis and compares 3D-Var and 4DEnVar performances for chemical reanalyses. Section 4.2 introduces model errors and is focused on the estimation of the effective model error using 4DEnVar. Section 4.3 summarizes the results on a larger number of DA windows: 4DEnVar results using the model bias correction procedure are compared to 3D-Var results for both reanalyses and forecasts.

4.1 Perfect model experiments

Experiments considering a perfect model are presented in this section for one assimilation cycle of 24 h. Emphasis is placed on the reanalysis capabilities of DA. A 24 h long forecast is produced by perturbing only the initial condition. Initial perturbations are computed applying B1/2 (Eq. 8) to a Gaussian uncorrelated noise eld N(0,I) at t = 0. Since

chemical concentrations can span different orders of magnitudes in the atmosphere, the standard deviation values set in B1/2 depend on the chemical species (Table 5). The background error standard deviations have been chosen to be about 1020 % of the average eld values (Fig. 2). The same horizontal correlation length has been set for all species (750 km). Multivariate correlations in B1/2 are switched off so that perturbations of chemical species are not correlated at t = 0. Chemical correlations can, however, arise at t > 0 due

to chemical couplings.

DA is applied to correct the 24 h long perturbed forecast. The same covariance matrix B that was used to produce the initial perturbation is applied at each cycle of the 3D-Var analysis (1 h). Hence, the background error covariance used in DA is perfectly known at t = 0, which represents the best

possible case in DA. At t > 0 the true B depends on the observations assimilated at previous steps and on the model dynamics, which makes the use of a xed B within 3D-Var a raw approximation. However, this is the typical setting of operational air quality models.

With 4DEnVar, an ensemble of 16 forecasts is generated perturbing the initial condition with the same B as above.

Therefore, the ensemble size is small compared to the dimensions of the system (16 [notdef] 8 [notdef] 97 = 12 416 variables). When

E. Emili et al.: QG-Chem 3945

TruthFree forecast 4DEnVar analysis 3D-Var analysis

Grid point A Grid point B

CO2

NO2

Figure 3. DA results with a perfect model hypothesis. Temporal trajectories for the four assimilated species are shown from top to bottom (O3, CO, CO2 and NO2). Forecast, 3D-Var/4DEnVar analyses and truth are shown in each plot for the grid points A (left plots) and B (right plots) depicted in Fig. 1.

cause daily trajectories are optimized at once with 4DEnVar.

The sequential aspect of 3D-Var, instead, makes the analysis more sensitive to the random observation errors. This introduces the observed jumps in the analyses.

Figure 4 provides the root mean square error (RMSE) gain (in %) for every grid point (i,j) of the model domain,

RMSEgain(i,j) =

1 RMSEanl(i,j) RMSEfct(i,j)[parenrightBig]

. (18)

Here, [notdef]X(i,j) is the average concentration of the truth val

ues (Fig. 2), RMSEfct and RMSEanl are the absolute values

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

3946 E. Emili et al.: QG-Chem

3D-Var RMSE gain 4DEnVar RMSE gain

0 1500 3000 4500 6000 7500 9000 10 500 12 000

CO2

0 1500 3000 4500 6000 7500 9000 10 500 12 000

NO2

Figure 4. DA results with a perfect model hypothesis. From top to bottom: the RMSE gain (Eq. 18) is displayed, respectively, for O3, CO, CO2 and NO2 assimilation experiments. Blue color means that DA lowered the RMSE and red color means that DA increased the RMSE.

Plots on the left are obtained using 3D-Var, on the right using 4DEnVar.

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

E. Emili et al.: QG-Chem 3947

Table 6. Summary statistics of RMSE gain for perfect model DA experiments in Fig. 4. Signs have been inverted compared to gures to show positive values when DA reduces the RMSE.

3D-Var 4DEnVar

Species Max. gain (%) Min. gain (%) Avg. gain (%) Max. gain (%) Min. gain (%) Avg. gain (%)

O3 17.11 7.67 0.92 27.03 6.68 2.55

CO 17.15 3.33 1.22 18.82 4.11 1.62

CO2 13.08 10.60 1.76 13.62 7.51 1.93

NO2 57.39 6.15 0.95 48.07 10.52 1.21

of RMSE (in chemical concentration units) for the forecast and analysis, respectively. The RMSE at the location (i,j) is dened as

RMSE(i,j) =

[radicalbigg]

known in the experiment setup, the degradation of the 4DEn-Var analysis is a consequence of the algorithm hypotheses (e.g., the linearization of the forecast model) or of the numerical implementation (e.g., the nite size of the ensemble and the localization approximations).

4.1.2 Multivariate assimilation

A second set of DA experiments has been performed, but perturbing and assimilating the four chemical species at the same time. Therefore, one 24 h long forecast and the corresponding analysis has been computed for all species, instead of four independent analyses as before. In the case of 3D-Var, elements of B related to the four assimilated species are set using the same parameters as in Sect. 4.1.1. Elements related to unobserved species are kept at zero as well as for cross-variable correlations. This leads to a multi-species assimilation, which is not yet multivariate. Effects on unobserved variables or between species are still permitted by chemical couplings in the forecast model.

The gain on RMSE obtained with 3D-Var for the four species (not shown) is very similar to those obtained in Sect. 4.1.1 with independent experiments.

With 4DEnVar, the corresponding multi-species assimilation has been tested by setting the cross-variable correlation coefcients to zero in the localization operator. In addition, a multivariate case has also been examined by setting the correlation coefcients to 0.5. In both cases results (not shown) were found to be very similar again to those in Fig. 4.

These results indicate that, when the initial condition is solely taken as a source of uncertainty, the chemical coupling between species does not inuence DA much. This is conrmed by the ensemble standard deviation of the 4DEn-Var experiments in Sect. 4.1.1. For each of the four assimilation experiments the average ensemble standard deviation has been computed for all species (perturbed and not perturbed at t = 0). The ensemble standard deviation of unper

turbed species stays below 1 % of the local concentration (not shown), compared to typical values of about 1015 % for the species that are perturbed initially. Therefore, the perturbation of the sole initial state does not affect signicantly the chemical balance. This also justies neglecting the cross correlations between chemical species in operational systems

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

x (i,j) x(i,j)[parenrightBig]T [parenleftBig]

x (i,j) x(i,j)[parenrightBig]

, (19)

where N = 24 is the total number of sub-windows, x (i,j) and

x(i,j) denotes the temporal trajectories of the truth and the analysis (or forecast) at the grid point (i,j), respectively. In

Fig. 4 the blue color means that RMSE values after DA have reduced (DA improved the forecast) and the red color means that RMSE values after DA have increased (DA degraded the forecast). Therefore, we can see from these gures that with 4DEnVar, improvements of the RMSE at unobserved locations are more pronounced than those of 3D-Var (more widespread blue regions and less red regions in RMSE gain). This is likely due to a better description of the background error covariance, which is ow dependent and closer to the true forecast error covariance within 4DEnVar. For example, the background error covariance used to assimilate NO2 with 3D-Var is highly overestimated for t > 0. This happens because, when the model is perfect, the NO2 eld rapidly converges to the truth after a few hours (Fig. 3). In this specic case, the RMSE is degraded even at observed locations (Fig. 4, bottom left plot). This is an instructive example of the effects of incorrectly specied B within 3D-Var. Finally, note that for other species also 3D-Var is capable of decreasing RMSE at unobserved locations. This effect is a result of the advection of the analysis increments, and, as expected, is more pronounced with long-lived species (CO, CO2) than with more reactive or emitted gases (O3, NO2).

Table 6 reports the minimum, maximum and average of the eld values in Fig. 4, with the sign inversed to display positive values for positive gain of DA, and negative otherwise. It can be seen that with 4DEnVar, the maximum degradation of RMSE (i.e., the minimum gain in absolute values) is always smaller by a factor 2 to 5 than the maximum gain, and the average RMSE gain is always positive (i.e., DA improves the forecast). The appearance of local but relatively small RMSE degradation can be tolerated in atmospheric chemistry, because the chemical system has a dissipative behavior and errors in the model state cannot grow during the forecast step. Since the error covariance of the initial condition is perfectly

1 N

3948 E. Emili et al.: QG-Chem

RMSE gain (%)

-2

-5

-4

-10

-6

8 16 32 64 128

CO2

NO2

RMSE gain (%)

-2

-4

RMSE gain (%)

-6

-8

-10

-12

-2

-14

-4

-16

8 16 32 64 128

Avg RMSE (x10) Min RMSE

Max RMSE

Figure 5. Impact of ensemble size (Ne) on the RMSE gain of the 4DEnVar analyses. Results are shown for the four chemical species: O3, CO, CO2 and NO2. For each plot, values of maximum, minimum and average RMSE gain (Table 6) are shown in different color, compared to reference results obtained with 3D-Var. Positive values mean better RMSE gain with 4DEnVar than with the reference 3D-Var. Average

RMSE gain has been multiplied by 10 to better highlight the differences among the experiments.

that assimilate hourly observations sequentially. This result was expected for weakly reacting species like CO2 or CO but was not evident for reacting gases such as O3 and NO2.A possible reason is that the amount of NO2 produced hourly by the oxidation of emitted NO is much larger than the applied initial perturbation. Therefore, the O3 photochemical production, which happens later during the day, is not much inuenced by the perturbation of NO2 at midnight. It would be interesting to verify if similar results also hold when larger perturbations are applied during daytime. A wider exploration of different chemical regimes is left for a future study.

4DEnVar was capable of providing similarly good results as in Sect. 4.1.1 when enabling the cross-variable covariances (and the respective localization). This means that the noise of chemical cross covariances due to the small ensemble size (16 members) did not degrade the results. This leaves hope for an effective multivariate chemical assimilation when the role of chemical couplings becomes larger (Sect. 4.2).

4.1.3 Ensemble size and localization

This section examines the impact of the principal parameters of the 4DEnVar algorithm, i.e., the ensemble size (16 members in previous experiments) and the localization length scale, on the analysis RMSE. These two aspects are closely linked with the dimension of the system and the eigenspectrum of the error covariance matrices (Furrer and Bengtsson, 2007). For instance, for a strongly decaying spectrum relatively few ensemble members are enough to provide a good approximation of the error covariance matrix. If this is not the case, a larger ensemble size allows in principle less severe localization. However, no theoretical formulation exists already for the 4DEnVar that links these parameters, e.g., the localization scale with the ensemble size. Moreover, the number of parameters of the localization operator (e.g., the horizontal length scale) might depend on the chemical species. A full exploration of the parameter space becomes rapidly unpractical even in a simplied model framework. Finally, the approximations made in the implementation of the

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

E. Emili et al.: QG-Chem 3949

RMSE gain (%)

-10

-20

-5

-30

-10

-40

-15

-50 noloc 500 km 1000 km 1500 km 2000 km 3000 km

-20 noloc 500 km 1000 km 1500 km 2000 km 3000 km

Localization scale

CO2

NO2

RMSE gain (%)

-5

-10

-15

-20

-20 noloc 500 km 1000 km 1500 km 2000 km 3000 km

-25 noloc 500 km 1000 km 1500 km 2000 km 3000 km

Localization scale

Avg RMSE (x10) Min RMSE

Max RMSE

Figure 6. Same as Fig. 5 but for the horizontal length scale of the localization operator (noloc when no localization is applied).

localization operator (Sect. 3.2) can potentially have a larger effect than the choice of the parameters themselves. Therefore, an empirical approach has been used in this paper, and we postpone a detailed theoretical analysis to future work.

In the empirical approach, rst the ensemble size is examined using xed but reasonable values for the localization operator (Sect. 4.1). The ensemble size can be one of the main limiting factors in operational forecast centers and the smallest ensemble providing better results than 3D-Var for all species was taken. Second, the impact of horizontal and chemical localization are examined keeping the selected ensemble size xed.

Results for the case study are shown in Fig. 5 (varying ensemble size), Fig. 6 (varying horizontal localization) and Fig. 7 (varying cross-variable localization). The minimum, maximum and mean RMSE gain (Table 6) of the analysis are considered to compare 4DEnVar and 3D-Var experiments. Hence, the plots display the differences between the RMSE gain of 4DEnVar minus the 3D-Var one, with the sign opportunely adjusted to display positive values when 4DEnVar beats 3D-Var, negative values otherwise. In general, the desired case is represented by all bars (mean, maximum and minimum gain) being positive. Nevertheless, a positive average gain with the sporadic occurrence of negative maxi-

mum/minimum gains can be tolerated, if the values for the minimum gain bar do not become too negative. Since the minimum RMSE gain is already negative (Table 6), negative values for the corresponding bars in Figs. 5, 6 and 7 mean that the degradation of the 4DEnVar analysis is larger than the 3D-Var analysis, which is not desired.

Results are very similar using 64 or 128 members, for all species, suggesting that some convergence of the assimilation scores is achieved with more than 64 members (Fig. 5).As expected, the accuracy starts to decrease using less than 32 members, with most of the gain of 4DEnVar over 3D-Var being lost using only 8 members. In a few cases, the RMSE gain occasionally decreases when increasing the size of the ensemble. This is due to statistical uctuations when the ensemble sizes are small. To avoid misinterpretation of statistical noise, the comparison of 3D-Var and 4DEnVar is repeated in Sect. 4.3 for a larger number of DA windows. We remark nally, that results with 16 members satisfy the requirements expressed above: better average RMSE gain than 3D-Var for all species and a limited number of cases of RMSE degradation. Hence, an ensemble size of 16 members is retained. We remind readers that the objective of this study is to demonstrate the applicability of a DA algorithm that could outperform currently implemented methods in operational centers,

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

3950 E. Emili et al.: QG-Chem

RMSE gain (%)

-2

-4

-6

-5

-8

1 0.5 0.2 0

Chemical localization coeff.

CO2

NO2

RMSE gain (%)

-5

-10

-15

-20

-25

-1

-30

-2

-35

1 0.5 0.2 0

Chemical localization coeff.

Avg RMSE (x10) Min RMSE

Max RMSE

Figure 7. Same as Fig. 5 but for the multivariate correlation coefcient of the localization operator (with 0 corresponding to univariate DA and 1 to a full multivariate unlocalized case).

but with an acceptable compromise between computational costs and precision. Therefore, even if an ensemble size of 32 or 64 would have represented a more accurate option for this study, we found it more valuable to assess the potential of 4DEnVar when computational resources might be limited.

The choice of the horizontal localization scale is more delicate, because it is intimately linked to the model dynamics and depends on the ensemble size and on the assumptions made to construct and apply C (Sect. 3.2). Figure 6 shows that horizontal localization is necessary to obtain meaningful results with 4DEnVar. Second, increasing the localization scale to values as high as 3000 km, compared to the 750 km of the initial perturbation scale, has the effect of improving the maximum RMSE gain but also degrading signicantly the minimum gain. This is not desired, as explained above.The best results, considering employing the same homogeneous and global localization scale for all chemical species, can be found for the value of 1500 km.

The conguration of a multivariate localization, or chemical localization in our study, presents the same issues as the spatial one. A similar approach as in the above paragraph has been taken. The numerical experiments described

in Sect. 4.1.2, which considered multivariate cases, are used to compute error differences in Fig. 7. Again, we remark that without any localization the statistical noise of the ensemble covariance signicantly degrades the results compared to 3D-Var. A localization coefcient of 0.5 reduces efciently the effect of noisy correlations in the 4-D ensemble covariance, leaving the possibility of describing multivariate effects. We remind readers that multivariate effects were found to be very small in perfect model experiments presented until now, but can be much larger when a model error is introduced (Sect. 4.2).

We conclude that a simple localization scheme, based on global and empirically tuned parameters, provides already encouraging results for the application of 4DEnVar to large-scale chemistry models. The development of more sophisticated localization operators (Bocquet, 2016; Desroziers et al., 2016) and more rigorous methods to estimate their parameters (Mntrier et al., 2015), represents a current subject of research, and will be the topic of a future study.

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

E. Emili et al.: QG-Chem 3951

4.2 Model error experiments

In the previous section, we have shown how 4DEnVar is able to match or even outperform 3D-Var results when the model is perfect. However, the main interest of 4DEn-Var for atmospheric chemistry arises when the model is not perfect, i.e., a case that is more easily addressed using ensemble-based methods. In this section, 4DEnVar experiments are conducted in the presence of a model error term. A typical source of uncertainty in CTMs is represented by anthropogenic or biogenic emissions (Kok, 2011;Zhao et al., 2011; Ma and van Aardenne, 2004). In the case of reactive species like NOx, erroneous emissions can impact strongly the formation of secondary species such as O3 (Lei and Wang, 2014; Sillman, 1999). Errors in surface emissions already produce complex and rich dynamics and will be used as a test bed to investigate the effects of model error in DA. Other sources of uncertainties, e.g., chemistry parameters or meteorology, will be addressed in a future study, considering that the same methodology used here can be applied.

Similar single-cycle DA experiments are conducted as in Sect. 4.1. The main difference with the previous section is that experiments are done here during the spin-up period of the model (days 24). This allows us to examine the model error estimation during the pollution build-up period, when the chemical system is in a transient phase and daily cycles of reactive gases are not yet stationary. This represents a more challenging and realistic situation for testing the bias correction procedure (Eq. 17), which is fully consistent only in stationary conditions. The true NO emissions (Table 3) are perturbed by a multiplicative factor, which is sampled from a log-normal distribution with mean and sigma equal to 0.5 and0.8, respectively. Forecast emissions are increased by a multiplicative factor of 2.35, whereas the log-normal distribution has been used to generate emission perturbations for the ensemble of forecasts. The emissions perturbation is constant in time but not in space, due to the geographical variability of emission factors (Fig. 1).

The main objective of this section is to illustrate the application of the bias estimation procedure (Sect. 3.3) on chemical elds. Chemical interactions alone can already give rise to complex temporal dynamics, which can produce unattended behavior within typical hypotheses of most DA schemes (Tang et al., 2016). Therefore, a simplied model setup is used in this section by means of deactivating the advection of chemical species. This reduces QG-Chem to a collection of 0-D chemistry models and allows us to focus on the effects of model errors in chemistry. Aside from this, the same exact conguration as before is used for 4DEnVar (initial condition error, ensemble size, localization). DA results in a more general case, when both chemistry and advection are activated, will follow in Sect. 4.3. Univariate O3 DA is presented in Sect. 4.2.1. Results of a multivariate DA experiment are presented in Sect. 4.2.2, to examine the combined

TruthFree forecast 4DEnVar analysis True model bias Model bias estimation

IC error and perfect model

Model error and perfect IC

IC error and model error

Figure 8. O3 effective model error estimation in the univariate case (only O3 assimilated). Temporal trajectories during day 2 for the forecast, the analysis, the truth, the true effective model error and the effective model error estimated using Eq. (12), for the pixel A located close to an observation. From top to bottom the following experiments are shown with the uncertainties being introduced: (i) only in the O3 initial condition (as in Sect. 4.1), (ii) only in the forecast model (through surface emissions perturbation) and(iii) both in the O3 initial condition and in the forecast model.

effects of model error and chemical couplings. Finally, the impact of the model bias correction procedure is evaluated on 48 h forecasts of several species in Sect. 4.2.3.

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

3952 E. Emili et al.: QG-Chem

IC error and perfect model

Model error and perfect IC

O3 effective model error

ppbv

IC error and model error

ppbv

0 1500 3000 4500 6000 7500 9000 10 500 12 000

True effective model error

computed by subtracting the truth from a forecast initialized from the truth, and it is also added to the gures.

The effective model error is zero in the rst experiment, which is correctly diagnosed by the procedure. In the second experiment, the estimated effective model error is approximately zero until 10:00 UTC and grows to positive values of about 8 ppbv later in the afternoon, which is coherent with the underlying perturbations and chemical mechanism.

The temporal average of the effective model error (Fig. 9) shows that the larger errors are diagnosed in the center of the domain, coherently with the characteristics of the emissions errors (global scaling of NO emissions). The geographical patterns and the differences from the true effective error (bottom right plot in Fig. 9) are a result of the observations location and localization scale. With the proposed procedure, the effective model error estimation relies on the localization scale that has been used for the 4DEnVar analysis. Therefore, the effective model error approaches zero moving far from assimilated observations, no matter what the spatial patterns of the true model error are. In the third experiment, the temporal behavior of the effective model error is well captured at the observed location. However, signicant differences with the second experiment are visible in the spatial distribution of the error. Ideally, the estimation of the effective model error should provide exactly the same results in the second and third experiment. Differences arise because DA is not perfect due to small ensemble size, linearization hypotheses, observation number and observation errors. Also, when adding de-

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

Figure 9. Temporal averages (24 h) of the estimated effective model error for the three experiments as in Fig. 8. From left to right and from top to bottom, experiments (i), (ii) and (iii) are shown. On the bottom right plot the true effective model error in the experiments (ii) and (iii) is shown (no model error was activated for the rst experiment).

4.2.1 Univariate assimilation

Three DA experiments are shown in Fig. 8: (i) activating only the initial perturbation, (ii) activating only the model error and (iii) with both perturbations activated. Only O3 observations have been assimilated and the chemical localization coefcients are set to zero to compute univariate analyses.

With the NO emissions increased by a factor of 2.35, the forecast produces higher concentrations of O3 (Fig. 8, middle plot). We remark also that the effects of model error in O3 appear later in the day, due to the photochemistry, whereas the perturbation of the initial condition has a larger effect in the rst part of the day. In this specic case, it is also interesting to note that, when both perturbation are applied, a compensation of the two errors cancels out the differences between the forecast and the truth later in the day. This is an example of compensating errors in atmospheric chemistry, which might be hard to detect when comparing model results to observations.

The 4DEnVar analyses agree well with the truth in all three cases, similarly to what was obtained in Sect. 4.1.1. This satises the main requirement for a meaningful computation of the effective model error (Sect. 3.3). The estimation of the effective model error with Eq. (12) is also displayed in Fig. 8.We remind readers that the effective model error is expressed in the same physical units of the model state, i.e., chemical concentration units (ppbv). The true effective model error is

E. Emili et al.: QG-Chem 3953

TruthFree forecast 4DEnVar analysis True model bias Model bias estimation

4.2.2 Multivariate assimilation

The effects of O3 assimilation on other chemical species are shown in Fig. 10. These were obtained by repeating the third experiment of Sect. 4.2.1 but setting the chemical localization coefcient to 0.5 instead of zero. Compared to the results discussed in Sect. 4.1.2, multivariate corrections are now very signicant. For example, analyzed NO and NO2 concentrations are almost halved and the initial forecast errors greatly reduced. On the other hand, CO, which was not perturbed initially nor is it strongly coupled to NO / NO2 / O3 concentrations, is not modied at all by the DA. This shows that multivariate aspects of chemical DA can be well captured by the 4DEnVar algorithm, even with a small ensemble.

The effective model error can be computed for all the variables of the state vector, and is plotted in Figs. 10 and 11.The error elds of NO2 and NO reproduce well the temporal and spatial features of the perturbation on NO emissions that was implemented in the forecast. Note that since NO is rapidly converted into NO2 during the night, the initial linear increase of the effective model error is only observable on NO2. On the other hand, during daytime, a strong presence of the effective model error is found for both NO and NO2.

However, the NO2 and NO errors trajectories are not linear during daytime, due to the complex photochemistry.

Strong nonlinearities of the chemical system cannot be taken into account by the current implementation of the 4DEnVar algorithm. When the NO / O3 relationship was strongly nonlinear, inaccurate analyses and, therefore, estimations of the effective model error have been found (not shown). This difculty could be overcome by introducing external loops within 4DEnVar, similarly to how it was already done with the IEnKS algorithm (Haussaire and Bocquet, 2016). This represents the objective of a future study.Alternatively, the combined assimilation of O3 and NO2 or a more severe chemical localization can reduce the occurrence of analysis errors in nonlinear regimes.

4.2.3 Impact on chemical forecasts

The analysis computed in the previous section has been used to initialize chemical forecasts for the following 48 h. This was done to evaluate the potential of the forecast bias correction procedure (Eq. 17). The corrected forecasts (CF) of O3,

NO2 and NO are compared to the initial 3-day forecast (OF), the 48 h forecast initialized from the latest available analysis without any correction (AF) and the truth (Fig. 12).

First, the AF converges very rapidly (in about 12 h) to the OF for all species, conrming the limited effects of the state correction on the chemical forecasts for the next days (Wu et al., 2008). On the other hand, the CF is very close to the truth during the rst 24 h, indicating that the hypothesis of a stationary effective model error in successive days (but hourly variable) seems appropriate in this case. During

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

NO2

Figure 10. Multivariate effective model error estimation. Temporal trajectories for day 2, as in Fig. 8 but for NO2, NO and CO species (not assimilated), obtained assimilating only O3. Uncertainties were both in the O3 initial condition and in the forecast model (NO emissions), corresponding to the third experiment in Fig. 8. Only difference with the previous experiment in is that a non-zero multivariate localization coefcient is used here.

grees of freedom to the sources of uncertainty and keeping constant the observation network, DA becomes more challenging. For example, if observations of O3 were available only in the late afternoon, no discrimination between the two sources of error could have been achieved in the third experiment. However, thanks to the hourly frequency of O3 observations, the temporal and large-scale features of the model error have been retrieved also in the third case.

3954 E. Emili et al.: QG-Chem

ppbv

NO2 effective model error

NO effective model error NO true effective model error

NO2 true effective model error

ppbv

0 1500 3000 4500 6000 7500 9000 10 500 12 000

Figure 11. Temporal averages (24 h) of the estimated effective model error (in ppbv) for the multivariate DA experiment in Fig. 10. Estimated effective model error for NO2 on top and NO on bottom. On the right, the true effective model error is displayed for both species.

Table 7. DA RMSEs for seven cycles of analyses/next-day forecasts where all considered species are perturbed and assimilated, and with model error enabled. [epsilon1]min, [epsilon1]max and [epsilon1]avg (in %) are the minimum, maximum and average values of the relative RMSE (Eq. 19 divided by the truth average) on the QG-Chem domain.

Species

Reanalysis Forecast

3D-Var 4DEnVar 3D-Var 4DEnVar

[epsilon1]min [epsilon1]max [epsilon1]avg [epsilon1]min [epsilon1]max [epsilon1]avg [epsilon1]min [epsilon1]max [epsilon1]avg [epsilon1]min [epsilon1]max [epsilon1]avg

O3 2.8 25.7 14.3 2.0 26.4 13.3 5.1 23.9 14.4 4.9 21.7 11.3 CO 1.3 42.8 17.1 1.1 27.7 13.1 6.7 90.8 37.3 7.2 50.1 22.1

NO2 7.7 88.9 41.2 4.1 69.4 23.3 23.6 117.8 74.0 11.3 90.3 37.2 CO2 3.1 15.3 8.6 1.4 17.5 8.6 1.7 13.7 5.7 1.9 13.6 6.9

the third day, a positive forecast correction is still achieved.

However, chemical concentrations have evolved too much to be efciently corrected using the bias estimated on the rst day. Estimating and correcting the forecast tendencies, instead of the state, could provide a better result for the third day.

Accounting for model errors, either bias or tendencies, for the next-day forecast requires the model uncertainties to be stationary. A stationary error has been used in this study since surface emissions were taken constant in time. Most of regional air quality models seem largely affected by stationary errors (Marcal et al., 2015) and assuming the persistence of the bias during 24 h looks reasonable. However, the main sources of uncertainties of real CTMs need rst to be identied to allow a meaningful effective model error estimation. Therefore, the implementation of 4DEnVar to a real CTM is

necessary to further demonstrate its potential for air quality forecasts.

4.3 Statistical comparison between 4DEnVar and 3D-Var

Single-window DA experiments have been examined so far, to better analyze new aspects of 4DEnVar DA for atmospheric chemistry. In this section, DA experiments are conducted for multiple consecutive days, to provide a statistically robust comparison between 3D-Var and 4DEnVar. A general case including initial condition and model errors for all four species is considered, and both reanalyses and 24 h forecasts for the next day are evaluated. The same values as before (multivariate experiments in Sect. 4.1.2) are used for the initial perturbation and for the algorithm settings. Com-

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

E. Emili et al.: QG-Chem 3955

TruthFree forecast 4DEnVar analysis Forecast

Bias-corrected forecast

Assimilation window Forecast

daily cycles are statistically independent. This is done to increase the experiment statistics, without having to deal with the propagation of the analysis covariance through consecutive cycles. This aspect will be investigated in a future study.

The relative RMSE between the experiments and the truth is computed at each grid point for the 7-day period as in Eq. (19). Results are summarized in Table 7: minimum, maximum and average values of the relative RMSEs computed over the QG-Chem domain are reported for 3D-Var and 4DEnVar experiments.

We conrm preceding ndings concerning the reanalyses capabilities of 4DEnVar, which provides superior results to 3D-Var for all species. Results are particularly good for species strongly related to emissions (CO and NO2), which show reduced average RMSE values compared to 3D-Var results. This is a consequence of precisely accounting for emission uncertainties within 4DEnVar. A similar result was obtained by Haussaire and Bocquet (2016). They showed that by using an ensemble forecast of the meteorology, thus partially accounting for model error, the root mean square error of IEnKS (a nonlinear 4DEnVar method) on the low-order online tracer model L95-T was improved by 25 to 50 %.Moreover, the 4DEnVar maximum RMSE is also similar or lower than in 3D-Var reanalyses, which suggests that the occurrence of degraded results compared to 3D-Var (negative bars in Fig. 5) is not systematic.

Similar conclusions can be drawn for the RMSE of the next-day forecast. We remind readers that the bias correction procedure has been used with 4DEnVar, whereas no correction is applied with 3D-Var. CO and NO2 forecasts, show signicantly lower RMSE with 4DEnVar, due to the forecast bias correction. O3 forecast improvements are also observed, even if smaller than those for the two precursors. With 4DEn-Var, forecasts of CO2 are slightly worse than with 3D-Var.

Since CO2 is not affected by the considered model error and it is not chemically coupled to other species, the relative bias correction term should be strictly equal to zero. However, the small ensemble size and use of localization introduce statistical noise in the effective model error estimation, which can impact the forecast correction. This issue can be mitigated by selectively setting to zero the chemical localization coefcient between species that are chemically uncoupled. However, results remained on par with 3D-Var in this study.

We can conclude that the forecast of species related to surface emissions, either directly or through chemical couplings, can be signicantly improved when the model error is considered within DA. The forecasts of CO2, which is a passive tracer in our study, seem instead dominated by the number of assimilated observation, no matter which DA algorithm is employed. However, this is not the case in real applications, where CO2 concentration is modulated by uncertain anthropogenic emissions and natural sinks.

We nally remark that EnKF, which also takes advantage of the information available from the ensembles, could have represented an alternative and possibly more accurate refer-

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

NO2

Figure 12. Impact of model bias correction on 48 h forecasts. Temporal trajectory of free forecast (3 days), analysis (24 h), truth and 48 h long forecasts initialized from the latest available analysis, for the pixel A in Fig. 1. DA of O3 is performed only during the rst 24 h (as in Fig. 10). Curves are displayed in dark blue when no bias correction is applied to the nal forecast and are salmon colored when the effective model error estimation in Fig. 10 is used to correct the nal forecast (Eq. 17).

pared to Sect. 4.2, all emissions in Table 3 are now perturbed, using a log-normally distributed scaling factor.

Synthetic observations are generated from a 7-day truth simulation. The four species are assimilated for a period of 7 days, starting on day 20. However, at the end of each cycle of 24 h (forecast, analysis and next-day forecast), the initial condition is reinitialized with the truth concentrations. Initial condition perturbations are recomputed, advancing the seed of the pseudo-random generator. Therefore, the seven

rect precursor species (e.g., NO and NO2), based only on hourly observations of secondary species (O3).

Using DA windows that are long enough (24 h in this study), 4DEnVar allowed to account for the different timescales of the chemical mechanism and the corresponding effects in DA. For example, the delayed impact of NO emission errors in afternoon O3 concentrations was correctly accounted for by 4DEnVar. The contribution of model errors to the 4-D chemical state can be estimated at the cost of an additional forecast, providing quantitative information on possible forecast biases, and, in the case of stationary errors, a method to correct next-day forecasts. This has been tested with success in several independent DA windows, showing that 24 h forecasts of NO2 and CO can be twice as accurate with 4DEnVar as with 3D-Var. Better results have been also found for O3 forecasts.

We conclude that 4DEnVar is potentially of high interest for chemical DA. We determine the main benets to be the implicit specication of a ow-dependent and multivariate error covariance matrix, the possibility of accounting for model errors through stochastic perturbation and the opportunity for using DA windows long enough to catch typical features of model errors in air quality models. The computational cost of 4DEnVar is higher than 3D-Var, but a small ensemble of at least 15 to 20 members remain affordable within most operational centers.

The application to a real CTM remains necessary to evaluate if the advantages of 4DEnVar shown in this study hold in real applications, where model uncertainties are not perfectly known, as well as observational ones. Aspects related to the vertical propagation of the information, which have been neglected with QG-Chem, could also represent an additional challenge in real systems. Finally, research on algorithmic aspects of 4DEnVar is needed to implement more accurate localization operators, to account for nonlinear chemical regimes and to correctly propagate the analysis covariance through successive DA windows. QG-Chem could represent a useful tool for these type of studies.

6 Code and data availability

The QG-Chem code is copyright of the CERFACS laboratory. The sources and the data used in this study are available upon request to E. Emili ([email protected]) or D. Cariolle ([email protected]).

Acknowledgements. We acknowledge Y. Trmolet and M. Fisher for providing the OOPS DA library and the QG model sources. We thank Philippe Moinat for the ASIS model sources and technical help. We nally thank Grald Desroziers for the useful discussions on the 4DEnVar algorithm and the anonymous referees for their contribution in improving the manuscript. This work was supported by the HERMES project, funded by the French LEFE INSU program.

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

3956 E. Emili et al.: QG-Chem

ence than 3D-Var for this study. However, it is signicantly more costly than 3D-Var, it introduces some difculties like the denition of an optimal ination procedure (Constantinescu et al., 2007a; Gaubert et al., 2014) and was not yet available in the OOPS DA library at the time of the study. A more comprehensive comparison of DA schemes of increasing complexity and cost within QG-Chem is left for a future study, the present being focused especially on the bias correction procedure using 4DEnVar.

5 Conclusions

The objectives of this study were (i) to develop a new toy model framework to test advanced DA algorithms for atmospheric chemistry and (ii) demonstrate the potential of the 4DEnVar method for air quality or more general tropospheric chemistry applications. In particular, we addressed the questions of how to jointly assimilate chemical species with very different lifetimes and possible chemical couplings, and how to account for model error to improve forecasts for the next day.

An atmospheric chemistry reduced-order model (QGChem) has been developed, based on quasi-geostrophic meteorology and a detailed tropospheric chemistry scheme. It has been used to simulate the complex spatiotemporal patterns of reactive gases (NOx, O3) and long-lived species (CO,

CO2) under the effects of emissions, chemistry and transport.

QG-Chem has been coupled to a generic library for data assimilation (OOPS) and has been proven to be well suited to perform a large number of DA experiments. Concerning temporal aspects and assimilated observations, the experiments have been designed based on the implementation of chemical DA in operational air quality forecast centers.

A number of DA experiments have been conducted to compare 4DEnVar analyses with 3D-Var analyses in a perfect model hypothesis, in a univariate and a multivariate setting.The sensitivity of 4DEnVar results to the ensemble size and localization method was also assessed. Results with 4DEn-Var are generally better for all chemical species even when using a small ensemble size of 16 members, provided that ensemble localization, even if basic, is applied. This suggests that considering the linearized model dynamics to derive a ow-dependent background error covariance can be benecial for chemical reanalyses. Thanks to 4DEnVar, this can be obtained without need of tangent linear and adjoint codes of the complex CTM. Multivariate effects were found to be not signicant when a perfect model is used, suggesting that multivariate DA goes together with model error for atmospheric chemistry applications.

Clear advantages of using an ensemble method, which is signicantly more costly than 3DVar, have been found when model errors were introduced. It has been shown that 4DEn-Var is able to take into account heterogeneous errors in chemical emissions and complex chemical couplings to cross cor-

E. Emili et al.: QG-Chem 3957

Edited by: S. RemyReviewed by: two anonymous referees

References

Babenhauserheide, A., Basu, S., Houweling, S., Peters, W., and Butz, A.: Comparing the CarbonTracker and TM5-4DVar data assimilation systems for CO2 surface ux inversions, Atmos.

Chem. Phys., 15, 97479763, doi:http://dx.doi.org/10.5194/acp-15-9747-2015

Web End =10.5194/acp-15-9747-2015 http://dx.doi.org/10.5194/acp-15-9747-2015

Web End = , 2015.

Beekmann, M. and Derognat, C.: Monte Carlo uncertainty analysis of a regional-scale transport chemistry model constrained by measurements from the Atmospheric Pollution Over the Paris Area (ESQUIF) campaign, J. Geophys. Res., 108, 8559, doi:http://dx.doi.org/10.1029/2003JD003391

Web End =10.1029/2003JD003391 http://dx.doi.org/10.1029/2003JD003391

Web End = , 2003.

Belo Pereira, M. and Berre, L.: The Use of an Ensemble Approach to Study the Background Error Covariances in a Global NWP Model, Mon. Weather Rev., 134, 24662489, 2006.

Bocquet, M.: Localization and the iterative ensemble Kalman smoother, Q. J. Roy. Meteor. Soc., 142, 10751089, doi:http://dx.doi.org/10.1002/qj.2711

Web End =10.1002/qj.2711 http://dx.doi.org/10.1002/qj.2711

Web End = , 2016.

Bocquet, M. and Sakov, P.: Joint state and parameter estimation with an iterative ensemble Kalman smoother, Nonlin. Processes Geophys., 20, 803818, doi:http://dx.doi.org/10.5194/npg-20-803-2013

Web End =10.5194/npg-20-803-2013 http://dx.doi.org/10.5194/npg-20-803-2013

Web End = , 2013. Bocquet, M. and Sakov, P.: An iterative ensemble Kalman smoother,Q. J. Roy. Meteor. Soc., 140, 15211535, doi:http://dx.doi.org/10.1002/qj.2236

Web End =10.1002/qj.2236 http://dx.doi.org/10.1002/qj.2236

Web End = , 2014.

Bocquet, M., Elbern, H., Eskes, H., Hirtl, M., abkar, R., Carmichael, G. R., Flemming, J., Inness, A., Pagowski, M., Prez Camao, J. L., Saide, P. E., San Jose, R., Soev, M., Vira, J., Baklanov, A., Carnevale, C., Grell, G., and Seigneur, C.: Data assimilation in atmospheric chemistry models: current status and future prospects for coupled chemistry meteorology models, Atmos. Chem. Phys., 15, 53255358, doi:http://dx.doi.org/10.5194/acp-15-5325-2015

Web End =10.5194/acp-15-5325- http://dx.doi.org/10.5194/acp-15-5325-2015

Web End =2015 , 2015.

Buehner, M., Houtekamer, P. L., Charette, C., Mitchell, H. L., and

He, B.: Intercomparison of Variational Data Assimilation and the Ensemble Kalman Filter for Global Deterministic NWP. Part II: One-Month Experiments with Real Observations, Mon. Weather Rev., 138, 15671586, doi:http://dx.doi.org/10.1175/2009MWR3158.1

Web End =10.1175/2009MWR3158.1 http://dx.doi.org/10.1175/2009MWR3158.1

Web End = , 2010. Chevallier, F., Fisher, M., Peylin, P., Serrar, S., Bousquet, P.,

Bron, F. M., Chdin, A., and Ciais, P.: Inferring CO2 sources and sinks from satellite observations: Method and application to TOVS data, J. Geophys. Res.-Atmos., 110, 113, doi:http://dx.doi.org/10.1029/2005JD006390

Web End =10.1029/2005JD006390 http://dx.doi.org/10.1029/2005JD006390

Web End = , 2005.

Constantinescu, E. M., Sandu, A., Chai, T., and Carmichael, G. R.: Ensemble-based chemical data assimilation. I: General approach,Q. J. Roy. Meteor. Soc., 133, 12291243, doi:http://dx.doi.org/10.1002/qj.76

Web End =10.1002/qj.76 http://dx.doi.org/10.1002/qj.76

Web End = , 2007a.

Constantinescu, E. M., Sandu, A., Chai, T., and Carmichael,G. R.: Ensemble-based chemical data assimilation. II: Covariance localization, Q. J. Roy. Meteor. Soc., 133, 12451256, doi:http://dx.doi.org/10.1002/qj.77

Web End =10.1002/qj.77 http://dx.doi.org/10.1002/qj.77

Web End = , 2007b.

Crassier, V., Suhre, K., Tulet, P., and Rosset, R.: Development of a reduced chemical scheme for use in mesoscale meteorological models, Atmos. Environ., 34, 26332644, doi:http://dx.doi.org/10.1016/S1352-2310(99)00480-X

Web End =10.1016/S1352- http://dx.doi.org/10.1016/S1352-2310(99)00480-X

Web End =2310(99)00480-X , 2000.

Derber, J. and Rosati, A.: A Global Oceanic Data Assimilation System, J. Phys. Oceanogr., 19, 13331347, doi:http://dx.doi.org/10.1175/1520-0485(1989)019<1333:AGODAS>2.0.CO;2

Web End =10.1175/1520- http://dx.doi.org/10.1175/1520-0485(1989)019<1333:AGODAS>2.0.CO;2

Web End =0485(1989)019<1333:AGODAS>2.0.CO;2 , 1989.

Desroziers, G., Camino, J.-T., and Berre, L.: 4D-En-Var: link with weak-constraint 4D-Var and different possible implementations,Q. J. Roy. Meteor. Soc., 140, 20972110, doi:http://dx.doi.org/10.1002/qj.2325

Web End =10.1002/qj.2325 http://dx.doi.org/10.1002/qj.2325

Web End = , 2014.

Desroziers, G., Arbogast, E., and Berre, L.: Improving spatial localisation in 4DEnVar, Q. J. Roy. Meteor. Soc., doi:http://dx.doi.org/10.1002/qj.2898

Web End =10.1002/qj.2898 http://dx.doi.org/10.1002/qj.2898

Web End = , 2016.

Dimet, F.-X. L. E. and Talagrand, O.: Variational algorithms for analysis and assimilation of meteorological observations: theoretical aspects, Tellus A, 38A, 97110, doi:http://dx.doi.org/10.1111/j.1600-0870.1986.tb00459.x

Web End =10.1111/j.1600- http://dx.doi.org/10.1111/j.1600-0870.1986.tb00459.x

Web End =0870.1986.tb00459.x , 1986.

Elbern, H., Schmidt, H., and Ebel, A.: Variational data assimilation for tropospheric chemistry modeling, J. Geophys. Res.-Atmos., 102, 1596715985, doi:http://dx.doi.org/10.1029/97JD01213

Web End =10.1029/97JD01213 http://dx.doi.org/10.1029/97JD01213

Web End = , 1997.

Elbern, H., Strunk, A., Schmidt, H., and Talagrand, O.: Emission rate and chemical state estimation by 4-dimensional variational inversion, Atmos. Chem. Phys., 7, 37493769, doi:http://dx.doi.org/10.5194/acp-7-3749-2007

Web End =10.5194/acp- http://dx.doi.org/10.5194/acp-7-3749-2007

Web End =7-3749-2007 , 2007.

Evensen, G.: The Ensemble Kalman Filter: theoretical formulation and practical implementation, Ocean Dynam., 53, 343367, doi:http://dx.doi.org/10.1007/s10236-003-0036-9

Web End =10.1007/s10236-003-0036-9 http://dx.doi.org/10.1007/s10236-003-0036-9

Web End = , 2003.

Fairbairn, D., Pring, S. R., Lorenc, A. C., and Roulstone, I.: A comparison of 4DVar with ensemble data assimilation methods, Q. J. Roy. Meteor. Soc., 140, 281294, doi:http://dx.doi.org/10.1002/qj.2135

Web End =10.1002/qj.2135 http://dx.doi.org/10.1002/qj.2135

Web End = , 2013. Fandry, C. B. and Leslie, L. M.: A Two-Layer Quasi-Geostrophic

Model of Summer Trough Formation in the Australian Subtropical Easterlies, J. Atmos. Sci., 41, 807818, doi:http://dx.doi.org/10.1175/1520-0469(1984)041<0807:ATLQGM>2.0.CO;2

Web End =10.1175/1520- http://dx.doi.org/10.1175/1520-0469(1984)041<0807:ATLQGM>2.0.CO;2

Web End =0469(1984)041<0807:ATLQGM>2.0.CO;2 , 1984.

Furrer, R. and Bengtsson, T.: Estimation of high-dimensional prior and posterior covariance matrices in Kalman lter variants, J. Multivariate Anal., 98, 227255, doi:http://dx.doi.org/10.1016/j.jmva.2006.08.003

Web End =10.1016/j.jmva.2006.08.003 http://dx.doi.org/10.1016/j.jmva.2006.08.003

Web End = , 2007.

Gaubert, B., Coman, A., Foret, G., Meleux, F., Ung, A., Rouil,L., Ionescu, A., Candau, Y., and Beekmann, M.: Regional scale ozone data assimilation using an ensemble Kalman lter and the CHIMERE chemical transport model, Geosci. Model Dev., 7, 283302, doi:http://dx.doi.org/10.5194/gmd-7-283-2014

Web End =10.5194/gmd-7-283-2014 http://dx.doi.org/10.5194/gmd-7-283-2014

Web End = , 2014.

Hamer, P. D., Bowman, K. W., Henze, D. K., Atti, J.-L., and Marcal, V.: The impact of observing characteristics on the ability to predict ozone under varying polluted photochemical regimes, Atmos. Chem. Phys., 15, 1064510667, doi:http://dx.doi.org/10.5194/acp-15-10645-2015

Web End =10.5194/acp-15- http://dx.doi.org/10.5194/acp-15-10645-2015

Web End =10645-2015 , 2015.

Haussaire, J.-M. and Bocquet, M.: A low-order coupled chemistry meteorology model for testing online and ofine data assimilation schemes: L95-GRS (v1.0), Geosci. Model Dev., 9, 393412, doi:http://dx.doi.org/10.5194/gmd-9-393-2016

Web End =10.5194/gmd-9-393-2016 http://dx.doi.org/10.5194/gmd-9-393-2016

Web End = , 2016.

Hollingsworth, A. and Loennberg, P.: The statistical structure of short-range forecast errors as determined from radiosonde data, Part I: The wind eld, Tellus, 38, 111136, 1986.

Huijnen, V., Eskes, H. J., Poupkou, A., Elbern, H., Boersma, K. F., Foret, G., Soev, M., Valdebenito, A., Flemming, J., Stein, O.,

Gross, A., Robertson, L., DIsidoro, M., Kioutsioukis, I., Friese,E., Amstrup, B., Bergstrom, R., Strunk, A., Vira, J., Zyryanov,D., Maurizi, A., Melas, D., Peuch, V.-H., and Zerefos, C.: Comparison of OMI NO2 tropospheric columns with an ensemble of

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

3958 E. Emili et al.: QG-Chem

global and European regional air quality models, Atmos. Chem.

Phys., 10, 32733296, doi:http://dx.doi.org/10.5194/acp-10-3273-2010

Web End =10.5194/acp-10-3273-2010 http://dx.doi.org/10.5194/acp-10-3273-2010

Web End = , 2010.Inness, A., Baier, F., Benedetti, A., Bouarar, I., Chabrillat, S., Clark,H., Clerbaux, C., Coheur, P., Engelen, R. J., Errera, Q., Flemming, J., George, M., Granier, C., Hadji-Lazaro, J., Huijnen,V., Hurtmans, D., Jones, L., Kaiser, J. W., Kapsomenakis, J., Lefever, K., Leito, J., Razinger, M., Richter, A., Schultz, M. G., Simmons, A. J., Suttie, M., Stein, O., Thpaut, J.-N., Thouret, V., Vrekoussis, M., Zerefos, C., and the MACC team: The MACC reanalysis: an 8 yr data set of atmospheric composition, Atmos.Chem. Phys., 13, 40734109, doi:http://dx.doi.org/10.5194/acp-13-4073-2013

Web End =10.5194/acp-13-4073-2013 http://dx.doi.org/10.5194/acp-13-4073-2013

Web End = , 2013.

Jaumouill, E., Massart, S., Piacentini, A., Cariolle, D., and Peuch,V.-H.: Impact of a time-dependent background error covariance matrix on air quality analysis, Geosci. Model Dev., 5, 1075 1090, doi:http://dx.doi.org/10.5194/gmd-5-1075-2012

Web End =10.5194/gmd-5-1075-2012 http://dx.doi.org/10.5194/gmd-5-1075-2012

Web End = , 2012.

Kalnay, E.: Atmospheric modeling, data assimilation, and predictability, Cambridge University Press, 2003.

Kok, J. F.: A scaling theory for the size distribution of emitted dust aerosols suggests climate models underestimate the size of the global dust cycle, P. Natl. Acad. Sci. USA, 108, 10161021, doi:http://dx.doi.org/10.1073/pnas.1014798108

Web End =10.1073/pnas.1014798108 http://dx.doi.org/10.1073/pnas.1014798108

Web End = , 2011.

Koohkan, M. R. and Bocquet, M.: Accounting for representativeness errors in the inversion of atmospheric constituent emissions: application to the retrieval of regional carbon monoxide uxes, Tellus B, 64, 19047, doi:http://dx.doi.org/10.3402/tellusb.v64i0.19047

Web End =10.3402/tellusb.v64i0.19047 http://dx.doi.org/10.3402/tellusb.v64i0.19047

Web End = , 2012.Lei, H. and Wang, J. X. L.: Sensitivities of NOx transformation and the effects on surface ozone and nitrate, Atmos. Chem. Phys., 14, 13851396, doi:http://dx.doi.org/10.5194/acp-14-1385-2014

Web End =10.5194/acp-14-1385-2014 http://dx.doi.org/10.5194/acp-14-1385-2014

Web End = , 2014.

Lorenc, A. C.: Recommended Nomenclature for EnVar Data Assimilation Methods, in: Research Activities in Atmospheric and Oceanic Modeling, 2013.

Lorenc, A. C., Bowler, N. E., Clayton, A. M., Pring, S. R., and Fair-bairn, D.: Comparison of Hybrid-4DEnVar and Hybrid-4DVar Data Assimilation Methods for Global NWP, Mon. Weather Rev., 143, 212229, doi:http://dx.doi.org/10.1175/MWR-D-14-00195.1

Web End =10.1175/MWR-D-14-00195.1 http://dx.doi.org/10.1175/MWR-D-14-00195.1

Web End = , 2015.

Lu, S., Lin, H. X., Heemink, A. W., Fu, G., and Segers, A. J.: Estimation of Volcanic Ash Emissions Using Trajectory-Based 4D-Var Data Assimilation, Mon. Weather Rev., 144, 575589, doi:http://dx.doi.org/10.1175/MWR-D-15-0194.1

Web End =10.1175/MWR-D-15-0194.1 http://dx.doi.org/10.1175/MWR-D-15-0194.1

Web End = , 2016.

Ma, J. and van Aardenne, J. A.: Impact of different emission inventories on simulated tropospheric ozone over China: a regional chemical transport model evaluation, Atmos. Chem. Phys., 4, 877887, doi:http://dx.doi.org/10.5194/acp-4-877-2004

Web End =10.5194/acp-4-877-2004 http://dx.doi.org/10.5194/acp-4-877-2004

Web End = , 2004.

Mallet, V. and Sportisse, B.: Uncertainty in a chemistry-transport model due to physical parameterizations and numerical approximations: An ensemble approach applied to ozone modeling, J.Geophys. Res., 111, D01302, doi:http://dx.doi.org/10.1029/2005JD006149

Web End =10.1029/2005JD006149 http://dx.doi.org/10.1029/2005JD006149

Web End = , 2006.Mandel, J., Bergou, E., Grol, S., Gratton, S., and Kasanick,I.: Hybrid LevenbergMarquardt and weak-constraint ensemble Kalman smoother method, Nonlin. Processes Geophys., 23, 59 73, doi:http://dx.doi.org/10.5194/npg-23-59-2016

Web End =10.5194/npg-23-59-2016 http://dx.doi.org/10.5194/npg-23-59-2016

Web End = , 2016.

Marcal, V., Peuch, V.-H., Andersson, C., Andersson, S., Arteta, J., Beekmann, M., Benedictow, A., Bergstrm, R., Bessagnet, B.,

Cansado, A., Chroux, F., Colette, A., Coman, A., Curier, R. L.,

Denier van der Gon, H. A. C., Drouin, A., Elbern, H., Emili,E., Engelen, R. J., Eskes, H. J., Foret, G., Friese, E., Gauss, M., Giannaros, C., Guth, J., Joly, M., Jaumouill, E., Josse, B., Kadygrov, N., Kaiser, J. W., Krajsek, K., Kuenen, J., Kumar, U., Li-

ora, N., Lopez, E., Malherbe, L., Martinez, I., Melas, D., Meleux,F., Menut, L., Moinat, P., Morales, T., Parmentier, J., Piacentini,A., Plu, M., Poupkou, A., Queguiner, S., Robertson, L., Roul,L., Schaap, M., Segers, A., Soev, M., Tarasson, L., Thomas,M., Timmermans, R., Valdebenito, ., van Velthoven, P., van Versendaal, R., Vira, J., and Ung, A.: A regional air quality forecasting system over Europe: the MACC-II daily ensemble production, Geosci. Model Dev., 8, 27772813, doi:http://dx.doi.org/10.5194/gmd-8-2777-2015

Web End =10.5194/gmd- http://dx.doi.org/10.5194/gmd-8-2777-2015

Web End =8-2777-2015 , 2015.

Mntrier, B., Montmerle, T., Michel, Y., and Berre, L.: Linear Filtering of Sample Covariances for Ensemble-Based Data Assimilation. Part I: Optimality Criteria and Application to Variance Filtering and Covariance Localization, Mon. Weather Rev., 143, 16221643, doi:http://dx.doi.org/10.1175/MWR-D-14-00157.1

Web End =10.1175/MWR-D-14-00157.1 http://dx.doi.org/10.1175/MWR-D-14-00157.1

Web End = , 2015. Miyazaki, K., Eskes, H. J., Sudo, K., Takigawa, M., van Weele, M., and Boersma, K. F.: Simultaneous assimilation of satellite NO2,

O3, CO, and HNO3 data for the analysis of tropospheric chemical composition and emissions, Atmos. Chem. Phys., 12, 9545

9579, doi:http://dx.doi.org/10.5194/acp-12-9545-2012

Web End =10.5194/acp-12-9545-2012 http://dx.doi.org/10.5194/acp-12-9545-2012

Web End = , 2012.

Pedlosky, J.: Geophysical Fluid Dynamics, Springer, 1992.

Rouil, L. and the MACC team: MACC-II Report no. D113.5: Validation report for 2012, Tech. rep., available at: https://www.gmes-atmosphere.eu/documents/maccii/deliverables/eva

Web End =https://www.gmes- https://www.gmes-atmosphere.eu/documents/maccii/deliverables/eva

Web End =atmosphere.eu/documents/maccii/deliverables/eva (last access:4 November 2016), 2014.

Saad, Y. and Schultz, M. H.: GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems, SIAM J. Sci. Stat. Comput., 7, 856869, doi:http://dx.doi.org/10.1137/0907058

Web End =10.1137/0907058 http://dx.doi.org/10.1137/0907058

Web End = , 1986.

Sillman, S.: The relation between ozone, NOx and hydrocarbons in urban and polluted rural environments, Atmos. Environ., 33, 18211845, doi:http://dx.doi.org/10.1016/S1352-2310(98)00345-8

Web End =10.1016/S1352-2310(98)00345-8 http://dx.doi.org/10.1016/S1352-2310(98)00345-8

Web End = , 1999. Stockwell, W. R., Kirchner, F., Kuhn, M., and Seefeld,S.: A new mechanism for regional atmospheric chemistry modeling, J. Geophys. Res.-Atmos., 102, 2584725879, doi:http://dx.doi.org/10.1029/97JD00849

Web End =10.1029/97JD00849 http://dx.doi.org/10.1029/97JD00849

Web End = , 1997.

Tang, X., Zhu, J., Wang, Z. F., and Gbaguidi, A.: Improvement of ozone forecast over Beijing based on ensemble Kalman lter with simultaneous adjustment of initial conditions and emissions, Atmos. Chem. Phys., 11, 1290112916, doi:http://dx.doi.org/10.5194/acp-11-12901-2011

Web End =10.5194/acp- http://dx.doi.org/10.5194/acp-11-12901-2011

Web End =11-12901-2011 , 2011.

Tang, X., Zhu, J., Wang, Z., Gbaguidi, A., Lin, C., Xin, J., Song,T., and Hu, B.: Limitations of ozone data assimilation with adjustment of NOx emissions: mixed effects on NO2 forecasts over

Beijing and surrounding areas, Atmos. Chem. Phys., 16, 6395 6405, doi:http://dx.doi.org/10.5194/acp-16-6395-2016

Web End =10.5194/acp-16-6395-2016 http://dx.doi.org/10.5194/acp-16-6395-2016

Web End = , 2016.

Thompson, R. L. and Stohl, A.: FLEXINVERT: an atmospheric Bayesian inversion framework for determining surface uxes of trace species using an optimized grid, Geosci. Model Dev., 7, 22232242, doi:http://dx.doi.org/10.5194/gmd-7-2223-2014

Web End =10.5194/gmd-7-2223-2014 http://dx.doi.org/10.5194/gmd-7-2223-2014

Web End = , 2014.

Trmolet, Y.: Accounting for an imperfect model in 4D-Var, Q.J. Roy. Meteor. Soc., 132, 24832504, doi:http://dx.doi.org/10.1256/qj.05.224

Web End =10.1256/qj.05.224 http://dx.doi.org/10.1256/qj.05.224

Web End = , 2006.

Trmolet, Y.: Model-error estimation in 4D-Var, Q. J. Roy. Meteor.

Soc., 133, 12671280, doi:http://dx.doi.org/10.1002/qj.94

Web End =10.1002/qj.94 http://dx.doi.org/10.1002/qj.94

Web End = , 2007.van der A, R. J., Allaart, M. A. F., and Eskes, H. J.: Multi sensor re-analysis of total ozone, Atmos. Chem. Phys., 10, 1127711294, doi:http://dx.doi.org/10.5194/acp-10-11277-2010

Web End =10.5194/acp-10-11277-2010 http://dx.doi.org/10.5194/acp-10-11277-2010

Web End = , 2010.

Geosci. Model Dev., 9, 39333959, 2016 www.geosci-model-dev.net/9/3933/2016/

E. Emili et al.: QG-Chem 3959

Verwer, J. G.: GaussSeidel Iteration for Stiff ODES from Chemical Kinetics, SIAM J. Sci. Comput., 15, 12431250, doi:http://dx.doi.org/10.1137/0915076

Web End =10.1137/0915076 http://dx.doi.org/10.1137/0915076

Web End = , 1994.

Weaver, A. and Courtier, P.: Correlation modelling on the sphere using a generalized diffusion equation, Q. J. Roy. Meteor. Soc., 127, 18151846, 2001.

Wu, L., Mallet, V., Bocquet, M., and Sportisse, B.: A comparison study of data assimilation algorithms for ozone forecasts, J. Geophys. Res., 113, D20310, doi:http://dx.doi.org/10.1029/2008JD009991

Web End =10.1029/2008JD009991 http://dx.doi.org/10.1029/2008JD009991

Web End = , 2008.Zhang, Y., Bocquet, M., Mallet, V., Seigneur, C., and Baklanov, A.:

Real-time air quality forecasting, Part II: State of the science, current research needs, and future prospects, Atmos. Environ.t, 60, 656676, doi:http://dx.doi.org/10.1016/j.atmosenv.2012.02.041

Web End =10.1016/j.atmosenv.2012.02.041 http://dx.doi.org/10.1016/j.atmosenv.2012.02.041

Web End = , 2012.

Zhao, Y., Nielsen, C. P., Lei, Y., McElroy, M. B., and Hao, J.: Quantifying the uncertainties of a bottom-up emission inventory of anthropogenic atmospheric pollutants in China, Atmos. Chem. Phys., 11, 22952308, doi:http://dx.doi.org/10.5194/acp-11-2295-2011

Web End =10.5194/acp-11-2295-2011 http://dx.doi.org/10.5194/acp-11-2295-2011

Web End = , 2011.

Zyryanov, D., Foret, G., Eremenko, M., Beekmann, M., Cammas, J.-P., DIsidoro, M., Elbern, H., Flemming, J., Friese, E., Kioutsioutkis, I., Maurizi, A., Melas, D., Meleux, F., Menut, L., Moinat, P., Peuch, V.-H., Poupkou, A., Razinger, M., Schultz,M., Stein, O., Suttie, A. M., Valdebenito, A., Zerefos, C., Du-four, G., Bergametti, G., and Flaud, J.-M.: 3-D evaluation of tropospheric ozone simulations by an ensemble of regional Chemistry Transport Model, Atmos. Chem. Phys., 12, 32193240, doi:http://dx.doi.org/10.5194/acp-12-3219-2012

Web End =10.5194/acp-12-3219-2012 http://dx.doi.org/10.5194/acp-12-3219-2012

Web End = , 2012.

www.geosci-model-dev.net/9/3933/2016/ Geosci. Model Dev., 9, 39333959, 2016

Word count: 17951

Show less

Abstract

Translate

Model errors play a significant role in air quality forecasts. Accounting for them in the data assimilation (DA) procedures is decisive to obtain improved forecasts. We address this issue using a reduced-order coupled chemistry-meteorology model based on quasi-geostrophic dynamics and a detailed tropospheric chemistry mechanism, which we name QG-Chem. This model has been coupled to the software library for the data assimilation Object Oriented Prediction System (OOPS) and used to assess the potential of the 4DEnVar algorithm for air quality analyses and forecasts. The assets of 4DEnVar include the possibility to deal with multivariate aspects of atmospheric chemistry and to account for model errors of a generic type. A simple diagnostic procedure for detecting model errors is proposed, based on the 4DEnVar analysis and one additional model forecast. A large number of idealized data assimilation experiments are shown for several chemical species of relevance for air quality forecasts (O3, NOx, CO and CO2) with very different atmospheric lifetimes and chemical couplings. Experiments are done both under a perfect model hypothesis and including model error through perturbation of surface chemical emissions. Some key elements of the 4DEnVar algorithm such as the ensemble size and localization are also discussed. A comparison with results of 3D-Var, widely used in operational centers, shows that, for some species, analysis and next-day forecast errors can be halved when model error is taken into account. This result was obtained using a small ensemble size, which remains affordable for most operational centers. We conclude that 4DEnVar has a promising potential for operational air quality models. We finally highlight areas that deserve further research for applying 4DEnVar to large-scale chemistry models, i.e., localization techniques, propagation of analysis covariance between DA cycles and treatment for chemical nonlinearities. QG-Chem can provide a useful tool in this regard.

Details

Title

Accounting for model error in air quality forecasts: an application of 4DEnVar to the assimilation of atmospheric composition using QG-Chem 1.0

Author

Emili, Emanuele; Gürol, Selime; Cariolle, Daniel

Pages

3933-3959

Publication year

2016

Publication date

2016

Publisher

Copernicus GmbH

ISSN

1991962X

e-ISSN

19919603

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5194/gmd-9-3933-2016

ProQuest document ID

1836999769

Accounting for model error in air quality forecasts: an application of 4DEnVar to the assimilation of atmospheric composition using QG-Chem 1.0

Jump to:

Full text

Abstract

Details

Suggested sources