## 1. Introduction

In the field of assimilation the key to success is knowing the errors in either the data or the computed values as a function of time and spatial location. It is rare that one will know such spatial and time information about the errors in the data, but wavelet analysis can give us such information on the errors of the computed values, and hence a mechanism to determine an appropriate weighting between the data and computed values.

We will illustrate the use of wavelet analysis in a new technique to assimilate data into a computational model. The new method is fast and efficient, and we believe that it surpasses other methods of comparable computational cost. The new method is called EEWADAi (Error Estimation using Wavelet Analysis for Data Assimilation). We should explain that the acronym EEWADAi for our new technique has meaning in the Japanese language. It means roughly, “a good topic of conversation,” which we hope our method will become.

The basic idea behind the new method is to use wavelet analysis to detect errors in the computational domain. Based on this fast real-time error analysis, we decide how to weight computed information against observed information. Simply stated, if the computational errors are large in a certain region of the computational domain at certain time, then the observed data will receive a relatively large weight. On the other hand, if the computational errors are small, then the computed information will be considered relatively reliable in this region and will be assigned a small weight.

**x**

**x**

^{−}

**K**

**y**

**E**

**x**

^{−}

**x**is the model state vector,

**y**is the observations,

**E**

**K**

*O*(

*n*

^{2}), integration of

*O*(2

*n*) stepping forward in time], 2) its linear assumption of the model dynamics and the Gaussian statistics, and 3) the lack of precise knowledge of the model error. These limitations have lead modelers to invent a variety of new technology for practical implementation: extended Kalman filter (EKF), ensemble Kalman filter, optimal interpolation (OI), and nudging (N). EKF improves the defect of KF by applying linearization at each time step about the most recent solution. However, because the computational requirements are still high, suggestion was made to solve the error prediction equation using the Monte Carlo method, since only a few hundred model states may be sufficient for reasonable statistical convergence (Evensen 1994). A more drastic simplification can be achieved by OI allowing information to influence only the nearby observation points (Ezer and Mellor 1994). Finally, the most heuristic and simple scheme, N, chooses an ad hoc weighting function which decays in time and space (Malanotte-Rizzoli and Holland 1986). In view of applying the assimilation scheme to OGCM, where the size of the state vector

**n**is of

*O*(10

^{8}) and larger, a practical choice is naturally OI or N. The principal defect of these method is that it does not reflect the evolution or the propagation of model error in space and time. This lead us to think of a diagnostic way of error detection rather than the prognostic way of solving the Riccati equation. EEWADAi thus provides us with a tool to diagnostically detect the error at each time step and use the detected error to determine the weighting matrix

**K**

The new method is very straightforward. The numerical values of the matrix **K****K**

This paper begins by introducing the fundamentals of wavelet analysis and how it is applied in the discrete world of a computer in section 2. Next, a review is given on how numerical methods are fundamentally created in section 3. It is the relationship between how wavelet methods are constructed and how numerical methods are constructed that is fundamental in obtaining an estimate of numerical errors. After our discussion of numerical errors obtained by wavelet analysis, we propose ways to obtain estimates of local error variance in section 4. In section 5 we show our estimate of numerical error variance can be used to construct a Kalman gain matrix for use in Kalman filtering. Finally, in section 6, we test our new method on a Kuroshio model and compare the result to that obtained by a traditional method—optimal interpolation. A conclusion follows.

## 2. Wavelet analysis

Possibly the most instructive way to think of wavelets is in contrast to traditional analysis techniques such as Fourier analysis. With Fourier analysis we analyze discrete or continuous data using basis functions which are global, smooth, and periodic. This analysis yields a set of coefficients, say, *a*_{k}, which gives the amount of energy in the data at frequency *k.* Wavelet analysis, by contrast, analyzes data with basis functions which are local, slightly smooth, not periodic, and which vary with respect to scale and location. Wavelet analysis thereby produces a set of coefficients *b*_{j,k} which give the amount of energy in the data at scale *j* and location *k.* Wavelet analysis can serve as a good complement to Fourier analysis. In fact, data which are efficiently analyzed with Fourier analysis often are not efficiently analyzed with wavelet analysis and the opposite situation also holds.

For our purposes here we will confine our discussion to the so-called orthogonal wavelets and specifically the Daubechies family of wavelets. The orthogonality property leads to a clear indication when data deviates from a low-order polynomial, the importance of which will become clear when we discuss numerical methods.

### a. Theoretical background in the continuous world

*ϕ*(

*x*), the scaling function, and

*ψ*(

*x*), the wavelet. The scaling function is the solution of the dilation equation, which carries the name “dilation equation” since the the independent variable

*x*appears alone on the left-hand side but is multiplied by 2, or dilated, on the right-hand side. One also requires the scaling function

*ϕ*(

*x*) be normalized:

^{co}

_{−∞}

*ϕ*(

*x*)

*dx*= 1. The wavelet

*ψ*(

*x*) is defined in terms of the scaling function,

*ϕ*(

*x*) and

*ψ*(

*x*) by dilating and translating to get the following functions: where

*j, k*∈

*Z.*The dilation parameter is

*j,*and

*k*is the translation parameter.

*ϕ*

^{j}

_{k}

*x*) and

*ψ*

^{j}

_{k}

*x*) over the parameter

*k,*with

*j*fixed, be denoted by

*V*

_{j}and

*W*

_{j}respectively, The spaces

*V*

_{j}and

*W*

_{j}are related by, where the notation

*V*

_{0}=

*V*

_{1}⊕

*W*

_{1}indicates that the vectors in

*V*

_{1}are orthogonal to the vectors in

*W*

_{1}and the space

*V*

_{0}is simply decomposed into these two component subspaces.

*H*=

*h*

_{k}}

^{L−1}

_{k=0}

*G*=

*g*

_{k}}

^{L−1}

_{k=0}

*g*

_{k}= (−1)

^{k}

*h*

_{L−k}for

*k*= 0, ...,

*L*− 1. All wavelet properties are specified through the parameters

*H*and

*G.*If one’s data are defined on a continuous domain such as

*f*(

*x*), where

*x*∈

*R*is a real number, then one uses

*ϕ*

^{j}

_{k}

*x*) and

*ψ*

^{j}

_{k}

*x*) to perform the wavelet analysis. If, on the other hand, one’s data are defined on a discrete domain such as

*f*(

*i*), where

*i*∈

*Z*is an integer, then the data are analyzed, or filtered, with the coefficients

*H*and

*G.*In either case, the scaling function

*ϕ*(

*x*) and its defining coefficients

*H*detect localized low-frequency information, that is, they are low-pass filters, and the wavelet

*ψ*(

*x*) and its defining coefficients

*G*detect localized high-frequency information, that is, they are high-pass filters. Specifically,

*H*and

*G*are chosen so that dilations and translations of the wavelet,

*ψ*

^{j}

_{k}

*x*), form an orthonormal basis of

*L*

^{2}(

*R*) and so that

*ψ*(

*x*) has

*M*vanishing moments, which determines the accuracy. In other words,

*ψ*

^{j}

_{k}

*x*) will satisfy where

*δ*

_{kl}is the Kronecker delta function, and the accuracy is specified by requiring that

*ψ*(

*x*) =

*ψ*

^{0}

_{0}

*x*) satisfy for

*m*= 0, ...,

*M*− 1. This statement on accuracy can be explained in an alternative manner. Recall that within the scaling function subspaces

*V*

_{j}one can reconstruct low-order polynomials exactly up to a given order so that

*x*

^{0},

*x*

^{1}, ...,

*x*

^{M−1}can be represented exactly by appropriately choosing scaling function coefficients. Accordingly, the subspace

*W*

_{j}(which is orthogonal to

*V*

_{j}) and consequently the basis elements

*ψ*

_{j}(

*x*) of

*W*

_{j}will be orthogonal to all elements contained in

*V*

_{j}, such as these low-order polynomials. So, one can see that by specifying the number of vanishing moments of the wavelets, one has in effect specified the number of polynomials which can be represented exactly and hence the numerical accuracy of the method.

*L*

^{2}(

*R*) one can see from the above expressions that for any function

*f*(

*x*) ∈

*L*

^{2}(

*R*) there exists a set {

*d*

_{jk}}, such that where

The two sets of coefficients *H* and *G* are known as quadrature mirror filters. For Daubechies wavelets the number of coefficients in *H* and *G,* or the length of the filters *H* and *G,* denoted by *L,* is related to the number of vanishing moments *M* by 2*M* = *L.* For example, the famous Haar wavelet is found by defining *H* as *h*_{0} = *h*_{1} = 1. For this filter, *H,* the solution to the dilation Eq. (2), *ϕ*(*x*), is the box function: *ϕ*(*x*) = 1 for *x* ∈ [0,1] and *ϕ*(*x*) = 0 otherwise. The Haar function is very useful as a learning tool, but because of its low order of approximation accuracy and lack of differentiability, it is of limited use as a basis set. The coefficients *H* needed to define compactly supported wavelets with a higher degree of regularity can be found in Daubechies (1988). As is expected, the regularity increases with the support of the wavelet. The usual notation to denote a Daubechies-based wavelet defined by coefficients *H* of length *L* is *D*_{L}.

### b. Practical implementation in the discrete world

Naturally, infinite sums and integrals are meaningless when one begins to implement a wavelet expansion on a computer. One must find appropriate ways to implement discrete counterparts to the continuous operations, which were outlined in the previous section. That is, nothing is continuous on a computer, and since the original wavelet analysis was derived in the continuous world of differential and integral mathematics it is necessary to discuss a discrete version of the above continuous theory. Generally, operations such as integration are easily approximated with an appropriate-order quadrature formula, but one would like to use as few quadratures as possible to limit the number of approximations that are made. We will see in this section how we can easily perform all the wavelet decompositions with relatively few approximations.

*j*= 0 is arbitrarily chosen as the finest scale required, and scale

*J*would be the scale at which a kind of local average,

*ϕ*

^{J}

_{k}

*x*), provides sufficient large-scale information, that is, the first term in Eq. (14) provides the local mean around which the function oscillates.

One must also limit the range of the location parameter *k.* Assuming periodicity of *f*(*x*) implies periodicity on all wavelet coefficients, *s*^{j}_{k}*d*^{j}_{k}*k.* For the nonperiodic case, since *k* is directly related to the location, a limit is imposed on the values of *k* when the location being addressed extends beyond the boundaries of the domain.

*s*

^{j}

_{k}

*d*

^{j}

_{k}

*h*

_{n}refers to the chosen filter while we have

*g*

_{n}= −(−1)

^{n}

*h*

_{L−n}.

**P**

^{j,j+1}

_{N}

**P**

*j*to scaling function and wavelet function coefficients at scale

*j*+ 1, that is,

**P**

^{j,j+1}

_{N}

**s**

_{j}onto

**s**

_{j+1}and

**d**

_{j+1}: where

**s**

_{j}refers to the vector containing the coefficients at scale

*j.*Note that the vectors at scale

*j*+ 1 are half as long as the vectors as scale

*j.*

*D*

_{4}wavelet, and that one wants to project from eight scaling function coefficients at scale

*j*to four scaling function coefficients at scale

*j*+ 1 and four wavelet coefficients at scale

*j*+ 1. The decomposition matrix for the case of periodic boundary conditions,

**P**

^{j,j+1}

_{8}

**P**

^{j,j+1}that is applied directly to the data and then directly to each level of scaling function coefficients. To be perfectly correct one would first approximate the scaling function coefficients at the finest scale using the raw data; however, in practice it seems to make very little difference if one simply considers the raw data to be the scaling function coefficients. So, for our purposes here we will simply use the raw data as the scaling function coefficients on the finest scale. The repeated application of the matrix

**P**

^{j,j+1}yields the wavelet coefficients at the various scales, and it is these wavelet coefficients that provide a guide to the errors committed during the numerical calculation, as will be further illustrated in the next paragraph and explained precisely in the next section.

Let us consider that the raw data are given and it is assumed to be the scaling function coefficients on the finest scale, **s**_{0}. One wavelet decompostion yields the scaling function coefficients and wavelet coefficients at scale *j* = 1, **s**_{1} and **d**_{1}. A second application of the wavelet decomposition matrix will yield the vectors **s**_{2} and **d**_{2}. It is the vectors **d**_{1}, **d**_{2}, ... that yield the critical information on the numerical errors. If, for example, one sees that the values of the **d**_{1} are relatively large in the middle of the vector, then it is clear that within this one-dimensional vector the largest errors will be in the middle of the one-dimensional domain from which this vector was derived. What we care about most is the relative errors being committed, but we also have some interest in the absolute errors, the subject of the next section.

## 3. Estimating computational errors

As outlined in the previous section, scaling functions are designed to approximate low-order polynomials exactly up to a given order, and wavelets are orthogonal to these same low-order polynomials. Any deviation from low-order polynomial structure in a computational domain can then be detected by wavelet analysis (see Jameson 1998). This measure of deviation from low-order polynomial structure is exactly what is needed to measure computational error. The reason for this is that fundamentally all nonspectral numerical schemes are constructed from low-order algebraic polynomials, and such schemes are exact if the data fall exactly on a low-order polynomial. Actually, spectral methods also follow this same rule, but in the case of Fourier spectral methods, the polynomials are trigonometric and not algebraic and are global instead of local. To be precise, let us review the fundamentals of how numerical methods are constructed and how wavelet analysis can detect errors in a computational scheme.

Numerical schemes for the approximation of partial differential equations on a computer provide a mechanism for taking one set of *N* numbers to another set of *N* numbers. To make this transition from one set to the next set, we must take derivatives but the data are a set of points that are not connected. We must, therefore, choose some type of function with which we can connect these points so that we can take a derivative. There are many choices that can be made, but fundamentally we are always working with some type of polynomial be it algebraic, trigonometric, or other.

Simply said, one can generate differencing coefficients by first interpolating a polynomial of any order through a set of data, then differentiating of this polynomial, and evaluating it at a grid point. As long as the number of grid points exceeds the order the polynomial by one, then the interpolation is unique and the differencing coefficients are likewise unique.

*A*

_{j}(

*x*

_{k}) =

*δ*

_{jk}. For given values

*w*

_{0},

*w*

_{1}, ...,

*w*

_{n}, the polynomial is in

*p*

_{n}(the space of a polynomial of order

*n*) and takes on these values at the points

*x*

_{i}:

*p*

_{n}

*x*

_{k}

*w*

_{k}

*k*= 0, 1, ...,

*n,*and it is the truncation error of this interpolation that determines the order accuracy that one obtains when differentiating.

*ξ*lies between the smallest and the largest

*x*

_{i}. Note that when one builds differencing operators after interpolation that the term

*f*

^{(n + 1)}

*ξ*

*x*and the grid points

*x*

_{i}goes from a term of the form to a term of the form In other words, if the computational data are exactly a polynomial of order

*n*or less, then the remainder term is zero and there is no error in the computation.

*p*

_{2}(

*x*) to a function

*f*(

*x*) at the grid points

*x*

_{0}<

*x*

_{1}<

*x*

_{2}:

*p*

_{2}(

*x*

_{i}) =

*f*(

*x*

_{i}),

*i*= 0, 1, 2. The remainder term for some

*ξ, x*

_{0}≤

*ξ*≤

*x*

_{2}, is Now, differentiate and evaluate at

*x*=

*x*

_{1}to get where

*h*=

*x*

_{1}−

*x*

_{0}=

*x*

_{2}−

*x*

_{1}. If the grid is evenly spaced, then the differences (

*x*

_{i}−

*x*

_{j}) are some integer multiple of the smallest difference, which one can denote by

*h.*If one doubles the number of grid points, then each of the distances (

*x*

_{i}−

*x*

_{j}) becomes half as large and the accuracy for this quadratic example will be 2. To be specific, if the computational data are quadratic polynomial or less, then the remainder is zero and the computation has no error.

In practice, however, computational data will rarely be an exact low-order polynomial; therefore, there is always error. The size of this computational error will depend on the deviation from low-order polynomials and can be readily measured with wavelet analysis. If, on the other hand, the data are exactly a low-order polynomial and the numerical method is “exact,” then wavelets of the corresponding order of accuracy will be orthogonal to the data and all wavelet coefficients will be zero indicating, correctly, that the calculation has no error.

*f*(

*x*) and one can derive the following expression (see Strang and Nguyen 1996):

From Eq. (27) we find that if *f*(*x*) behaves like a polynomial of an order less than *M* inside the small interval, then *d*^{j}_{k}

Thus, by considering the magnitude of *d*^{j}_{k}

## 4. Error variance and model variation

Wavelet analysis can detect both the local error in a numerical calculation and the model variation. In a sense, the estimate of local error is a precise concept based on deviation from a local low-order polynomial, whereas model variation is a less precise measure of localized frequency content. Building the error variance from the error estimate is just an averaging process.

### a. Wavelet dectection of error variance

*local*since the ability to estimate errors locally in both space and time is one of the key strong points of EEWADAi. The first point to note is that the average error will be zero. That is, the local inner product between the wavelet and the data, indicating the local error, will certainly sometimes be positive and sometimes negative and on average will be zero. The next step is to decide which errors to include when estimating the local variance. One can average the squared errors in time or in space. From practical point of view, we will find a local average of the squared errors in space. Therefore, in order to estimate the error variance around, say, grid point (

*x*

_{I},

*y*

_{J},

*z*

_{K}) then one sum in the following manner, where

*w*(

*i,j,k*) is the wavelet coefficient at grid point

*i, j, k.*Here,

*d*is a small number used to define a small averaging box around the point of interest, and

*c*is a constant that is needed to scale the wavelet transform to the problem at hand.

In this manner, we can obtain a local and very computationally efficient estimate of local error variance.

### b. Wavelet detection of model variation

As explained in previous sections of the paper, wavelet analysis breaks up data into local frequency components. That is, at a given physical space location one can obtain an estimate of the various scales of information present in the vicinity of this physical space location. Using the previously defined notation, we have the “frequency” or “scale” boxes *W*_{1}, *W*_{2}, *W*_{3}, ... that contain this local frequency content information. Furthermore, “variation” in a model will appear as a localized oscillation at some scale and this information will appear as a large wavelet coefficient in one of the frequency boxes. That is, there is a one-to-one correspondence between model variation at a given scale and the wavelet energy at that same scale. In fact, variation is exactly what wavelet analysis detects.

### c. When variation and error are *not* the same

In a word, where there is *numerical error* there will be variation, but variation does not imply numerical error. Roughly speaking, one can make the argument based on scale information. For example, if one observes wavelet energy in the finest scale box, *W*_{1}, then this energy will indicate that the grid point density for the numerical method in a given region of the domain is not sufficient and that numerical errors are commited in this region. Certainly this same *W*_{1} box energy indicates local high-frequency information or local variation. On the other hand, if one observes energy in the coarser scale box *W*_{3}, then it will not necessarily indicate an insufficient gridpoint density but, again, it will indicate that variation occurs at scale 3. To be more precise, suppose that in the vicinity of grid point *x*_{k} that energy is present in box *W*_{3} but not in box *W*_{1}, this will indicate variation but not numerical error. On the other hand, if energy is present in box *W*_{1}, then this will indicate both numerical error and variation.

In summary, we note that other techniques such as OI measure only model variation, and this measure of variation is far less precise than the model variation measure that is given by wavelet analysis. In addition, wavelet analysis gives a precise measure of the error committed in the numerical calculation. Both of these estimates, the error estimate and the variation estimate, make wavelet analysis a very powerful and useful tool.

## 5. Reduced Kalman gain

**x**

**x**

^{−}

**K**

**y**

**E**

**x**

^{−}

**x**is the analyzed model state vector,

**x**

^{−}is the first-guess model state vector of length

*M,*

**K**

**y**is the observation vector of length

*N,*and

**E**

*M*will be substantially smaller than the full model state dimension, and the actual reduction we used in this study for the TOPEX/Poseidon altimeter twin experiment will be presented later.

*N*+

*M*) by

*M*observation matrix

**E**

**E**

**O I**

**O**

*N*by

*M*null matrix and

**I**

*N*by

*N,*satisfying

**y**

**E**

**x**

*e*

**x**

**x**

_{t}

**x**

**x**

_{t}

^{T}

**K**

**x**

_{t}is the true state and superscript T denotes transposition. The following formula is readily derived:

**K**

**EPE**

^{T}

**R**

**PE**

^{T}

**P**

*M*+

*N*) by (

*M*+

*N*) model error covariance matrix with diagonal element

*σ*

^{2}

_{p}

**R**

*N*by

*N*observation error covariance matrix. Typically,

**R**

**R**

*σ*

^{2}

_{r}

**I**

**I**

*N*by

*N*identity matrix, and

*σ*

^{2}

_{r}

**P**

**P**

_{11}is

*M*by

*M*matrix,

**P**

_{12}is

*M*by

*N*matrix,

**P**

_{21}is

*N*by

*M*matrix, and

**P**

_{22}is

*N*by

*N*matrix. Here,

**P**

_{11}represents covariance among model data to be corrected, and

**P**

_{22}represents covariance among model data at the observation points. Also,

**P**

_{12}=

**P**

^{T}

_{21}

**K**

_{11}is

*M*by

*N*and

**K**

_{21}is

*N*by

*N*matrices.

Using (36), we can derive the reduced Kalman gain used in a variety of data assimilation schemes: we will summarize below, the optimal interpolation scheme, the nudging scheme, and EEWADAi. The notable difference of EEWADAi to the other methods is the modeling of the model error covariance matrix, which we show below.

### a. Optimal interpolation (OI) scheme

**P**

*p*

_{ij}=

*σ*

^{2}

_{m}exp[−(

*k*

_{x}

*δx*

_{ij})

^{2}− (

*k*

_{y}

*δy*

_{ij})

^{2}− (

*ω*

_{t}

*δt*

_{ij})

^{2}],

*σ*

^{2}

_{m}

*δx*

_{ij},

*δy*

_{ij},

*δt*

_{ij}, are the distance and time interval between data points; and

*k*

_{x},

*k*

_{y}, and

*ω*

_{t}represents the spatial and timescale of the covariance. The Kalman gain can be derived from (36), substituting (37),

*k*

_{ij}

*p*

_{ik}

*p*−

*r*)

^{−1}

_{kj}

*k*

_{x}and

*k*

_{y}were set to about 60 to 90 km and

*ω*

_{t}to 5 days. In Figs. 1a,b we show an example of the mean SSH and the variance of the SSH as used in OI.

### b. Nudging scheme (N)

**P**

_{22}is neglected in (36), and defining

*α*=

*σ*

^{2}

_{m}

*σ*

^{2}

_{r}

**K**

_{11},

**K**

_{21}will take the familiar form of nudging:

*k*

_{ij}

*α*

*k*

_{x}

*δx*

_{ij}

^{2}

*k*

_{y}

*δy*

_{ij}

^{2}

*ω*

_{t}

*δt*

_{ij}

^{2}

### c. EEWADAi

*σ*

^{2}

_{p}

*σ*

^{2}

_{p}

*C*is a scaling constant to be found. The nondiagonal elements of the error covariance matrix will be provided assuming a Gaussian distribution similar to that used in (37).

*w*(

*x,y*) to the model anomaly variance,

*σ*

^{2}

_{m}

*σ*

^{2}

_{m}

*σ*

^{2}

_{m}

*f*

*w*

*x, y*

*c*

_{1}= 5 and

*c*

_{2}= 1000. We can simply replace

*σ*

^{2}

_{p}

*σ*

^{2}

_{m}

**P**

_{22}as in the case of nudging, then, the Kalman gain can be expressed explicitly as

*k*

_{ij}

*c*

*w*

*x,y*

*k*

_{x}

*δx*

_{ij}

^{2}

*k*

_{y}

*δy*

_{ij}

^{2}

*ω*

_{t}

*δt*

_{ij}

^{2}

*c*′ =

*c*

_{1}/

*σ*

^{2}

_{r}

*k*

_{x},

*k*

_{y}, and

*ω*

_{t}were chosen to be identical to the OI case.

Figure 3 shows the actual reduced state we used, similar to that used by Ezer and Mellor (1994). Here, the observation data is limited to *N* data points along the satellite track, indicated by crosses, and those are used to correct *M* model outputs, indicated by circles, in the neighboring grid points on the grid line intersecting the satellite track. This window is shifted along the track until all the satellite observation data are assimilated. A localized Kalman gain will be derived for this localized state. In this paper, we will compare the performance of OI and EEWADAi for the identical state reduction in the twin experiment. Here, *M* is chosen to be 11, and *N* is chosen to be 5.

Below we will give an example of how we actually implemented this method to assimilate the altimeter obtained SSH data into an OGCM. Much of the methodology follows the OI scheme presented by Ezer and Mellor (1994) except that the weighting function now is found by EEWADAi.

*η*is the SSH,

*T*is the temperature,

*S*is the salinity,

*M*represents the model result,

*O*represents the observed data,

*A*denotes the assimilated result also known as the analysis data, and the matrix

**K**

*F*is known as the correlation factor, and

*δT, δS*is the deviation from average value of

*T, S*: and where

Equations (47) and (48) are used to extend the surface information into the interior of the fluid using a statistical correlation, *F*_{T} and *F*_{S}. For the OI, *K*’s are found from the OI equation similar to that described by Ezer and Mellor (1994), and for the EEWADAi, *K*’s are determined by (45).

## 6. Twin experiment

The proposed data assimilation scheme was implemented in the regional Kuroshio circulation model developed by Mitsudera et al. (1997). The model is a sigma coordinate primitive ocean model forced by surface wind stress, heat flux, and salinity flux with a prescribed inflow/outflow transport at the open boundary. The model dimension is 206 × 209 × 32.

The fundamental idea behind the twin experiment is that we begin two numerical simulations with different initial conditions. One of the simulations will be considered the “control run” and the other simulation will be used to test data assimilation schemes. In our case we will test the new wavelet-based method proposed in this paper against the traditional OI method. In the twin experiment one attempts to “push” the simulation run to the control run. In doing so we can compare the performance of the optimal interpolation scheme to the wavelet based method.

### a. Details of experiment

The model was run for six years with annual mean forcing following 90 days of diagnostic run with Levitus (1982) temperature and salinity climatology. For the twin experiments, the output from day 1800 to day 1980 (5 to 5½ yr) was used as a control run and the output from day 1980 to 2160 (5½ to 6 yr) was used as a simulation run (model run).

Data mimicking the TOPEX/Poseidon altimeter observation were sampled from the SSH output of the control run on the model grid point near the satellite track and therefore have a lower spatial resolution (7∼15 km) than the real data (7 km). The temporal sampling rate is the exact TOPEX/Poseidon track sampling rate. For the region of interest, 32 tracks passed through at 9.91 days repeat cycle (see Fig. 4).

The necessary statistical quantities such as the correlation factors were obtained from the model run from day 1800 to 2160 (fifth year). The method used to relate the surface data into the interior of the fluid is similar to Ezer and Mellor (1994) and is described in detail in their report.

The assimilation was conducted in a sequential manner, that is, initiated whenever the observation is available and continued for a short relaxation period following the observation. The relaxation was introduced in order to smooth the shock introduced by the assimilation. The relaxation time scale, *ω*_{t}, was chosen to be around 0.3 days, shorter than the track sampling interval.

### b. EEWADAi, weighting, and the twin experiment

First, let us understand the nature of the error detection method of EEWADAi. Here we take two snapshots from the Kuroshio model and apply wavelet analysis to each of the two snapshots. The magnitude of the wavelet coefficients will indicate the magnitude of the computational error committed in the region of the wavelet analysis. In the regions where wavelet analysis has detected a large computational error, our assimilation method “eewadai” will heavily weight the data from an external source while giving a very small weight to the computed result. In other words, if we desire to assimilate satellite data into our computational model, eewadai tells us where our computed result is reliable and where it cannot be trusted.

The two snapshots from our computational model are given in Figs. 5a,c, and the associated wavelet detected computational error is given in Figs. 5b,d. Note from the two eewadai error plots that the majority of the domain is a relatively solid blue color and that in a small percentage of the domain one can observe red, which indicates that a large computational error is committed. In this case, we would choose to trust the computed result in the blue regions of the domain and we would trust, say, the satellite data in the red regions of the domain. Given this information on errors one can choose a variety of ways to execute an effective assimilation scheme, such as simply pushing the scheme or implementing a reduced Kalman gain technique.

In practice, we would be concerned not with the numerical errors in the entire domain at one time, but in the numerical errors committed on and in the vicinity of the source of external data. For example, if the source of external data is a satellite, then one would want to know the errors on or near the satellite path (see Fig. 4). In this diagram, we show the location and value of the largest absolute value of the wavelet coefficients along the satellite path. Such information would be used, say, to indicate that the location where the largest wavelet coefficient occurs is the region along the satellite path where the numerical data can be trusted the least and the satellite data would be trusted the most thus leading to a straightforward and computationally efficient data assimilation scheme. Now let us see how EEWADAi performs compared to the traditional OI technique.

### c. Results of the twin experiment

We present the *L*_{2} difference of two of the flow fields, sea surface height, and temperature at a given level between the control run and the test run at 5-day intervals during the experiment. We can see from the two plots presented (see Figs. 6 and 7) that the two solutions are not noticeably different in quality up to day 120. However, after about day 120 one can see that the wavelet based method clearly gives a better solution than the OI method for both the temperature field and the sea surface height field. In fact, we consider the results from the later portion of the simulation to more relevant than the early part since we believe the methods have reached a kind of steady state.

We should note the wavelet method presented here has yet to be “optimized” and that even without any optimization at all it has outperformed an existing and established method. We attribute this early success to the fact the wavelet based method is actually giving a direct measure of error as a function of space and time, whereas optimal interpolation, in our view, has no “direct” relationship with error. In future versions of our method we certainly will improve the performance by analytically deriving an optimal relationship between the wavelet coefficients and the weighting matrix or by simple trial and error. In the next section, we summarize our results.

## 7. Conclusions

In the field of data assimilation one can summarize the weakness of current methods as saying there is insufficient knowledge of errors, either from the computational side or from the external source of data. Without knowledge of errors, one cannot hope to assimilate external data efficiently. From the simple example of incorporating real-time sea surface height data into a model run, one must know roughly which is more accurate; the height given by the numerical calculation or the height given by the external data source, say, data from a satellite. Generally, we will have some knowlegde of the satellite errors but not as a function of space and time. For example, we might know that the satellite data error is some kind of skewed Gaussian with a certain standard deviation. But, we will generally have no knowledge of how this error changes with spacial location over the ocean or with time. On the other hand, if one has some knowledge of the errors in the computational scheme, then one will have an estimate of the relative errors between the satellite data and computed data. In this paper we have proposed a method which takes a large step toward solving the problem of lack of error knowledge. Our wavelet-based method can give information on the errors committed by the computational scheme and this information coupled with a slight knowledge of the satellite errors, even if they are constant, can be used to effectively decide which information can be trusted more in which portion of the domain. The wavelet-based methods works by detecting where the numercial scheme cannot give a correct answer by estimating the deviation from a local low-order polynomial. Furthermore, the numerical cost of the wavelet error estimator is neglible. That is, one only needs to perform the wavelet analysis in the region of the domain where the new data are available and the cost of this analysis is very cheap, only a small constant times the number of grid points analyzed. This is in strong contrast to some existing methods where the cost of assimilation can dominate the calculation. Furthermore, in our numerical example where we compared the new wavelet-based method with the existing optimal interpolation method, we have shown that even at this early state of its existence the wavelet-based method without any optimization has outperformed the existing and certainly more mature method.

To conclude, the wavelet-based method gives a direct measure of computational errors as a function of space and time, and this information is critical to have an effective assimilation scheme.

## Acknowledgments

We would like to express our thanks to Dr. Mitsudera and Dr. Yaremchuk of the IPRC for their valuable comments and ecouragement. We also would like to express our thanks to Dr. Yoshikawa and Mr. Taguchi for providing us with the Kuroshio regional circulation model for the twin experiment. This research was supported by Frontier Research System for Global Change.

## REFERENCES

Daubechies, I., 1988: Orthonormal basis of compactly supported wavelets.

*Commun. Pure Appl. Math.,***41,**909–996.Erlebacher, G., M. Y. Hussaini, and L. Jameson, Eds., 1996:

*Wavelets:Theory and Applications.*Oxford University Press, 528 pp.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

*J. Geophys. Res.,***99,**10143–10162.Ezer, T., and G. L. Mellor, 1994: Continuous assimilation of Geosat altimeter data into a three-dimensional primitive equation Gulf Stream model.

*J. Phys. Oceanogr.,***24,**832–847.Jameson, L., 1998: A wavelet-optimized, very high order adaptive grid and order numerical method.

*SIAM J. Sci. Comput.,***19,**1980–2013.Kalman, R. E., 1960: A new approach to linear filtering and prediction problems.

*J. Basic. Eng.,***82D,**35–45.Levitus, S., 1982:

*Climatological Atlas of the World Ocean.*National Oceanic and Atmospheric Administration Prof. Paper 13, 173 pp.Malanotte-Rizzoli, P., and W. R. Holland, 1986: Data constraints applied to models of the ocean general circulation. Part I: The steady case.

*J. Phys. Oceanogr.,***16,**1665–1682.Mitsudera, H., Y. Yoshikawa, B. Taguchi, and H. Nakamura, 1997: High-resolution Kuroshio/Oyashio System Model: Preliminary results (in Japanese with English abstract). Japan Marine Science and Technology Center Rep. 36, 147–155.

Strang, G., and T. Nguyen, 1996:

*Wavelets and Filter Banks.*Wellesley-Cambridge Press, 490 pp.Wunsch, C., 1996:

*The Ocean Circulation Inverse Problem.*Cambridge University Press, 442 pp.

^{}

* School of Ocean and Earth Science and Technology Contribution Number 4915 and International Pacific Research Center Contribution Number 23.

^{}

+ The International Pacific Research Center is partly sponsored by the Frontier Research System for Global Change.