Package 'FKF'

Title: Fast Kalman Filter
Description: This is a fast and flexible implementation of the Kalman filter and smoother, which can deal with NAs. It is entirely written in C and relies fully on linear algebra subroutines contained in BLAS and LAPACK. Due to the speed of the filter, the fitting of high-dimensional linear state space models to large datasets becomes possible. This package also contains a plot function for the visualization of the state vector and graphical diagnostics of the residuals.
Authors: David Luethi [aut], Philipp Erb [aut], Simon Otziger [aut], Daniel McDonald [aut], Paul Smith [aut, cre]
Maintainer: Paul Smith <[email protected]>
License: GPL (>= 2)
Version: 0.2.6
Built: 2024-11-01 06:25:36 UTC
Source: https://github.com/waternumbers/fkf

Help Index


Fast Kalman filter

Description

This function allows for fast and flexible Kalman filtering. Both, the measurement and transition equation may be multivariate and parameters are allowed to be time-varying. In addition “NA”-values in the observations are supported. fkf wraps the C-function FKF which fully relies on linear algebra subroutines contained in BLAS and LAPACK.

Usage

fkf(a0, P0, dt, ct, Tt, Zt, HHt, GGt, yt)

Arguments

a0

A vector giving the initial value/estimation of the state variable.

P0

A matrix giving the variance of a0.

dt

A matrix giving the intercept of the transition equation (see Details).

ct

A matrix giving the intercept of the measurement equation (see Details).

Tt

An array giving the factor of the transition equation (see Details).

Zt

An array giving the factor of the measurement equation (see Details).

HHt

An array giving the variance of the innovations of the transition equation (see Details).

GGt

An array giving the variance of the disturbances of the measurement equation (see Details).

yt

A matrix containing the observations. “NA”-values are allowed (see Details).

Details

State space form

The following notation is closest to the one of Koopman et al. The state space model is represented by the transition equation and the measurement equation. Let mm be the dimension of the state variable, dd be the dimension of the observations, and nn the number of observations. The transition equation and the measurement equation are given by

αt+1=dt+Ttαt+Htηt\alpha_{t + 1} = d_t + T_t \cdot \alpha_t + H_t \cdot \eta_t

yt=ct+Ztαt+Gtϵt,y_t = c_t + Z_t \cdot \alpha_t + G_t \cdot \epsilon_t,

where ηt\eta_t and ϵt\epsilon_t are iid N(0,Im)N(0, I_m) and iid N(0,Id)N(0, I_d), respectively, and αt\alpha_t denotes the state variable. The parameters admit the following dimensions:

αtRm\alpha_{t} \in R^{m} dtRmd_{t} \in R^m ηtRm\eta_{t} \in R^m
TtRm×mT_{t} \in R^{m \times m} HtRm×mH_{t} \in R^{m \times m}
ytRdy_{t} \in R^d ctRdc_t \in R^d ϵtRd\epsilon_{t} \in R^d
ZtRd×mZ_{t} \in R^{d \times m} GtRd×dG_{t} \in R^{d \times d}

Note that fkf takes as input HHt and GGt which corresponds to HtHtH_t H_t^\prime and GtGtG_t G_t^\prime.

Iteration:

The filter iterations are implemented using the expected values

at=E[αty1,,yt1]a_{t} = E[\alpha_t | y_1,\ldots,y_{t-1}]

att=E[αty1,,yt]a_{t|t} = E[\alpha_t | y_1,\ldots,y_{t}]

and variances

Pt=Var[αty1,,yt1]P_{t} = Var[\alpha_t | y_1,\ldots,y_{t-1}]

Ptt=Var[αty1,,yt]P_{t|t} = Var[\alpha_t | y_1,\ldots,y_{t}]

of the state αt\alpha_{t} in the following way (for the case of no NA's):

Initialisation: Set t=1t=1 with at=a0a_{t} = a0 and Pt=P0P_{t}=P0

Updating equations:

vt=ytctZtatv_t = y_t - c_t - Z_t a_t

Ft=ZtPtZt+GtGtF_t = Z_t P_t Z_t^{\prime} + G_t G_t^\prime

Kt=PtZtFt1K_t = P_t Z_t^{\prime} F_{t}^{-1}

att=at+Ktvta_{t|t} = a_t + K_t v_t

Ptt=PtPtZtKtP_{t|t} = P_t - P_t Z_t^\prime K_t^\prime

Prediction equations:

at+1=dt+Ttatta_{t+1} = d_{t} + T_{t} a_{t|t}

Pt+1=TtPttTt+HtHtP_{t+1} = T_{t} P_{t|t} T_{t}^{\prime} + H_t H_t^\prime

Next iteration: Set t=t+1t=t+1 and goto “Updating equations”.

NA-values:

NA-values in the observation matrix yt are supported. If particular observations yt[,i] contain NAs, the NA-values are removed and the measurement equation is adjusted accordingly. When the full vector yt[,i] is missing the Kalman filter reduces to a prediction step.

Parameters:

The parameters can either be constant or deterministic time-varying. Assume the number of observations is nn (i.e. y=(yt)t=1,,n,yt=(yt1,,ytd)y = (y_t)_{t = 1, \ldots, n}, y_t = (y_{t1}, \ldots, y_{td})). Then, the parameters admit the following classes and dimensions:

dt either a m×nm \times n (time-varying) or a m×1m \times 1 (constant) matrix.
Tt either a m×m×nm \times m \times n or a m×m×1m \times m \times 1 array.
HHt either a m×m×nm \times m \times n or a m×m×1m \times m \times 1 array.
ct either a d×nd \times n or a d×1d \times 1 matrix.
Zt either a d×m×nd \times m \times n or a d×m×1d \times m \times 1 array.
GGt either a d×d×nd \times d \times n or a d×d×1d \times d \times 1 array.
yt a d×nd \times n matrix.

BLAS and LAPACK routines used:

The R function fkf basically wraps the C-function FKF, which entirely relies on linear algebra subroutines provided by BLAS and LAPACK. The following functions are used:

BLAS: dcopy, dgemm, daxpy.
LAPACK: dpotri, dpotrf.

FKF is called through the .Call interface. Internally, FKF extracts the dimensions, allocates memory, and initializes the R-objects to be returned. FKF subsequently calls cfkf which performs the Kalman filtering.

The only critical part is to compute the inverse of FtF_t and the determinant of FtF_t. If the inverse can not be computed, the filter stops and returns the corresponding message in status (see Value). If the computation of the determinant fails, the filter will continue, but the log-likelihood (element logLik) will be “NA”.

The inverse is computed in two steps: First, the Cholesky factorization of FtF_t is calculated by dpotrf. Second, dpotri calculates the inverse based on the output of dpotrf.
The determinant of FtF_t is computed using again the Cholesky decomposition.

The first element of both at and Pt is filled with the function arguments a0 and P0, and the last, i.e. the (n + 1)-th, element of at and Pt contains the predictions for the next time step.

Value

An S3-object of class “fkf”, which is a list with the following elements:

att A m×nm \times n-matrix containing the filtered state variables, i.e. att[,t] = atta_{t|t}.
at A m×(n+1)m \times (n + 1)-matrix containing the predicted state variables, i.e. at[,t] = ata_t.
Ptt A m×m×nm \times m \times n-array containing the variance of att, i.e. Ptt[,,t] = PttP_{t|t}.
Pt A m×m×(n+1)m \times m \times (n + 1)-array containing the variances of at, i.e. Pt[,,t] = PtP_t.
vt A d×nd \times n-matrix of the prediction errors i.e. vt[,t] = vtv_t.
Ft A d×d×nd \times d \times n-array which contains the variances of vt, i.e. Ft[,,t] = FtF_t.
Kt A m×d×nm \times d \times n-array containing the “Kalman gain” i.e. Kt[,,t] = ktk_t.
logLik The log-likelihood.
status A vector which contains the status of LAPACK's dpotri and dpotrf. (0,0)(0, 0) means successful exit.
sys.time The time elapsed as an object of class “proc_time”.

References

Harvey, Andrew C. (1990). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press.

Hamilton, James D. (1994). Time Series Analysis. Princeton University Press.

Koopman, S. J., Shephard, N., Doornik, J. A. (1999). Statistical algorithms for models in state space using SsfPack 2.2. Econometrics Journal, Royal Economic Society, vol. 2(1), pages 107-160.

See Also

plot to visualize and analyze fkf-objects, KalmanRun from the stats package, function dlmFilter from package dlm.

Examples

## <--------------------------------------------------------------------------->
## Example: Local level model for the Nile's annual flow.
## <--------------------------------------------------------------------------->
## Transition equation:
## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt)          
## Measurement equation:
## y[t] = alpha[t] + eps[t], eps[t] ~  N(0, GGt)

y <- Nile
y[c(3, 10)] <- NA  # NA values can be handled

## Set constant parameters:
dt <- ct <- matrix(0) 
Zt <- Tt <- matrix(1)
a0 <- y[1]            # Estimation of the first year flow 
P0 <- matrix(100)     # Variance of 'a0'

## Estimate parameters:
fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5,
                   GGt = var(y, na.rm = TRUE) * .5),
                 fn = function(par, ...)
                 -fkf(HHt = matrix(par[1]), GGt = matrix(par[2]), ...)$logLik,
                 yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct,
                 Zt = Zt, Tt = Tt)

## Filter Nile data with estimated parameters:
fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = matrix(fit.fkf$par[1]),
               GGt = matrix(fit.fkf$par[2]), yt = rbind(y))

## Compare with the stats' structural time series implementation:
fit.stats <- StructTS(y, type = "level")

fit.fkf$par
fit.stats$coef

## Plot the flow data together with fitted local levels:
plot(y, main = "Nile flow")
lines(fitted(fit.stats), col = "green")
lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue")
legend("top", c("Nile flow data", "Local level (StructTS)", "Local level (fkf)"),
       col = c("black", "green", "blue"), lty = 1)

Fast Kalman Smoother

Description

This function can be run after running fkf to produce "smoothed" estimates of the state variable αt\alpha_t. Unlike the output of the filter, these estimates are conditional on the entire set of nn data points rather than only the past, see details.

Usage

fks(FKFobj)

Arguments

FKFobj

An S3-object of class "fkf", returned by fkf.

Details

The following notation is taken from the fkf function descriptions and is close to the one of Koopman et al. The smoother estimates

atn=E[αty1,,yn]a_{t|n} = E[\alpha_{t} | y_1,\ldots,y_n]

Ptn=Var[αty1,,yn]P_{t|n} = Var[\alpha_{t} | y_1,\ldots,y_n]

based on the outputs of the forward filtering pass performed by fkf.

The formulation of Koopman and Durbin is used which evolves the two values rtRmr_{t} \in R^m and NtRm×mN_{t} \in R^{m \times m} to avoid inverting the covariance matrix.

Iteration:

If there are no missing values the iteration proceeds as follows:

Initialisation: Set t=nt=n, with rt=0r_t =0 and Nt=0N_t =0.

Evolution equations:

L=TtTtKtZtL = T_{t} - T_{t}K_{t}Z_{t}

rt1=ZtFt1vt+Lrtr_{t-1} = Z_{t}^\prime F_{t}^{-1} v_{t} + L^\prime r_{t}

Nt1=ZtFt1Zt+LNtLN_{t-1} = Z_{t}^\prime F_{t}^{-1} Z_{t} + L^\prime N_{t} L

Updating equations:

atn=att1+Ptt1rt1a_{t|n} = a_{t|t-1} + P_{t|t-1}r_{t-1}

Ptn=Ptt1Ptt1Nt1Ptt1P_{t|n} = P_{t|t-1} - P_{t|t-1}N_{t-1}P_{t|t-1}

Next iteration: Set t=t1t=t-1 and goto “Evolution equations”.

Value

An S3-object of class "fks" which is a list with the following elements:

ahatt A m×nm \times n-matrix containing the smoothed state variables, i.e. ahatt[,t] = atna_{t|n}
Vt A m×m×nm \times m \times n-array containing the variances of ahatt, i.e. Vt[,,t] = PtnP_{t|n}

References

Koopman, S. J. and Durbin, J. (2000). Fast filtering and smoothing for multivariate state space models Journal of Time Series Analysis Vol. 21, No. 3

Examples

## <--------------------------------------------------------------------------->
## Example: Local level model for the Nile's annual flow.
## <--------------------------------------------------------------------------->
## Transition equation:
## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt)
## Measurement equation:
## y[t] = alpha[t] + eps[t], eps[t] ~  N(0, GGt)

y <- Nile
y[c(3, 10)] <- NA  # NA values can be handled

## Set constant parameters:
dt <- ct <- matrix(0)
Zt <- Tt <- matrix(1)
a0 <- y[1]            # Estimation of the first year flow
P0 <- matrix(100)     # Variance of 'a0'

## Estimate parameters:
fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5,
                   GGt = var(y, na.rm = TRUE) * .5),
                 fn = function(par, ...)
                   -fkf(HHt = matrix(par[1]), GGt = matrix(par[2]), ...)$logLik,
                 yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct,
                 Zt = Zt, Tt = Tt)

## Filter Nile data with estimated parameters:
fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = matrix(fit.fkf$par[1]),
               GGt = matrix(fit.fkf$par[2]), yt = rbind(y))

## Smooth the data based on the filter object
fks.obj <- fks(fkf.obj)

## Plot the flow data together with local levels:
plot(y, main = "Nile flow")
lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue")
lines(ts(fks.obj$ahatt[1,], start = start(y), frequency = frequency(y)), col = "red")
legend("top", c("Nile flow data", "Local level (fkf)","Local level (fks)"),
       col = c("black", "green", "blue", "red"), lty = 1)

Plotting fkf objects

Description

Plotting method for objects of class fkf. This function provides tools for graphical analysis of the Kalman filter output: Visualization of the state vector, QQ-plot of the individual residuals, QQ-plot of the Mahalanobis distance, auto- as well as crosscorrelation function of the residuals.

Usage

## S3 method for class 'fkf'
plot(
  x,
  type = c("state", "resid.qq", "qqchisq", "acf"),
  CI = 0.95,
  at.idx = 1:nrow(x$at),
  att.idx = 1:nrow(x$att),
  ...
)

Arguments

x

The output of fkf.

type

A string stating what shall be plotted (see Details).

CI

The confidence interval in case type == "state". Set CI to NA if no confidence interval shall be plotted.

at.idx

An vector giving the indexes of the predicted state variables which shall be plotted if type == "state".

att.idx

An vector giving the indexes of the filtered state variables which shall be plotted if type == "state".

...

Arguments passed to either plot, qqnorm, qqplot or acf.

Details

The argument type states what shall be plotted. type must partially match one of the following:

state

The state variables are plotted. By the arguments at.idx and att.idx, the user can specify which of the predicted (ata_{t}) and filtered (atta_{t|t}) state variables will be drawn.

resid.qq

Draws a QQ-plot for each residual-series invt.

qqchisq

A Chi-Squared QQ-plot will be drawn to graphically test for multivariate normality of the residuals based on the Mahalanobis distance.

acf

Creates a pairs plot with the autocorrelation function (acf) on the diagonal panels and the crosscorrelation function (ccf) of the residuals on the off-diagnoal panels.

Value

Invisibly returns an list with components:

distance The Mahalanobis distance of the residuals as a vector of length nn.
std.resid The standardized residuals as an d×nd \times n-matrix. It should hold that std.residij  iidNd(0,I)std.resid_{ij} \; iid \sim N_d(0, I),

where dd denotes the dimension of the data and nn the number of observations.

usage

plot(x, type = c("state", "resid.qq", "qqchisq", "acf"), CI = 0.95, at.idx = 1:nrow(x$at), att.idx = 1:nrow(x$att), ...)

See Also

fkf

Examples

## <--------------------------------------------------------------------------->
## Example: Local level model for the treering data
## <--------------------------------------------------------------------------->
## Transition equation:
## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt)          
## Measurement equation:
## y[t] = alpha[t] + eps[t], eps[t] ~  N(0, GGt)

y <- treering
y[c(3, 10)] <- NA  # NA values can be handled

## Set constant parameters:
dt <- ct <- matrix(0)
Zt <- Tt <- array(1,c(1,1,1))
a0 <- y[1]            # Estimation of the first width
P0 <- matrix(100)     # Variance of 'a0'

## Estimate parameters:
fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5,
                   GGt = var(y, na.rm = TRUE) * .5),
                 fn = function(par, ...)
                 -fkf(HHt = array(par[1],c(1,1,1)), GGt = array(par[2],c(1,1,1)), ...)$logLik,
                 yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct,
                 Zt = Zt, Tt = Tt)

## Filter tree ring data with estimated parameters:
fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = array(fit.fkf$par[1],c(1,1,1)),
               GGt = array(fit.fkf$par[2],c(1,1,1)), yt = rbind(y))

## Plot the width together with fitted local levels:
plot(y, main = "Treering data")
lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue")
legend("top", c("Treering data", "Local level"), col = c("black", "blue"), lty = 1)

## Check the residuals for normality:
plot(fkf.obj, type = "resid.qq")

## Test for autocorrelation:
plot(fkf.obj, type = "acf", na.action = na.pass)

Plotting fks objects

Description

Plotting method for objects of class fks. This function provides tools visualisation of the state vector of the Kalman smoother output

Usage

## S3 method for class 'fks'
plot(x, CI = 0.95, ahatt.idx = 1:nrow(x$ahatt), ...)

Arguments

x

The output of fks.

CI

The confidence interval in case type == "state". Set CI to NA if no confidence interval shall be plotted.

ahatt.idx

An vector giving the indexes of the predicted state variables which shall be plotted if type == "state".

...

Arguments passed to either plot, qqnorm, qqplot or acf.

Details

The state variables are plotted. By the argument ahatt.idx, the user can specify which of the smoothed (atna_{t|n}) state variables will be drawn.

See Also

fks

Examples

## <--------------------------------------------------------------------------->
## Example 3: Local level model for the treering data
## <--------------------------------------------------------------------------->
## Transition equation:
## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt)          
## Measurement equation:
## y[t] = alpha[t] + eps[t], eps[t] ~  N(0, GGt)

y <- treering
y[c(3, 10)] <- NA  # NA values can be handled

## Set constant parameters:
dt <- ct <- matrix(0)
Zt <- Tt <- array(1,c(1,1,1))
a0 <- y[1]            # Estimation of the first width
P0 <- matrix(100)     # Variance of 'a0'

## Estimate parameters:
fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5,
                   GGt = var(y, na.rm = TRUE) * .5),
                 fn = function(par, ...)
                 -fkf(HHt = array(par[1],c(1,1,1)), GGt = array(par[2],c(1,1,1)), ...)$logLik,
                 yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct,
                 Zt = Zt, Tt = Tt)

## Filter tree ring data with estimated parameters:
fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = array(fit.fkf$par[1],c(1,1,1)),
               GGt = array(fit.fkf$par[2],c(1,1,1)), yt = rbind(y))

fks.obj <- fks(fkf.obj)
plot(fks.obj)
##lines(as.numeric(y),col="blue")