Title: | Fast Kalman Filter |
---|---|
Description: | This is a fast and flexible implementation of the Kalman filter and smoother, which can deal with NAs. It is entirely written in C and relies fully on linear algebra subroutines contained in BLAS and LAPACK. Due to the speed of the filter, the fitting of high-dimensional linear state space models to large datasets becomes possible. This package also contains a plot function for the visualization of the state vector and graphical diagnostics of the residuals. |
Authors: | David Luethi [aut], Philipp Erb [aut], Simon Otziger [aut], Daniel McDonald [aut], Paul Smith [aut, cre] |
Maintainer: | Paul Smith <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.2.6 |
Built: | 2024-12-01 05:40:55 UTC |
Source: | https://github.com/waternumbers/fkf |
This function allows for fast and flexible Kalman filtering. Both, the
measurement and transition equation may be multivariate and parameters
are allowed to be time-varying. In addition “NA”-values in the
observations are supported. fkf
wraps the C
-function
FKF
which fully relies on linear algebra subroutines contained
in BLAS and LAPACK.
fkf(a0, P0, dt, ct, Tt, Zt, HHt, GGt, yt)
fkf(a0, P0, dt, ct, Tt, Zt, HHt, GGt, yt)
a0 |
A |
P0 |
A |
dt |
A |
ct |
A |
Tt |
An |
Zt |
An |
HHt |
An |
GGt |
An |
yt |
A |
State space form
The following notation is closest to the one of Koopman et al.
The state space model is represented by the transition equation and
the measurement equation. Let be the dimension of the state
variable,
be the dimension of the observations, and
the number of observations. The transition equation and the
measurement equation are given by
where and
are iid
and iid
,
respectively, and
denotes the state
variable. The parameters admit the following dimensions:
|
|
|
|
|
|
|
|
|
|
|
Note that fkf
takes as input HHt
and GGt
which
corresponds to and
.
Iteration:
The filter iterations are implemented using the expected values
and variances
of the state in the following way
(for the case of no NA's):
Initialisation: Set with
and
Updating equations:
Prediction equations:
Next iteration: Set and goto “Updating equations”.
NA-values:
NA-values in the observation matrix yt
are supported. If
particular observations yt[,i]
contain NAs, the NA-values are
removed and the measurement equation is adjusted accordingly. When
the full vector yt[,i]
is missing the Kalman filter reduces to
a prediction step.
Parameters:
The parameters can either be constant or deterministic
time-varying. Assume the number of observations is
(i.e.
). Then, the parameters admit the following
classes and dimensions:
dt |
either a (time-varying) or a (constant) matrix. |
Tt |
either a or a array. |
HHt |
either a or a array. |
ct |
either a or a matrix. |
Zt |
either a or a array. |
GGt |
either a or a array. |
yt |
a matrix.
|
BLAS and LAPACK routines used:
The R function fkf
basically wraps the C
-function
FKF
, which entirely relies on linear algebra subroutines
provided by BLAS and LAPACK. The following functions are used:
BLAS: | dcopy , dgemm , daxpy . |
LAPACK: | dpotri , dpotrf .
|
FKF
is called through the .Call
interface. Internally,
FKF
extracts the dimensions, allocates memory, and initializes
the R-objects to be returned. FKF
subsequently calls
cfkf
which performs the Kalman filtering.
The only critical part is to compute the inverse of
and the determinant of
. If the inverse can not be
computed, the filter stops and returns the corresponding message in
status
(see Value). If the computation of the
determinant fails, the filter will continue, but the log-likelihood
(element logLik
) will be “NA”.
The inverse is computed in two steps:
First, the Cholesky factorization of is
calculated by
dpotrf
. Second, dpotri
calculates the
inverse based on the output of dpotrf
.
The determinant of is computed using again the
Cholesky decomposition.
The first element of both at
and Pt
is filled with the
function arguments a0
and P0
, and the last, i.e. the (n +
1)-th, element of at
and Pt
contains the predictions for the next time step.
An S3-object of class “fkf”, which is a list with the following elements:
att |
A -matrix containing the filtered state variables, i.e. att[,t] = . |
at |
A -matrix containing the predicted state variables, i.e. at[,t] = . |
Ptt |
A -array containing the variance of att , i.e. Ptt[,,t] = . |
Pt |
A -array containing the variances of at , i.e. Pt[,,t] = . |
vt |
A -matrix of the prediction errors i.e. vt[,t] = . |
Ft |
A -array which contains the variances of vt , i.e. Ft[,,t] = . |
Kt |
A -array containing the “Kalman gain” i.e. Kt[,,t] = . |
logLik |
The log-likelihood. |
status |
A vector which contains the status of LAPACK's dpotri and dpotrf . means successful exit. |
sys.time |
The time elapsed as an object of class “proc_time”. |
Harvey, Andrew C. (1990). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press.
Hamilton, James D. (1994). Time Series Analysis. Princeton University Press.
Koopman, S. J., Shephard, N., Doornik, J. A. (1999). Statistical algorithms for models in state space using SsfPack 2.2. Econometrics Journal, Royal Economic Society, vol. 2(1), pages 107-160.
plot
to visualize and analyze fkf
-objects, KalmanRun
from the stats package, function dlmFilter
from package dlm
.
## <---------------------------------------------------------------------------> ## Example: Local level model for the Nile's annual flow. ## <---------------------------------------------------------------------------> ## Transition equation: ## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt) ## Measurement equation: ## y[t] = alpha[t] + eps[t], eps[t] ~ N(0, GGt) y <- Nile y[c(3, 10)] <- NA # NA values can be handled ## Set constant parameters: dt <- ct <- matrix(0) Zt <- Tt <- matrix(1) a0 <- y[1] # Estimation of the first year flow P0 <- matrix(100) # Variance of 'a0' ## Estimate parameters: fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5, GGt = var(y, na.rm = TRUE) * .5), fn = function(par, ...) -fkf(HHt = matrix(par[1]), GGt = matrix(par[2]), ...)$logLik, yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct, Zt = Zt, Tt = Tt) ## Filter Nile data with estimated parameters: fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = matrix(fit.fkf$par[1]), GGt = matrix(fit.fkf$par[2]), yt = rbind(y)) ## Compare with the stats' structural time series implementation: fit.stats <- StructTS(y, type = "level") fit.fkf$par fit.stats$coef ## Plot the flow data together with fitted local levels: plot(y, main = "Nile flow") lines(fitted(fit.stats), col = "green") lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue") legend("top", c("Nile flow data", "Local level (StructTS)", "Local level (fkf)"), col = c("black", "green", "blue"), lty = 1)
## <---------------------------------------------------------------------------> ## Example: Local level model for the Nile's annual flow. ## <---------------------------------------------------------------------------> ## Transition equation: ## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt) ## Measurement equation: ## y[t] = alpha[t] + eps[t], eps[t] ~ N(0, GGt) y <- Nile y[c(3, 10)] <- NA # NA values can be handled ## Set constant parameters: dt <- ct <- matrix(0) Zt <- Tt <- matrix(1) a0 <- y[1] # Estimation of the first year flow P0 <- matrix(100) # Variance of 'a0' ## Estimate parameters: fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5, GGt = var(y, na.rm = TRUE) * .5), fn = function(par, ...) -fkf(HHt = matrix(par[1]), GGt = matrix(par[2]), ...)$logLik, yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct, Zt = Zt, Tt = Tt) ## Filter Nile data with estimated parameters: fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = matrix(fit.fkf$par[1]), GGt = matrix(fit.fkf$par[2]), yt = rbind(y)) ## Compare with the stats' structural time series implementation: fit.stats <- StructTS(y, type = "level") fit.fkf$par fit.stats$coef ## Plot the flow data together with fitted local levels: plot(y, main = "Nile flow") lines(fitted(fit.stats), col = "green") lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue") legend("top", c("Nile flow data", "Local level (StructTS)", "Local level (fkf)"), col = c("black", "green", "blue"), lty = 1)
This function can be run after running fkf
to produce
"smoothed" estimates of the state variable .
Unlike the output of the filter, these estimates are conditional
on the entire set of
data points rather than only the past, see details.
fks(FKFobj)
fks(FKFobj)
FKFobj |
An S3-object of class "fkf", returned by |
The following notation is taken from the fkf
function descriptions
and is close to the one of Koopman et al. The smoother estimates
based on the outputs of the forward filtering pass performed by fkf
.
The formulation of Koopman and Durbin is used which evolves the two values
and
to avoid inverting the covariance matrix.
Iteration:
If there are no missing values the iteration proceeds as follows:
Initialisation: Set , with
and
.
Evolution equations:
Updating equations:
Next iteration: Set and goto “Evolution equations”.
An S3-object of class "fks" which is a list with the following elements:
ahatt
A -matrix containing the
smoothed state variables, i.e. ahatt[,t] =
Vt
A -array
containing the variances of
ahatt
, i.e. Vt[,,t] =
Koopman, S. J. and Durbin, J. (2000). Fast filtering and smoothing for multivariate state space models Journal of Time Series Analysis Vol. 21, No. 3
## <---------------------------------------------------------------------------> ## Example: Local level model for the Nile's annual flow. ## <---------------------------------------------------------------------------> ## Transition equation: ## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt) ## Measurement equation: ## y[t] = alpha[t] + eps[t], eps[t] ~ N(0, GGt) y <- Nile y[c(3, 10)] <- NA # NA values can be handled ## Set constant parameters: dt <- ct <- matrix(0) Zt <- Tt <- matrix(1) a0 <- y[1] # Estimation of the first year flow P0 <- matrix(100) # Variance of 'a0' ## Estimate parameters: fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5, GGt = var(y, na.rm = TRUE) * .5), fn = function(par, ...) -fkf(HHt = matrix(par[1]), GGt = matrix(par[2]), ...)$logLik, yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct, Zt = Zt, Tt = Tt) ## Filter Nile data with estimated parameters: fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = matrix(fit.fkf$par[1]), GGt = matrix(fit.fkf$par[2]), yt = rbind(y)) ## Smooth the data based on the filter object fks.obj <- fks(fkf.obj) ## Plot the flow data together with local levels: plot(y, main = "Nile flow") lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue") lines(ts(fks.obj$ahatt[1,], start = start(y), frequency = frequency(y)), col = "red") legend("top", c("Nile flow data", "Local level (fkf)","Local level (fks)"), col = c("black", "green", "blue", "red"), lty = 1)
## <---------------------------------------------------------------------------> ## Example: Local level model for the Nile's annual flow. ## <---------------------------------------------------------------------------> ## Transition equation: ## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt) ## Measurement equation: ## y[t] = alpha[t] + eps[t], eps[t] ~ N(0, GGt) y <- Nile y[c(3, 10)] <- NA # NA values can be handled ## Set constant parameters: dt <- ct <- matrix(0) Zt <- Tt <- matrix(1) a0 <- y[1] # Estimation of the first year flow P0 <- matrix(100) # Variance of 'a0' ## Estimate parameters: fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5, GGt = var(y, na.rm = TRUE) * .5), fn = function(par, ...) -fkf(HHt = matrix(par[1]), GGt = matrix(par[2]), ...)$logLik, yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct, Zt = Zt, Tt = Tt) ## Filter Nile data with estimated parameters: fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = matrix(fit.fkf$par[1]), GGt = matrix(fit.fkf$par[2]), yt = rbind(y)) ## Smooth the data based on the filter object fks.obj <- fks(fkf.obj) ## Plot the flow data together with local levels: plot(y, main = "Nile flow") lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue") lines(ts(fks.obj$ahatt[1,], start = start(y), frequency = frequency(y)), col = "red") legend("top", c("Nile flow data", "Local level (fkf)","Local level (fks)"), col = c("black", "green", "blue", "red"), lty = 1)
Plotting method for objects of class fkf
. This function
provides tools for graphical analysis of the Kalman filter output:
Visualization of the state vector, QQ-plot of the individual
residuals, QQ-plot of the Mahalanobis distance, auto- as well as
crosscorrelation function of the residuals.
## S3 method for class 'fkf' plot( x, type = c("state", "resid.qq", "qqchisq", "acf"), CI = 0.95, at.idx = 1:nrow(x$at), att.idx = 1:nrow(x$att), ... )
## S3 method for class 'fkf' plot( x, type = c("state", "resid.qq", "qqchisq", "acf"), CI = 0.95, at.idx = 1:nrow(x$at), att.idx = 1:nrow(x$att), ... )
x |
The output of |
type |
A string stating what shall be plotted (see Details). |
CI |
The confidence interval in case |
at.idx |
An vector giving the indexes of the predicted state variables
which shall be plotted if |
att.idx |
An vector giving the indexes of the filtered state variables
which shall be plotted if |
... |
The argument type
states what shall be plotted. type
must partially match one of the following:
state
The state variables are plotted. By the
arguments at.idx
and att.idx
, the user can specify
which of the predicted () and filtered
(
) state variables will be drawn.
resid.qq
Draws a QQ-plot for each residual-series invt
.
qqchisq
A Chi-Squared QQ-plot will be drawn to graphically test for multivariate normality of the residuals based on the Mahalanobis distance.
acf
Creates a pairs plot with the autocorrelation
function (acf
) on the diagonal panels and the
crosscorrelation function (ccf
) of the residuals on the
off-diagnoal panels.
Invisibly returns an list with components:
distance |
The Mahalanobis distance of the residuals as a
vector of length . |
std.resid |
The standardized residuals as an -matrix. It should hold that ,
|
where denotes the dimension of the data and
the number
of observations.
plot(x, type = c("state", "resid.qq", "qqchisq", "acf"),
CI = 0.95, at.idx = 1:nrow(x$at), att.idx = 1:nrow(x$att), ...)
## <---------------------------------------------------------------------------> ## Example: Local level model for the treering data ## <---------------------------------------------------------------------------> ## Transition equation: ## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt) ## Measurement equation: ## y[t] = alpha[t] + eps[t], eps[t] ~ N(0, GGt) y <- treering y[c(3, 10)] <- NA # NA values can be handled ## Set constant parameters: dt <- ct <- matrix(0) Zt <- Tt <- array(1,c(1,1,1)) a0 <- y[1] # Estimation of the first width P0 <- matrix(100) # Variance of 'a0' ## Estimate parameters: fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5, GGt = var(y, na.rm = TRUE) * .5), fn = function(par, ...) -fkf(HHt = array(par[1],c(1,1,1)), GGt = array(par[2],c(1,1,1)), ...)$logLik, yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct, Zt = Zt, Tt = Tt) ## Filter tree ring data with estimated parameters: fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = array(fit.fkf$par[1],c(1,1,1)), GGt = array(fit.fkf$par[2],c(1,1,1)), yt = rbind(y)) ## Plot the width together with fitted local levels: plot(y, main = "Treering data") lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue") legend("top", c("Treering data", "Local level"), col = c("black", "blue"), lty = 1) ## Check the residuals for normality: plot(fkf.obj, type = "resid.qq") ## Test for autocorrelation: plot(fkf.obj, type = "acf", na.action = na.pass)
## <---------------------------------------------------------------------------> ## Example: Local level model for the treering data ## <---------------------------------------------------------------------------> ## Transition equation: ## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt) ## Measurement equation: ## y[t] = alpha[t] + eps[t], eps[t] ~ N(0, GGt) y <- treering y[c(3, 10)] <- NA # NA values can be handled ## Set constant parameters: dt <- ct <- matrix(0) Zt <- Tt <- array(1,c(1,1,1)) a0 <- y[1] # Estimation of the first width P0 <- matrix(100) # Variance of 'a0' ## Estimate parameters: fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5, GGt = var(y, na.rm = TRUE) * .5), fn = function(par, ...) -fkf(HHt = array(par[1],c(1,1,1)), GGt = array(par[2],c(1,1,1)), ...)$logLik, yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct, Zt = Zt, Tt = Tt) ## Filter tree ring data with estimated parameters: fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = array(fit.fkf$par[1],c(1,1,1)), GGt = array(fit.fkf$par[2],c(1,1,1)), yt = rbind(y)) ## Plot the width together with fitted local levels: plot(y, main = "Treering data") lines(ts(fkf.obj$att[1, ], start = start(y), frequency = frequency(y)), col = "blue") legend("top", c("Treering data", "Local level"), col = c("black", "blue"), lty = 1) ## Check the residuals for normality: plot(fkf.obj, type = "resid.qq") ## Test for autocorrelation: plot(fkf.obj, type = "acf", na.action = na.pass)
Plotting method for objects of class fks
. This function
provides tools visualisation of the state vector of the Kalman smoother output
## S3 method for class 'fks' plot(x, CI = 0.95, ahatt.idx = 1:nrow(x$ahatt), ...)
## S3 method for class 'fks' plot(x, CI = 0.95, ahatt.idx = 1:nrow(x$ahatt), ...)
x |
The output of |
CI |
The confidence interval in case |
ahatt.idx |
An vector giving the indexes of the predicted state variables
which shall be plotted if |
... |
The state variables are plotted. By the argument ahatt.idx
, the user can specify
which of the smoothed () state variables will be drawn.
## <---------------------------------------------------------------------------> ## Example 3: Local level model for the treering data ## <---------------------------------------------------------------------------> ## Transition equation: ## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt) ## Measurement equation: ## y[t] = alpha[t] + eps[t], eps[t] ~ N(0, GGt) y <- treering y[c(3, 10)] <- NA # NA values can be handled ## Set constant parameters: dt <- ct <- matrix(0) Zt <- Tt <- array(1,c(1,1,1)) a0 <- y[1] # Estimation of the first width P0 <- matrix(100) # Variance of 'a0' ## Estimate parameters: fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5, GGt = var(y, na.rm = TRUE) * .5), fn = function(par, ...) -fkf(HHt = array(par[1],c(1,1,1)), GGt = array(par[2],c(1,1,1)), ...)$logLik, yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct, Zt = Zt, Tt = Tt) ## Filter tree ring data with estimated parameters: fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = array(fit.fkf$par[1],c(1,1,1)), GGt = array(fit.fkf$par[2],c(1,1,1)), yt = rbind(y)) fks.obj <- fks(fkf.obj) plot(fks.obj) ##lines(as.numeric(y),col="blue")
## <---------------------------------------------------------------------------> ## Example 3: Local level model for the treering data ## <---------------------------------------------------------------------------> ## Transition equation: ## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt) ## Measurement equation: ## y[t] = alpha[t] + eps[t], eps[t] ~ N(0, GGt) y <- treering y[c(3, 10)] <- NA # NA values can be handled ## Set constant parameters: dt <- ct <- matrix(0) Zt <- Tt <- array(1,c(1,1,1)) a0 <- y[1] # Estimation of the first width P0 <- matrix(100) # Variance of 'a0' ## Estimate parameters: fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5, GGt = var(y, na.rm = TRUE) * .5), fn = function(par, ...) -fkf(HHt = array(par[1],c(1,1,1)), GGt = array(par[2],c(1,1,1)), ...)$logLik, yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct, Zt = Zt, Tt = Tt) ## Filter tree ring data with estimated parameters: fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = array(fit.fkf$par[1],c(1,1,1)), GGt = array(fit.fkf$par[2],c(1,1,1)), yt = rbind(y)) fks.obj <- fks(fkf.obj) plot(fks.obj) ##lines(as.numeric(y),col="blue")