Title: | Anomaly Detection using the CAPA and PELT Algorithms |
---|---|
Description: | Implimentations of the univariate CAPA <doi:10.1002/sam.11586> and PELT <doi:10.1080/01621459.2012.737745> algotithms along with various cost functions for different distributions and models. The modular design, using R6 classes, favour ease of extension (for example user written cost functions) over the performance of other implimentations (e.g. <doi:10.32614/CRAN.package.changepoint>, <doi:10.32614/CRAN.package.anomaly>). |
Authors: | Paul Smith [aut, cre] |
Maintainer: | Paul Smith <[email protected]> |
License: | GPL-3 |
Version: | 0.0.4.2 |
Built: | 2024-12-13 13:46:58 UTC |
Source: | https://github.com/waternumbers/anomalous |
An R implimentation of the segmented search algorithmpelt algorithm
capa(part, fCost, prune = TRUE, verbose = FALSE)
capa(part, fCost, prune = TRUE, verbose = FALSE)
part |
the starting partition |
fCost |
the cost function |
prune |
logical, should pruning be used |
verbose |
logical, print out progress |
Basic R implimentation of pelt - not efficent
the optimal partition
Cost functions for the multinomial distribution
Collective anomalies are represented as chnages to the expected proportions. Time varying expected proportions are currently not handled.
length()
Get the length of time series
categoricalCost$length()
new()
Initialise the cost function
categoricalCost$new(x, m = rep(1/ncol(x), ncol(x)))
x
integer matrix of observations of 0,1
m
numeric vector of expected proportions ## need to check row sums are 1 - possibly just use multinomial??
baseCost()
Compute the non-anomalous cost of a segment
categoricalCost$baseCost(a, b, pen = 0)
a
start of period
b
end of period
pen
penalty cost
pointCost()
Compute the point anomaly cost of a time step
categoricalCost$pointCost(b, pen)
b
time step
pen
penalty cost
collectiveCost()
Compute the anomalous cost of a segment
categoricalCost$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
categoricalCost$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
categoricalCost$clone(deep = FALSE)
deep
Whether to make a deep clone.
set.seed(0) m <- c(1:4)/sum(1:4) X <- t(rmultinom(100, 1, m)) p <- categoricalCost$new(X,m) p$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] p$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation p$collectiveCost(90,95,57,3)
set.seed(0) m <- c(1:4)/sum(1:4) X <- t(rmultinom(100, 1, m)) p <- categoricalCost$new(X,m) p$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] p$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation p$collectiveCost(90,95,57,3)
Generates coefficents for the anomalous periods in an object with class amomalous_partition
## S3 method for class 'anomalous_partition' coef(object, ...)
## S3 method for class 'anomalous_partition' coef(object, ...)
object |
the anomalous_partition |
... |
optional parameters see details |
Generates coefficents for all anomalous periods. Required input cost
is a cost function. Optional input t
is the time of the partitioning solution to use.
A matrix of parameters whose rows correspond to the rows in the summary output
Extracts the collective anomaly summaries from the object
collective_anomalies(p, t)
collective_anomalies(p, t)
p |
object, such as a partition, from which to extract the point anomalies |
t |
the end time at which the partition is based |
An implimentation of the CROPS algorithm in 1D
crops( betaMin, betaMax, fCost, alg = pelt, betaP = Inf, min_length = 2, prune = TRUE, verbose = FALSE, maxIter = 100 )
crops( betaMin, betaMax, fCost, alg = pelt, betaP = Inf, min_length = 2, prune = TRUE, verbose = FALSE, maxIter = 100 )
betaMin |
lower bound of penalisation window |
betaMax |
upper bound of penalisation window |
fCost |
the cost function |
alg |
algorithm either capa of pelt |
betaP |
penalty for adding a point anomaly - only for use with capa |
min_length |
minimum number of values in a collective anomaly |
prune |
logical, should pruning be used |
verbose |
logical, print out progress |
maxIter |
maximum number of algorithm evaluations to perform |
This will only work for cost functions where the beta is additive!!!
something...
Cost functions for differening types of univariate gaussion anomalies
x |
numeric vector or matrix of observations (see details) |
m |
numeric vector of mean values |
s |
numeric vector of standard deviation values |
point_type |
representation of point anomalies as either a change in mean or variance |
a |
start of period |
b |
end of period |
pen |
penalty cost |
len |
minimum number of obseervations |
Collective anomalies are represented either as changes in mean (gaussMean
),
variance (gaussVar
) or mean and variance (gaussMeanvar
). See vignettes for details.
If x is a matrix then the values in each row are treated as IID replicate observations.
length()
Get the length of time series
gaussCost$length()
new()
Initialise the cost function
gaussCost$new(x, m = 0, s = 1, point_type = c("var", "mean"))
x
numeric vector of observations
m
numeric vector of mean values
s
numeric vector of standard deviation values
point_type
representation of point anomalies as either a change in mean or variance
baseCost()
Compute the non-anomalous cost of a segment
gaussCost$baseCost(a, b, pen = 0)
a
start of period
b
end of period
pen
penalty cost
pointCost()
Compute the point anomaly cost of a time step
gaussCost$pointCost(b, pen)
b
time step
pen
penalty cost
clone()
The objects of this class are cloneable with this method.
gaussCost$clone(deep = FALSE)
deep
Whether to make a deep clone.
anomalous::gaussCost
-> gaussMean
collectiveCost()
Compute the anomalous cost of a segment
gaussMean$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
gaussMean$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
gaussMean$clone(deep = FALSE)
deep
Whether to make a deep clone.
anomalous::gaussCost
-> gaussVar
collectiveCost()
Compute the anomalous cost of a segment
gaussVar$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
gaussVar$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
gaussVar$clone(deep = FALSE)
deep
Whether to make a deep clone.
anomalous::gaussCost
-> gaussMeanVar
collectiveCost()
Compute the non-anomalous cost of a segment
gaussMeanVar$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
gaussMeanVar$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
gaussMeanVar$clone(deep = FALSE)
deep
Whether to make a deep clone.
set.seed(0) m <- runif(100) s <- pmax(1e-4,runif(100)) x <- rnorm(100,m,s) ## example data gM <- gaussMean$new(x,m,s) ## anomalies are changes in mean gM$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] gM$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation gM$collectiveCost(90,95,57,3) gV <- gaussVar$new(x,m,s) ## anomalies are changes in variance gV$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] gV$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation gV$collectiveCost(90,95,57,3) gMV <- gaussMeanVar$new(x,m,s) ## anomalies are changes in mean and variance gMV$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] gMV$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation gMV$collectiveCost(90,95,57,3)
set.seed(0) m <- runif(100) s <- pmax(1e-4,runif(100)) x <- rnorm(100,m,s) ## example data gM <- gaussMean$new(x,m,s) ## anomalies are changes in mean gM$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] gM$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation gM$collectiveCost(90,95,57,3) gV <- gaussVar$new(x,m,s) ## anomalies are changes in variance gV$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] gV$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation gV$collectiveCost(90,95,57,3) gMV <- gaussMeanVar$new(x,m,s) ## anomalies are changes in mean and variance gMV$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] gMV$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation gMV$collectiveCost(90,95,57,3)
Cost functions for differening types of univariate gaussion anomalies
Collective anomalies are represented either as changes in parameters describing the mean (gaussRegMean
),
variance (gaussRegVar
) or mean and variance (gaussRegMeanVar
). Changes in variance are represented as a scaling parameter, not changes to the covariance. See vignettes for details.
Each element of the input x
should be a list containing a vector of observations y
and corresponding design matrix X
. Optionally in can also include a vector of parameter m
and covariance matrix S
.
length()
Get the length of time series
gaussRegCost$length()
new()
Initialise the cost function
gaussRegCost$new(x, non_neg = FALSE)
x
a list of regressions (see details)
non_neg
should only non-negative paraemter solutions be considered
baseCost()
Compute the non-anomalous cost of a segment
gaussRegCost$baseCost(a, b, pen = 0)
a
start of period
b
end of period
pen
penalty cost
pointCost()
Compute the point anomaly cost of a time step
gaussRegCost$pointCost(b, pen)
b
time step
pen
penalty cost
clone()
The objects of this class are cloneable with this method.
gaussRegCost$clone(deep = FALSE)
deep
Whether to make a deep clone.
anomalous::gaussRegCost
-> gaussRegMean
collectiveCost()
Compute the anomalous cost of a segment
gaussRegMean$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
gaussRegMean$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
gaussRegMean$clone(deep = FALSE)
deep
Whether to make a deep clone.
anomalous::gaussRegCost
-> gaussRegVar
collectiveCost()
Compute the anomalous cost of a segment
gaussRegVar$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
gaussRegVar$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
gaussRegVar$clone(deep = FALSE)
deep
Whether to make a deep clone.
anomalous::gaussRegCost
-> gaussRegMeanVar
collectiveCost()
Compute the anomalous cost of a segment
gaussRegMeanVar$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
gaussRegMeanVar$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
gaussRegMeanVar$clone(deep = FALSE)
deep
Whether to make a deep clone.
## simple test set.seed(10) x <- list() n <- 120 for(ii in 1:48){ if(ii < 10){ theta = c(1,0); sigma <- 0.1 } if(ii >= 10 & ii <12){ theta <- c(10,0); sigma <- 2} if(ii >= 12 & ii < 44){ theta <- c(5,1); sigma <- 2} if(ii >= 44 ){ theta <- c(1,0); sigma <- 0.1} X <- cbind(rep(1,n),runif(n,ii-1,ii)) y <- rnorm(n, X%*%theta, sigma) x[[ii]] <- list(y=y,X=X) } fCost <- gaussRegMeanVar$new(x) p <- partition(4*log(sum(sapply(y,length))),NA,2) res <- pelt(p,fCost)
## simple test set.seed(10) x <- list() n <- 120 for(ii in 1:48){ if(ii < 10){ theta = c(1,0); sigma <- 0.1 } if(ii >= 10 & ii <12){ theta <- c(10,0); sigma <- 2} if(ii >= 12 & ii < 44){ theta <- c(5,1); sigma <- 2} if(ii >= 44 ){ theta <- c(1,0); sigma <- 0.1} X <- cbind(rep(1,n),runif(n,ii-1,ii)) y <- rnorm(n, X%*%theta, sigma) x[[ii]] <- list(y=y,X=X) } fCost <- gaussRegMeanVar$new(x) p <- partition(4*log(sum(sapply(y,length))),NA,2) res <- pelt(p,fCost)
Cost functions for the Least Absolution Deviation from a given quantile
this is a very niaive and slow implimentation
length()
Get the length of time series
ladCost$length()
new()
Initialise the cost function
ladCost$new(x, m = 0, tau = 0.5)
x
numeric vector of observations
m
expected value of x
tau
the quantile
baseCost()
Compute the non-anomalous cost of a segment
ladCost$baseCost(a, b, pen = 0)
a
start of period
b
end of period
pen
penalty cost
pointCost()
Compute the point anomaly cost of a time step
ladCost$pointCost(a, pen)
a
time step
pen
penalty cost
collectiveCost()
Compute the anomalous cost of a segment
ladCost$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
ladCost$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
ladCost$clone(deep = FALSE)
deep
Whether to make a deep clone.
set.seed(0) m <- runif(100) x <- rnorm(100,m) p <- ladCost$new(x,m,0.5) p$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] p$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation p$collectiveCost(90,95,57,3)
set.seed(0) m <- runif(100) x <- rnorm(100,m) p <- ladCost$new(x,m,0.5) p$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] p$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation p$collectiveCost(90,95,57,3)
This dataset is taken from Lai W, Johnson MJ, Kucherlapati R, Park PJ, Bioinformatics , 2005. The paper states that the original source of the data is from Bredel et al. (2005). The data is an excerpt of chromosome 7 in GBM29 from 40 to 65 Mb.
This version of the data is a copy of that in the changepoint package.
data(Lai2005fig4)
data(Lai2005fig4)
A matrix of dimensions 193 x 5. The columns are Spot, CH, POS.start, POS.end, GBM31.
http://compbio.med.harvard.edu/Supplements/Bioinformatics05b/Profiles/Chrom_7_from40_to65Mb_GBM29.xls
VERY experimental - do not use
length()
Get the length of time series
localRegCost$length()
new()
Initialise the cost function
localRegCost$new(x, family = c("gaussian", "symmetric"))
x
observations as for gauss_reg
family
for fitting
baseCost()
Compute the non-anomalous cost of a segment
localRegCost$baseCost(a, b, pen = 0)
a
start of period
b
end of period
pen
penalty cost
pointCost()
Compute the point anomaly cost of a time step
localRegCost$pointCost(b, pen)
b
time step
pen
penalty cost
collectiveCost()
Compute the non-anomalous cost of a segment
localRegCost$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
localRegCost$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
localRegCost$clone(deep = FALSE)
deep
Whether to make a deep clone.
Temperature sensor data of an internal component of a large, industrial machine. The data contains three known anomalies. The first anomaly is a planned shutdown of the machine. The second anomaly is difficult to detect and directly led to the third anomaly, a catastrophic failure of the machine. The data consists of 22695 observations of machine temperature recorded at 5 minute intervals along with the date and time of the measurement. The data was obtained from the Numenta Anomaly Benchmark, which can be found at https://github.com/numenta/NAB.
data(machinetemp)
data(machinetemp)
A dataframe with 22695 rows and 2 columns. The first column contains the date and time of the temperature measurement. The second column contains the machine temperature.
Cost functions for the multinomial distribution
Collective anomalies are represented as chnages to the expected proportions. Time varying expected proportions are currently not handled.
length()
Get the length of time series
multinomialCost$length()
new()
Initialise the cost function
multinomialCost$new(x, m = rep(1/ncol(x), ncol(x)))
x
integer matrix of observations
m
numeric vector of expected proportions
baseCost()
Compute the non-anomalous cost of a segment
multinomialCost$baseCost(a, b, pen = 0)
a
start of period
b
end of period
pen
penalty cost
pointCost()
Compute the point anomaly cost of a time step
multinomialCost$pointCost(b, pen)
b
time step
pen
penalty cost
collectiveCost()
Compute the anomalous cost of a segment
multinomialCost$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
multinomialCost$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
multinomialCost$clone(deep = FALSE)
deep
Whether to make a deep clone.
set.seed(0) m <- c(1:4)/sum(1:4) X <- t(rmultinom(100, 144, m)) p <- multinomialCost$new(X,m) p$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] p$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation p$collectiveCost(90,95,57,3)
set.seed(0) m <- c(1:4)/sum(1:4) X <- t(rmultinom(100, 144, m)) p <- multinomialCost$new(X,m) p$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] p$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation p$collectiveCost(90,95,57,3)
Get the parameters for a partitioning result
param(res, fCost)
param(res, fCost)
res |
the result of a partitioning algorithm |
fCost |
the cost function |
Not yet implimented for all cost functions
list of parameters
A partition records the seperation of the data generated by the pelt or capa methods
partition(beta, betaP, min_length)
partition(beta, betaP, min_length)
beta |
penalty for a new collective anomaly |
betaP |
penalty for a new point anomaly |
min_length |
shortest length of a collective anomaly |
p <- partition(3,4,2)
p <- partition(3,4,2)
An R implimentation of the segmented search algorithmpelt algorithm
pelt(part, fCost, prune = TRUE, verbose = FALSE)
pelt(part, fCost, prune = TRUE, verbose = FALSE)
part |
the starting partition |
fCost |
the cost function |
prune |
logical, should pruning be used |
verbose |
logical, print out progress |
Basic R implimentation of pelt - not efficent
the optimal partition
Provides a summary plot of an object with class amomalous_partition
## S3 method for class 'anomalous_partition' plot(x, ...)
## S3 method for class 'anomalous_partition' plot(x, ...)
x |
the anomalous_partition |
... |
optional parameters see details |
If providing t
a time the results prodiced are as though the analysis was stopped then.
The optional inputs xx
and yy
allow for data to be displayed under the shaded anomaly/changepoint areas.
If no data is provided xx = 1:t
and yy
is the cost of the periods the timestep is in.
Produces a plot
Extracts the point anomaly summaries from the object
point_anomalies(p, t)
point_anomalies(p, t)
p |
object, such as a partition, from which to extract the point anomalies |
t |
the end time at which the partition is based |
Cost functions for the univariate Poission distribution
Collective anomalies are represented as multiplicative changes in rate
length()
Get the length of time series
poisCost$length()
new()
Initialise the cost function
poisCost$new(x, rate = 1)
x
numeric vector of observations
rate
numeric vector of rate parameters
baseCost()
Compute the non-anomalous cost of a segment
poisCost$baseCost(a, b, pen = 0)
a
start of period
b
end of period
pen
penalty cost
pointCost()
Compute the point anomaly cost of a time step
poisCost$pointCost(b, pen)
b
time step
pen
penalty cost
collectiveCost()
Compute the anomalous cost of a segment
poisCost$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
poisCost$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
poisCost$clone(deep = FALSE)
deep
Whether to make a deep clone.
set.seed(0) r <- 8 + runif(100)*2 x <- rpois(100,lambda = r) p <- poisCost$new(x,r) p$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] p$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation p$collectiveCost(90,95,57,3)
set.seed(0) r <- 8 + runif(100)*2 x <- rpois(100,lambda = r) p <- poisCost$new(x,r) p$baseCost(90,95) ## cost of non-anomalous distribution for x[90:95] p$pointCost(90,0) ## point anomaly cost for x[90] with 0 penalty ## collective anomaly cost for x[90:95] with penalty of 57 and at least 3 observation p$collectiveCost(90,95,57,3)
This is VERY developmental - do not use
length()
Get the length of time series
rankCost$length()
new()
Initialise the cost function
rankCost$new(x, m = 0)
x
numeric matrix of observations
m
numeric vector or matrix of mean values
baseCost()
Compute the non-anomalous cost of a segment
rankCost$baseCost(a, b, pen = 0)
a
start of period
b
end of period
pen
penalty cost
pointCost()
Compute the point anomaly cost of a time step
rankCost$pointCost(b, pen)
b
time step
pen
penalty cost
collectiveCost()
Compute the non-anomalous cost of a segment
rankCost$collectiveCost(a, b, pen, len)
a
start of period
b
end of period
pen
penalty cost
len
minimum number of observations
param()
Compute parameters of a segment if anomalous
rankCost$param(a, b)
a
start of period
b
end of period
clone()
The objects of this class are cloneable with this method.
rankCost$clone(deep = FALSE)
deep
Whether to make a deep clone.
set.seed(0) m <- runif(100) s <- pmax(1e-4,runif(100)) x <- rnorm(100,m,s) ## example data
set.seed(0) m <- runif(100) s <- pmax(1e-4,runif(100)) x <- rnorm(100,m,s) ## example data
A simulated data set for use in the examples and vignettes. The data consists of 500 observations on 20 variates drawn from the standard normal distribution. Within the data there are three multivariate anomalies of length 15 located at t=100, t=200, and t=300 for which the mean changes from 0 to 2. The anomalies affect variates 1 to 8, 1 to 12 and 1 to 16 respectively.
data(simulated)
data(simulated)
A matrix with 500 rows and 40 columns.
Provides a summary of an object with class amomalous_partition
## S3 method for class 'anomalous_partition' summary(object, ...)
## S3 method for class 'anomalous_partition' summary(object, ...)
object |
the anomalous_partition |
... |
optional parameters see details |
If providing t
a time the results prodiced are as though the analysis was stopped then.
A data.frame summarising the partitions
Daily average wind speeds for 1961-1978 at 12 synoptic meteorological stations in the Republic of Ireland (Haslett and raftery 1989). Wind speeds are in knots (1 knot = 0.5418 m/s), at each of the stations in the order given in Fig.4 of Haslett and Raftery (1989, see below)
This data is a copy of that contained within the gstat package
data(wind)
data(wind)
data.frame wind
contains the following columns:
year, minus 1900
month (number) of the year
day
average wind speed in knots at station RPT
average wind speed in knots at station VAL
average wind speed in knots at station ROS
average wind speed in knots at station KIL
average wind speed in knots at station SHA
average wind speed in knots at station BIR
average wind speed in knots at station DUB
average wind speed in knots at station CLA
average wind speed in knots at station MUL
average wind speed in knots at station CLO
average wind speed in knots at station BEL
average wind speed in knots at station MAL
data.frame wind.loc
contains the following columns:
Station name
Station code
Latitude, in DMS, see examples below
Longitude, in DMS, see examples below
mean wind for each station, metres per second
This data set comes with the following message: “Be aware that the dataset is 532494 bytes long (thats over half a Megabyte). Please be sure you want the data before you request it.” The data were obtained on Oct 12, 2008, from: http://www.stat.washington.edu/raftery/software.html The data are also available from statlib. Locations of 11 of the stations (ROS, Rosslare has been thrown out because it fits poorly the spatial correlations of the other stations) were obtained from: http://www.stat.washington.edu/research/reports/2005/tr475.pdf Roslare lat/lon was obtained from google maps, location Roslare. The mean wind value for Roslare comes from Fig. 1 in the original paper. Haslett and Raftery proposed to use a sqrt-transform to stabilize the variance.
Adrian Raftery; imported to R by Edzer Pebesma
These data were analyzed in detail in the following article:
Haslett, J. and Raftery, A. E. (1989). Space-time Modelling with Long-memory Dependence: Assessing Ireland's Wind Power Resource (with Discussion). Applied Statistics 38, 1-50. and in many later papers on space-time analysis, for example: Tilmann Gneiting, Marc G. Genton, Peter Guttorp: Geostatistical Space-Time Models, Stationarity, Separability and Full symmetry. Ch. 4 in: B. Finkenstaedt, L. Held, V. Isham, Statistical Methods for Spatio-Temporal Systems.
data(wind) summary(wind)
data(wind) summary(wind)