Title: | Calculation and Application of p-Variation |
---|---|
Description: | The calculation of p-variation of the finite sample data. This package is a realisation of the procedure described in Butkus, V. & Norvaisa, R. Lith Math J (2018). <doi: 10.1007/s10986-018-9414-3> The formal definitions and reference into literature are given in vignette. |
Authors: | Vygantas Butkus |
Maintainer: | Vygantas Butkus <[email protected]> |
License: | GPL-2 |
Version: | 2.2.7 |
Built: | 2025-02-05 04:41:10 UTC |
Source: | https://github.com/cran/pvar |
This package deals with p-variation for the sample (i.e. the sequence of data values). It gives opportunity to calculate the p-variation for the sample – this is the main purpose of this package. Nonetheless, it could be used to calculate p-variation for arbitrary piecewise monotonic function as well. Moreover, the package includes one example of practical application of the p-variation.
Package: | pvar |
Type: | Package |
Version: | 2.2.5 |
Date: | 2016-05-17 |
License: | GPL-2 |
Institution: | Vilnius University Faculty of Mathematics and Informatics |
This package is about p-variation. It deals with p-variation of a finite sample data values. To be precise, lets star with the definitions. Originally p-variation is defined for a functions.
For a function and
p-variation is defined as
Analogically, for a sequences of values , the p-variation is defined as
The points (or
) that achieves the maximums is called a supreme partition (or just a partition for short).
There are two main functions that this package is all about, namely it is pvar
and PvarBreakTest
.
The main function in this package is pvar
.
It calculates the p-variation and the partition.
And the function PvarBreakTest
is one of the examples of p-variation applications.
It performs structural break test of vector x
that exams whether there are multiple
shifts in mean inside vector x
.
All other functions are loaded only for supporting and illustrating purposes.
Author and Maintainer: Vygantas Butkus <[email protected]>.
Special thanks to Rimas Norvaisa the supervisor of my studies.
[1] V. Butkus, R. Norvaisa. Lith Math J (2018). https://doi.org/10.1007/s10986-018-9414-3
[2] R. M. Dudley, R. Norvaisa. An Introduction to p-variation and Young Integrals, Cambridge, Mass., 1998.
[3] R. M. Dudley, R. Norvaisa. Differentiability of Six Operators on Nonsmooth Functions and p-Variation, Springer Berlin Heidelberg, Print ISBN 978-3-540-65975-4, Lecture Notes in Mathematics Vol. 1703, 1999.
[4] R. Norvaisa, A. Rackauskas. Convergence in law of partial sum processes in p-variation norm. Lth. Math. J., 2008., Vol. 48, No. 2, 212-227.
[5] J. Qian. The p-variation of Partial Sum Processes and the Empirical Process. The Annals of Probability, 1998, Vol. 26, No. 3, 1370-1383.
The main function is pvar
- it finds p-variation and the partition that maximizes Sum_p
function.
Other important functions is PvarBreakTest
it performs structural break test of vector x
by calculating p-variations of BridgeT(x)
(see BridgeT
).
Concatenate Strings
x %.% y
x %.% y
x |
asd |
y |
asd |
The same result may be achieved with paste
, but in some circumstance this function is more user friendly.
A character string of the concatenated values.
paste('I ', 'love ', 'R.', sep='') 'I ' %.% 'love ' %.% 'R.' x = c(2,1,6,7,9) paste('The length of vector (', paste(x , sep='', collapse =','), ') is ', length(x) , sep='') 'The length of vector (' %.% paste(x , sep='', collapse =',') %.% ') is ' %.% length(x)
paste('I ', 'love ', 'R.', sep='') 'I ' %.% 'love ' %.% 'R.' x = c(2,1,6,7,9) paste('The length of vector (', paste(x , sep='', collapse =','), ') is ', length(x) , sep='') 'The length of vector (' %.% paste(x , sep='', collapse =',') %.% ') is ' %.% length(x)
Merges two objects of p-variation and effectively recalculates the p-variation of joined sample.
AddPvar(PV1, PV2, AddIfPossible = TRUE)
AddPvar(PV1, PV2, AddIfPossible = TRUE)
PV1 |
an object of the class |
PV2 |
an object of the class |
AddIfPossible |
|
Note: a short form of AddPvar(PV1, PV2
is PV1 + PV2
.
An object of the class pvar
. See pvar
.
### creating two pvar objects: x = rwiener(1000) PV1 = pvar(x[1:500], 2) PV2 = pvar(x[500:1000], 2) layout(matrix(c(1,3,2,3), 2, 2)) plot(PV1) plot(PV2) plot(AddPvar(PV1, PV2)) layout(1) ### AddPvar(PV1, PV2) is eqivavalent to PV1 + PV2 IsEqualPvar(AddPvar(PV1, PV2), PV1 + PV2)
### creating two pvar objects: x = rwiener(1000) PV1 = pvar(x[1:500], 2) PV2 = pvar(x[500:1000], 2) layout(matrix(c(1,3,2,3), 2, 2)) plot(PV1) plot(PV2) plot(AddPvar(PV1, PV2)) layout(1) ### AddPvar(PV1, PV2) is eqivavalent to PV1 + PV2 IsEqualPvar(AddPvar(PV1, PV2), PV1 + PV2)
Transforms data by Bridge transformation.
BridgeT(x, normalize = TRUE)
BridgeT(x, normalize = TRUE)
x |
x a numeric vector of data values. |
normalize |
|
Let n
denotes the length ox x
.
For each bridge transformations
BridgeT
is defined as
Meanwhile, the transformation with normalization is
A numeric vector.
x <- rnorm(1000) Bx <- BridgeT(x, FALSE) op <- par(mfrow=c(2,1),mar=c(4,4,2,1)) plot(cumsum(x), type="l") plot(Bx, type="l") par(op)
x <- rnorm(1000) Bx <- BridgeT(x, FALSE) op <- par(mfrow=c(2,1),mar=c(4,4,2,1)) plot(cumsum(x), type="l") plot(Bx, type="l") par(op)
numeric
vectorFinds changes points (i.e. corners) in the numeric
vector.
ChangePoints(x)
ChangePoints(x)
x |
|
The end points of the vector will be always included in the results.
The vector of index of change points.
x <- rwiener(100) cid <- ChangePoints(x) plot(x, type="l") points(time(x)[cid], x[cid], cex=0.5, col=2, pch=19)
x <- rwiener(100) cid <- ChangePoints(x) plot(x, type="l") points(time(x)[cid], x[cid], cex=0.5, col=2, pch=19)
The test PvarBreakTest
uses quantiles from Monte-Carlo simulations.
The results of the simulations are saved in these data sets.
PvarQuantileDF MeanCoef SdCoef
PvarQuantileDF MeanCoef SdCoef
the PvarQuantileDF
is a data.frame
with fields prob
an Qaunt.
The field brob
represent the probability and Quant
gives correspondingly quantile.
MeanCoef
and SdCoef
is a named vector used in functions getMean
and getSd
.
The distribution of p-variation of BridgeT(x)
are unknown,
therefore it was approximated form Monte-Carlo simulation based on 140 millions iterations.
The data frame PvarQuantile
summarize the distribution of normalized statistics.
Meanwhile, MeanCoef
and SdCoef
defines the coefficients of functional form of mean
and sd
statistics of
PvarBreakTest statistics (see getMean
).
Vygantas Butkus <[email protected]>
Monte-Carlo simulation
Two pvar
objects are considered to be equal
if they have the same x
, p
, value
and the same value of x
in the points of partition
(the index of partitions are not necessary the same).
All other tributes like dname
or TimeLabel
are not important.
IsEqualPvar(pv1, pv2)
IsEqualPvar(pv1, pv2)
pv1 |
an object of the class |
pv2 |
an object of the class |
x <- rwiener(100) pv1 <- pvar(x, 2) pv2 <- pvar(x[1:50], 2) + pvar(x[50:101], 2) IsEqualPvar(pv1, pv2)
x <- rwiener(100) pv1 <- pvar(x, 2) pv2 <- pvar(x[1:50], 2) + pvar(x[50:101], 2) IsEqualPvar(pv1, pv2)
Calculates p-variation of the sample.
pvar(x, p, TimeLabel = as.vector(time(x)), LSI = 3) ## S3 method for class 'pvar' summary(object, ...) ## S3 method for class 'pvar' plot(x, main = "p-variation", ylab = x$dname, sub = "p=" %.% round(x$p, 5) %.% ", p-variation: " %.% formatC(x$value, 5, format = "f"), col.PP = 2, cex.PP = 0.5, ...)
pvar(x, p, TimeLabel = as.vector(time(x)), LSI = 3) ## S3 method for class 'pvar' summary(object, ...) ## S3 method for class 'pvar' plot(x, main = "p-variation", ylab = x$dname, sub = "p=" %.% round(x$p, 5) %.% ", p-variation: " %.% formatC(x$value, 5, format = "f"), col.PP = 2, cex.PP = 0.5, ...)
x |
a (non-empty) numeric vector of data values or an object of the class |
p |
a positive number indicating the power |
TimeLabel |
numeric, a time index of |
LSI |
a length of small interval. It must be a positive odd number. This parameter do not have effect on final result, but might influence the speed of calculation. |
object |
an objct of the class |
... |
further arguments. |
main |
a |
ylab |
a |
sub |
a |
col.PP |
the color of partition points. |
cex.PP |
the cex of partition points. |
This function is the main function in this package. It calculates the p-variation of the sample.
The formal definition is given in pvar-package
.
An object of the class pvar
. Namely, it is a list that contains
value |
a value of p-variation. |
x |
a vector of original data |
p |
the value of p. |
partition |
a vector of indexes that indicates the partition that achieves the maximum. |
dname |
a name of data vector (optional). |
TimeLabel |
a time label of |
Vygantas Butkus <[email protected]>
IsEqualPvar
, AddPvar
, PvarBreakTest
.
### randomised data: x = rbridge(1000) ### the main functions: pv = pvar(x, 2) print(pv) summary(pv) plot(pv) ### The value of p-variation is pv; Sum_p(x[pv$partition], 2) ### The meaning of supreme partition points: pv.PP = pvar(x[pv$partition], TimeLabel=time(x)[pv$partition], 2) pv.PP == pv.PP op <- par(mfrow = c(2, 1), mar=c(2, 4, 4, 1)) plot(pv, main='pvar with original data') plot(pv.PP, main='The same pvar without redundant points') par(op)
### randomised data: x = rbridge(1000) ### the main functions: pv = pvar(x, 2) print(pv) summary(pv) plot(pv) ### The value of p-variation is pv; Sum_p(x[pv$partition], 2) ### The meaning of supreme partition points: pv.PP = pvar(x[pv$partition], TimeLabel=time(x)[pv$partition], 2) pv.PP == pv.PP op <- par(mfrow = c(2, 1), mar=c(2, 4, 4, 1)) plot(pv, main='pvar with original data') plot(pv.PP, main='The same pvar without redundant points') par(op)
This function performs structural break test that is based on p-variation.
PvarBreakTest(x, TimeLabel = as.vector(time(x)), alpha = 0.05, FullInfo = TRUE) ## S3 method for class 'PvarBreakTest' plot(x, main1 = "Data", main2 = "Bridge transformation", ylab1 = x$dname, ylab2 = "BridgeT(" %.% x$dname %.% ")", sub2 = NULL, col.PP = 3, cex.PP = 0.5, col.BP = 2, cex.BP = 1, cex.DP = 0.5, ...) ## S3 method for class 'PvarBreakTest' summary(object, ...)
PvarBreakTest(x, TimeLabel = as.vector(time(x)), alpha = 0.05, FullInfo = TRUE) ## S3 method for class 'PvarBreakTest' plot(x, main1 = "Data", main2 = "Bridge transformation", ylab1 = x$dname, ylab2 = "BridgeT(" %.% x$dname %.% ")", sub2 = NULL, col.PP = 3, cex.PP = 0.5, col.BP = 2, cex.BP = 1, cex.DP = 0.5, ...) ## S3 method for class 'PvarBreakTest' summary(object, ...)
x |
a numeric vector of data values or an object of class |
TimeLabel |
numeric, a time index of |
alpha |
a small number greater then 0. It indicates the significant level of the test. |
FullInfo |
|
main1 |
the |
main2 |
the |
ylab1 |
the |
ylab2 |
the |
sub2 |
the |
col.PP |
the color of partition points. |
cex.PP |
the cex of partition points. |
col.BP |
the color of break points. |
cex.BP |
the cex of break points. |
cex.DP |
the cex of data points. |
... |
further arguments, passed to |
object |
the object of the class |
Lets x
be a data that should be tested of structural breaks.
Then the p-variation of the BridgeT(x)
with p=4
is the test's statistics.
The quantiles of H0 distribution is based on Monte-Carlo simulation of 140 millions iterations.
The test is reliable then length(x)
is between 100 and 10000.
The test might work with other lengths too, but it is not tested well.
The test will not compute then length(x)<20
.
If FullInfo=TRUE
then function returns an object of the class PvarBreakTest
.
It is the list
that contains:
Stat |
a value of statistics (p-variation of transformed data). |
CriticalValue |
the critical value of the test according to significant level. |
alpha |
the significant level. |
p.value |
approximate p-value. |
reject |
|
dname |
the name of data vector. |
p |
the power in p-variation calculus. The test performs only with the |
x |
a vector of original data. |
y |
a vector of transformed data ( |
Timelabel |
time label of |
BreakPoints |
the indexes of break points suggestion. |
Partition |
a vector of indexes that indicates the partition of |
Vygantas Butkus <[email protected]>
The test was proposed by A. Rackaskas. The test is based on the results given in the flowing article
[1] R. Norvaisa, A. Rackauskas. Convergence in law of partial sum processes in p-variation norm. Lth. Math. J., 2008., Vol. 48, No. 2, 212-227.
Tests statistics is pvar
of the data BridgeT(x)
(see BridgeT
) with (p=4).
The critical value and the approximate p-value of the test might by found by functions
PvarQuantile
and PvarPvalue
.
set.seed(1) MiuDiff <- 0.3 x <- rnorm(250*4, rep(c(0, MiuDiff, 0, MiuDiff), each=250)) plot(x, pch=19, cex=0.5, main='original data, with several shifts of mean') k <- 50 moveAvg <- filter(x, rep(1/k, k)) lines(time(x), moveAvg, lwd=2, col=2) legend('topleft', c('sample', 'moving average (k='%.%k%.%')'), lty=c(NA,1), lwd=c(NA, 2), col=1:2, pch=c(19,NA), pt.cex=c(0.7,1) ,inset = .03, bg='antiquewhite1') xtest <- PvarBreakTest(x) plot(xtest)
set.seed(1) MiuDiff <- 0.3 x <- rnorm(250*4, rep(c(0, MiuDiff, 0, MiuDiff), each=250)) plot(x, pch=19, cex=0.5, main='original data, with several shifts of mean') k <- 50 moveAvg <- filter(x, rep(1/k, k)) lines(time(x), moveAvg, lwd=2, col=2) legend('topleft', c('sample', 'moving average (k='%.%k%.%')'), lty=c(NA,1), lwd=c(NA, 2), col=1:2, pch=c(19,NA), pt.cex=c(0.7,1) ,inset = .03, bg='antiquewhite1') xtest <- PvarBreakTest(x) plot(xtest)
The distribution of p-variation of BridgeT(x)
depends on n=length(x)
.
This fact is important for getting appropriate quantiles (or p-value).
These functions helps to deal with it.
PvarQuantile(n, prob = c(0.9, 0.95, 0.99), DF = PvarQuantileDF) PvarPvalue(n, stat, DF = PvarQuantileDF) getMean(n, bMean = MeanCoef) getSd(n, bSd = SdCoef) NormalisePvar(x, n, bMean = MeanCoef, bSd = SdCoef)
PvarQuantile(n, prob = c(0.9, 0.95, 0.99), DF = PvarQuantileDF) PvarPvalue(n, stat, DF = PvarQuantileDF) getMean(n, bMean = MeanCoef) getSd(n, bSd = SdCoef) NormalisePvar(x, n, bMean = MeanCoef, bSd = SdCoef)
n |
a positive integer indicating the length of data vector. |
prob |
cumulative probabilities of p-variation distribution. |
DF |
a |
stat |
a vector of p-variation statistics. |
bMean |
a coefficient vector that defines a function of the mean of p-variation. |
bSd |
a coefficient vector that defines a function of the standard deviation of p-variation. |
x |
a numeric vector of data values. |
The distribution of p-variance is form Monte-Carlo simulation based on 140 millions iterations.
The data frame PvarQuantileDF
saves the results of Monte-Carlo simulation.
Meanwhile, MeanCoef
and SdCoef
defines the coefficients of functional
form (conditional on n
) of mean
and sd
statistics.
A functional form of mean
and sd
statistics are the same, namely
The coefficients are saved in vectors
MeanCoef
and SdCoef
.
Those vectors are estimated with nls
function form Monte-Carlo simulation.
Functions PvarQuantile
and PvarPvalue
returns a corresponding value quantile or the probability.
Functions getMean
and getSd
returns a corresponding value of mean
and sd
statistics.
Function NormalisePvar
returns normalize values.
Arguments n
, stat
and prob
might be vectors,
but they can't be vectors simultaneously (at least one of then must be a number).
PvarBreakTest
, PvarQuantileDF
,
NormalisePvar
, getMean
, getSd
Generate a trajectory of random processes.
rwiener(frequency = 1000, end = 1) rbridge(frequency = 1000, end = 1) rcumbin(frequency = 1000, end = 1)
rwiener(frequency = 1000, end = 1) rbridge(frequency = 1000, end = 1) rcumbin(frequency = 1000, end = 1)
frequency |
a number specifying the size of trajectory vector. The trajectory will start at point 0
and will have |
end |
a number. The end point of the process in the 'time' scale. |
rwiener
generate Wiener process via partial sums process and
rbridge
generate Brownian bridge via rwiener
.
The original code of rwiener
and rbridge
was written in the package e1071
.
In this package these functions was modified to
include leading zero in the beginning of the sample.
rcumbin
generate partial sums process from random variables with values -1, 0, 1
.
A time series containing a simulated realization of random processes.
The length of time series is frequency+1
, since zero is always included in the beginning of the sample.
It is the sum of absolute differences in the power of p.
Sum_p(x, p, lag = 1)
Sum_p(x, p, lag = 1)
x |
a numeric vector of data values. |
p |
a number indicating the power in summing function. |
lag |
a number, indicating the lag of differences. |
This is a function that must be maximized by taking a proper subset of x
, i.e. if prt
is a
p-variation partition of sample x
, then Sum_p(x[prt], p) == pvar(x, p)$value
.
The number equal to sum((abs(diff(x, lag)))^p)
x = rbridge(1000) pv = pvar(x, 2); pv # Sum_p in supreme partition and the value form pvar must match Sum_p(x[pv$partition], 2) pv
x = rbridge(1000) pv = pvar(x, 2); pv # Sum_p in supreme partition and the value form pvar must match Sum_p(x[pv$partition], 2) pv