| Title: | Kernel Density Estimation for Random Symmetric Positive Definite Matrices |
|---|---|
| Description: | Kernel smoothing for Wishart random matrices described in Daayeb, Khardani and Ouimet (2025) <doi:10.48550/arXiv.2506.08816>, Gaussian and log-Gaussian models using least square or likelihood cross validation criteria for optimal bandwidth selection. |
| Authors: | Leo Belzile [aut, cre] (ORCID: <https://orcid.org/0000-0002-9135-014X>), Frederic Ouimet [aut] (ORCID: <https://orcid.org/0000-0001-7933-5265>) |
| Maintainer: | Leo Belzile <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1 |
| Built: | 2026-05-23 07:58:01 UTC |
| Source: | https://github.com/lbelzile/ksm |
Given a sample of positive definite matrices,
perform numerical maximization of the h-block least square (lscv) or leave-one-out likelihood (lcv) cross-validation criteria using a root search.
bandwidth_optim( x, criterion = c("lscv", "lcv"), kernel = c("Wishart", "smlnorm", "smnorm"), tol = 1e-04, h = 1L, bounds = c(1e-04, 10) )bandwidth_optim( x, criterion = c("lscv", "lcv"), kernel = c("Wishart", "smlnorm", "smnorm"), tol = 1e-04, h = 1L, bounds = c(1e-04, 10) )
x |
sample of symmetric matrix observations from which to build the kernel density kernel |
criterion |
optimization criterion, one of |
kernel |
string, one of |
tol |
double, tolerance of optimization (root search) |
h |
lag step for consideration of observations, for the case |
bounds |
vector of length 2 containing the bounds for the search |
double, the optimal bandwidth up to tol
x <- simu_rdens(n = 100, model = 3, d = 3) bandwidth_optim(x = x, criterion = "lscv", kernel = "Wishart", h = 2L)x <- simu_rdens(n = 100, model = 3, d = 3) bandwidth_optim(x = x, criterion = "lscv", kernel = "Wishart", h = 2L)
Density of inverse Wishart random matrix
dinvWishart(x, df, S, log = FALSE)dinvWishart(x, df, S, log = FALSE)
x |
array of dimension |
df |
degrees of freedom |
S |
symmetric positive definite matrix of dimension |
log |
logical; if |
a vector of length n containing the log-density of the inverse Wishart.
Given a random matrix x, compute the density
for arguments shape1 and shape2
dmbeta2(x, shape1, shape2, log = TRUE)dmbeta2(x, shape1, shape2, log = TRUE)
x |
cube of dimension |
shape1 |
positive shape parameter, strictly larger than |
shape2 |
positive shape parameter, strictly larger than |
log |
[logical] if |
a vector of length n
Density of the lognormal matrix-variate density, defined through the matrix logarithm, with the Jacobian resulting from the transformation
dsmlnorm(x, b, M, log = TRUE)dsmlnorm(x, b, M, log = TRUE)
x |
[cube] array of dimension |
b |
[numeric] scale parameter, strictly positive |
M |
[matrix] location matrix, positive definite |
log |
[logical] if |
a vector of length n
Symmetric matrix-variate normal density
dsmnorm(x, b, M, log = TRUE)dsmnorm(x, b, M, log = TRUE)
x |
[cube] array of dimension |
b |
[numeric] scale parameter, strictly positive |
M |
[matrix] location matrix, positive definite |
log |
[logical] if |
a vector of length n
Density of Wishart random matrix
dWishart(x, df, S, log = FALSE)dWishart(x, df, S, log = FALSE)
x |
array of dimension |
df |
degrees of freedom |
S |
symmetric positive definite matrix of dimension |
log |
logical; if |
a vector of length n containing the log-density of the Wishart.
Given a function f defined over the space of symmetric positive definite matrices, compute an integral via numerical integration using the routine cubintegrate.
integrate_spd( f, dim, tol = 0.001, lb = 1e-08, ub = Inf, neval = 1000000L, method = c("suave", "hcubature"), ... )integrate_spd( f, dim, tol = 0.001, lb = 1e-08, ub = Inf, neval = 1000000L, method = c("suave", "hcubature"), ... )
f |
function to evaluate that takes as arguments array of size |
dim |
dimension of integral, only two or three dimensions are supported |
tol |
double for tolerance of numerical integral |
lb |
lower bound for integration range of eigenvalues |
ub |
upper bound for integration range of eigenvalues |
neval |
maximum number of evaluations |
method |
string indicating the method from |
... |
additional arguments for the function |
list returned by the integration routine. See the documentation of cubintegrate for more details.
integrate_spd( dim = 2L, neval = 1e4L, f = function(x, S){ dWishart(x, df = 10, S = S, log = FALSE)}, S = diag(2))integrate_spd( dim = 2L, neval = 1e4L, f = function(x, S){ dWishart(x, df = 10, S = S, log = FALSE)}, S = diag(2))
Given a sample of m points xs from an original sample
and a set of n new sample matrices x at which to evaluate the symmetric matrix normal log kernel, return the density with bandwidth parameter b.
kdens_smlnorm(x, xs, b, log = TRUE)kdens_smlnorm(x, xs, b, log = TRUE)
x |
cube of size |
xs |
cube of size |
b |
positive double giving the bandwidth parameter |
log |
bool; if |
a vector of length n containing the (log) density of the sample x
Given a sample of m points xs from an original sample
and a set of n new sample matrices x at which to evaluate the symmetric matrix normal kernel, return the density with bandwidth parameter b. Note that this kernel suffers from boundary spillover.
kdens_smnorm(x, xs, b, log = TRUE)kdens_smnorm(x, xs, b, log = TRUE)
x |
cube of size |
xs |
cube of size |
b |
positive double giving the bandwidth parameter |
log |
bool; if |
a vector of length n containing the (log) density of the sample x
Given a sample of m points xs from an original sample
and a set of n new sample symmetric positive definite matrices x at which to evaluate the kernel, return the density with bandwidth parameter b.
kdens_symmat(x, xs, kernel = "Wishart", b = 1, log = TRUE)kdens_symmat(x, xs, kernel = "Wishart", b = 1, log = TRUE)
x |
cube of size |
xs |
cube of size |
kernel |
string, one of |
b |
positive double giving the bandwidth parameter |
log |
bool; if |
a vector of length n containing the (log) density of the sample x
Given a sample of m points xs from an original sample
and a set of n new sample matrices x at which to evaluate the Wishart kernel, return the density with bandwidth parameter b.
kdens_Wishart(x, xs, b, log = TRUE)kdens_Wishart(x, xs, b, log = TRUE)
x |
cube of size |
xs |
cube of size |
b |
positive double giving the bandwidth parameter |
log |
bool; if |
a vector of length n containing the (log) density of the sample x
Given a cube of sample observations (consisting of random symmetric positive definite matrices), and a vector of candidate bandwidth parameters b,
compute the leave-one-out likelihood cross-validation criterion and
return the bandwidth among the choices that maximizes the criterion.
lcv_kdens_symmat(x, b, h = 1L, kernel = "Wishart")lcv_kdens_symmat(x, b, h = 1L, kernel = "Wishart")
x |
array of dimension |
b |
vector of candidate bandwidth, strictly positive |
h |
integer for the lag vector for determining which observation to exclude, any data point in a radius of |
kernel |
string indicating the kernel, one of |
a list with arguments
lcv vector of likelihood cross validation criterion
b vector of candidate bandwidth
h lag for leave-one-out
bandwidth optimal bandwidth among candidates
kernel string indicating the choice of kernel function
Given a cube x and a bandwidth b, compute
the leave-one-out cross validation criterion by taking out a slice
and evaluating the kernel at the holdout value, excluding points that are at distance at least h-1 apart.
lcv_kern_smlnorm(x, b, h = 1L)lcv_kern_smlnorm(x, b, h = 1L)
x |
[cube] array of dimension |
b |
[numeric] scale parameter, strictly positive |
h |
integer lag for excluding observations |
the value of the log objective function
Given a cube x and a bandwidth b, compute
the leave-one-out cross validation criterion by taking out a slice
and evaluating the kernel at the holdout values, excluding points that are at distance at least h-1 apart.
lcv_kern_smnorm(x, b, h = 1L)lcv_kern_smnorm(x, b, h = 1L)
x |
[cube] array of dimension |
b |
[numeric] scale parameter, strictly positive |
h |
integer lag for excluding observations |
the value of the log objective function
Given a cube x and a bandwidth b, compute
the leave-one-out cross validation criterion by taking out a slice
and evaluating the kernel at the holdout value, excluding points that are at distance at least h-1 apart.
lcv_kern_Wishart(x, b, h = 1)lcv_kern_Wishart(x, b, h = 1)
x |
[cube] array of dimension |
b |
[numeric] scale parameter, strictly positive |
h |
integer lag for excluding observations |
the value of the log objective function
Given a cube of sample observations (consisting of random symmetric positive definite matrices), and a vector of candidate bandwidth parameters b,
compute the least square likelihood cross-validation criterion and
return the bandwidth among the choices that minimizes the criterion.
lscv_kdens_symmat(x, b, h = 1L, kernel = "Wishart")lscv_kdens_symmat(x, b, h = 1L, kernel = "Wishart")
x |
array of dimension |
b |
vector of candidate bandwidth, strictly positive |
h |
integer for the lag vector for determining which observation to exclude, any data point in a radius of |
kernel |
string indicating the kernel, one of |
a list with arguments
lscv vector of likelihood cross validation criterion
b vector of candidate bandwidth
h lag for leave-one-out
bandwidth optimal bandwidth among candidates
kernel string indicating the choice of kernel function
Finite sample h-block leave-one-out approximation to the least
square criterion, omitting constant term. Only pairs that are apart are considered.
lscv_kern_smlnorm(x, b, h = 1L)lscv_kern_smlnorm(x, b, h = 1L)
x |
[cube] array of dimension |
b |
[numeric] scale parameter, strictly positive |
h |
[int] integer indicating the separation lag |
a double containing the log of the least square cross validation criterion
Finite sample h-block leave-one-out approximation to the least square criterion, omitting constant term.
lscv_kern_Wishart(x, b, h = 1L)lscv_kern_Wishart(x, b, h = 1L)
x |
[cube] array of dimension |
b |
[numeric] scale parameter, strictly positive |
h |
separation vector; only pairs that are |
a double containing the a vector of length two containing the log of the summands
Given a vector of points x and an order p, compute the multivariate gamma function. The function is defined as
mgamma(x, p, log = FALSE)mgamma(x, p, log = FALSE)
x |
[vector] of points at which to evaluate the function |
p |
[int] dimension of the multivariate gamma function, strictly positive. |
log |
[logical] if |
a matrix with one column of the same length as x
Intraday realized covariances of the returns between the Amazon stock (rvarAMZN) and the SPDR S&P 500 ETF (rvarSPY) using five minutes data, for the period of September 13th, 2023 to September 12, 2024.
realvarrealvar
A 2 by 2 by 250 array
Anne MacKay
data(realvar, package = "ksm") bopt <- bandwidth_optim( x = realvar, criterion = "lscv", kernel = "Wishart", h = 4L )data(realvar, package = "ksm") bopt <- bandwidth_optim( x = realvar, criterion = "lscv", kernel = "Wishart", h = 4L )
Given two matrices M and S, solve Riccati equation by iterative updating to find the solution , where the latter satisfies
until convergence (i.e., when the Frobenius norm is less than tol, or the maximum number of iterations maxiter is reached.
Riccati(M, S, tol = 1e-08, maxiter = 10000L)Riccati(M, S, tol = 1e-08, maxiter = 10000L)
M |
matrix |
S |
matrix |
tol |
double for tolerance |
maxiter |
integer, the maximum number of iterations |
a list containing
solution matrix solution to Riccati's equation
error numerical error
niter number of iteration
convergence bool indicating convergence (TRUE) if niter < maxiter
Random matrix generation from the inverse Wishart distribution
rinvWishart(n, df, S)rinvWishart(n, df, S)
n |
[integer] sample size |
df |
[double] degrees of freedom, positive |
S |
[matrix] a |
an array of dimension d by d by n containing the samples
This function only supports the case of diagonal matrices
rmbeta2(n, d, shape1, shape2)rmbeta2(n, d, shape1, shape2)
n |
sample size |
d |
dimension of the matrix |
shape1 |
positive shape parameter, strictly larger than |
shape2 |
positive shape parameter, strictly larger than |
a cube of dimension d by d by n
Sampler derived using the eigendecomposition of the covariance
matrix vcov.
rmnorm(n, mean, vcov)rmnorm(n, mean, vcov)
n |
sample size |
mean |
mean vector of length |
vcov |
a square positive definite covariance matrix, of the same dimension as |
an n by d matrix of samples
rmnorm(n = 10, mean = c(0, 2), vcov = diag(2))rmnorm(n = 10, mean = c(0, 2), vcov = diag(2))
Given a matrix of coefficients M and a covariance matrix Sigma,
simulate n random matrices from a first-order autoregressive Wishart process
by simulating from cross-products of vector autoregressions
rWAR(n, M, Sigma, K = NULL, order = 1L, burnin = 25L)rWAR(n, M, Sigma, K = NULL, order = 1L, burnin = 25L)
n |
sample size |
M |
matrix of autoregressive coefficients |
Sigma |
covariance matrix |
K |
integer, degrees of freedom |
order |
order of autoregressive process, only |
burnin |
number of iterations discarded |
an array of size d by d by n containing the samples
C. Gourieroux, J. Jasiak, and R. Sufana (2009). The Wishart Autoregressive process of multivariate stochastic volatility, Journal of Econometrics, 150(2), 167-181, <doi:10.1016/j.jeconom.2008.12.016>.
M <- matrix(c(0.3, -0.3, -0.3, 0.3), nrow = 2) Sigma <- matrix(c(1, 0.5, 0.5, 1), nrow = 2) rWAR(n = 10, M = M, Sigma = Sigma, K = 5)M <- matrix(c(0.3, -0.3, -0.3, 0.3), nrow = 2) Sigma <- matrix(c(1, 0.5, 0.5, 1), nrow = 2) rWAR(n = 10, M = M, Sigma = Sigma, K = 5)
Random matrix generation from Wishart distribution
rWishart(n, df, S)rWishart(n, df, S)
n |
[integer] sample size |
df |
[double] degrees of freedom, positive |
S |
[matrix] a |
an array of dimension d by d by n containing the samples
Given an input matrix, symmetrize by taking average of lower and upper triangular components as .
symmetrize(A)symmetrize(A)
A |
square matrix |
symmetrized version of A