| Title: | N-Way Partial Least Squares Modelling of Multi-Way Data |
|---|---|
| Description: | Creation and selection of N-way Partial Least Squares (NPLS) models. Selection of the optimal number of components can be done using ncrossreg(). NPLS was originally described by Rasmus Bro, see <doi:10.1002/%28SICI%291099-128X%28199601%2910%3A1%3C47%3A%3AAID-CEM400%3E3.0.CO%3B2-C>. |
| Authors: | Geert Roelof van der Ploeg [aut, cre] (ORCID: <https://orcid.org/0009-0007-5204-3386>), Johan Westerhuis [ctb] (ORCID: <https://orcid.org/0000-0002-6747-9779>), Anna Heintz-Buschart [ctb] (ORCID: <https://orcid.org/0000-0002-9780-1933>), Age Smilde [ctb] (ORCID: <https://orcid.org/0000-0002-3052-4644>), University of Amsterdam [cph, fnd] |
| Maintainer: | Geert Roelof van der Ploeg <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.0.9000 |
| Built: | 2026-05-28 08:13:58 UTC |
| Source: | https://github.com/grvanderploeg/nplstoolbox |
The Cornejo2025 longitudinal dataset as three-dimensional arrays, with subjects in mode 1, features in mode 2 and time in mode 3.
Cornejo2025Cornejo2025
A list object with seven elements:
List object of the tongue longitudinal microbiota data.
List object of the saliva longitudinal microbiota data.
List object of the longitudinal salivary cytokine data.
List object of the longitudinal salivary biochemistry data.
List object of the longitudinal circulatory hormone data.
List object of the longitudinal clinical outcome data.
Matrix with subject metadata.
TBD
The Jakobsen2025 longitudinal dataset as three-dimensional arrays, with subjects in mode 1, features in mode 2 and time in mode 3.
Jakobsen2025Jakobsen2025
A list object with seven elements:
List object of the longitudinal infant faecal microbiota data.
List object of the longitudinal HM microbiota data.
List object of the longitudinal salivary cytokine data.
TBD
This function runs ACMTF-R with cross-validation. A deterministic K–fold partition
is used: the subjects are split in order into cvFolds groups. For each fold the
training set consists of the other folds and the test set is the current fold.
ncrossreg(X, y, maxNumComponents = 5, maxIter = 120, cvFolds = dim(X)[1])ncrossreg(X, y, maxNumComponents = 5, maxIter = 120, cvFolds = dim(X)[1])
X |
Centered tensor of independent data |
y |
Centered dependent variable |
maxNumComponents |
Maximum number of components to investigate (default 5). |
maxIter |
Maximum number of iterations (default 100). |
cvFolds |
Number of folds to use in the cross-validation. For example, if |
A list with two elements: - varExp: a tibble with the variance–explained (for X and Y) per number of components. - RMSE: a tibble with the RMSE (computed over the unified CV prediction vector) per number of components.
set.seed(123) X <- array(rnorm(25 * 5 * 4), dim = c(25, 5, 4)) y <- rnorm(25) # Random response variable result = ncrossreg(X, y, cvFolds=2, maxNumComponents=2)set.seed(123) X <- array(rnorm(25 * 5 * 4), dim = c(25, 5, 4)) y <- rnorm(25) # Random response variable result = ncrossreg(X, y, cvFolds=2, maxNumComponents=2)
Predict Y for new data by projecting the data onto the latent space defined by an NPLS model.
npred(model, newX)npred(model, newX)
model |
NPLS model |
newX |
New data organized in a matrix of (Inew x J x K) with Inew new subjects |
Ypred: vector of the predicted value(s) of Y for the new data
Y = as.numeric(as.factor(Cornejo2025$Tongue$mode1$GenderID)) Ycnt = Y - mean(Y) model = triPLS1(Cornejo2025$Tongue$data, Ycnt, numComponents=1) npred(model, Cornejo2025$Tongue$data[1,,])Y = as.numeric(as.factor(Cornejo2025$Tongue$mode1$GenderID)) Ycnt = Y - mean(Y) model = triPLS1(Cornejo2025$Tongue$data, Ycnt, numComponents=1) npred(model, Cornejo2025$Tongue$data[1,,])
Tri-PLS1: three-way PLS regressed onto a y vector
triPLS1(X, y, numComponents, tol = 1e-10, maxIter = 100)triPLS1(X, y, numComponents, tol = 1e-10, maxIter = 100)
X |
Centered tensor of independent data |
y |
Centered dependent variable |
numComponents |
Number of components to fit |
tol |
Relative change in loss for the model to converge (default 1e-10). |
maxIter |
Maximum number of iterations (default 100). |
Model
set.seed(123) X <- array(rnorm(100 * 5 * 4), dim = c(100, 5, 4)) # Random tensor (100 samples, 5 vars, 4 vars) y <- rnorm(100) # Random response variable model <- triPLS1(X, y, numComponents = 2)set.seed(123) X <- array(rnorm(100 * 5 * 4), dim = c(100, 5, 4)) # Random tensor (100 samples, 5 vars, 4 vars) y <- rnorm(100) # Random response variable model <- triPLS1(X, y, numComponents = 2)