| Title: | Methods for Pre-Treatment, Data Mining and Correlation Analyses of Metabolomics Data |
|---|---|
| Description: | A tool kit for pre-treatment, modelling, feature selection and correlation analyses of metabolomics data. |
| Authors: | Jasen Finch [aut, cre] |
| Maintainer: | Jasen Finch <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.15.4 |
| Built: | 2026-05-17 07:36:32 UTC |
| Source: | https://github.com/aberHRML/metabolyseR |
Aggregation of sample features based on a grouping variable.
aggregateMean(d, cls = "class") ## S4 method for signature 'AnalysisData' aggregateMean(d, cls = "class") aggregateMedian(d, cls = "class") ## S4 method for signature 'AnalysisData' aggregateMedian(d, cls = "class") aggregateSum(d, cls = "class") ## S4 method for signature 'AnalysisData' aggregateSum(d, cls = "class")aggregateMean(d, cls = "class") ## S4 method for signature 'AnalysisData' aggregateMean(d, cls = "class") aggregateMedian(d, cls = "class") ## S4 method for signature 'AnalysisData' aggregateMedian(d, cls = "class") aggregateSum(d, cls = "class") ## S4 method for signature 'AnalysisData' aggregateSum(d, cls = "class")
d |
S4 object of class |
cls |
info columns across which to aggregate the data |
Sample aggregation allows the electronic pooling of sample features based on a grouping variable. This is useful in situations such as the presence of technical replicates that can be aggregated to reduce the effects of pseudo replication.
An S4 object of class AnalysisData containing the aggregated data.
aggregateMean: Aggregate sample features to the group mean.
aggregateMedian: Aggregate sample features to the group median.
aggregateSum: Aggregate sample features to the group total.
## Each of the following examples shows the application of the aggregation method and then ## a Principle Component Analysis is plotted to show it's effect on the data structure. ## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(occupancy = 2/3) d %>% plotPCA(cls = 'day') ## Mean aggregation d %>% aggregateMean(cls = c('day','class')) %>% plotPCA(cls = 'day',ellipses = FALSE) ## Median aggregation d %>% aggregateMedian(cls = c('day','class')) %>% plotPCA(cls = 'day',ellipses = FALSE) ## Sum aggregation d %>% aggregateSum(cls = c('day','class')) %>% plotPCA(cls = 'day',ellipses = FALSE)## Each of the following examples shows the application of the aggregation method and then ## a Principle Component Analysis is plotted to show it's effect on the data structure. ## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(occupancy = 2/3) d %>% plotPCA(cls = 'day') ## Mean aggregation d %>% aggregateMean(cls = c('day','class')) %>% plotPCA(cls = 'day',ellipses = FALSE) ## Median aggregation d %>% aggregateMedian(cls = c('day','class')) %>% plotPCA(cls = 'day',ellipses = FALSE) ## Sum aggregation d %>% aggregateSum(cls = c('day','class')) %>% plotPCA(cls = 'day',ellipses = FALSE)
An S4 class to store analysis results.
loglist containing analysis dates and time
parametersclass AnalysisParameters containing the analysis parameters
rawlist containing info and raw data
pre-treatedlist containing preTreated info and raw data
modellinglist containing modelling results
correlationstibble containing weighted edgelist of correlations
Create an AnalysisData S4 object.
analysisData(data, info)analysisData(data, info)
data |
table containing sample metabolomic data |
info |
table containing sample meta information |
An S4 object of class Analysis.
library(metaboData) d <- analysisData(data = abr1$neg,info = abr1$fact) print(d)library(metaboData) d <- analysisData(data = abr1$neg,info = abr1$fact) print(d)
An S4 class for metabolomic data and sample meta information.
datasample metabolomic data
infosample meta information
Return the analysis elements available in metabolyseR.
analysisElements()analysisElements()
A character vector of analysis elements.
analysisElements()analysisElements()
AnalysisParameters S4 class objectInitiate an AnalysisParameters object with the default analysis parameters for each of the analysis elements.
analysisParameters(elements = analysisElements())analysisParameters(elements = analysisElements())
elements |
character vector containing elements for analysis. |
An S4 object of class AnalysisParameters containing the default analysis parameters.
p <- analysisParameters() print(p)p <- analysisParameters() print(p)
An S4 class to store analysis parameters.
pre-treatmentlist containing parameters for data pre-treatment
modellinglist containing parameters for modelling
correlationslist containing parameters for correlations
One-way analysis of variance (ANOVA).
anova( x, cls = "class", pAdjust = "bonferroni", comparisons = list(), returnModels = FALSE ) ## S4 method for signature 'AnalysisData' anova( x, cls = "class", pAdjust = "bonferroni", comparisons = list(), returnModels = FALSE )anova( x, cls = "class", pAdjust = "bonferroni", comparisons = list(), returnModels = FALSE ) ## S4 method for signature 'AnalysisData' anova( x, cls = "class", pAdjust = "bonferroni", comparisons = list(), returnModels = FALSE )
x |
S4 object of class |
cls |
a vector of sample info column names to analyse |
pAdjust |
p value adjustment method |
comparisons |
list of comparisons to perform |
returnModels |
should models be returned |
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Perform ANOVA anova_analysis <- anova(d,cls = 'day') ## Extract significant features explanatoryFeatures(anova_analysis)library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Perform ANOVA anova_analysis <- anova(d,cls = 'day') ## Extract significant features explanatoryFeatures(anova_analysis)
Methods for accessing modelling results.
binaryComparisons(x, cls = "class") ## S4 method for signature 'AnalysisData' binaryComparisons(x, cls = "class") mtry(x, cls = "class") ## S4 method for signature 'AnalysisData' mtry(x, cls = "class") type(x) ## S4 method for signature 'RandomForest' type(x) ## S4 method for signature 'Univariate' type(x) response(x) ## S4 method for signature 'RandomForest' response(x) ## S4 method for signature 'Univariate' response(x) metrics(x) ## S4 method for signature 'RandomForest' metrics(x) ## S4 method for signature 'list' metrics(x) ## S4 method for signature 'Analysis' metrics(x) predictions(x) ## S4 method for signature 'RandomForest' predictions(x) ## S4 method for signature 'list' predictions(x) ## S4 method for signature 'Analysis' predictions(x) importanceMetrics(x) ## S4 method for signature 'RandomForest' importanceMetrics(x) importance(x) ## S4 method for signature 'RandomForest' importance(x) ## S4 method for signature 'Univariate' importance(x) ## S4 method for signature 'list' importance(x) ## S4 method for signature 'Analysis' importance(x) proximity(x, idx = NULL) ## S4 method for signature 'RandomForest' proximity(x, idx = NULL) ## S4 method for signature 'list' proximity(x, idx = NULL) ## S4 method for signature 'Analysis' proximity(x, idx = NULL) explanatoryFeatures(x, ...) ## S4 method for signature 'Univariate' explanatoryFeatures( x, threshold = 0.05, value = c("adjusted.p.value", "p.value") ) ## S4 method for signature 'RandomForest' explanatoryFeatures( x, metric = "false_positive_rate", value = c("value", "p-value", "adjusted_p-value"), threshold = 0.05 ) ## S4 method for signature 'list' explanatoryFeatures(x, ...) ## S4 method for signature 'Analysis' explanatoryFeatures(x, ...)binaryComparisons(x, cls = "class") ## S4 method for signature 'AnalysisData' binaryComparisons(x, cls = "class") mtry(x, cls = "class") ## S4 method for signature 'AnalysisData' mtry(x, cls = "class") type(x) ## S4 method for signature 'RandomForest' type(x) ## S4 method for signature 'Univariate' type(x) response(x) ## S4 method for signature 'RandomForest' response(x) ## S4 method for signature 'Univariate' response(x) metrics(x) ## S4 method for signature 'RandomForest' metrics(x) ## S4 method for signature 'list' metrics(x) ## S4 method for signature 'Analysis' metrics(x) predictions(x) ## S4 method for signature 'RandomForest' predictions(x) ## S4 method for signature 'list' predictions(x) ## S4 method for signature 'Analysis' predictions(x) importanceMetrics(x) ## S4 method for signature 'RandomForest' importanceMetrics(x) importance(x) ## S4 method for signature 'RandomForest' importance(x) ## S4 method for signature 'Univariate' importance(x) ## S4 method for signature 'list' importance(x) ## S4 method for signature 'Analysis' importance(x) proximity(x, idx = NULL) ## S4 method for signature 'RandomForest' proximity(x, idx = NULL) ## S4 method for signature 'list' proximity(x, idx = NULL) ## S4 method for signature 'Analysis' proximity(x, idx = NULL) explanatoryFeatures(x, ...) ## S4 method for signature 'Univariate' explanatoryFeatures( x, threshold = 0.05, value = c("adjusted.p.value", "p.value") ) ## S4 method for signature 'RandomForest' explanatoryFeatures( x, metric = "false_positive_rate", value = c("value", "p-value", "adjusted_p-value"), threshold = 0.05 ) ## S4 method for signature 'list' explanatoryFeatures(x, ...) ## S4 method for signature 'Analysis' explanatoryFeatures(x, ...)
x |
S4 object of class |
cls |
sample information column to use |
idx |
sample information column to use for sample names. If |
... |
arguments to parse to method for specific class |
threshold |
threshold below which explanatory features are extracted |
value |
the importance value to threshold. See the usage section for possible values for each class. |
metric |
importance metric for which to retrieve explanatory features |
binaryComparisons: Return a vector of all possible binary comparisons for a given sample information column.
mtry: Return the default mtry random forest parameter value for a given sample information column.
type: Return the type of random forest analysis.
response: Return the response variable name used for a random forest analysis.
metrics: Retrieve the model performance metrics for a random forest analysis
predictions: Retrieve the out of bag model response predictions for a random forest analysis.
importanceMetrics: Retrieve the available feature importance metrics for a random forest analysis.
importance: Retrieve feature importance results.
proximity: Retrieve the random forest sample proximities.
explanatoryFeatures: Retrieve explanatory features.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Return possible binary comparisons for the `day` response column binaryComparisons(d,cls = 'day') ## Return the default random forest `mtry` parameter for the `day` response column mtry(d,cls = 'day') ## Perform random forest analysis rf_analysis <- randomForest(d,cls = 'day') ## Return the type of random forest type(rf_analysis) ## Return the response variable name used response(rf_analysis) ## Retrieve the model performance metrics metrics(rf_analysis) ## Retrieve the out of bag model response predictions predictions(rf_analysis) ## Show the available feature importance metrics importanceMetrics(rf_analysis) ## Retrieve the feature importance results importance(rf_analysis) ## Retrieve the sample proximities proximity(rf_analysis) ## Retrieve the explanatory features explanatoryFeatures(rf_analysis,metric = 'false_positive_rate',threshold = 0.05)library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Return possible binary comparisons for the `day` response column binaryComparisons(d,cls = 'day') ## Return the default random forest `mtry` parameter for the `day` response column mtry(d,cls = 'day') ## Perform random forest analysis rf_analysis <- randomForest(d,cls = 'day') ## Return the type of random forest type(rf_analysis) ## Return the response variable name used response(rf_analysis) ## Retrieve the model performance metrics metrics(rf_analysis) ## Retrieve the out of bag model response predictions predictions(rf_analysis) ## Show the available feature importance metrics importanceMetrics(rf_analysis) ## Retrieve the feature importance results importance(rf_analysis) ## Retrieve the sample proximities proximity(rf_analysis) ## Retrieve the explanatory features explanatoryFeatures(rf_analysis,metric = 'false_positive_rate',threshold = 0.05)
AnalysisData objects by rowBind the rows of AnalysisData objects contained within a list.
bindRows(d) ## S4 method for signature 'list' bindRows(d)bindRows(d) ## S4 method for signature 'list' bindRows(d)
d |
list object containing S4 objects of class AnalysisData to be bound |
An S4 object of class AnalysisData containg the bound data sets.
library(metaboData) d <- list( negative = analysisData(abr1$neg,abr1$fact), positive = analysisData(abr1$pos,abr1$fact) ) bindRows(d)library(metaboData) d <- list( negative = analysisData(abr1$neg,abr1$fact), positive = analysisData(abr1$pos,abr1$fact) ) bindRows(d)
Change analysis parameters.
changeParameter(x, parameterName, elements = analysisElements()) <- value ## S4 replacement method for signature 'AnalysisParameters' changeParameter(x, parameterName, elements = analysisElements()) <- valuechangeParameter(x, parameterName, elements = analysisElements()) <- value ## S4 replacement method for signature 'AnalysisParameters' changeParameter(x, parameterName, elements = analysisElements()) <- value
x |
S4 object of class |
parameterName |
name of the parameter to change |
elements |
character vector of analysis elements to target parameter
change. Can be any returned by |
value |
New value of the parameter |
For the parameter name selected, all parameters with that name will be altered.
An S4 object of class AnalysisParameters.
p <- analysisParameters('pre-treatment') changeParameter(p,'cls') <- 'day' print(p)p <- analysisParameters('pre-treatment') changeParameter(p,'cls') <- 'day' print(p)
Query or alter sample meta information in AnalysisData or Analysis class objects.
Replace a given sample info column from an Analysis or AnalysisData object.
clsAdd(d, cls, value, ...) ## S4 method for signature 'AnalysisData' clsAdd(d, cls, value) ## S4 method for signature 'Analysis' clsAdd(d, cls, value, type = c("pre-treated", "raw")) clsArrange(d, cls = "class", descending = FALSE, ...) ## S4 method for signature 'AnalysisData' clsArrange(d, cls = "class", descending = FALSE) ## S4 method for signature 'Analysis' clsArrange( d, cls = "class", descending = FALSE, type = c("pre-treated", "raw") ) clsAvailable(d, ...) ## S4 method for signature 'AnalysisData' clsAvailable(d) ## S4 method for signature 'Analysis' clsAvailable(d, type = c("pre-treated", "raw")) clsExtract(d, cls = "class", ...) ## S4 method for signature 'AnalysisData' clsExtract(d, cls = "class") ## S4 method for signature 'Analysis' clsExtract(d, cls = "class", type = c("pre-treated", "raw")) clsRemove(d, cls, ...) ## S4 method for signature 'AnalysisData' clsRemove(d, cls) ## S4 method for signature 'Analysis' clsRemove(d, cls, type = c("pre-treated", "raw")) clsRename(d, cls, newName, ...) ## S4 method for signature 'AnalysisData' clsRename(d, cls, newName) ## S4 method for signature 'Analysis' clsRename(d, cls, newName, type = c("pre-treated", "raw")) clsReplace(d, value, cls = "class", ...) ## S4 method for signature 'AnalysisData' clsReplace(d, value, cls = "class") ## S4 method for signature 'Analysis' clsReplace(d, value, cls = "class", type = c("pre-treated", "raw"))clsAdd(d, cls, value, ...) ## S4 method for signature 'AnalysisData' clsAdd(d, cls, value) ## S4 method for signature 'Analysis' clsAdd(d, cls, value, type = c("pre-treated", "raw")) clsArrange(d, cls = "class", descending = FALSE, ...) ## S4 method for signature 'AnalysisData' clsArrange(d, cls = "class", descending = FALSE) ## S4 method for signature 'Analysis' clsArrange( d, cls = "class", descending = FALSE, type = c("pre-treated", "raw") ) clsAvailable(d, ...) ## S4 method for signature 'AnalysisData' clsAvailable(d) ## S4 method for signature 'Analysis' clsAvailable(d, type = c("pre-treated", "raw")) clsExtract(d, cls = "class", ...) ## S4 method for signature 'AnalysisData' clsExtract(d, cls = "class") ## S4 method for signature 'Analysis' clsExtract(d, cls = "class", type = c("pre-treated", "raw")) clsRemove(d, cls, ...) ## S4 method for signature 'AnalysisData' clsRemove(d, cls) ## S4 method for signature 'Analysis' clsRemove(d, cls, type = c("pre-treated", "raw")) clsRename(d, cls, newName, ...) ## S4 method for signature 'AnalysisData' clsRename(d, cls, newName) ## S4 method for signature 'Analysis' clsRename(d, cls, newName, type = c("pre-treated", "raw")) clsReplace(d, value, cls = "class", ...) ## S4 method for signature 'AnalysisData' clsReplace(d, value, cls = "class") ## S4 method for signature 'Analysis' clsReplace(d, value, cls = "class", type = c("pre-treated", "raw"))
d |
S4 object of class Analysis or AnalysisData |
cls |
sample info column to extract |
value |
vactor of new sample information for replacement |
... |
arguments to pass to specific method |
type |
|
descending |
TRUE/FALSE, arrange samples in descending order |
newName |
new column name |
clsAdd: Add a sample information column.
clsArrange: Arrange sample row order by a specified sample information column.
clsAvailable: Retrieve the names of the available sample information columns.
clsExtract: Extract the values of a specified sample information column.
clsRemove: Remove a sample information column.
clsRename: Rename a sample information column.
clsReplace: Replace a sample information column.
library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Add a sample information column named 'new' d <- clsAdd(d,'new',1:nSamples(d)) print(d) ## Arrange the row orders by the 'day' column d <- clsArrange(d,'day') clsExtract(d,'day') ## Retreive the available sample information column names clsAvailable(d) ## Extract the values of the 'day' column clsExtract(d,'day') ## Remove the 'class' column d <- clsRemove(d,'class') clsAvailable(d) ## Rename the 'day' column to 'treatment' d <- clsRename(d,'day','treatment') clsAvailable(d) ## Replace the values of the 'treatment' column d <- clsReplace(d,rep(1,nSamples(d)),'treatment') clsExtract(d,'treatment')library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Add a sample information column named 'new' d <- clsAdd(d,'new',1:nSamples(d)) print(d) ## Arrange the row orders by the 'day' column d <- clsArrange(d,'day') clsExtract(d,'day') ## Retreive the available sample information column names clsAvailable(d) ## Extract the values of the 'day' column clsExtract(d,'day') ## Remove the 'class' column d <- clsRemove(d,'class') clsAvailable(d) ## Rename the 'day' column to 'treatment' d <- clsRename(d,'day','treatment') clsAvailable(d) ## Replace the values of the 'treatment' column d <- clsReplace(d,rep(1,nSamples(d)),'treatment') clsExtract(d,'treatment')
Correction of batch/block differences.
correctionCenter(d, block = "block", type = c("mean", "median")) ## S4 method for signature 'AnalysisData' correctionCenter(d, block = "block", type = c("mean", "median"))correctionCenter(d, block = "block", type = c("mean", "median")) ## S4 method for signature 'AnalysisData' correctionCenter(d, block = "block", type = c("mean", "median"))
d |
S4 object of class |
block |
sample information column name to use containing sample block groupings |
type |
type of average to use |
There can sometimes be artificial batch related variability introduced into metabolomics analyses as a result of analytical instrumentation or sample preparation. With an appropriate randomised block design of sample injection order, batch related variability can be corrected using an average centring correction method of the individual features.
An S4 object of class AnalysisData containing the corrected data.
correctionCenter: Correction using group average centring.
## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(occupancy = 2/3) ## Group total ion count distributions prior to correction d %>% plotTIC(by = 'day',colour = 'day') ## Group total ion count distributions after group median correction d %>% correctionCenter(block = 'day',type = 'median') %>% plotTIC(by = 'day',colour = 'day')## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(occupancy = 2/3) ## Group total ion count distributions prior to correction d %>% plotTIC(by = 'day',colour = 'day') ## Group total ion count distributions after group median correction d %>% correctionCenter(block = 'day',type = 'median') %>% plotTIC(by = 'day',colour = 'day')
Feature correlation analysis.
correlations(d, ...) ## S4 method for signature 'AnalysisData' correlations( d, method = "pearson", pAdjustMethod = "bonferroni", corPvalue = 0.05, minCoef = 0, maxCor = Inf ) ## S4 method for signature 'Analysis' correlations(d)correlations(d, ...) ## S4 method for signature 'AnalysisData' correlations( d, method = "pearson", pAdjustMethod = "bonferroni", corPvalue = 0.05, minCoef = 0, maxCor = Inf ) ## S4 method for signature 'Analysis' correlations(d)
d |
S4 object of class |
... |
arguments to pass to specific method |
method |
correlation method. One of |
pAdjustMethod |
p-value adjustment method. See |
corPvalue |
p-value cut-off threshold for significance |
minCoef |
minimum absolute correlation coefficient threshold |
maxCor |
maximum number of returned correlations |
Correlation analyses can be used to identify associated features within data sets. This can be useful to identifying clusters of related features that can be used to annotate metabolites within data sets. All features are compared and the returned table of correlations are thresholded to the specified p-value cut-off.
A tibble containing results of significantly correlated features.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) correlations(d)library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) correlations(d)
Retrieve the default parameters for correlation analysis.
correlationsParameters()correlationsParameters()
## Retrieve the default correlation parameters p <- correlationsParameters() ## Assign the correlation parameters to analysis parameters cp <- analysisParameters('correlations') parameters(cp,'correlations') <- p print(cp)## Retrieve the default correlation parameters p <- correlationsParameters() ## Assign the correlation parameters to analysis parameters cp <- analysisParameters('correlations') parameters(cp,'correlations') <- p print(cp)
AnalysisData and Analysis class accessorsAccessor methods for the AnalysisData and Analysis S4 classes.
dat(x, ...) ## S4 method for signature 'AnalysisData' dat(x) ## S4 method for signature 'Analysis' dat(x, type = c("pre-treated", "raw")) dat(x, ...) <- value ## S4 replacement method for signature 'AnalysisData' dat(x) <- value ## S4 replacement method for signature 'Analysis' dat(x, type = c("pre-treated", "raw")) <- value sinfo(x, ...) ## S4 method for signature 'AnalysisData' sinfo(x) ## S4 method for signature 'Analysis' sinfo(x, type = c("pre-treated", "raw"), value) sinfo(x, ...) <- value ## S4 replacement method for signature 'AnalysisData' sinfo(x) <- value ## S4 replacement method for signature 'Analysis' sinfo(x, type = c("pre-treated", "raw")) <- value raw(x) ## S4 method for signature 'Analysis' raw(x) raw(x) <- value ## S4 replacement method for signature 'Analysis' raw(x) <- value preTreated(x) ## S4 method for signature 'Analysis' preTreated(x) preTreated(x) <- value ## S4 replacement method for signature 'Analysis' preTreated(x) <- value features(x, ...) ## S4 method for signature 'AnalysisData' features(x) ## S4 method for signature 'Analysis' features(x, type = c("pre-treated", "raw")) nSamples(x, ...) ## S4 method for signature 'AnalysisData' nSamples(x) ## S4 method for signature 'Analysis' nSamples(x, type = c("pre-treated", "raw")) nFeatures(x, ...) ## S4 method for signature 'AnalysisData' nFeatures(x) ## S4 method for signature 'Analysis' nFeatures(x, type = c("pre-treated", "raw")) analysisResults(x, element) ## S4 method for signature 'Analysis' analysisResults(x, element)dat(x, ...) ## S4 method for signature 'AnalysisData' dat(x) ## S4 method for signature 'Analysis' dat(x, type = c("pre-treated", "raw")) dat(x, ...) <- value ## S4 replacement method for signature 'AnalysisData' dat(x) <- value ## S4 replacement method for signature 'Analysis' dat(x, type = c("pre-treated", "raw")) <- value sinfo(x, ...) ## S4 method for signature 'AnalysisData' sinfo(x) ## S4 method for signature 'Analysis' sinfo(x, type = c("pre-treated", "raw"), value) sinfo(x, ...) <- value ## S4 replacement method for signature 'AnalysisData' sinfo(x) <- value ## S4 replacement method for signature 'Analysis' sinfo(x, type = c("pre-treated", "raw")) <- value raw(x) ## S4 method for signature 'Analysis' raw(x) raw(x) <- value ## S4 replacement method for signature 'Analysis' raw(x) <- value preTreated(x) ## S4 method for signature 'Analysis' preTreated(x) preTreated(x) <- value ## S4 replacement method for signature 'Analysis' preTreated(x) <- value features(x, ...) ## S4 method for signature 'AnalysisData' features(x) ## S4 method for signature 'Analysis' features(x, type = c("pre-treated", "raw")) nSamples(x, ...) ## S4 method for signature 'AnalysisData' nSamples(x) ## S4 method for signature 'Analysis' nSamples(x, type = c("pre-treated", "raw")) nFeatures(x, ...) ## S4 method for signature 'AnalysisData' nFeatures(x) ## S4 method for signature 'Analysis' nFeatures(x, type = c("pre-treated", "raw")) analysisResults(x, element) ## S4 method for signature 'Analysis' analysisResults(x, element)
x |
S4 object of class |
... |
arguments to pass to the appropriate method |
type |
get or set |
value |
value to set |
element |
analysis element results to return |
dat: Return a metabolomic data table.
dat<-: Set a metabolomic data table.
sinfo: Return a sample information data table.
sinfo<-: Set a sample information data table.
raw: Return the AnalysisData object containing unprocessed metabolomic data from an Analysis object.
raw<-: Set an AnalysisData object to the raw slot of an Analysis class object.
preTreated: Return the AnalysisData object containing pre-treated metabolomic data from an Analysis object.
preTreated<-: Set an AnalysisData object to the pre-treated slot of an Analysis class object.
features: Return the features names.
nSamples: Return the number of samples.
nFeatures: Return the number of features.
analysisResults: Return results from an Analysis object of an analysis element.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Return the metabolomic data dat(d) ## Set the metabolomic data dat(d) <- abr1$neg[,300:400] ## Return the sample information sinfo(d) ## Set the sample information sinfo(d) <- abr1$fact ## Return the feature names features(d) ## Return the number of samples nSamples(d) ## Return the number of features nFeatures(d)library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Return the metabolomic data dat(d) ## Set the metabolomic data dat(d) <- abr1$neg[,300:400] ## Return the sample information sinfo(d) ## Set the sample information sinfo(d) <- abr1$fact ## Return the feature names features(d) ## Return the number of samples nSamples(d) ## Return the number of features nFeatures(d)
Impute missing values using random forest imputation.
imputeAll(d, occupancy = 2/3, parallel = "variables", seed = 1234) ## S4 method for signature 'AnalysisData' imputeAll(d, occupancy = 2/3, parallel = "variables", seed = 1234) imputeClass(d, cls = "class", occupancy = 2/3, seed = 1234) ## S4 method for signature 'AnalysisData' imputeClass(d, cls = "class", occupancy = 2/3, seed = 1234)imputeAll(d, occupancy = 2/3, parallel = "variables", seed = 1234) ## S4 method for signature 'AnalysisData' imputeAll(d, occupancy = 2/3, parallel = "variables", seed = 1234) imputeClass(d, cls = "class", occupancy = 2/3, seed = 1234) ## S4 method for signature 'AnalysisData' imputeClass(d, cls = "class", occupancy = 2/3, seed = 1234)
d |
S4 object of class |
occupancy |
occupancy threshold above which missing values of a feature will be imputed |
parallel |
parallel type to use. See |
seed |
random number seed |
cls |
info column to use for class labels |
Missing values can have an important influence on downstream analyses with zero values heavily influencing the outcomes of parametric tests.
Where and how they are imputed are important considerations and is highly related to variable occupancy.
The methods provided here allow both these aspects to be taken into account and utilise random forest imputation using the missForest package.
An S4 object of class AnalysisData containing the data after imputation.
imputeAll: Impute missing values across all sample features.
imputeClass: Impute missing values class-wise.
## Each of the following examples shows the application of each imputation method and then ## a Linear Discriminant Analysis is plotted to show it's effect on the data structure. ## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:250],abr1$fact) %>% occupancyMaximum(occupancy = 2/3) d %>% plotLDA(cls = 'day') ## Missing value imputation across all samples d %>% imputeAll(parallel = 'no') %>% plotLDA(cls = 'day') ## Missing value imputation class-wise d %>% imputeClass(cls = 'day') %>% plotLDA(cls = 'day')## Each of the following examples shows the application of each imputation method and then ## a Linear Discriminant Analysis is plotted to show it's effect on the data structure. ## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:250],abr1$fact) %>% occupancyMaximum(occupancy = 2/3) d %>% plotLDA(cls = 'day') ## Missing value imputation across all samples d %>% imputeAll(parallel = 'no') %>% plotLDA(cls = 'day') ## Missing value imputation class-wise d %>% imputeClass(cls = 'day') %>% plotLDA(cls = 'day')
Retain samples, classes or features in an AnalysisData object.
keepClasses(d, cls = "class", classes = c()) ## S4 method for signature 'AnalysisData' keepClasses(d, cls = "class", classes = c()) keepFeatures(d, features = character()) ## S4 method for signature 'AnalysisData' keepFeatures(d, features = character()) keepSamples(d, idx = "fileOrder", samples = c()) ## S4 method for signature 'AnalysisData' keepSamples(d, idx = "fileOrder", samples = c())keepClasses(d, cls = "class", classes = c()) ## S4 method for signature 'AnalysisData' keepClasses(d, cls = "class", classes = c()) keepFeatures(d, features = character()) ## S4 method for signature 'AnalysisData' keepFeatures(d, features = character()) keepSamples(d, idx = "fileOrder", samples = c()) ## S4 method for signature 'AnalysisData' keepSamples(d, idx = "fileOrder", samples = c())
d |
S4 object of class AnalysisData |
cls |
info column to use for class information |
classes |
classes to keep |
features |
features to remove |
idx |
info column containing sample indexes |
samples |
sample indexes to keep |
An S4 object of class AnalysisData with specified samples, classes or features retained.
keepClasses: Keep classes.
keepFeatures: Keep features.
keepSamples: Keep samples.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Keep classes d %>% keepClasses(cls = 'day',classes = 'H') ## Keep features d %>% keepFeatures(features = c('N200','N201')) ## Keep samples d %>% keepSamples(idx = 'injorder',samples = c(1,10))library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Keep classes d %>% keepClasses(cls = 'day',classes = 'H') ## Keep features d %>% keepFeatures(features = c('N200','N201')) ## Keep samples d %>% keepSamples(idx = 'injorder',samples = c(1,10))
Linear regression
linearRegression( x, cls = "class", pAdjust = "bonferroni", returnModels = FALSE ) ## S4 method for signature 'AnalysisData' linearRegression( x, cls = "class", pAdjust = "bonferroni", returnModels = FALSE )linearRegression( x, cls = "class", pAdjust = "bonferroni", returnModels = FALSE ) ## S4 method for signature 'AnalysisData' linearRegression( x, cls = "class", pAdjust = "bonferroni", returnModels = FALSE )
x |
S4 object of class |
cls |
vector of sample information column names to regress |
pAdjust |
p value adjustment method |
returnModels |
should models be returned |
An S4 object of class Univariate.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Perform linear regression lr_analysis <- linearRegression(d,cls = 'injorder') ## Extract significant features explanatoryFeatures(lr_analysis)library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Perform linear regression lr_analysis <- linearRegression(d,cls = 'injorder') ## Extract significant features explanatoryFeatures(lr_analysis)
Multidimensional scaling of random forest proximities.
mds(x, dimensions = 2, idx = NULL) ## S4 method for signature 'RandomForest' mds(x, dimensions = 2, idx = NULL) ## S4 method for signature 'list' mds(x, dimensions = 2, idx = NULL) ## S4 method for signature 'Analysis' mds(x, dimensions = 2, idx = NULL)mds(x, dimensions = 2, idx = NULL) ## S4 method for signature 'RandomForest' mds(x, dimensions = 2, idx = NULL) ## S4 method for signature 'list' mds(x, dimensions = 2, idx = NULL) ## S4 method for signature 'Analysis' mds(x, dimensions = 2, idx = NULL)
x |
S4 object of class |
dimensions |
The number of dimensions by which the data are to be represented. |
idx |
sample information column to use for sample names. If |
A tibble containing the scaled dimensions.
library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') mds(rf)library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') mds(rf)
Perform analyses containing multiple analysis element steps.
metabolyse(data, info, parameters = analysisParameters(), verbose = TRUE) reAnalyse(analysis, parameters = analysisParameters(), verbose = TRUE) ## S4 method for signature 'Analysis' reAnalyse(analysis, parameters = analysisParameters(), verbose = TRUE)metabolyse(data, info, parameters = analysisParameters(), verbose = TRUE) reAnalyse(analysis, parameters = analysisParameters(), verbose = TRUE) ## S4 method for signature 'Analysis' reAnalyse(analysis, parameters = analysisParameters(), verbose = TRUE)
data |
tibble or data.frame containing data to analyse |
info |
tibble or data.frame containing data info or meta data |
parameters |
an object of AnalysisParameters class containing
parameters for analysis. Default calls |
verbose |
should output be printed to the console |
analysis |
an object of class Analysis containing previous analysis results |
Routine analyses are those that are often made up of numerous steps where parameters have likely already been previously established.
The emphasis here is on convenience with as little code as possible required.
In these analyses, the necessary analysis elements, order and parameters are first prepared and then the analysis routine subsequently performed in a single step.
The metabolyse function provides this utility, where the metabolome data, sample meta information and analysis parameters are provided.
The reAnalyse method can be used to perform further analyses on the results.
An S4 object of class Analysis.
library(metaboData) ## Generate analysis parameters p <- analysisParameters(c('pre-treatment','modelling')) ## Alter pre-treatment and modelling parameters to use different methods parameters(p,'pre-treatment') <- preTreatmentParameters( list(occupancyFilter = 'maximum', transform = 'TICnorm') ) parameters(p,'modelling') <- modellingParameters('anova') ## Change "cls" parameters changeParameter(p,'cls') <- 'day' ## Run analysis using a subset of the abr1 negative mode data set analysis <- metabolyse(abr1$neg[,1:200], abr1$fact, p) ## Re-analyse to include correlation analysis analysis <- reAnalyse(analysis, parameters = analysisParameters('correlations')) print(analysis)library(metaboData) ## Generate analysis parameters p <- analysisParameters(c('pre-treatment','modelling')) ## Alter pre-treatment and modelling parameters to use different methods parameters(p,'pre-treatment') <- preTreatmentParameters( list(occupancyFilter = 'maximum', transform = 'TICnorm') ) parameters(p,'modelling') <- modellingParameters('anova') ## Change "cls" parameters changeParameter(p,'cls') <- 'day' ## Run analysis using a subset of the abr1 negative mode data set analysis <- metabolyse(abr1$neg[,1:200], abr1$fact, p) ## Re-analyse to include correlation analysis analysis <- reAnalyse(analysis, parameters = analysisParameters('correlations')) print(analysis)
Retrieve the available modelling methods and parameters.
modellingMethods() modellingParameters(methods)modellingMethods() modellingParameters(methods)
methods |
character vector of available modelling methods |
## Retrieve the available modelling methods modellingMethods() ## Retrieve the modelling parameters for the anova method p <- modellingParameters('anova') ## Assign the modelling parameters to analysis parameters mp <- analysisParameters('modelling') parameters(mp,'modelling') <- p print(mp)## Retrieve the available modelling methods modellingMethods() ## Retrieve the modelling parameters for the anova method p <- modellingParameters('anova') ## Assign the modelling parameters to analysis parameters mp <- analysisParameters('modelling') parameters(mp,'modelling') <- p print(mp)
Calculate the class occupancies of all features in an AnalysisData object.
occupancy(d, cls = "class") ## S4 method for signature 'AnalysisData' occupancy(d, cls = "class")occupancy(d, cls = "class") ## S4 method for signature 'AnalysisData' occupancy(d, cls = "class")
d |
S4 object of class |
cls |
sample information column to use for which to compute class occupancies |
A tibble containing feature class proportional occupancies.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) occupancy(d,cls = 'day')library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) occupancy(d,cls = 'day')
Feature filtering based on class occupancy.
occupancyMaximum(d, cls = "class", occupancy = 2/3) ## S4 method for signature 'AnalysisData' occupancyMaximum(d, cls = "class", occupancy = 2/3) occupancyMinimum(d, cls = "class", occupancy = 2/3) ## S4 method for signature 'AnalysisData' occupancyMinimum(d, cls = "class", occupancy = 2/3)occupancyMaximum(d, cls = "class", occupancy = 2/3) ## S4 method for signature 'AnalysisData' occupancyMaximum(d, cls = "class", occupancy = 2/3) occupancyMinimum(d, cls = "class", occupancy = 2/3) ## S4 method for signature 'AnalysisData' occupancyMinimum(d, cls = "class", occupancy = 2/3)
d |
S4 object of class |
cls |
sample information column name to use for class data |
occupancy |
feature occupancy filtering threshold, below which features will be removed |
Occupancy provides a useful metric by which to filter poorly represented features (features containing a majority zero or missing values). An occupancy threshold provides a means of specifying this majority with variables below the threshold excluded from further analyses. However, this can be complicated by an underlying class structure present within the data where a variable may be well represented within one class but not in another.
An S4 object of class AnalysisData containing the class occupancy filtered data.
occupancyMaximium: Maximum occupancy threshold feature filtering. Where the maximum occupancy across all classes is above the threshold. Therefore, for a feature to be retained, only a single class needs to have an occupancy above the threshold.
occupancyMinimum: Minimum occupancy threshold feature filtering. Where the minimum occupancy across all classes is required to be above the threshold. Therefore, for a feature to be retained, all classes would need to have an occupancy above the threshold.
## Each of the following examples shows the application ## of the feature occupancy filtering method method and ## then a Principle Component Analysis is plotted to show ## its effect on the data structure. ## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Maximum occupancy threshold feature filtering d %>% occupancyMaximum(cls = 'day') %>% plotPCA(cls = 'day') ## Minimum occupancy threshold feature filtering d %>% occupancyMinimum(cls = 'day') %>% plotPCA(cls = 'day')## Each of the following examples shows the application ## of the feature occupancy filtering method method and ## then a Principle Component Analysis is plotted to show ## its effect on the data structure. ## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Maximum occupancy threshold feature filtering d %>% occupancyMaximum(cls = 'day') %>% plotPCA(cls = 'day') ## Minimum occupancy threshold feature filtering d %>% occupancyMinimum(cls = 'day') %>% plotPCA(cls = 'day')
Get or set parameters for AnalysisParameters or Analysis class objects.
parameters(d, ...) ## S4 method for signature 'AnalysisParameters' parameters(d, element) ## S4 method for signature 'Analysis' parameters(d) parameters(d, element) <- value ## S4 replacement method for signature 'AnalysisParameters' parameters(d, element) <- value ## S4 replacement method for signature 'Analysis' parameters(d) <- valueparameters(d, ...) ## S4 method for signature 'AnalysisParameters' parameters(d, element) ## S4 method for signature 'Analysis' parameters(d) parameters(d, element) <- value ## S4 replacement method for signature 'AnalysisParameters' parameters(d, element) <- value ## S4 replacement method for signature 'Analysis' parameters(d) <- value
d |
S4 object of class |
... |
arguments to pass to the appropriate method |
element |
analysis element for parameters to extract or assign.
Should be one of those returned by |
value |
list containing parameter values |
p <- analysisParameters('pre-treatment') ## extract pre-treatment parameters parameters(p,'pre-treatment') ## set pre-treatment parameters parameters(p,'pre-treatment') <- preTreatmentParameters( list( remove = 'classes', QC = c('RSDfilter','removeQC'), transform = 'TICnorm' ) ) print(p)p <- analysisParameters('pre-treatment') ## extract pre-treatment parameters parameters(p,'pre-treatment') ## set pre-treatment parameters parameters(p,'pre-treatment') <- preTreatmentParameters( list( remove = 'classes', QC = c('RSDfilter','removeQC'), transform = 'TICnorm' ) ) print(p)
Import analysis parameters from a .yaml format file or export an AnalysisParameters object to .yaml format.
parseParameters(path) exportParameters(d, file = "analysis_parameters.yaml") ## S4 method for signature 'AnalysisParameters' exportParameters(d, file = "analysis_parameters.yaml") ## S4 method for signature 'Analysis' exportParameters(d, file = "analysis_parameters.yaml")parseParameters(path) exportParameters(d, file = "analysis_parameters.yaml") ## S4 method for signature 'AnalysisParameters' exportParameters(d, file = "analysis_parameters.yaml") ## S4 method for signature 'Analysis' exportParameters(d, file = "analysis_parameters.yaml")
path |
file path of .yaml file to parse |
d |
S4 object of class AnalysisParameters or Analysis |
file |
File name and path to export to |
## Import analysis parameters paramFile <- system.file('defaultParameters.yaml',package = 'metabolyseR') p <- parseParameters(paramFile) p ## Not run: ## Export analysis parameters exportParameters(p,file = 'analysis_parameters.yaml') ## End(Not run)## Import analysis parameters paramFile <- system.file('defaultParameters.yaml',package = 'metabolyseR') p <- parseParameters(paramFile) p ## Not run: ## Export analysis parameters exportParameters(p,file = 'analysis_parameters.yaml') ## End(Not run)
Plot a heatmap of explanatory features.
plotExplanatoryHeatmap(x, ...) ## S4 method for signature 'Univariate' plotExplanatoryHeatmap( x, threshold = 0.05, title = "", distanceMeasure = "euclidean", clusterMethod = "ward.D2", featureNames = TRUE, dendrogram = TRUE, featureLimit = Inf, ... ) ## S4 method for signature 'RandomForest' plotExplanatoryHeatmap( x, metric = "false_positive_rate", threshold = 0.05, title = "", distanceMeasure = "euclidean", clusterMethod = "ward.D2", featureNames = TRUE, dendrogram = TRUE, featureLimit = Inf, ... ) ## S4 method for signature 'list' plotExplanatoryHeatmap( x, threshold = 0.05, distanceMeasure = "euclidean", clusterMethod = "ward.D2", featureNames = TRUE, featureLimit = Inf ) ## S4 method for signature 'Analysis' plotExplanatoryHeatmap( x, threshold = 0.05, distanceMeasure = "euclidean", clusterMethod = "ward.D2", featureNames = TRUE, featureLimit = Inf )plotExplanatoryHeatmap(x, ...) ## S4 method for signature 'Univariate' plotExplanatoryHeatmap( x, threshold = 0.05, title = "", distanceMeasure = "euclidean", clusterMethod = "ward.D2", featureNames = TRUE, dendrogram = TRUE, featureLimit = Inf, ... ) ## S4 method for signature 'RandomForest' plotExplanatoryHeatmap( x, metric = "false_positive_rate", threshold = 0.05, title = "", distanceMeasure = "euclidean", clusterMethod = "ward.D2", featureNames = TRUE, dendrogram = TRUE, featureLimit = Inf, ... ) ## S4 method for signature 'list' plotExplanatoryHeatmap( x, threshold = 0.05, distanceMeasure = "euclidean", clusterMethod = "ward.D2", featureNames = TRUE, featureLimit = Inf ) ## S4 method for signature 'Analysis' plotExplanatoryHeatmap( x, threshold = 0.05, distanceMeasure = "euclidean", clusterMethod = "ward.D2", featureNames = TRUE, featureLimit = Inf )
x |
object of class |
... |
arguments to pass to method |
threshold |
score threshold to use for specifying explanatory features |
title |
plot title |
distanceMeasure |
distance measure to use for clustering. See details. |
clusterMethod |
clustering method to use. See details |
featureNames |
should feature names be plotted? |
dendrogram |
TRUE/FALSE. Should the dendrogram be plotted? |
featureLimit |
The maximum number of features to plot |
metric |
importance metric on which to retrieve explanatory features |
Distance measures can be one of any that can be used for the method argument of dist().
Cluster methods can be one of any that can be used for the method argument of hclust().
library(metaboData) x <- analysisData(data = abr1$neg[,200:300],info = abr1$fact) ## random forest classification example random_forest <- randomForest(x,cls = 'day') plotExplanatoryHeatmap(random_forest) ## random forest regression example random_forest <- randomForest(x,cls = 'injorder') plotExplanatoryHeatmap(random_forest,metric = '%IncMSE',threshold = 2)library(metaboData) x <- analysisData(data = abr1$neg[,200:300],info = abr1$fact) ## random forest classification example random_forest <- randomForest(x,cls = 'day') plotExplanatoryHeatmap(random_forest) ## random forest regression example random_forest <- randomForest(x,cls = 'injorder') plotExplanatoryHeatmap(random_forest,metric = '%IncMSE',threshold = 2)
Plot the trend of a feature.
plotFeature(analysis, feature, cls = "class", label = NULL, labelSize = 2, ...) ## S4 method for signature 'AnalysisData' plotFeature(analysis, feature, cls = "class", label = NULL, labelSize = 2) ## S4 method for signature 'Analysis' plotFeature( analysis, feature, cls = "class", label = NULL, labelSize = 2, type = c("pre-treated", "raw") )plotFeature(analysis, feature, cls = "class", label = NULL, labelSize = 2, ...) ## S4 method for signature 'AnalysisData' plotFeature(analysis, feature, cls = "class", label = NULL, labelSize = 2) ## S4 method for signature 'Analysis' plotFeature( analysis, feature, cls = "class", label = NULL, labelSize = 2, type = c("pre-treated", "raw") )
analysis |
an object of class |
feature |
feature name to plot |
cls |
information column to use for class labels |
label |
information column to use for sample labels |
labelSize |
sample label size |
... |
arguments to pass to the appropriate method |
type |
|
d <- analysisData(metaboData::abr1$neg, metaboData::abr1$fact) ## Plot a categorical response variable plotFeature(d,'N133',cls = 'day') ## Plot a continuous response variable plotFeature(d,'N133',cls = 'injorder')d <- analysisData(metaboData::abr1$neg, metaboData::abr1$fact) ## Plot a categorical response variable plotFeature(d,'N133',cls = 'day') ## Plot a continuous response variable plotFeature(d,'N133',cls = 'injorder')
Plot Univariate or random forest feature importance.
plotImportance(x, ...) ## S4 method for signature 'Univariate' plotImportance(x, response = "class", rank = TRUE, threshold = 0.05) ## S4 method for signature 'RandomForest' plotImportance(x, metric = "false_positive_rate", rank = TRUE) ## S4 method for signature 'list' plotImportance(x, metric = "false_positive_rate")plotImportance(x, ...) ## S4 method for signature 'Univariate' plotImportance(x, response = "class", rank = TRUE, threshold = 0.05) ## S4 method for signature 'RandomForest' plotImportance(x, metric = "false_positive_rate", rank = TRUE) ## S4 method for signature 'list' plotImportance(x, metric = "false_positive_rate")
x |
S4 object of class |
... |
arguments to pass to specific method |
response |
response results to plot |
rank |
rank feature order for plotting |
threshold |
explanatory threshold line for the output plot |
metric |
importance metric to plot |
library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% keepClasses(cls = 'day',classes = c('H','1','5')) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') plotImportance(rf,rank = FALSE)library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% keepClasses(cls = 'day',classes = c('H','1','5')) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') plotImportance(rf,rank = FALSE)
Plot linear discriminant analysis results of pre-treated data
plotLDA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "DF1", yAxis = "DF2", shape = FALSE, ellipses = TRUE, title = "PC-LDA", legendPosition = "bottom", labelSize = 2, ... ) ## S4 method for signature 'AnalysisData' plotLDA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "DF1", yAxis = "DF2", shape = FALSE, ellipses = TRUE, title = "PC-LDA", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'Analysis' plotLDA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "DF1", yAxis = "DF2", shape = FALSE, ellipses = TRUE, title = "PC-LDA", legendPosition = "bottom", labelSize = 2, type = c("pre-treated", "raw") )plotLDA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "DF1", yAxis = "DF2", shape = FALSE, ellipses = TRUE, title = "PC-LDA", legendPosition = "bottom", labelSize = 2, ... ) ## S4 method for signature 'AnalysisData' plotLDA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "DF1", yAxis = "DF2", shape = FALSE, ellipses = TRUE, title = "PC-LDA", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'Analysis' plotLDA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "DF1", yAxis = "DF2", shape = FALSE, ellipses = TRUE, title = "PC-LDA", legendPosition = "bottom", labelSize = 2, type = c("pre-treated", "raw") )
analysis |
S4 object of class |
cls |
name of sample information column to use for class labels |
label |
name of sample information column to use for sample labels. Set to NULL for no labels. |
scale |
scale the data |
center |
center the data |
xAxis |
principle component to plot on the x-axis |
yAxis |
principle component to plot on the y-axis |
shape |
TRUE/FALSE use shape aesthetic for plot points. Defaults to TRUE when the number of classes is greater than 12 |
ellipses |
TRUE/FALSE, plot multivariate normal distribution 95\ confidence ellipses for each class |
title |
plot title |
legendPosition |
legend position to pass to legend.position argument
of |
labelSize |
label size. Ignored if |
... |
arguments to pass to the appropriate method |
type |
|
library(metaboData) d <- analysisData(abr1$neg,abr1$fact) %>% occupancyMaximum(cls = 'day') ## LDA plot plotLDA(d,cls = 'day')library(metaboData) d <- analysisData(abr1$neg,abr1$fact) %>% occupancyMaximum(cls = 'day') ## LDA plot plotLDA(d,cls = 'day')
Plot multidimensional scaling plot for a RandomForest class object.
plotMDS( x, cls = "class", label = NULL, shape = FALSE, ellipses = TRUE, title = "", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'RandomForest' plotMDS( x, cls = "class", label = NULL, shape = FALSE, ellipses = TRUE, title = "", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'list' plotMDS( x, label = NULL, shape = FALSE, ellipses = TRUE, title = "", legendPosition = "bottom", labelSize = 2 )plotMDS( x, cls = "class", label = NULL, shape = FALSE, ellipses = TRUE, title = "", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'RandomForest' plotMDS( x, cls = "class", label = NULL, shape = FALSE, ellipses = TRUE, title = "", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'list' plotMDS( x, label = NULL, shape = FALSE, ellipses = TRUE, title = "", legendPosition = "bottom", labelSize = 2 )
x |
S4 object of class |
cls |
sample information column to use for sample labelling, Set to NULL for no labelling. |
label |
sample information column to use for sample labels. Set to NULL for no labels. |
shape |
TRUE/FALSE use shape aesthetic for plot points. Defaults to TRUE when the number of classes is greater than 12 |
ellipses |
TRUE/FALSE, plot multivariate normal distribution 95% confidence ellipses for each class |
title |
plot title |
legendPosition |
legend position to pass to legend.position argument
of |
labelSize |
label size. Ignored if |
library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') plotMDS(rf,cls = 'day')library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') plotMDS(rf,cls = 'day')
Plot random forest model performance metrics
plotMetrics(x, response = "class") ## S4 method for signature 'RandomForest' plotMetrics(x) ## S4 method for signature 'list' plotMetrics(x)plotMetrics(x, response = "class") ## S4 method for signature 'RandomForest' plotMetrics(x) ## S4 method for signature 'list' plotMetrics(x)
x |
S4 object of class |
response |
response results to plot |
library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% keepClasses(cls = 'day',classes = c('H','1','5')) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day',binary = TRUE) plotMetrics(rf,response = 'day')library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% keepClasses(cls = 'day',classes = c('H','1','5')) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day',binary = TRUE) plotMetrics(rf,response = 'day')
Plot class occupancy distributions.
plotOccupancy(x, cls = "class", ...) ## S4 method for signature 'AnalysisData' plotOccupancy(x, cls = "class") ## S4 method for signature 'Analysis' plotOccupancy(x, cls = "class", type = "raw")plotOccupancy(x, cls = "class", ...) ## S4 method for signature 'AnalysisData' plotOccupancy(x, cls = "class") ## S4 method for signature 'Analysis' plotOccupancy(x, cls = "class", type = "raw")
x |
S4 object of class |
cls |
sample information column to use for class labels |
... |
arguments to pass to the appropriate method |
type |
|
library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Plot class occupancy distributions plotOccupancy(d,cls = 'day')library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Plot class occupancy distributions plotOccupancy(d,cls = 'day')
Plot Principle Component Analysis results.
plotPCA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "PC1", yAxis = "PC2", shape = FALSE, ellipses = TRUE, title = "PCA", legendPosition = "bottom", labelSize = 2, ... ) ## S4 method for signature 'AnalysisData' plotPCA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "PC1", yAxis = "PC2", shape = FALSE, ellipses = TRUE, title = "Principle Component Analysis (PCA)", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'Analysis' plotPCA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "PC1", yAxis = "PC2", shape = FALSE, ellipses = TRUE, title = "PCA", legendPosition = "bottom", labelSize = 2, type = c("pre-treated", "raw") )plotPCA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "PC1", yAxis = "PC2", shape = FALSE, ellipses = TRUE, title = "PCA", legendPosition = "bottom", labelSize = 2, ... ) ## S4 method for signature 'AnalysisData' plotPCA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "PC1", yAxis = "PC2", shape = FALSE, ellipses = TRUE, title = "Principle Component Analysis (PCA)", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'Analysis' plotPCA( analysis, cls = "class", label = NULL, scale = TRUE, center = TRUE, xAxis = "PC1", yAxis = "PC2", shape = FALSE, ellipses = TRUE, title = "PCA", legendPosition = "bottom", labelSize = 2, type = c("pre-treated", "raw") )
analysis |
object of class |
cls |
name of class information column to use for sample labelling |
label |
name of class information column to use for sample labels. Set to NULL for no labels. |
scale |
scale the data |
center |
center the data |
xAxis |
principle component to plot on the x-axis |
yAxis |
principle component to plot on the y-axis |
shape |
TRUE/FALSE use shape aesthetic for plot points. Defaults to TRUE when the number of classes is greater than 12 |
ellipses |
TRUE/FALSE, plot multivariate normal distribution 95\ confidence ellipses for each class |
title |
plot title |
legendPosition |
legend position to pass to legend.position argument
of |
labelSize |
label size. Ignored if |
... |
arguments to pass to the appropriate method |
type |
|
library(metaboData) d <- analysisData(abr1$neg,abr1$fact) %>% occupancyMaximum(cls = 'day') ## PCA plot plotPCA(d,cls = 'day')library(metaboData) d <- analysisData(abr1$neg,abr1$fact) %>% occupancyMaximum(cls = 'day') ## PCA plot plotPCA(d,cls = 'day')
Plot receiver operator characteristic curves for a
RandomForest class object.
plotROC(x, title = "", legendPosition = "bottom") ## S4 method for signature 'RandomForest' plotROC(x, title = "", legendPosition = "bottom") ## S4 method for signature 'list' plotROC(x, title = "", legendPosition = "bottom")plotROC(x, title = "", legendPosition = "bottom") ## S4 method for signature 'RandomForest' plotROC(x, title = "", legendPosition = "bottom") ## S4 method for signature 'list' plotROC(x, title = "", legendPosition = "bottom")
x |
S4 object of class |
title |
plot title |
legendPosition |
legend position to pass to legend.position
argument of |
library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') plotROC(rf)library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') plotROC(rf)
Plot RSD distributions of raw data in quality control samples.
plotRSD(analysis, cls = "class", ...) ## S4 method for signature 'AnalysisData' plotRSD(analysis, cls = "class") ## S4 method for signature 'Analysis' plotRSD(analysis, cls = "class", type = "raw")plotRSD(analysis, cls = "class", ...) ## S4 method for signature 'AnalysisData' plotRSD(analysis, cls = "class") ## S4 method for signature 'Analysis' plotRSD(analysis, cls = "class", type = "raw")
analysis |
object of class |
cls |
information column to use for class labels |
... |
arguments to pass to the appropriate method |
type |
|
library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Plot class RSD distributions plotRSD(d,cls = 'day')library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Plot class RSD distributions plotRSD(d,cls = 'day')
A multidimensional scaling (MDS) plot of supervised random forest analysis
plotSupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, ROC = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2, ... ) ## S4 method for signature 'AnalysisData' plotSupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, ROC = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'Analysis' plotSupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, ROC = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2, type = c("pre-treated", "raw") )plotSupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, ROC = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2, ... ) ## S4 method for signature 'AnalysisData' plotSupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, ROC = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'Analysis' plotSupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, ROC = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2, type = c("pre-treated", "raw") )
x |
object of class |
cls |
information column to use for sample classes |
rf |
list of additional parameters to pass to |
label |
information column to use for sample labels. Set to |
shape |
TRUE/FALSE use shape aesthetic for plot points. Defaults to TRUE when the number of classes is greater than 12 |
ellipses |
TRUE/FALSE, plot multivariate normal distribution 95% confidence ellipses for each class |
ROC |
should receiver-operator characteristics be plotted? |
seed |
random number seed |
title |
plot title |
legendPosition |
legend position to pass to legend.position argument
of |
labelSize |
label size. Ignored if |
... |
arguments to pass to the appropriate method |
type |
|
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Supervised random forest MDS plot plotSupervisedRF(d,cls = 'day')library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Supervised random forest MDS plot plotSupervisedRF(d,cls = 'day')
Plot total ion counts of sample data.
plotTIC(analysis, by = "injOrder", colour = "block", ...) ## S4 method for signature 'AnalysisData' plotTIC(analysis, by = "injOrder", colour = "block") ## S4 method for signature 'Analysis' plotTIC( analysis, by = "injOrder", colour = "block", type = c("pre-treated", "raw") )plotTIC(analysis, by = "injOrder", colour = "block", ...) ## S4 method for signature 'AnalysisData' plotTIC(analysis, by = "injOrder", colour = "block") ## S4 method for signature 'Analysis' plotTIC( analysis, by = "injOrder", colour = "block", type = c("pre-treated", "raw") )
analysis |
S4 object of class |
by |
information column to plot against |
colour |
information column to provide colour labels |
... |
arguments to pass to the appropriate method |
type |
|
library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Plot sample TIVs plotTIC(d,by = 'injorder',colour = 'day') plotTIC(d,by = 'day',colour = 'day')library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Plot sample TIVs plotTIC(d,by = 'injorder',colour = 'day') plotTIC(d,by = 'day',colour = 'day')
A multidimensional scaling (MDS) plot of unsupervised random forest analysis
plotUnsupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2, ... ) ## S4 method for signature 'AnalysisData' plotUnsupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'Analysis' plotUnsupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2, type = c("pre-treated", "raw") )plotUnsupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2, ... ) ## S4 method for signature 'AnalysisData' plotUnsupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2 ) ## S4 method for signature 'Analysis' plotUnsupervisedRF( x, cls = "class", rf = list(), label = NULL, shape = FALSE, ellipses = TRUE, seed = 1234, title = "", legendPosition = "bottom", labelSize = 2, type = c("pre-treated", "raw") )
x |
object of class |
cls |
sample information column to use for sample labelling |
rf |
list of additional parameters to pass to |
label |
info column to use for sample labels. Set to NULL for no labels. |
shape |
TRUE/FALSE use shape aesthetic for plot points. Defaults to TRUE when the number of classes is greater than 12 |
ellipses |
TRUE/FALSE, plot multivariate normal distribution 95% confidence ellipses for each class |
seed |
random number seed |
title |
plot title |
legendPosition |
legend position to pass to legend.position argument
of |
labelSize |
label size. Ignored if |
... |
arguments to pass to the appropriate method |
type |
|
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Unsupervised random forest MDS plot plotUnsupervisedRF(d,cls = 'day')library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Unsupervised random forest MDS plot plotUnsupervisedRF(d,cls = 'day')
Predict values of random forest model response variables from new data.
predict( model, new_data, idx = NULL, type = c("response", "prob", "votes"), ... ) ## S4 method for signature 'RandomForest,AnalysisData' predict( model, new_data, idx = NULL, type = c("response", "prob", "votes"), ... )predict( model, new_data, idx = NULL, type = c("response", "prob", "votes"), ... ) ## S4 method for signature 'RandomForest,AnalysisData' predict( model, new_data, idx = NULL, type = c("response", "prob", "votes"), ... )
model |
S4 object of class |
new_data |
S4 object of class |
idx |
sample information column to use for sample names. If |
type |
one of |
... |
arguments to pass to |
The features contained within new_data should match those of the features used to train model.
The features() method can be used to check this.
The argument returnModels = TRUE should also be used when training the RandomForest-class object used for argument model.
library(metaboData) ## Prepare some data x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() ## Extract data from which to train a random forest model training_data <- x %>% keepClasses(cls = 'day', classes = c('H','1')) ## Extract data for which response values will be predicted test_data <- x %>% keepClasses(cls = 'day', classes = c('2','3')) rf <- randomForest(training_data, cls = 'day', returnModels = TRUE) predict(rf, test_data)library(metaboData) ## Prepare some data x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() ## Extract data from which to train a random forest model training_data <- x %>% keepClasses(cls = 'day', classes = c('H','1')) ## Extract data for which response values will be predicted test_data <- x %>% keepClasses(cls = 'day', classes = c('2','3')) rf <- randomForest(training_data, cls = 'day', returnModels = TRUE) predict(rf, test_data)
Return pre-treatment elements, methods and parameters.
preTreatmentElements() preTreatmentMethods(element) preTreatmentParameters(methods)preTreatmentElements() preTreatmentMethods(element) preTreatmentParameters(methods)
element |
pre-treatment element name |
methods |
a named list of element methods |
## Return the availalble pre-treatment elements preTreatmentElements() ## Return the available pre-treatment methods for the remove element preTreatmentMethods('remove') ## Define some default pre-treatment parameters p <- preTreatmentParameters( list( remove = 'classes', QC = c('RSDfilter','removeQC'), transform = 'TICnorm' ) ) ## Assign the pre-treatment parameters to analysis parameters ap <- analysisParameters('pre-treatment') parameters(ap,'pre-treatment') <- p print(ap)## Return the availalble pre-treatment elements preTreatmentElements() ## Return the available pre-treatment methods for the remove element preTreatmentMethods('remove') ## Define some default pre-treatment parameters p <- preTreatmentParameters( list( remove = 'classes', QC = c('RSDfilter','removeQC'), transform = 'TICnorm' ) ) ## Assign the pre-treatment parameters to analysis parameters ap <- analysisParameters('pre-treatment') parameters(ap,'pre-treatment') <- p print(ap)
Quality control (QC) sample pre-treatment methods.
QCimpute( d, cls = "class", QCidx = "QC", occupancy = 2/3, parallel = "variables", seed = 1234 ) ## S4 method for signature 'AnalysisData' QCimpute( d, cls = "class", QCidx = "QC", occupancy = 2/3, parallel = "variables", seed = 1234 ) QCoccupancy(d, cls = "class", QCidx = "QC", occupancy = 2/3) ## S4 method for signature 'AnalysisData' QCoccupancy(d, cls = "class", QCidx = "QC", occupancy = 2/3) QCremove(d, cls = "class", QCidx = "QC") ## S4 method for signature 'AnalysisData' QCremove(d, cls = "class", QCidx = "QC") QCrsdFilter(d, cls = "class", QCidx = "QC", RSDthresh = 50) ## S4 method for signature 'AnalysisData' QCrsdFilter(d, cls = "class", QCidx = "QC", RSDthresh = 50)QCimpute( d, cls = "class", QCidx = "QC", occupancy = 2/3, parallel = "variables", seed = 1234 ) ## S4 method for signature 'AnalysisData' QCimpute( d, cls = "class", QCidx = "QC", occupancy = 2/3, parallel = "variables", seed = 1234 ) QCoccupancy(d, cls = "class", QCidx = "QC", occupancy = 2/3) ## S4 method for signature 'AnalysisData' QCoccupancy(d, cls = "class", QCidx = "QC", occupancy = 2/3) QCremove(d, cls = "class", QCidx = "QC") ## S4 method for signature 'AnalysisData' QCremove(d, cls = "class", QCidx = "QC") QCrsdFilter(d, cls = "class", QCidx = "QC", RSDthresh = 50) ## S4 method for signature 'AnalysisData' QCrsdFilter(d, cls = "class", QCidx = "QC", RSDthresh = 50)
d |
S4 object of class AnalysisData |
cls |
info column to use for class labels |
QCidx |
QC sample label |
occupancy |
occupancy threshold for filtering |
parallel |
parallel type to use. See |
seed |
random number seed |
RSDthresh |
RSD (%) threshold for filtering |
A QC sample is an average pooled sample, equally representative in composition of all the samples present within an experimental set. Within an analytical run, the QC sample is analysed at equal intervals throughout the run. If there is class structure within the run, this should be randomised within a block fashion so that the classes are equally represented in each block throughout the run. A QC sample can then be injected and analysed between these randomised blocks. This provides a set of technical injections that allows the variability in instrument performance over the run to be accounted for and the robustness of the acquired variables to be assessed.
The technical reproducibility of an acquired variable can be assessed using it's relative standard deviation (RSD) within the QC samples. The variable RSDs can then be filtered below a threshold value to remove metabolome features that are poorly reproducible across the analytical runs. This variable filtering strategy has an advantage over that of occupancy alone as it is not dependent on underlying class structure. Therefore, the variables and variable numbers will not alter if a new class structure is imposed upon the data.
An S4 object of class AnalysisData containing QC treated data.
QCimpute: Missing value imputation of QC samples.
QCoccupancy: Feature maximum occupancy filtering based on QC samples.
QCremove: Remove QC samples.
QCrsdFilter: Feature filtering based RSD of QC sample features.
## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,1:1000],abr1$fact) ## Plot the feature RSD distributions of the H class only d %>% keepClasses(cls = 'day',classes = 'H') %>% plotRSD(cls = 'day') ## Apply QC feature occupancy filtering and QC feature RSD filtering QC_treated <- d %>% QCoccupancy(cls = 'day',QCidx = 'H',occupancy = 2/3) %>% QCrsdFilter(cls = 'day',QCidx = 'H',RSDthresh = 50) print(QC_treated) ## Plot the feature RSD distributions of the H class after QC treatments QC_treated %>% keepClasses(cls = 'day',classes = 'H') %>% plotRSD(cls = 'day')## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,1:1000],abr1$fact) ## Plot the feature RSD distributions of the H class only d %>% keepClasses(cls = 'day',classes = 'H') %>% plotRSD(cls = 'day') ## Apply QC feature occupancy filtering and QC feature RSD filtering QC_treated <- d %>% QCoccupancy(cls = 'day',QCidx = 'H',occupancy = 2/3) %>% QCrsdFilter(cls = 'day',QCidx = 'H',RSDthresh = 50) print(QC_treated) ## Plot the feature RSD distributions of the H class after QC treatments QC_treated %>% keepClasses(cls = 'day',classes = 'H') %>% plotRSD(cls = 'day')
Perform random forest on an AnalysisData object
randomForest( x, cls = "class", rf = list(), reps = 1, binary = FALSE, comparisons = list(), perm = 0, returnModels = FALSE, seed = 1234 ) ## S4 method for signature 'AnalysisData' randomForest( x, cls = "class", rf = list(), reps = 1, binary = FALSE, comparisons = list(), perm = 0, returnModels = FALSE, seed = 1234 )randomForest( x, cls = "class", rf = list(), reps = 1, binary = FALSE, comparisons = list(), perm = 0, returnModels = FALSE, seed = 1234 ) ## S4 method for signature 'AnalysisData' randomForest( x, cls = "class", rf = list(), reps = 1, binary = FALSE, comparisons = list(), perm = 0, returnModels = FALSE, seed = 1234 )
x |
S4 object of class |
cls |
vector of sample information columns to use for response variable information. Set to NULL for unsupervised. |
rf |
named list of arguments to pass to |
reps |
number of repetitions to perform |
binary |
TRUE/FALSE should binary comparisons be performed. Ignored for unsupervised and regression. Ignored if |
comparisons |
list of comparisons to perform. Ignored for unsupervised and regression. See details. |
perm |
number of permutations to perform. Ignored for unsupervised. |
returnModels |
TRUE/FALSE should model objects be returned. |
seed |
random number seed |
Specified class comparisons should be given as a list named
according to cls. Comparisons should be given as class names
separated by '~' (eg. '1~2~H').
An S4 object of class RandomForest.
library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') plotMDS(rf,cls = 'day')library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') plotMDS(rf,cls = 'day')
An S4 class for random forest results and models.
typerandom forest type
responseresponse variable name
metricstibble of model performance metrics
predictionstibble of model observation predictions
permutationslist of permutations measure and importance results tables
importancestibble of model feature importances
proximitiestibble of model observation proximities
modelslist of random forest models
Exclusion of samples, classes or features from an AnalysisData object.
removeClasses(d, cls = "class", classes = c()) ## S4 method for signature 'AnalysisData' removeClasses(d, cls = "class", classes = c()) removeFeatures(d, features = character()) ## S4 method for signature 'AnalysisData' removeFeatures(d, features = character()) removeSamples(d, idx = "fileOrder", samples = c()) ## S4 method for signature 'AnalysisData' removeSamples(d, idx = "fileOrder", samples = c())removeClasses(d, cls = "class", classes = c()) ## S4 method for signature 'AnalysisData' removeClasses(d, cls = "class", classes = c()) removeFeatures(d, features = character()) ## S4 method for signature 'AnalysisData' removeFeatures(d, features = character()) removeSamples(d, idx = "fileOrder", samples = c()) ## S4 method for signature 'AnalysisData' removeSamples(d, idx = "fileOrder", samples = c())
d |
S4 object of class |
cls |
info column to use for class information |
classes |
classes to remove |
features |
features to remove |
idx |
info column containing sample indexes |
samples |
sample indexes to remove |
An S4 object of class AnalysisData with samples, classes or features removed.
removeClasses: Remove classes.
removeFeatures: Remove features.
removeSamples: Remove samples.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Remove classes d %>% removeClasses(cls = 'day',classes = 'H') ## Remove features d %>% removeFeatures(features = c('N200','N201')) ## Remove samples d %>% removeSamples(idx = 'injorder',samples = c(1,10))library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) ## Remove classes d %>% removeClasses(cls = 'day',classes = 'H') ## Remove features d %>% removeFeatures(features = c('N200','N201')) ## Remove samples d %>% removeSamples(idx = 'injorder',samples = c(1,10))
ROC curves for out-of-bag random forest predictions.
roc(x) ## S4 method for signature 'RandomForest' roc(x) ## S4 method for signature 'list' roc(x) ## S4 method for signature 'Analysis' roc(x)roc(x) ## S4 method for signature 'RandomForest' roc(x) ## S4 method for signature 'list' roc(x) ## S4 method for signature 'Analysis' roc(x)
x |
S4 object of class |
A tibble containing the ROC curves.
library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') roc(rf)library(metaboData) x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() rf <- randomForest(x,cls = 'day') roc(rf)
Calculate relative standard deviation (RSD) percentage values for each feature per class for a given sample information column.
rsd(x, cls = "class") ## S4 method for signature 'AnalysisData' rsd(x, cls = "class")rsd(x, cls = "class") ## S4 method for signature 'AnalysisData' rsd(x, cls = "class")
x |
S4 object of class |
cls |
sample information column to use for class structure |
A tibble containing the computed RSD values.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) rsd(d,cls = 'day')library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) rsd(d,cls = 'day')
AnalysisData objectSplit an object of class AnalysisData into a list based
a class grouping variable.
split(x, cls = "class") ## S4 method for signature 'AnalysisData' split(x, cls = "class")split(x, cls = "class") ## S4 method for signature 'AnalysisData' split(x, cls = "class")
x |
S4 object of class |
cls |
sample information column to use for splitting |
A list of AnalysisData objects.
library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Split the data set based on the 'day' class information column d <- split(d,cls = 'day') print(d)library(metaboData) d <- analysisData(abr1$neg,abr1$fact) ## Split the data set based on the 'day' class information column d <- split(d,cls = 'day') print(d)
Methods for data scaling, transformation and normalisation.
transformArcSine(d) ## S4 method for signature 'AnalysisData' transformArcSine(d) transformAuto(d) ## S4 method for signature 'AnalysisData' transformAuto(d) transformCenter(d) ## S4 method for signature 'AnalysisData' transformCenter(d) transformLevel(d) ## S4 method for signature 'AnalysisData' transformLevel(d) transformLn(d, add = 1) ## S4 method for signature 'AnalysisData' transformLn(d, add = 1) transformLog10(d, add = 1) ## S4 method for signature 'AnalysisData' transformLog10(d, add = 1) transformPareto(d) ## S4 method for signature 'AnalysisData' transformPareto(d) transformPercent(d) ## S4 method for signature 'AnalysisData' transformPercent(d) transformRange(d) ## S4 method for signature 'AnalysisData' transformRange(d) transformSQRT(d) ## S4 method for signature 'AnalysisData' transformSQRT(d) transformTICnorm(d, refactor = TRUE) ## S4 method for signature 'AnalysisData' transformTICnorm(d, refactor = TRUE) transformVast(d) ## S4 method for signature 'AnalysisData' transformVast(d)transformArcSine(d) ## S4 method for signature 'AnalysisData' transformArcSine(d) transformAuto(d) ## S4 method for signature 'AnalysisData' transformAuto(d) transformCenter(d) ## S4 method for signature 'AnalysisData' transformCenter(d) transformLevel(d) ## S4 method for signature 'AnalysisData' transformLevel(d) transformLn(d, add = 1) ## S4 method for signature 'AnalysisData' transformLn(d, add = 1) transformLog10(d, add = 1) ## S4 method for signature 'AnalysisData' transformLog10(d, add = 1) transformPareto(d) ## S4 method for signature 'AnalysisData' transformPareto(d) transformPercent(d) ## S4 method for signature 'AnalysisData' transformPercent(d) transformRange(d) ## S4 method for signature 'AnalysisData' transformRange(d) transformSQRT(d) ## S4 method for signature 'AnalysisData' transformSQRT(d) transformTICnorm(d, refactor = TRUE) ## S4 method for signature 'AnalysisData' transformTICnorm(d, refactor = TRUE) transformVast(d) ## S4 method for signature 'AnalysisData' transformVast(d)
d |
S4 object of class |
add |
value to add prior to transformation |
refactor |
TRUE/FALSE. Re-factor the normalised intensity values to a range consistent with the raw values by multiplying by the median sample TIC. |
Prior to downstream analyses, metabolomics data often require transformation to fulfil the assumptions of a particular statistical/data mining technique. Before applying a transformation, it is important to consider the effects that the transformation will have on the data, as this can greatly effect the outcome of further downstream analyses. It is also important to consider at what stage in the pre-treatment routine a transformation is applied as this too could introduce artefacts into the data. The best practice is to apply a transformation as the last in a pre-treatment routine after all other steps have been taken. There are a wide range of transformation methods available that are commonly used for the analysis of metabolomics data.
An S4 object of class AnalysisData containing the transformed data.
transformArcSine: Arc-sine transformation.
transformAuto: Auto scaling.
transformCenter: Mean centring.
transformLevel: Level scaling.
transformLn: Natural logarithmic transformation.
transformLog10: Logarithmic transformation.
transformPareto: Pareto scaling.
transformPercent: Scale as a percentage of the feature maximum intensity.
transformRange: Range scaling. Also known as min-max scaling.
transformSQRT: Square root transformation.
transformTICnorm: Total ion count normalisation.
transformVast: Vast scaling.
## Each of the following examples shows the application of the transformation and then ## a Linear Discriminant Analysis is plotted to show it's effect on the data structure. ## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(occupancy = 2/3) d %>% plotLDA(cls = 'day') ## Arc-sine transformation d %>% transformArcSine() %>% plotLDA(cls = 'day') ## Auto scaling d %>% transformAuto() %>% plotLDA(cls = 'day') ## Mean centring d %>% transformCenter()%>% plotLDA(cls = 'day') ## Level scaling d %>% transformLevel() %>% plotLDA(cls = 'day') ## Natural logarithmic transformation d %>% transformLn() %>% plotLDA(cls = 'day') ## Logarithmic transformation d %>% transformLog10()%>% plotLDA(cls = 'day') ## Pareto scaling d %>% transformPareto() %>% plotLDA(cls = 'day') ## Percentage scaling d %>% transformPercent() %>% plotLDA(cls = 'day') ## Range scaling d %>% transformRange() %>% plotLDA(cls = 'day') ## Square root scaling d %>% transformSQRT() %>% plotLDA(cls = 'day') ## Total ion count nromalisation d %>% transformTICnorm() %>% plotLDA(cls = 'day') ## Vast scaling d %>% transformVast() %>% plotLDA(cls = 'day')## Each of the following examples shows the application of the transformation and then ## a Linear Discriminant Analysis is plotted to show it's effect on the data structure. ## Initial example data preparation library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(occupancy = 2/3) d %>% plotLDA(cls = 'day') ## Arc-sine transformation d %>% transformArcSine() %>% plotLDA(cls = 'day') ## Auto scaling d %>% transformAuto() %>% plotLDA(cls = 'day') ## Mean centring d %>% transformCenter()%>% plotLDA(cls = 'day') ## Level scaling d %>% transformLevel() %>% plotLDA(cls = 'day') ## Natural logarithmic transformation d %>% transformLn() %>% plotLDA(cls = 'day') ## Logarithmic transformation d %>% transformLog10()%>% plotLDA(cls = 'day') ## Pareto scaling d %>% transformPareto() %>% plotLDA(cls = 'day') ## Percentage scaling d %>% transformPercent() %>% plotLDA(cls = 'day') ## Range scaling d %>% transformRange() %>% plotLDA(cls = 'day') ## Square root scaling d %>% transformSQRT() %>% plotLDA(cls = 'day') ## Total ion count nromalisation d %>% transformTICnorm() %>% plotLDA(cls = 'day') ## Vast scaling d %>% transformVast() %>% plotLDA(cls = 'day')
Welch's t-test
ttest( x, cls = "class", pAdjust = "bonferroni", comparisons = list(), returnModels = FALSE ) ## S4 method for signature 'AnalysisData' ttest( x, cls = "class", pAdjust = "bonferroni", comparisons = list(), returnModels = FALSE )ttest( x, cls = "class", pAdjust = "bonferroni", comparisons = list(), returnModels = FALSE ) ## S4 method for signature 'AnalysisData' ttest( x, cls = "class", pAdjust = "bonferroni", comparisons = list(), returnModels = FALSE )
x |
S4 object of class AnalysisData |
cls |
vector of sample information column names to analyse |
pAdjust |
p value adjustment method |
comparisons |
named list of binary comparisons to analyse |
returnModels |
should models be returned |
An S4 object of class Univariate.
library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) %>% keepClasses(cls = 'day',classes = c('H','5')) ## Perform t-test ttest_analysis <- ttest(d,cls = 'day') ## Extract significant features explanatoryFeatures(ttest_analysis)library(metaboData) d <- analysisData(abr1$neg[,200:300],abr1$fact) %>% keepClasses(cls = 'day',classes = c('H','5')) ## Perform t-test ttest_analysis <- ttest(d,cls = 'day') ## Extract significant features explanatoryFeatures(ttest_analysis)
Tune the mtry and ntree random forest parameters using a grid search approach.
tune( x, cls = "class", mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) + mtry(x, cls = cls)/2, length.out = 4)), ntree_range = 1000, seed = 1234 ) ## S4 method for signature 'AnalysisData' tune( x, cls = "class", mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) + mtry(x, cls = cls)/2, length.out = 4)), ntree_range = 1000, seed = 1234 )tune( x, cls = "class", mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) + mtry(x, cls = cls)/2, length.out = 4)), ntree_range = 1000, seed = 1234 ) ## S4 method for signature 'AnalysisData' tune( x, cls = "class", mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) + mtry(x, cls = cls)/2, length.out = 4)), ntree_range = 1000, seed = 1234 )
x |
S4 object of class |
cls |
sample information column to use |
mtry_range |
numeric vector of |
ntree_range |
numeric vector of |
seed |
random number seed |
Parameter tuning is performed by grid search of all combinations of the mtry_range and ntree_range vectors provided.
The optimal parameter values are selected using the out-of-bag error estimates of the margin metric for classification and the rmse (root-mean-square error) metric for regression.
A list containing the optimal mtry and ntree parameters.
This is suitable for use as the rf argument in method randomForest().
library(metaboData) ## Prepare some data x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() ## Tune the `mtry` parameter for the `day` response tune(x,cls = 'day')library(metaboData) ## Prepare some data x <- analysisData(abr1$neg[,200:300],abr1$fact) %>% occupancyMaximum(cls = 'day') %>% transformTICnorm() ## Tune the `mtry` parameter for the `day` response tune(x,cls = 'day')
An S4 class for univariate test models and results.
typeunivariate test type
modelslist of model objects
resultstibble containing test results