Title: | Modeling Multivariate Origins Determinants - Evolutionary Lineages in Ecology |
---|---|
Description: | Perform multivariate modeling of evolved traits, with special attention to understanding the interplay of the multi-factorial determinants of their origins in complex ecological settings (Stephens, 2007 <doi:10.1016/j.tree.2006.12.003>). This software primarily concentrates on phylogenetic regression analysis, enabling implementation of tree transformation averaging and visualization functionality. Functions additionally support information theoretic approaches (Grueber, 2011 <doi:10.1111/j.1420-9101.2010.02210.x>; Garamszegi, 2011 <doi:10.1007/s00265-010-1028-7>) such as model averaging and selection of phylogenetic models. Accessory functions are also implemented for coef standardization (Cade 2015), selection uncertainty, and variable importance (Burnham & Anderson 2000). There are other numerous functions for visualizing confounded variables, plotting phylogenetic trees, as well as reporting and exporting modeling results. Lastly, as challenges to ecology are inherently multifarious, and therefore often multi-dataset, this package features several functions to support the identification, interpolation, merging, and updating of missing data and outdated nomenclature. |
Authors: | David M Schruth |
Maintainer: | David M Schruth <[email protected]> |
License: | Apache License |
Version: | 0.2.5 |
Built: | 2024-11-07 03:26:12 UTC |
Source: | https://github.com/cran/mmodely |
These function takes the output of pgls.iter and uses its list of objects model fits, optimzations (e.g.AICc) and performs a weighted average on the ctoefficients estimated in the former by weighting by the latter. The parameters can also optionally be converted to binary by specifying "binary=FALSE" or just running the alias wraper function for assessing evidence of variable importance (Burnham & Anderson 2000).
average.fit.models(vars, fits, optims,weight='AICw', by=c('n','q','nXq','rwGsm')[1], round.digits=5, binary=FALSE, standardize=FALSE) variable.importance(vars, fits, optims,weight='AICw', by=c('n','q','nXq','rwGsm')[1], round.digits=5)
average.fit.models(vars, fits, optims,weight='AICw', by=c('n','q','nXq','rwGsm')[1], round.digits=5, binary=FALSE, standardize=FALSE) variable.importance(vars, fits, optims,weight='AICw', by=c('n','q','nXq','rwGsm')[1], round.digits=5)
vars |
variable names of model |
fits |
a list of PGLS model fits |
optims |
a list of PGLS optimization paramters (should include "AICw") |
weight |
a column name in the optims that specifies the weights to be used in the average |
by |
unique identifier used to group sub-datasets for reporting (defaults to n) |
round.digits |
the tnumber of decimal places of the resultant mean to ouput |
binary |
converts all parameters to binary for presense or absense to calculate 'importance' |
standardize |
standardize the coefficient estimates by partial standard deviations, according to Cade (2015) |
A vector of AICc difference weighted [AICw] averages of PGLS coefficients. Also returns model 'selection' errors or the square root of 'uncertainties' (Burnham & Anderson 2000)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) average.fit.models(vars=c('mass.Kg','group.size'), fits=PGLSi$fits, optims=PGLSi$optim) variable.importance(vars=c('mass.Kg','group.size'), fits=PGLSi$fits, optims=PGLSi$optim)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) average.fit.models(vars=c('mass.Kg','group.size'), fits=PGLSi$fits, optims=PGLSi$optim) variable.importance(vars=c('mass.Kg','group.size'), fits=PGLSi$fits, optims=PGLSi$optim)
The one in ten rule of thumb for model fitting suggest at least 10 fold as many data as parametes fit. This function allows for easily calculating that ratio on model selected PGLS fits.
calc.q2n.ratio(coefs)
calc.q2n.ratio(coefs)
coefs |
a list of coefficients extracted from fit PGLS models |
the ratio of q to n (on average for all extracted fit models)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) coefs.objs <- get.pgls.coefs(PGLSi$fits, est='Estimate') calc.q2n.ratio(coefs.objs)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) coefs.objs <- get.pgls.coefs(PGLSi$fits, est='Estimate') calc.q2n.ratio(coefs.objs)
This function takes a dataframe, list, or a named vector of variable (column) names to subset
cept(x,except='gn_sp')
cept(x,except='gn_sp')
x |
a dataframe, list, or named vector |
except |
a vector of the names of the items in x to exclude |
the subset of x without those 'except' items specified
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) df.except.gnsp <- cept(x=data,except='gn_sp')
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) df.except.gnsp <- cept(x=data,except='gn_sp')
This is a shortcut function that wraps around "comparative.data" for use in the PGLS function.
comp.data(phylo,df,gn_sp='gn_sp')
comp.data(phylo,df,gn_sp='gn_sp')
phylo |
a tre file of the format phylo |
df |
a data.frame with row names matching number and tip labels of 'tree' |
gn_sp |
the column name (e.g. "gn_sp") that indicates how to match df with tree |
a "comparative data" table for use in PGLS modeling
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data)
This function simply lists the rows of the data that are not getting matched to tips of the tree.
compare.data.gs.vs.tree.tips(data, phylo, match.on=c('gn_sp','rownames')[1])
compare.data.gs.vs.tree.tips(data, phylo, match.on=c('gn_sp','rownames')[1])
data |
a data frame with genus species information as row names and a column named "gn_sp" |
phylo |
a phylogenetic tree with labeled tip |
match.on |
use a character string specifiying where the 'Genus_species' vector lies |
prints rows that are not matched ot the tree tips
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] compare.data.gs.vs.tree.tips(data, phyl, match.on='rownames')
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] compare.data.gs.vs.tree.tips(data, phyl, match.on='rownames')
Calculate a corrected Akaiki Information Criterion
correct.AIC(AIC, K,n)
correct.AIC(AIC, K,n)
AIC |
a vector of AIC values |
K |
number of parameters |
n |
number of data |
corrected AIC values
correct.AIC(AIC=100,K=10,n=100)
correct.AIC(AIC=100,K=10,n=100)
This function takes a model string and counts the number of predictor variables.
count.mod.vars(model)
count.mod.vars(model)
model |
model specified as a string in the form "y ~ x1 + x2 ..." |
an integer specifying the count of predictor variables
count <- count.mod.vars(model=formula('y ~ x1 + x2')) if(count == 2) { print('sane'); }else{ print('insane')}
count <- count.mod.vars(model=formula('y ~ x1 + x2')) if(count == 2) { print('sane'); }else{ print('insane')}
Count all combinations of predictor variables in a multivariate regression model.
ct.possible.models(q)
ct.possible.models(q)
q |
number of predictor variables |
a count of the number of possible models
ct.possible.models(9)
ct.possible.models(9)
This function takes a dataframe as input and removes any rows that have NA as values.
drop.na.data(df, vars=names(df))
drop.na.data(df, vars=names(df))
df |
a dataframe |
vars |
sub set of variable (column) names to use in searching for missing values |
A subset of 'df' that only has non-missing values in the columns specified by 'vars'
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) df.nona <- drop.na.data(data, vars=names(df))
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) df.nona <- drop.na.data(data, vars=names(df))
This function takes a fit multivariate regression model as input and converts the normal tabular output into a single line using repeated "+"or"-" symbols for significance
fit.1ln.rprt(fit, method=c('std.dev','p-value')[1], decimal.places=3, name.char.len=6, print.inline=TRUE, rtrn.line=FALSE, R2AIC=TRUE,mn='')
fit.1ln.rprt(fit, method=c('std.dev','p-value')[1], decimal.places=3, name.char.len=6, print.inline=TRUE, rtrn.line=FALSE, R2AIC=TRUE,mn='')
fit |
a fit model |
method |
how to calculate the number of pluses or minuses before each coefficient name (default is standard deviations) |
decimal.places |
the number of decimal places to use in reporting p-values |
name.char.len |
the maximum length to use when truncating variable names |
R2AIC |
boolean for also returning/printing AIC and R^2 values |
print.inline |
should the outout string be printed to the terminal? |
rtrn.line |
should the output string be returned as a characters string? |
mn |
model number prefixed to printout if 'print.inline' is TRUE |
A character string of the form "++var1 +var5 var3 | -var2 –var4" indicating signifcance and direction of regression results
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) model.fit <- lm('OC ~ mass.Kg + group.size + arboreal + leap.pct', data=data) fit.1ln.rprt(fit=model.fit, decimal.places=3, name.char.len=6, print.inline=TRUE, rtrn.line=FALSE)
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) model.fit <- lm('OC ~ mass.Kg + group.size + arboreal + leap.pct', data=data) fit.1ln.rprt(fit=model.fit, decimal.places=3, name.char.len=6, print.inline=TRUE, rtrn.line=FALSE)
Get the variable names from a model string by splitting on "+" and '~' using both 'get.mod.outcome' and 'get.mod.vars'. The results are passed to the comp.data function for eventual use in PGLS modeling. 'gn_sp' is included as it is typically required to link tree tips to rows of the comparative data.
get.mod.clmns(model, gs.clmn='gn_sp')
get.mod.clmns(model, gs.clmn='gn_sp')
model |
a model string of the form "y ~ x1 + x2 ..." |
gs.clmn |
the column header for the vector of "Genus_species" names, to link a tree tips to rows |
a vector of characters enummerating the columns to retain in PGLS modeling (input to df param in the 'comp.data' function)
model.columns <- get.mod.clmns(model=formula('y ~ x1 + x2'))
model.columns <- get.mod.clmns(model=formula('y ~ x1 + x2'))
Get the outcome variable from the front of a model formula string. Used as part of 'get.mod.clmns' to be passed to 'comp.data'
get.mod.outcome(model)
get.mod.outcome(model)
model |
a character string of a formula of the form 'y ~ x1 + x2 ...' |
a character string specifying the outcome variable
model.columns <- get.mod.clmns(model=formula('y ~ x1 + x2'))
model.columns <- get.mod.clmns(model=formula('y ~ x1 + x2'))
Split the predictor string of a model formula into it's constituent character strings.
get.mod.vars(model)
get.mod.vars(model)
model |
a character string of a formula of the form 'y ~ x1 + x2' |
a vector of character strings of variable names (e.g. corresponding to column names for comp.data input)
model.variables <- get.mod.vars(model='y ~ x1 + x2')
model.variables <- get.mod.vars(model='y ~ x1 + x2')
Enumerate all combinations of predictor variables in a multivariate regression model.
get.model.combos(outcome.var, predictor.vars, min.q=1)
get.model.combos(outcome.var, predictor.vars, min.q=1)
predictor.vars |
predictor variables names (a vector of character strings) |
outcome.var |
outcome variable name (character string) |
min.q |
minimum number of predictor variables to include in the mode (default is 2) |
a vector of models as character strings of the form "y ~ x1 + x2 ..."
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) get.model.combos(outcome.var='OC', predictor.vars=names(data), min.q=2)
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) get.model.combos(outcome.var='OC', predictor.vars=names(data), min.q=2)
Post PGLS model selection, the list of all possible PGLS model fits can be subset and passed to this function, which harvests out the coefficients or t-values for each model into bins for the coefficients
get.pgls.coefs(pgls.fits, est=c("t value","Estimate","Pr(>|t|)")[1])
get.pgls.coefs(pgls.fits, est=c("t value","Estimate","Pr(>|t|)")[1])
pgls.fits |
a list of PGLS models output from 'pgls' or 'pgls.report' |
est |
a character string indicating if Estimate or t value should be used as data points in the plot, default is 'Estimate' |
A list of PGLS coeficients (lists of estimates and t-values) organized by coeficient-named bins
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) coefs.objs <- get.pgls.coefs(PGLSi$fits, est='Estimate')
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) coefs.objs <- get.pgls.coefs(PGLSi$fits, est='Estimate')
This function uses Pagel's lambda, Blombergs k, and Ancestral Character Estimation [ACE] to calculate statistics on a tree given a specified trait.
get.phylo.stats(phylo, data, trait.clmn, gs.clmn='gn_sp', ace.method='REML',ace.scaled=TRUE, ace.kappa=1)
get.phylo.stats(phylo, data, trait.clmn, gs.clmn='gn_sp', ace.method='REML',ace.scaled=TRUE, ace.kappa=1)
phylo |
PARAMDESCRIPTION |
data |
PARAMDESCRIPTION |
trait.clmn |
PARAMDESCRIPTION |
gs.clmn |
PARAMDESCRIPTION |
ace.method |
PARAMDESCRIPTION |
ace.scaled |
PARAMDESCRIPTION |
ace.kappa |
PARAMDESCRIPTION |
statistics on a particular trait within a tree (Pagel's lambda, Blomberg's k, and the most ancestral ACE estimate)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] get.phylo.stats(phylo=phyl, data=data, trait.clmn='OC', gs.clmn='gn_sp', ace.method='REML',ace.scaled=TRUE, ace.kappa=1)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] get.phylo.stats(phylo=phyl, data=data, trait.clmn='OC', gs.clmn='gn_sp', ace.method='REML',ace.scaled=TRUE, ace.kappa=1)
This convienience function checks to make sure that all of the elements the provided character vector adhere to the "Genus species" naming convention format. Default delimiters between genus and species names in the string are " ", "_", or "."
gs.check(genus.species, sep='[ _\\.]')
gs.check(genus.species, sep='[ _\\.]')
genus.species |
a vector of character strings specifiying the combination of Genus [and] species |
sep |
a regular expression between genus and species |
None
path <- system.file("extdata","primate-example.data.csv", package="mmodely") gs.tab <- read.csv(path, row.names=1) gs.tab$gn_sp <- rownames(gs.tab) gs.check(genus.species=gs.tab$gn_sp, sep='[ _\\.]')
path <- system.file("extdata","primate-example.data.csv", package="mmodely") gs.tab <- read.csv(path, row.names=1) gs.tab$gn_sp <- rownames(gs.tab) gs.check(genus.species=gs.tab$gn_sp, sep='[ _\\.]')
This convienience function checks to make sure that all of the elements the provided character vector adhere to the "Genus species" naming convention format. Default delimiters between genus and species names in the string are " ", "_", or "."
gs.names.mismatch.check(df, alias.table.path, gs.clmn='gn_sp')
gs.names.mismatch.check(df, alias.table.path, gs.clmn='gn_sp')
df |
a data frame with genus species information as row names and optionally in a column named "gn_sp" |
alias.table.path |
a file system path (e.g. 'inst/extdata/primate.taxa.aliases.tab') to a lookup table with 'old.name' and 'new.name' as columns |
gs.clmn |
the name of the column containing the 'Genus_species' vector |
None
path <- system.file("extdata","primate-example.data.csv", package="mmodely") gs.tab <- read.csv(path, row.names=1) gs.tab$gn_sp <- rownames(gs.tab) path.look <- system.file("extdata","primate.taxa.aliases.tab", package="mmodely") gs.names.mismatch.check(gs.tab, alias.table.path=path.look, gs.clmn='gn_sp')
path <- system.file("extdata","primate-example.data.csv", package="mmodely") gs.tab <- read.csv(path, row.names=1) gs.tab$gn_sp <- rownames(gs.tab) path.look <- system.file("extdata","primate.taxa.aliases.tab", package="mmodely") gs.names.mismatch.check(gs.tab, alias.table.path=path.look, gs.clmn='gn_sp')
This function takes a data frame (with a genus species column) and proceeds to use an external look-up table to update the names if they've been changed
gs.rename(df, alias.table.path, retro=FALSE, update.gn_sp=FALSE)
gs.rename(df, alias.table.path, retro=FALSE, update.gn_sp=FALSE)
df |
a data frame with genus species information as row names and optionally in a column named "gn_sp" |
alias.table.path |
a file system path (e.g. 'inst/extdata/primate.taxa.aliases.tab') to a lookup table with 'old.name' and 'new.name' as columns |
retro |
a boolean (T/F) parameter specifying if the renaming should go from new to old instead of the default of old to new |
update.gn_sp |
a boolean parameter specifying if the 'gn_sp' column should also be updated with 'new.name's |
the original data frame with (potentially) updated row names and updated gn_sp column values
path.data <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path.data, row.names=1) path.look <- system.file("extdata","primate.taxa.aliases.tab", package="mmodely") data.renamed <- gs.rename(df=data, alias.table.path=path.look, retro=FALSE, update.gn_sp=FALSE)
path.data <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path.data, row.names=1) path.look <- system.file("extdata","primate.taxa.aliases.tab", package="mmodely") data.renamed <- gs.rename(df=data, alias.table.path=path.look, retro=FALSE, update.gn_sp=FALSE)
This function finds NA values and interpolates using averaging values of nearby genus and species
interpolate(df, taxa=c('genus','family'), clmns=1:length(df))
interpolate(df, taxa=c('genus','family'), clmns=1:length(df))
df |
a data frame |
taxa |
a vector of taxonomic ranks (corresonding to columns) to assist in guiding the interpolating |
clmns |
the names of the columns to interpolate over |
a modified data frame without missing values in the columns specified
path <- system.file("extdata","primate-example.data.csv", package="mmodely") gs.tab <- read.csv(path, row.names=1) clmns <- match(c('mass.Kg','DPL.km'),names(gs.tab)) df.2 <- interpolate(df=gs.tab, taxa='genus', clmns=clmns)
path <- system.file("extdata","primate-example.data.csv", package="mmodely") gs.tab <- read.csv(path, row.names=1) clmns <- match(c('mass.Kg','DPL.km'),names(gs.tab)) df.2 <- interpolate(df=gs.tab, taxa='genus', clmns=clmns)
This funciton reports column and rowwise missing data. It can also list the rownames for missing columns or the column names for missing rows.
missing.data(x, cols=NULL, rows=NULL)
missing.data(x, cols=NULL, rows=NULL)
x |
a dataframe |
cols |
print the specific rows corresponding to missing values in this column |
rows |
print the specific cols corresponding to missing values in this rowname |
a report on column versus row wise missing data
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) missing.data(data)
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) missing.data(data)
This function uses the (non-missing) values from one column to fill in the missing values of another
missing.fill.in(x, var.from, var.to)
missing.fill.in(x, var.from, var.to)
x |
a dataframe or matrix |
var.from |
secondary variable (of the same type and units) providing values to 'var.to' |
var.to |
primary variable with missing values to fill in by 'var.from' |
a modified dataframe with fewer missing values in the 'var.to' column
df <- data.frame(a=c(1,2,NA),b=c(1,NA,3),c=c(1,2,6)) missing.fill.in(df, 'c','a')
df <- data.frame(a=c(1,2,NA),b=c(1,NA,3),c=c(1,2,6)) missing.fill.in(df, 'c','a')
This function takes phylogenetic tree and a list of (all possible) combinations of variables as a vector of model strings and estimates PGLS fits based on the bounds or tree parameters provided seperately.
pgls.iter(models, phylo, df, gs.clmn='gn_sp', b=list(lambda=c(.2,1),kappa=c(.2,2.8),delta=c(.2,2.8)),l='ML', k='ML',d='ML')
pgls.iter(models, phylo, df, gs.clmn='gn_sp', b=list(lambda=c(.2,1),kappa=c(.2,2.8),delta=c(.2,2.8)),l='ML', k='ML',d='ML')
models |
a vector of all possible model formulas (as character strings) |
phylo |
a phylogenetic tree |
df |
the name of the column used to specify 'Genus_species' |
gs.clmn |
the name of the column containing the 'Genus_species' vector |
b |
a list of vectors of upper and lower bounds for kappa, lambda, and delta |
k |
the fixed or 'ML' value for kappa |
l |
the fixed or 'ML' value for lambda |
d |
the fixed or 'ML' value for delta |
a list of fit PGLS regression models plus 'optim' and 'param' support tables
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1)
Print (and plot) statistics from a list of PGLSs fitted models and tables of associated parameters.
pgls.iter.stats(PGLSi, verbose=TRUE, plots=FALSE)
pgls.iter.stats(PGLSi, verbose=TRUE, plots=FALSE)
PGLSi |
a list of PGLS iter objects, each of which is list including: fitted PGLS model, a optim table, and a tree-transformation parameter table |
verbose |
the model formula (as acharacter string) |
plots |
the fixed or 'ML' value for kappa |
A summary statistics on each of the objects in the PGLS list of lists
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) pgls.iter.stats(PGLSi, verbose=TRUE, plots=FALSE)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) pgls.iter.stats(PGLSi, verbose=TRUE, plots=FALSE)
Print the results of a PGLS model fit
pgls.print(pgls, all.vars=names(pgls$data$data)[-1], model.no=NA, mtx.out=NA, write=TRUE, print=FALSE)
pgls.print(pgls, all.vars=names(pgls$data$data)[-1], model.no=NA, mtx.out=NA, write=TRUE, print=FALSE)
pgls |
a fit PGLS model |
all.vars |
the names of all the variables to be reported |
model.no |
the model number (can be the order that models were run |
mtx.out |
should a matrix of the tabular summary results be returned |
write |
should the matrix of summary results be written to disk? |
print |
should the matrix of summary results be printed to screen? |
A matrix of summary results of a fit PGLS model
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") #5. RAxML phylogram based on the 61199 bp concatenation of 69 nuclear and ten mitochondrial genes. phyl <- ape::read.tree(tree.path)[[5]] phyl <- trim.phylo(phylo=phyl, gs.vect=data$gn_sp) comp <- comp.data(phylo=phyl, df=data) a.PGLS <- caper::pgls(formula('OC~mass.Kg + DPL.km'), data=comp) pgls.print(a.PGLS, all.vars=names(a.PGLS$data$data)[-1], model.no=NA, mtx.out='', write=FALSE, print=FALSE)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") #5. RAxML phylogram based on the 61199 bp concatenation of 69 nuclear and ten mitochondrial genes. phyl <- ape::read.tree(tree.path)[[5]] phyl <- trim.phylo(phylo=phyl, gs.vect=data$gn_sp) comp <- comp.data(phylo=phyl, df=data) a.PGLS <- caper::pgls(formula('OC~mass.Kg + DPL.km'), data=comp) pgls.print(a.PGLS, all.vars=names(a.PGLS$data$data)[-1], model.no=NA, mtx.out='', write=FALSE, print=FALSE)
Output a spreadsheet ready tabular summary of a fit PGLS model
pgls.report(cd, f=formula('y~x'), l=1,k=1,d=1, bounds=list(lambda=c(.2,1),kappa=c(.2,2.7),delta=c(.2,2.7)), anova=FALSE, mod.no='NA', out='pgls.output-temp',QC.plot=FALSE)
pgls.report(cd, f=formula('y~x'), l=1,k=1,d=1, bounds=list(lambda=c(.2,1),kappa=c(.2,2.7),delta=c(.2,2.7)), anova=FALSE, mod.no='NA', out='pgls.output-temp',QC.plot=FALSE)
cd |
a comparative data object, here created by 'comp.data' |
f |
the model formula (as acharacter string) |
k |
the fixed or 'ML' value for kappa |
l |
the fixed or 'ML' value for lambda |
d |
the fixed or 'ML' value for delta |
bounds |
a list of vectors of upper and lower bounds for kappa, lambda, and delta |
anova |
should an anova be run on the fit model and output to the terminal? |
mod.no |
the model number (can be the order that models were run) |
out |
the base filename to be printed out |
QC.plot |
should a quality control plot be output to screen? |
A summary results of a fit PGLS model with ANOVA and tabular spreadsheet ready csv filesystem output.
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") #5. RAxML phylogram based on the 61199 bp concatenation of 69 nuclear and ten mitochondrial genes. phyl <- ape::read.tree(tree.path)[[5]] phyl <- trim.phylo(phylo=phyl, gs.vect=data$gn_sp) comp <- comp.data(phylo=phyl, df=data) pgls.report(comp, f=formula('OC~mass.Kg + DPL.km'), l=1,k=1,d=1, anova=FALSE, mod.no='555', out='', QC.plot=TRUE)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") #5. RAxML phylogram based on the 61199 bp concatenation of 69 nuclear and ten mitochondrial genes. phyl <- ape::read.tree(tree.path)[[5]] phyl <- trim.phylo(phylo=phyl, gs.vect=data$gn_sp) comp <- comp.data(phylo=phyl, df=data) pgls.report(comp, f=formula('OC~mass.Kg + DPL.km'), l=1,k=1,d=1, anova=FALSE, mod.no='555', out='', QC.plot=TRUE)
Print the results of an unfit PGLS model
pgls.wrap(cd,f,b,l,k,d,all.vars=names(cd$data)[-1], model.no=NA, mtx.out=NA, write=TRUE,print=FALSE)
pgls.wrap(cd,f,b,l,k,d,all.vars=names(cd$data)[-1], model.no=NA, mtx.out=NA, write=TRUE,print=FALSE)
cd |
a 'comparative data' object, here created by 'comp.data(phylo, df, gs.clmn)' |
f |
the model formula (as acharacter string) |
b |
a list of vectors of upper and lower bounds for kappa, lambda, and delta |
l |
the fixed or 'ML' value for lambda |
k |
the fixed or 'ML' value for kappa |
d |
the fixed or 'ML' value for delta |
all.vars |
the names of all the variables to be reported |
model.no |
the model number (can be the order that models were run |
mtx.out |
should a matrix of the tabular summary results be returned |
write |
should the matrix of summary results be written to disk? |
print |
should the matrix of summary results be printed to screen? |
A matrix of summary results of a fit PGLS model
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") #5. RAxML phylogram based on the 61199 bp concatenation of 69 nuclear and ten mitochondrial genes. phyl <- ape::read.tree(tree.path)[[5]] phyl <- trim.phylo(phylo=phyl, gs.vect=data$gn_sp) comp <- comp.data(phylo=phyl, df=data) model <- 'OC ~ mass.Kg + group.size' pgls.wrap(cd=comp,f=model,b=list(kappa=c(.3,3),lambda=c(.3,3),delta=c(.3,3)), l=1,k=1,d=1,all.vars=names(cd.obj$data)[-1])
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") #5. RAxML phylogram based on the 61199 bp concatenation of 69 nuclear and ten mitochondrial genes. phyl <- ape::read.tree(tree.path)[[5]] phyl <- trim.phylo(phylo=phyl, gs.vect=data$gn_sp) comp <- comp.data(phylo=phyl, df=data) model <- 'OC ~ mass.Kg + group.size' pgls.wrap(cd=comp,f=model,b=list(kappa=c(.3,3),lambda=c(.3,3),delta=c(.3,3)), l=1,k=1,d=1,all.vars=names(cd.obj$data)[-1])
Plot a grid of x y plots showing how a third confounding variable 'z' changes the slope
## S3 method for class 'confound.grid' plot(x, Y='y', X='x', confounder='z', breaks=3,...)
## S3 method for class 'confound.grid' plot(x, Y='y', X='x', confounder='z', breaks=3,...)
x |
a data frame |
Y |
the name of the column with the dependent/outcome variable |
X |
the name of the column with the predictor variable |
confounder |
the name of the column with confounding variable |
breaks |
number or vector of breaks to split the plots horizontally (across x) |
... |
other arguments passed to 'plot' |
a confound grid plot
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) data$col <- c('yellow','red')[data$nocturnal+1] plot.confound.grid(x=data, Y='OC', X='leap.pct', confounder='mass.Kg')
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) data$col <- c('yellow','red')[data$nocturnal+1] plot.confound.grid(x=data, Y='OC', X='leap.pct', confounder='mass.Kg')
A plot of AIC (and AICc) vs R^2 (and adjusted R^2) for all of the PGLS iterations
## S3 method for class 'pgls.iters' plot(x, bests=bestBy(x$optim, by=c('n','q','qXn','rwGsm')[1], best=c('AICc','R2.adj')[1], inverse=FALSE), ...)
## S3 method for class 'pgls.iters' plot(x, bests=bestBy(x$optim, by=c('n','q','qXn','rwGsm')[1], best=c('AICc','R2.adj')[1], inverse=FALSE), ...)
x |
a PGLSi[teration] object (a list of pgls model fits as well as optimization and tree parameter tables) |
bests |
a table of the 'best' models to highlight in the plot based on some optimization criterion (e.g. R2) |
... |
other parameters passed to 'plot' |
a plot of all of PGLS iterations
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) # sprinkle in some missing data so as to make model selection more interesting for(pv in pvs){ data[sample(x=1:nrow(data),size=2),pv] <- NA} PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) # find the lowest AIC within each q by n sized sub-dataset plot.pgls.iters(x=PGLSi)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) # sprinkle in some missing data so as to make model selection more interesting for(pv in pvs){ data[sample(x=1:nrow(data),size=2),pv] <- NA} PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) # find the lowest AIC within each q by n sized sub-dataset plot.pgls.iters(x=PGLSi)
Plots a single panel of R^2 versus AIC, using versions of your choosing.
## S3 method for class 'pgls.R2AIC' plot(x, bests=bestBy(x, by=c('n','q','qXn','rwGsm')[4], best=c('AICc','R2.adj')[1], inverse=c(FALSE,TRUE)[1]),bcl=rgb(1,1,1,maxColorValue=3,alpha=1), nx=2, model.as.title='', ...)
## S3 method for class 'pgls.R2AIC' plot(x, bests=bestBy(x, by=c('n','q','qXn','rwGsm')[4], best=c('AICc','R2.adj')[1], inverse=c(FALSE,TRUE)[1]),bcl=rgb(1,1,1,maxColorValue=3,alpha=1), nx=2, model.as.title='', ...)
x |
a PGLSi[teration]$optim [optimization] table |
bests |
a list of the best PGLS models grouped by variable count and sorted by some metric (e.g. adjusted R2) |
bcl |
background color of plot point |
nx |
point size expansion factor to multiply against sample size ratio (this model to max of all models) |
model.as.title |
uses model.1ln.report to create a short character string of the "best" model results as a title |
... |
other parameters passed to 'plot' |
a plot of R2 versus AIC of many PGLS models
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:6]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) # sprinkle in some missing data so as to make model selection more interesting for(pv in pvs){ data[sample(x=1:nrow(data),size=2),pv] <- NA} PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) plot.pgls.R2AIC(PGLSi$optim) # find the lowest AIC within each q by n sized sub-dataset
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:6]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) # sprinkle in some missing data so as to make model selection more interesting for(pv in pvs){ data[sample(x=1:nrow(data),size=2),pv] <- NA} PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) plot.pgls.R2AIC(PGLSi$optim) # find the lowest AIC within each q by n sized sub-dataset
PGLS regression will use maximum likelihood to estimate tree parameters while also estimating regression parameters. Here we provide a utility function to visualize what this new tree would look like in two dimensions.
## S3 method for class 'transformed.phylo' plot(x, delta=1,kappa=1,...)
## S3 method for class 'transformed.phylo' plot(x, delta=1,kappa=1,...)
x |
a phylogenetic tree |
delta |
an integer between 0 and 3 |
kappa |
an integer between 0 and 3 |
... |
other parameters passed to 'plot' |
a plot of a transformed phylogenetic tree
tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] plot.transformed.phylo(x=phyl, delta=2.3,kappa=2.1)
tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] plot.transformed.phylo(x=phyl, delta=2.3,kappa=2.1)
This function performs a simple scatter plot but also superimposses a linear regression trend (abline) and optionally also the p-value of this line
## S3 method for class 'xy.ab.p' plot(x, x.var, y.var, fit.line=TRUE, p.value=TRUE, slope=TRUE, p.col='red', plot.labels=TRUE, verbose=TRUE, ...)
## S3 method for class 'xy.ab.p' plot(x, x.var, y.var, fit.line=TRUE, p.value=TRUE, slope=TRUE, p.col='red', plot.labels=TRUE, verbose=TRUE, ...)
x |
a data frame |
x.var |
the name of the x variable in df |
y.var |
the name of the y variable in df |
fit.line |
should a fit (ab) line be drawn? |
p.value |
should the p-value be printed on the plot? |
slope |
should the slope be printed on the plot? |
p.col |
should the plot be labeled? |
plot.labels |
should all of thie model fit information be printed out? |
verbose |
should all other information be printed out too? |
... |
other parameters passed to 'plot' |
An x/y scatterplot with regression line
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) plot.xy.ab.p(x=data, x.var='OC', y.var='group.size', fit.line=TRUE, p.value=TRUE, slope=TRUE, p.col='red', plot.labels=TRUE, verbose=TRUE)
path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(path, row.names=1) plot.xy.ab.p(x=data, x.var='OC', y.var='group.size', fit.line=TRUE, p.value=TRUE, slope=TRUE, p.col='red', plot.labels=TRUE, verbose=TRUE)
Get the outcome variable from the front of a model formula string. Used as part of 'get.mod.clmns' to be passed to 'comp.data'
select.best.models(PGLSi, using=c('AICc','R2.adj','AIC','R2')[1], by=c('n','q','nXq','rwGsm')[1])
select.best.models(PGLSi, using=c('AICc','R2.adj','AIC','R2')[1], by=c('n','q','nXq','rwGsm')[1])
PGLSi |
a list of PGLS iter objects, each of which is list including: fitted PGLS model, a optim table, and a tree-transformation parameter table |
using |
performance metric to use in searching for the best model |
by |
unique identifier used to group sub-datasets for reporting (defaults to n) |
a line corresponding to the "best" models from the PGLSi "optim" table
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) a.PGLS <- select.best.models(PGLSi, by=c('R2.adj','AICc')[1])
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) a.PGLS <- select.best.models(PGLSi, by=c('R2.adj','AICc')[1])
Plot the raw distribution of points corresponding to the coefficients harvested from the best model of each subset of the dataset.
sparge.modsel(PC, jit.f=1, R2x=3, nx=2, n.max=max(unlist(PC$n)), zeroline=TRUE, add=FALSE, pd=0, pvs=names(PC$coefs), pvlabs=NULL, xlim=range(unlist(PC$coefs)), MA = NULL, ap=8, ac = 1, ax = nx, ...)
sparge.modsel(PC, jit.f=1, R2x=3, nx=2, n.max=max(unlist(PC$n)), zeroline=TRUE, add=FALSE, pd=0, pvs=names(PC$coefs), pvlabs=NULL, xlim=range(unlist(PC$coefs)), MA = NULL, ap=8, ac = 1, ax = nx, ...)
PC |
a list of vectors of pooled coefficients (or scores) harvested from the 'best' selected modeling runs (out put from 'get.pgls.coefs') |
jit.f |
factor for random jittering (see 'jitter()' |
R2x |
the line width expansion factor according to R^2 value |
nx |
the point size expansion factor according to sample size of model |
n.max |
the maximum sample size used in all models |
zeroline |
should we add an abline at x=0? |
add |
should we add to the existing plot? |
pd |
'position dodge' moves all y axis plotting positions up or down by this provided value (useful for adding multiple distributions for the same param) |
pvs |
the predictor variable vector for ordering the y-axis labels |
pvlabs |
the predictor variable labels for labeling the plot (defaults to pvs) |
xlim |
x axis plot limits |
MA |
matrix of model averages (defaults to NULL) |
ap |
coded numeric point character symbol used for model averaged parameter position |
ac |
color symbol used for model averaged parameters plot character |
ax |
expansion factor to expant model average parameter plot character (defaults to nx) |
... |
other parameters passed on to plot |
a 'sparge' [sprinkle/smear] plot of coefficent distributions
See also 'boxplot' and 'stripchart' in package 'graphics' as well as 'violin', 'bean', 'ridgelines', and 'raincloud' plots.
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) coefs.objs <- get.pgls.coefs(PGLSi$fits, est='Estimate') sparge.modsel(coefs.objs)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) coefs.objs <- get.pgls.coefs(PGLSi$fits, est='Estimate') sparge.modsel(coefs.objs)
Read in a vector of genus species names and a tree and drop the tips in the tree that match the vector of names.
trim.phylo(phylo, gs.vect)
trim.phylo(phylo, gs.vect)
phylo |
a phylogenetic tree |
gs.vect |
a vector of character strings in the 'Genus_species' format |
a plot of a transformed phylogenetic tree
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phylo <- read.tree(tree.path)[[5]] trim.phylo(phylo, gs.vect=data$gn_sp)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phylo <- read.tree(tree.path)[[5]] trim.phylo(phylo, gs.vect=data$gn_sp)
An implementation of IC weighting that first calulates the difference in IC values by subtracting all values from the lowest IC value. Second, the changes are expoentiated divided by a sum of the same and exponentiated yet again.
weight.IC(IC)
weight.IC(IC)
IC |
a vector of IC values |
a vector of IC based weights
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) AICc.w <- weight.IC(IC=PGLSi$optim$AICc)
data.path <- system.file("extdata","primate-example.data.csv", package="mmodely") data <- read.csv(data.path, row.names=1) pvs <- names(data[3:5]) data$gn_sp <- rownames(data) tree.path <- system.file("extdata","primate-springer.2012.tre", package="mmodely") phyl <- ape::read.tree(tree.path)[[5]] comp <- comp.data(phylo=phyl, df=data) mods <- get.model.combos(predictor.vars=pvs, outcome.var='OC', min.q=2) PGLSi <- pgls.iter(models=mods, phylo=phyl, df=data, k=1,l=1,d=1) AICc.w <- weight.IC(IC=PGLSi$optim$AICc)