Main function adapt
The main function adapt
takes in a phyloseq3 object that needs to have at least a count table and a sample metadata. The metadata must contain one variable that represents the condition to test on. We currently allow binary and continuous conditions. If there are more than two groups for the condition variable, the user may single out one group to compare with all the others for DAA. ADAPT
allows adjusting for extra covariates. There are eight arguments in the adapt
function:
input_data
is the phyloseq object with the count matrix and metadata. This argument is required.cond.var
is the variable in the metadata that represents the condition to test on. This argument is required.base.cond
is the group that the user is interested in comparing against other groups. This argument is only needed if the condition is categorical.adj.var
contains the covariates to adjust for. It can be empty or be a vector of variable names.censor
is the value to censor at for all the zero counts. By default the zeros are censored at one.prev.filter
is the threshold for filtering out rare taxa. By default taxa with prevalence lower than 0.05 are filtered out before analysis.depth.filter
is the threshold for filtering out samples with low sequencing depths. The default threshold is 1000 reads.alpha
is the cutoff for the Benjamini-hochberg adjusted p-values to decide differentially abundant taxa. The default is 0.05.
The ADAPT
package contains two metagenomics datasets from an early childhood dental caries study.4 One corresponds to 16S rRNA sequencing of 161 saliva samples collected from 12-month-old infants (ecc_saliva
). The other corresponds to shotgun metagenomic sequencing of 30 plaque samples collected from kids between 36 and 60 months old (ecc_plaque
). In this vignette let’s use the saliva data to find out differentially abundant taxa between kids who developed early childhood caries (ECC) after 36 months old and those who didn’t. Besides the main variable CaseStatus
, there is another variable Site
representing the site where each sample was collected. Both variables are discrete and contain two categories each.
We can run adapt
with or without covariate adjustment. The returned object is a customized S4-type object with slots corresponding to analysis name, reference taxa, differentially abundant taxa, detailed analysis results (a dataframe) and input phyloseq object.
saliva_output_noadjust <- adapt(input_data=ecc_saliva, cond.var="CaseStatus", base.cond="Control")
#> Choose 'Control' as the baseline condition
#> 155 taxa and 161 samples being analyzed...
#> Selecting Reference Set... 77 taxa selected as reference
#> 27 differentially abundant taxa detected
saliva_output_adjust <- adapt(input_data=ecc_saliva, cond.var="CaseStatus", base.cond="Control", adj.var="Site")
#> Choose 'Control' as the baseline condition
#> 155 taxa and 161 samples being analyzed...
#> Selecting Reference Set... 77 taxa selected as reference
#> 28 differentially abundant taxa detected
The Site
is not a confounding variable, therefore the DAA results only differ by one taxon.