1. 1. Introduction
2. 2. Details
1. 2.1 Command interface
2. 2.2 Application

1.  Introduction

The data-adaptive sum test (aSum) by Han and Pan 20101 is the first method that took into consideration the difference in direction of effects (protective or deleterious) of rare variants in the same genetic region analyzed by a rare variant association test. It is a two-stage approach. In the first stage the effect size of each rare variant is evaluated in a multivariate regression analysis, identifying the variants having significant "protective" effects, i.e., variants with a negative log odds ratio associated with a {$p$} value smaller than {$0.1$}. In the second stage, variants are collapsed across the genetic region similar to Morris and Zeggini 20102 but with the coding for protective variants flipped. The test statistic is a score test for logistic regression for case control data.

The implementation of stage 1 in this program differs from the original paper. Instead of evaluating the effect size for each variant, it evaluates the difference in MAF of each variant between case and controls via an exact test to determine which variants are to be re-coded in stage 2. The same {$p<0.1$} criteria is used for stage 1, but in effect is more stringent than the original criteria for a multivariate logistic regression analysis.

2.  Details

2.1  Command interface

vtools show test aSum

Name:          aSum
Description:   Adaptive Sum score test for protective and deleterious variants, Han &
Pan 2010
usage: vtools associate --method aSum [-h] [--name NAME] [-q1 MAFUPPER]
[-q2 MAFLOWER] [-p N] [--adaptive C]

Adaptive Sum score test for protective and deleterious variants, Han & Pan
2010. In the first stage of the test, each variant site are evaluated for
excess of minor alleles in controls and genotype codings are flipped, and the
second stage performs a burden test similar to BRV (Morris & Zeggini 2009).
This two-stage test is robust to a mixture of protective/risk variants within
one gene, yet is computationally intensive. aSum test is a two-tailed test.

optional arguments:
-h, --help            show this help message and exit
--name NAME           Name of the test that will be appended to names of
output fields, usually used to differentiate output of
different tests, or the same test with different
parameters.
-q1 MAFUPPER, --mafupper MAFUPPER
Minor allele frequency upper limit. All variants
having sample MAF<=m1 will be included in analysis.
Default set to 0.01
-q2 MAFLOWER, --maflower MAFLOWER
Minor allele frequency lower limit. All variants
having sample MAF>m2 will be included in analysis.
Default set to 0.0
-p N, --permutations N
Number of permutations
confidence interval for binomial distribution. The
program will compute a p-value every 1000 permutations
and compare the lower bound of the 95 percent CI of
p-value against "C", and quit permutations with the
p-value if it is larger than "C". It is recommended to
specify a "C" that is slightly larger than the
significance level for the study. To disable the
adaptive procedure, set C=1. Default is C=0.1


2.2  Application

Example using snapshot vt_ExomeAssociation

 vtools associate rare status -m "aSum --name aSum -p 5000" --group_by name2 --to_db asum -j\ 8 > asum.txt  INFO: 3180 samples are found INFO: 2632 groups are found INFO: Starting 8 processes to load genotypes Loading genotypes: 100% [=================================] 3,180 32.6/s in 00:01:37 Testing for association: 100% [=========================================] 2,632/591 10.3/s in 00:04:14 INFO: Association tests on 2632 groups have completed. 591 failed. INFO: Using annotation DB asum in project test. INFO: Annotation database used to record results of association tests. Created on Wed, 30 Jan 2013 16:32:32  vtools show fields | grep asum  asum.name2 name2 asum.sample_size_aSum sample size asum.num_variants_aSum number of variants in each group (adjusted for specified MAF asum.total_mac_aSum total minor allele counts in a group (adjusted for MOI) asum.statistic_aSum test statistic. asum.pvalue_aSum p-value asum.std_error_aSum Empirical estimate of the standard deviation of statistic asum.num_permutations_aSum number of permutations at which p-value is evaluated  head asum.txt  name2 sample_size_aSum num_variants_aSum total_mac_aSum statistic_aSum pvalue_aSum std_error_aSum num_permutations_aSum AADACL4 3180 5 138 2.59057 0.32967 3.85368 1000 ABCG5 3180 6 87 1.90472 0.335664 3.00098 1000 ABCD3 3180 3 42 -0.873585 0.635365 2.17424 1000 ABCB6 3180 7 151 -0.521698 0.632368 3.97958 1000 ABHD1 3180 5 29 -0.365094 0.548452 1.81627 1000 ABCG8 3180 12 152 -5.63774 0.95005 4.06417 1000 ABL2 3180 4 41 0.242453 0.565435 1.98108 1000 ACADL 3180 5 65 0.457547 0.58042 3.00258 1000 ACAP3 3180 3 17 0.0273585 0.404595 1.26823 1000  QQ-plot

1 Fang Han and Wei Pan (2010) A Data-Adaptive Sum Test for Disease Association with Multiple Common or Rare Variants. Human Heredity doi:10.1159/000288704. http://www.karger.com/doi/10.1159/000288704

2 Andrew P. Morris and Eleftheria Zeggini (2010) An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genetic Epidemiology doi:10.1002/gepi.20450. http://doi.wiley.com/10.1002/gepi.20450