1. 1. Introduction
2. 2. Details
1. 2.1 Command interface
2. 2.2 Application

## 1.  Introduction

This implements a collection of weighted aggregation tests. Different from plain aggregation methods which assumes equal contribution of each locus from the genetic region under investigation, the weighted methods assigns a "weight" to each variant site such that each site differs from another by the weight they are assigned, and these weights will contribute to the aggregated "burden", e.g., {$$X=\sum_i^N\omega_iX_i$$} where {$\omega_i$}'s are the weights. The weights often reflect the relative importance of a variant in terms of its contribution to phenotype.

The weighting approach was first proposed by Madsen and Browning 20101 with the assumption that "rarer" variants tend to be more important (the WSS statistic). This weighting theme is by far the most popular weights and has been adapted into a number of methods emerged later, such as Lin and Tang 20112 and Wu et al 20113. Other weighting themes such as KBAC and RBT weightings have different assumptions but they are also based solely on internal information from data. Price et al 20104 proposed the use of "external" weights, i.e., using functional annotation sources to calculate weight for rare variants. This weighting theme can also be naturally integrated into many rare variants methods.

Implementation of WeightedBurdenBt and WeightedBurdenQt are similar to aggregation methods but allows the use of the following weighting themes:

• WSS weight, based on entire sample
• WSS weight, based on controls or sample with above/below average phenotype values
• RBT weight
• KBAC weight
• External weights from annotation

Permutation methods have to be used to obtain {$p$} value for WSS (control based), KBAC and RBT weighting themes.

## 2.  Details

### 2.1  Command interface

vtools show test WeightedBurdenBt
Name:          WeightedBurdenBt
Description:   Weighted genotype burden tests for disease traits, using one or many
arbitrary external weights as well as one of 4 internal
weighting themes
usage: vtools associate --method WeightedBurdenBt [-h] [--name NAME]
[--mafupper MAFUPPER]
[--alternative TAILED]
[-p N] [--permute_by XY]
[--extern_weight [EXTERN_WEIGHT [EXTERN_WEIGHT ...]]]
[--weight {Browning_all,Browning,KBAC,RBT}]

Weighted genotype burden tests for disease traits, using one or many arbitrary
external weights as well as one of 4 internal weighting themes. External
weights (variant/genotype annotation field) are passed into the test by
--var_info and --geno_info options. Internal weighting themes are one of
"Browning_all", "Browning", "KBAC" or "RBT". p-value is based on logistic
regression analysis and permutation procedure has to be used for "Browning",
"KBAC" or "RBT" weights.

optional arguments:
-h, --help            show this help message and exit
--name NAME           Name of the test that will be appended to names of
output fields, usually used to differentiate output of
different tests, or the same test with different
parameters.
--mafupper MAFUPPER   Minor allele frequency upper limit. All variants
having sample MAF<=m1 will be included in analysis.
Default set to 0.01
--alternative TAILED  Alternative hypothesis is one-sided ("1") or two-sided
("2"). Default set to 1
-p N, --permutations N
Number of permutations
--permute_by XY       Permute phenotypes ("Y") or genotypes ("X"). Default
is "Y"
--adaptive C          Adaptive permutation using Edwin Wilson 95 percent
confidence interval for binomial distribution. The
program will compute a p-value every 1000 permutations
and compare the lower bound of the 95 percent CI of
p-value against "C", and quit permutations with the
p-value if it is larger than "C". It is recommended to
specify a "C" that is slightly larger than the
significance level for the study. To disable the
adaptive procedure, set C=1. Default is C=0.1
--extern_weight [EXTERN_WEIGHT [EXTERN_WEIGHT ...]]
External weights that will be directly applied to
genotype coding. Names of these weights should be in
one of '--var_info' or '--geno_info'. If multiple
weights are specified, they will be applied to
genotypes sequentially. Note that all weights will be
masked if --use_indicator is evoked.
--weight {Browning_all,Browning,KBAC,RBT}
Internal weighting themes inspired by various
association methods. Valid choices are:
'Browning_all', 'Browning', 'KBAC' and 'RBT'. Default
set to 'Browning_all'. Except for 'Browning_all'
weighting, tests using all other weighting themes has
to calculate p-value via permutation. For details of
the weighting themes, please refer to the online
documentation.
--NA_adjust           This option, if evoked, will replace missing genotype
values with a score relative to sample allele
frequencies. The association test will be adjusted to
incorporate the information. This is an effective
approach to control for type I error due to
differential degrees of missing genotypes among
samples.
Mode of inheritance. Will code genotypes as 0/1/2/NA
for additive mode, 0/1/NA for dominant or recessive
model. Default set to additive
vtools show test WeightedBurdenQt
Name:          WeightedBurdenQt
Description:   Weighted genotype burden tests for quantitative traits, using one or
many arbitrary external weights as well as one of 4
internal weighting themes
usage: vtools associate --method WeightedBurdenQt [-h] [--name NAME]
[--mafupper MAFUPPER]
[--alternative TAILED]
[-p N] [--permute_by XY]
[--extern_weight [EXTERN_WEIGHT [EXTERN_WEIGHT ...]]]
[--weight {Browning_all,Browning,KBAC,RBT}]

Weighted genotype burden tests for quantitative traits, using one or many
arbitrary external weights as well as one of 4 internal weighting themes.
External weights (variant/genotype annotation field) are passed into the test
by --var_info and --geno_info options. Internal weighting themes are one of
"Browning_all", "Browning", "KBAC" or "RBT". p-value is based on linear
regression analysis and permutation procedure has to be used for "Browning",
"KBAC" or "RBT" weights.

optional arguments:
-h, --help            show this help message and exit
--name NAME           Name of the test that will be appended to names of
output fields, usually used to differentiate output of
different tests, or the same test with different
parameters.
--mafupper MAFUPPER   Minor allele frequency upper limit. All variants
having sample MAF<=m1 will be included in analysis.
Default set to 0.01
--alternative TAILED  Alternative hypothesis is one-sided ("1") or two-sided
("2"). Default set to 1
-p N, --permutations N
Number of permutations
--permute_by XY       Permute phenotypes ("Y") or genotypes ("X"). Default
is "Y"
--adaptive C          Adaptive permutation using Edwin Wilson 95 percent
confidence interval for binomial distribution. The
program will compute a p-value every 1000 permutations
and compare the lower bound of the 95 percent CI of
p-value against "C", and quit permutations with the
p-value if it is larger than "C". It is recommended to
specify a "C" that is slightly larger than the
significance level for the study. To disable the
adaptive procedure, set C=1. Default is C=0.1
--extern_weight [EXTERN_WEIGHT [EXTERN_WEIGHT ...]]
External weights that will be directly applied to
genotype coding. Names of these weights should be in
one of '--var_info' or '--geno_info'. If multiple
weights are specified, they will be applied to
genotypes sequentially. Note that all weights will be
masked if --use_indicator is evoked.
--weight {Browning_all,Browning,KBAC,RBT}
Internal weighting themes inspired by various
association methods. Valid choices are:
'Browning_all', 'Browning', 'KBAC' and 'RBT'. Default
set to 'Browning_all'. Except for 'Browning_all'
weighting, tests using all other weighting themes has
to calculate p-value via permutation. For details of
the weighting themes, please refer to the online
documentation.
--NA_adjust           This option, if evoked, will replace missing genotype
values with a score relative to sample allele
frequencies. The association test will be adjusted to
incorporate the information. This is an effective
approach to control for type I error due to
differential degrees of missing genotypes among
samples.
Mode of inheritance. Will code genotypes as 0/1/2/NA
for additive mode, 0/1/NA for dominant or recessive
mode. Default set to additive

### 2.2  Application

Example using snapshot vt_ExomeAssociation