The default version of our dbSNP annotation is currently referring to dbSNP138 (using hg19 coordinates) as shown below. However, users can also retrieve older versions of dbSNP: db135, dbSNP129, dbSNP130, dbSNP131 and dbSNP132. The 129 and 130 versions use hg18 as a reference genome and 131, 132, 135 and later use hg19. The archived versions can be used by a variant tools project by referring to their specific names - for example: dbSNP-hg18_129.

  1. dbSNP 138 has many more flags and fields than previous versions. It also does not contain all variants that are defined in dbSNP 135 and earlier.
  2. A dbSNP entry might match multiple variants. For example, rs111688037 matches variants T->A and T->C at chr6:31602679.

version

% vtools show annotation dbSNP -v2
Annotation database dbSNP (version hg19_138)
Description:            dbSNP version 138, created using vcf file downloaded
  from NCBI
Database type:          variant
Reference genome hg19:  chr, pos, ref, alt
  chr
  pos
  name                  DB SNP ID (rsname)
  ref                   Reference allele (as on the + strand)
  alt                   Alternative allele (as on the + strand)
  FILTER                Inconsistent Genotype Submission For At Least One
                        Sample
  RS                    dbSNP ID (i.e. rs number)
  RSPOS                 Chr position reported in dbSNP
  RV                    RS orientation is reversed
  VP                    Variation Property.  Documentation is at
                        ftp://ftp.ncbi.nlm.nih.gov/snp/specs/dbSNP_BitField_la
                        test.pdf
  GENEINFO              Pairs each of gene symbol:gene id.  The gene symbol
                        and id are delimited by a colon (:) and each pair is
                        delimited by a vertical bar (|)
  dbSNPBuildID          First dbSNP Build for RS
  SAO                   Variant Allele Origin: 0 - unspecified, 1 - Germline,
                        2 - Somatic, 3 - Both
  SSR                   Variant Suspect Reason Codes (may be more than one
                        value added together) 0 - unspecified, 1 - Paralog, 2
                        - byEST, 4 - oldAlign, 8 - Para_EST, 16 - 1kg_failed,
                        1024 - other
  WGT                   Weight, 00 - unmapped, 1 - weight 1, 2 - weight 2, 3 -
                        weight 3 or more
  VC                    Variation Class
  PM_flag               Variant is Precious(Clinical,Pubmed Cited)
  TPA_flag              Provisional Third Party Annotation(TPA) (currently rs
                        from PHARMGKB who will give phenotype data)
  PMC_flag              Links exist to PubMed Central article
  S3D_flag              Has 3D structure - SNP3D table
  SLO_flag              Has SubmitterLinkOut - From
                        SNP->SubSNP->Batch.link_out
  NSF_flag              Has non-synonymous frameshift A coding region
                        variation where one allele in the set changes all
                        downstream amino acids. FxnClass = 44
  NSM_flag              Has non-synonymous missense A coding region variation
                        where one allele in the set changes protein peptide.
                        FxnClass = 42
  NSN_flag              Has non-synonymous nonsense A coding region variation
                        where one allele in the set changes to STOP codon
                        (TER). FxnClass = 41
  REF_flag_flag         Has reference A coding region variation where one
                        allele in the set is identical to the reference
                        sequence. FxnCode = 8
  SYN_flag              Has synonymous A coding region variation where one
                        allele in the set does not change the encoded amino
                        acid. FxnCode = 3
  U3_flag               In 3' UTR Location is in an untranslated region (UTR).
                        FxnCode = 53
  U5_flag               In 5' UTR Location is in an untranslated region (UTR).
                        FxnCode = 55
  ASS_flag              In acceptor splice site FxnCode = 73
  DSS_flag              In donor splice-site FxnCode = 75
  INT_flag              In Intron FxnCode = 6
  R3_flag               In 3' gene region FxnCode = 13
  R5_flag               In 5' gene region FxnCode = 15
  OTH_flag              Has other variant with exactly the same set of mapped
                        positions on NCBI refernce assembly.
  CFL_flag              Has Assembly conflict. This is for weight 1 and 2
                        variant that maps to different chromosomes on
                        different assemblies.
  ASP_flag              Is Assembly specific. This is set if the variant only
                        maps to one assembly
  MUT_flag              Is mutation (journal citation, explicit fact): a low
                        frequency variation that is cited in journal and other
                        reputable sources
  VLD_flag              Is Validated.  This bit is set if the variant has 2+
                        minor allele count based on frequency or genotype
                        data.
  G5A_flag              >5% minor allele frequency in each and all populations
  G5_flag               >5% minor allele frequency in 1+ populations
  HD_flag               Marker is on high density genotyping kit (50K density
                        or greater).  The variant may have phenotype
                        associations present in dbGaP.
  GNO_flag              Genotypes available. The variant has individual
                        genotype (in SubInd table).
  KGValidated_flag      1000 Genome validated
  KGPhase1_flag         1000 Genome phase 1 (incl. June Interim phase 1)
  KGPilot123_flag       1000 Genome discovery all pilots 2010(1,2,3)
  KGPROD_flag           Has 1000 Genome submission
  OTHERKG_flag          non-1000 Genome submission
  PH3_flag              HAP_MAP Phase 3 genotyped: filtered, non-redundant
  CDA_flag              Variation is interrogated in a clinical diagnostic
                        assay
  LSD_flag              Submitted from a locus-specific database
  MTP_flag              Microattribution/third-party annotation(TPA:GWAS,PAGE)
  OM_flag               Has OMIM/OMIA
  NOC_flag              Contig allele not present in variant allele list. The
                        reference sequence allele at the mapped position is
                        not present in the variant allele list, adjusted for
                        orientation.
  WTD_flag              Is Withdrawn by submitter If one member ss is
                        withdrawn by submitter, then this bit is set.  If all
                        member ss' are withdrawn, then the rs is deleted to
                        SNPHistory
  NOV_flag              Rs cluster has non-overlapping allele sets. True when
                        rs set has more than 2 alleles from different
                        submissions and these sets share no alleles in common.
  CAF                   An ordered, comma delimited list of allele frequencies
                        based on 1000Genomes, starting with the reference
                        allele followed by alternate alleles as ordered in the
                        ALT column. Where a 1000Genomes alternate allele is
                        not in the dbSNPs alternate allele set, the allele is
                        added to the ALT column.  The minor allele is the
                        second largest value in the list, and was previuosly
                        reported in VCF as the GMAF.  This is the GMAF
                        reported on the RefSNP and EntrezSNP pages and
                        VariationReporter
  COMMON                RS is a common SNP.  A common SNP is one that has at
                        least one 1000Genomes population with a minor allele
                        of frequency >= 1% and for which 2 or more founders
                        contribute to that minor allele frequency.
  CLNHGVS               Variant names from HGVS.    The order of these
                        variants corresponds to the order of the info in the
                        other clinical  INFO tags.
  CLNALLE               Variant alleles from REF or ALT columns.  0 is REF, 1
                        is the first ALT allele, etc.  This is used to match
                        alleles with other corresponding clinical (CLN) INFO
                        tags.  A value of -1 indicates that no allele was
                        found to match a corresponding HGVS allele name.
  CLNSRC                Variant Clinical Chanels
  CLNORIGIN             Allele Origin. One or more of the following values may
                        be added: 0 - unknown; 1 - germline; 2 - somatic; 4 -
                        inherited; 8 - paternal; 16 - maternal; 32 - de-novo;
                        64 - biparental; 128 - uniparental; 256 - not-tested;
                        512 - tested-inconclusive; 1073741824 - other
  CLNSRCID              Variant Clinical Channel IDs
  CLNSIG                Variant Clinical Significance, 0 - unknown, 1 -
                        untested, 2 - non-pathogenic, 3 - probable-non-
                        pathogenic, 4 - probable-pathogenic, 5 - pathogenic, 6
                        - drug-response, 7 - histocompatibility, 255 - other
  CLNDSDB               Variant disease database name
  CLNDSDBID             Variant disease database ID
  CLNDBN                Variant disease name
  CLNACC                Variant Accession and Versions

version 135 and earlier

vtools show annotation dbSNP-hg19_135 -v2
Annotation database dbSNP (version hg19_137)
Description: dbSNP version 137
Database type: variant
Number of records: 58,008,911
Number of distinct variants: 56,738,705
Reference genome hg19: ['chr', 'start', 'refNCBI', 'alt']

Field:           chr
Type:            string
Missing entries: 0 
Unique Entries:  93

Field:           start
Type:            integer
Comment:         start position in chrom (1-based)
Missing entries: 0 
Unique Entries:  46,982,076
Range:           55 - 249239663

Field:           end
Type:            integer
Comment:         end position in chrom (1-based). start=end means zero-length
                 feature
Missing entries: 0 
Unique Entries:  46,756,124
Range:           55 - 249239663

Field:           name
Type:            string
Comment:         dbSNP reference SNP identifier
Missing entries: 0 
Unique Entries:  53,109,372

Field:           strand
Type:            string
Comment:         which DNA strand contains the observed alleles
Missing entries: 0 
Unique Entries:  2

Field:           refNCBI
Type:            string
Comment:         Reference genomic sequence from dbSNP
Missing entries: 0 
Unique Entries:  165,544

Field:           refUCSC
Type:            string
Comment:         Reference genomic sequence from UCSC lookup of
                 chrom,chromStart,chromEnd
Missing entries: 0 
Unique Entries:  160,817

Field:           observed
Type:            string
Comment:         Strand-specific observed alleles
Missing entries: 0 
Unique Entries:  250,691

Field:           alt
Type:            string
Comment:         alternate allele on the '+' strand
Missing entries: 0 
Unique Entries:  112,097

Field:           molType
Type:            string
Comment:         sample type, can be one of unknown, genomic or cDNA
Missing entries: 0 
Unique Entries:  3

Field:           class
Type:            string
Comment:         Class of variant (single, in-del, het, named, mixed,
                 insertion, deletion etc
Missing entries: 0 
Unique Entries:  9

Field:           valid
Type:            string
Comment:         validation status, can be unknown, by-cluster, by-frequency,
                 by-submitter, by-2hit-2allele, by-hapmap, and by-
                 1000genomes
Missing entries: 0 
Unique Entries:  62

Field:           avHet
Type:            float
Comment:         Average heterozygosity from all observations
Missing entries: 0 
Unique Entries:  158,839
Range:           0 - 0.904364

Field:           avHetSE
Type:            float
Comment:         Standard error for the average heterozygosity
Missing entries: 0 
Unique Entries:  106,224
Range:           0 - 0.305748

Field:           func
Type:            string
Comment:         Functional cetegory of the SNP (coding-synon, coding-nonsynon,
                 intron, etc.)
Missing entries: 0 
Unique Entries:  445

Field:           locType
Type:            string
Comment:         Type of mapping inferred from size on reference.
Missing entries: 0 
Unique Entries:  7