Skip to main content

Table 1 Explanation of genetic methods and terms

From: Family studies to find rare high risk variants in migraine

Method/term

Description

Single nucleotide polymorphism (SNP)

A SNP is a substitution of a single base pair in the genome that occur in >1% of a population, a so called common variant [84, 85]. SNPs that occur in <1% of a population are considered rare. On average, there is one SNP for every 0.75-1.91 kb throughout the genome [86, 87]. Many of these reside outside protein-coding areas. A proportion of these will reside in other functional elements [88]. <1% of SNPs lead to changes in protein function [89].

After completion of especially the HapMap project and The 1000 Genomes Project, the vast majority of SNPs and structural variants are now mapped throughout the genome [86, 87, 90–92]. More than 38 million SNPs are identified and these are estimated to constitute more than 95% of all common SNPs [91]. The SNPs known to date are gathered in public databases like dbSNP [33] .

LOD-score

LOD = logarithm of the odds. A measure of the probability of two genetic loci to be located close to each other on a chromosome and thereby the likelihood for them to be inherited together (be linked). A LOD-score on > 3 means that the likelihood for two loci to be located close (and be linked) is 1,000 times the likelihood of no linkage [93].

Genome wide association study (GWAS)

The rationale is to find variants that happen to occur more often than by chance in the genomes of individuals with a specific phenotype. It is carried out by an association analysis on genotyped cases and controls. SNPs are most widely used as genetic marker. Genomes are genotyped at specific points in the DNA where the chosen markers are localized if present. Every SNP represents a block of genes, a haplotype. These are inherited together more often than by chance. They are said to be in linkage disequilibrium [85]. Tag-SNPs present in the sample are tested for association with a phenotype of interest, e.g. migraine, by comparing the frequencies of the SNPs in cases vs. controls.

Nest generation sequencing (NGS)

Sequencing of the nucleotides in the entire exome or genome by whole exome or whole genome sequencing (WES or WGS, see below)

Whole exome sequencing (WES)

WES is sequencing of every nucleotide in all exomes in a genome. Exomes are the protein coding part of DNA. This means that the remaining part of DNA in between the exomes is not sequenced.

Whole genome sequencing (WGS)

WGS is complete sequencing of the entire genome consisting almost 3 billion base pairs [89]. Thus, also non-coding parts of the DNA are sequenced. Non protein coding DNA contains many functional elements with influence on gene expression and regulation e.g. RNA coding sequences, transcription factor binding sides, regions of modification or with influence on chromatin (the DNA, RNA and proteins that chromosomes are made of) structure and other interacting regions [88].

Linkage-analysis

Attempts to find chromosome segments that are shared between affected family members. Thus, no prior hypothesis of involved loci is needed. To screen for shared DNA blocks, markers are needed. Often, sets of microsatellite-markers are used [94]. Microsatellites contain a short sequence of base pairs that are repeated a variable number of times. Every microsatellite represents a block of DNA, a haplotype. Thus, having a specific microsatellite means having a specific haplotype. The aim is then to find linkage between a phenotype e.g. a disease and a haplotype. If a haplotype segregates with a disease in a family, they are probably linked.

Haplotype

Each gene has a specific position on a chromosome, a so called locus. A haplotype is a combination of gene alleles at a chromosome that are inherited together more often than by chance. On average haplotypes span 25,000 nucleotides [84, 85]. Haplotypes are longer for newer and inbred or isolated populations and shorter for old or very outbred populations [91].

Sanger sequencing

A classic method to sequence every nucleotide in a DNA fragment of interest. The method includes the use of modified nucleotides labeled radioactively or by fluorescence and gel electrophoresis [95]. More precise sequencing with fewer read errors that WES/WGS. It is used to confirm findings in WES/WGS.

Phasing and imputation

Imputation is performed with different kinds of software and is a way to predict not genotyped variants, located between genotyped variants in haplotyped blocks, by using a reference sample where a greater number of variants are genotyped [96]. Phasing means to sort out which genotypes are placed on the paternal respectively the maternal chromosome [97].

Identity by descent (IBD)

Genomic regions that are identically inherited from parents to more than one child. This means that the siblings will share the DNA combination in that region [63]. IBD can prevail over many generations and reveal the familial relationship (a common ancestor) between very distantly related individuals.