HAPAR: Haplotype Inference by Parsimony

HAPAR is a program to infer haplotype from genotype data. It uses the parsimony principle, i.e. try to find the minimum number of haplotypes that can reconstruct the input genotypes.


download source files

download executable for Windows

download executable for Unix (complied under Sun Solaris)
 
How to Use HAPAR

If UNIX executable is downloaded, it needs to be decompressed first.
Please use the following two commands in order to decompress.
  (1) gunzip hapar.tar.gz
  (2) tar xvf hapar.tar
Now you get a file named hapar, which can be executed directly.

Usage:    hapar inputfile

Input file format:
Input file contains the genotype data generated in experiment.
The first line contains: genotype-number  SNP-number
One genotype is represented by a consecutive sequence of SNP sites, and each SNP site can be 0 (for homozygous wild type), 1 (for homozygous mutant), or 2 (for heterozygote). A sample input file.

Output file format:
The reconstructed haplotypes are output in file hapar.out
The first line contains: haplotype-number  SNP-number
One haplotype is represented by a consecutive sequence of SNP sites, and each SNP site can be either 0 (for homozygous wild type) or 1 (for homozygous mutant). Then resolutions of the input genotypes are listed. Some genotypes may contain more than one possible resolution. A sample output file.

Here is an example taken from real biological data: B2ar (B2-adrenergic receptor gene).
Input file (B2ar.in) is as follows:
18 12
011100011101
120011100101
110011100101
110011120101
111011110020
212222222101
211222212202
112011120202
212222212101
222222222101
212222222202
112011120222
112011120201
212011100101
211011120202
211222212101
212222222202
110011100202

Run: HAPAR B2ar.in

The output file is :
10 12
110011100000
011011100101
111011110001
111011110101
011100011101
100011100101
110011100101
110011110101
111011110000
111011110010
Resolution of input Genotypes:
Genotype                   Resolution
011100011101:      011100011101 + 011100011101 ;
120011100101:      100011100101 + 110011100101 ;
110011100101:      110011100101 + 110011100101 ;
110011120101:      110011100101 + 110011110101 ;
111011110020:      111011110000 + 111011110010 ;
212222222101:      011100011101 + 110011100101 ;
211222212202:      011100011101 + 111011110000 ;
112011120202:      110011100000 + 111011110101 ; 110011100101 + 111011110000 ;
212222212101:      011100011101 + 110011110101 ;
222222222101:      011100011101 + 100011100101 ;
212222222202:      110011100000 + 011100011101 ;
112011120222:      110011100101 + 111011110010 ;
112011120201:      111011110001 + 110011100101 ;
212011100101:      011011100101 + 110011100101 ;
211011120202:      011011100101 + 111011110000 ;
211222212101:      111011110101 + 011100011101 ;
212222222202:      110011100000 + 011100011101 ;
110011100202:      110011100000 + 110011100101 ;