I need to get haplotype data for HapMap or 1000 Genomes for CEU, MKK, TSI, CHB and JPT for ± 1 Mb at LCT gene. I'm quite new in the area. Actually I don't know the meaning of "± 1 Mb". From which sites and how do I get the data? Do I need a program?

As far as I understand, LCT gene is on Chromosome 2. That's why I used following code: wget -r --reject="index.html*" Now I have a vcf file.

I also obtained a vcf file from Data Slicer ( ). I chosed populations and wrote 2:135,787,840-135,837,200 for region lookup. What does this file contain? How can I see the file content? What else do I have to do to get haplotypes? I will try to cluster and visualize haplotypes accorgind to populations. If you can guide me, I am very happy.


To obtain population related vcf files I created the respective (CEU, MKK,… ) samples list first using the samples file you can also download the 1000 genome ftp site.

grep CEU integrated_call_samples_v3.20130502.ALL.panel | cut -f1 > CEU.samples.list

Then I installed vcf tools ( and used the vcf-subset command.

I got the commands from this faq page:

I did not compute haplotypes from the files, but I read in another post, that you can do that with PLINK. Have you checked that?

