
We are searching data for your request:
Upon completion, a link will appear to access the found materials.
I need to get haplotype data for HapMap or 1000 Genomes for CEU, MKK, TSI, CHB and JPT for ± 1 Mb at LCT gene. I'm quite new in the area. Actually I don't know the meaning of "± 1 Mb". From which sites and how do I get the data? Do I need a program?
As far as I understand, LCT gene is on Chromosome 2. That's why I used following code: wget -r --reject="index.html*" http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr2.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz Now I have a vcf file.
I also obtained a vcf file from Data Slicer ( http://grch37.ensembl.org/Homo_sapiens/Tools/DataSlicer ). I chosed populations and wrote 2:135,787,840-135,837,200 for region lookup. What does this file contain? How can I see the file content? What else do I have to do to get haplotypes? I will try to cluster and visualize haplotypes accorgind to populations. If you can guide me, I am very happy.
Thanks!
To obtain population related vcf files I created the respective (CEU, MKK,… ) samples list first using the samples file you can also download the 1000 genome ftp site.
grep CEU integrated_call_samples_v3.20130502.ALL.panel | cut -f1 > CEU.samples.list
Then I installed vcf tools (https://vcftools.github.io/index.html) and used the vcf-subset command.
I got the commands from this faq page: http://www.internationalgenome.org/faq/how-can-i-get-allele-frequency-my-variant/
I did not compute haplotypes from the files, but I read in another post, that you can do that with PLINK. Have you checked that?