Loading…

Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation

Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using...

Full description

Saved in:
Bibliographic Details
Published in:American journal of human genetics 2009-12, Vol.85 (6), p.775-785
Main Authors: Chen, Jieming, Zheng, Houfeng, Bei, Jin-Xin, Sun, Liangdan, Jia, Wei-hua, Li, Tao, Zhang, Furen, Seielstad, Mark, Zeng, Yi-Xin, Zhang, Xuejun, Liu, Jianjun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c608t-8ec799b057a7faf2e5729e09ce88cb1d62a3042c6a5bfb031e4f8988ae847f353
cites cdi_FETCH-LOGICAL-c608t-8ec799b057a7faf2e5729e09ce88cb1d62a3042c6a5bfb031e4f8988ae847f353
container_end_page 785
container_issue 6
container_start_page 775
container_title American journal of human genetics
container_volume 85
creator Chen, Jieming
Zheng, Houfeng
Bei, Jin-Xin
Sun, Liangdan
Jia, Wei-hua
Li, Tao
Zhang, Furen
Seielstad, Mark
Zeng, Yi-Xin
Zhang, Xuejun
Liu, Jianjun
description Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future.
doi_str_mv 10.1016/j.ajhg.2009.10.016
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2790583</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0002929709004716</els_id><sourcerecordid>1922293731</sourcerecordid><originalsourceid>FETCH-LOGICAL-c608t-8ec799b057a7faf2e5729e09ce88cb1d62a3042c6a5bfb031e4f8988ae847f353</originalsourceid><addsrcrecordid>eNp9kd1rFDEUxYModlv9B3yQIFifZr1J5iOBIsiirVC0WOtryGTudDPMTtZkZqX_vRl3qR8PfQqc_M7h3nsIecFgyYCVb7ul6da3Sw6gkrBM0iOyYIWosrKE4jFZAADPFFfVETmOsQNgTIJ4So6YUnmeA1uQm3MccHSWXo9hsuMUkPqWjmukF2agq7UbMCK98tupN6PzA_2KOzQ9NrS-o8nrN5j9dA3S689X9LsJ7jf1jDxpTR_x-eE9ITcfP3xbXWSXX84_rd5fZrYEOWYSbaVUDUVlqta0HIuKKwRlUUpbs6bkRkDObWmKuq1BMMxbqaQ0KPOqFYU4Ie_2udup3mBjcRiD6fU2uI0Jd9obp__9Gdxa3_qd5pWCQooU8OYQEPyPCeOoNy5a7HszoJ-iroQQOZOcJ_L0QZIzwUsm55le_Qd2fgpDOkNiVKEApEwQ30M2-BgDtvczM9BzubrTc7l6LnfWkpRML__e9o_l0GYCXh8AE63p22AG6-I9x9MaRZ7PQWd7DlM3O4dBR-twsNi4gHbUjXcPzfELOQHCOQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>219590088</pqid></control><display><type>article</type><title>Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation</title><source>BACON - Elsevier - GLOBAL_SCIENCEDIRECT-OPENACCESS</source><source>PubMed Central</source><creator>Chen, Jieming ; Zheng, Houfeng ; Bei, Jin-Xin ; Sun, Liangdan ; Jia, Wei-hua ; Li, Tao ; Zhang, Furen ; Seielstad, Mark ; Zeng, Yi-Xin ; Zhang, Xuejun ; Liu, Jianjun</creator><creatorcontrib>Chen, Jieming ; Zheng, Houfeng ; Bei, Jin-Xin ; Sun, Liangdan ; Jia, Wei-hua ; Li, Tao ; Zhang, Furen ; Seielstad, Mark ; Zeng, Yi-Xin ; Zhang, Xuejun ; Liu, Jianjun</creatorcontrib><description>Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future.</description><identifier>ISSN: 0002-9297</identifier><identifier>EISSN: 1537-6605</identifier><identifier>DOI: 10.1016/j.ajhg.2009.10.016</identifier><identifier>PMID: 19944401</identifier><identifier>CODEN: AJHGAG</identifier><language>eng</language><publisher>Cambridge, MA: Elsevier Inc</publisher><subject>Algorithms ; Asian Continental Ancestry Group ; Biological and medical sciences ; China ; Computer Simulation ; Ethnic Groups ; Fundamental and applied biological sciences. Psychology ; General aspects. Genetic counseling ; Genetic Variation - genetics ; Genetics of eukaryotes. Biological and molecular evolution ; Genetics, Population ; Genome ; Genome-Wide Association Study ; Genomics ; Humans ; Medical genetics ; Medical sciences ; Migration ; Minority &amp; ethnic groups ; Models, Genetic ; Molecular and cellular biology ; Polymorphism, Single Nucleotide ; Population genetics ; Simulation</subject><ispartof>American journal of human genetics, 2009-12, Vol.85 (6), p.775-785</ispartof><rights>2009 The American Society of Human Genetics</rights><rights>2015 INIST-CNRS</rights><rights>Copyright University of Chicago, acting through its Press Dec 11, 2009</rights><rights>2009 The American Society of Human Genetics. Published by Elsevier Ltd. All right reserved.. 2009 The American Society of Human Genetics</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c608t-8ec799b057a7faf2e5729e09ce88cb1d62a3042c6a5bfb031e4f8988ae847f353</citedby><cites>FETCH-LOGICAL-c608t-8ec799b057a7faf2e5729e09ce88cb1d62a3042c6a5bfb031e4f8988ae847f353</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2790583/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2790583/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,315,733,786,790,891,27957,27958,53827,53829</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=22235446$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19944401$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, Jieming</creatorcontrib><creatorcontrib>Zheng, Houfeng</creatorcontrib><creatorcontrib>Bei, Jin-Xin</creatorcontrib><creatorcontrib>Sun, Liangdan</creatorcontrib><creatorcontrib>Jia, Wei-hua</creatorcontrib><creatorcontrib>Li, Tao</creatorcontrib><creatorcontrib>Zhang, Furen</creatorcontrib><creatorcontrib>Seielstad, Mark</creatorcontrib><creatorcontrib>Zeng, Yi-Xin</creatorcontrib><creatorcontrib>Zhang, Xuejun</creatorcontrib><creatorcontrib>Liu, Jianjun</creatorcontrib><title>Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation</title><title>American journal of human genetics</title><addtitle>Am J Hum Genet</addtitle><description>Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future.</description><subject>Algorithms</subject><subject>Asian Continental Ancestry Group</subject><subject>Biological and medical sciences</subject><subject>China</subject><subject>Computer Simulation</subject><subject>Ethnic Groups</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects. Genetic counseling</subject><subject>Genetic Variation - genetics</subject><subject>Genetics of eukaryotes. Biological and molecular evolution</subject><subject>Genetics, Population</subject><subject>Genome</subject><subject>Genome-Wide Association Study</subject><subject>Genomics</subject><subject>Humans</subject><subject>Medical genetics</subject><subject>Medical sciences</subject><subject>Migration</subject><subject>Minority &amp; ethnic groups</subject><subject>Models, Genetic</subject><subject>Molecular and cellular biology</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Population genetics</subject><subject>Simulation</subject><issn>0002-9297</issn><issn>1537-6605</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><recordid>eNp9kd1rFDEUxYModlv9B3yQIFifZr1J5iOBIsiirVC0WOtryGTudDPMTtZkZqX_vRl3qR8PfQqc_M7h3nsIecFgyYCVb7ul6da3Sw6gkrBM0iOyYIWosrKE4jFZAADPFFfVETmOsQNgTIJ4So6YUnmeA1uQm3MccHSWXo9hsuMUkPqWjmukF2agq7UbMCK98tupN6PzA_2KOzQ9NrS-o8nrN5j9dA3S689X9LsJ7jf1jDxpTR_x-eE9ITcfP3xbXWSXX84_rd5fZrYEOWYSbaVUDUVlqta0HIuKKwRlUUpbs6bkRkDObWmKuq1BMMxbqaQ0KPOqFYU4Ie_2udup3mBjcRiD6fU2uI0Jd9obp__9Gdxa3_qd5pWCQooU8OYQEPyPCeOoNy5a7HszoJ-iroQQOZOcJ_L0QZIzwUsm55le_Qd2fgpDOkNiVKEApEwQ30M2-BgDtvczM9BzubrTc7l6LnfWkpRML__e9o_l0GYCXh8AE63p22AG6-I9x9MaRZ7PQWd7DlM3O4dBR-twsNi4gHbUjXcPzfELOQHCOQ</recordid><startdate>20091211</startdate><enddate>20091211</enddate><creator>Chen, Jieming</creator><creator>Zheng, Houfeng</creator><creator>Bei, Jin-Xin</creator><creator>Sun, Liangdan</creator><creator>Jia, Wei-hua</creator><creator>Li, Tao</creator><creator>Zhang, Furen</creator><creator>Seielstad, Mark</creator><creator>Zeng, Yi-Xin</creator><creator>Zhang, Xuejun</creator><creator>Liu, Jianjun</creator><general>Elsevier Inc</general><general>Cell Press</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7U7</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>K9.</scope><scope>NAPCQ</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20091211</creationdate><title>Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation</title><author>Chen, Jieming ; Zheng, Houfeng ; Bei, Jin-Xin ; Sun, Liangdan ; Jia, Wei-hua ; Li, Tao ; Zhang, Furen ; Seielstad, Mark ; Zeng, Yi-Xin ; Zhang, Xuejun ; Liu, Jianjun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c608t-8ec799b057a7faf2e5729e09ce88cb1d62a3042c6a5bfb031e4f8988ae847f353</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Algorithms</topic><topic>Asian Continental Ancestry Group</topic><topic>Biological and medical sciences</topic><topic>China</topic><topic>Computer Simulation</topic><topic>Ethnic Groups</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects. Genetic counseling</topic><topic>Genetic Variation - genetics</topic><topic>Genetics of eukaryotes. Biological and molecular evolution</topic><topic>Genetics, Population</topic><topic>Genome</topic><topic>Genome-Wide Association Study</topic><topic>Genomics</topic><topic>Humans</topic><topic>Medical genetics</topic><topic>Medical sciences</topic><topic>Migration</topic><topic>Minority &amp; ethnic groups</topic><topic>Models, Genetic</topic><topic>Molecular and cellular biology</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Population genetics</topic><topic>Simulation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Jieming</creatorcontrib><creatorcontrib>Zheng, Houfeng</creatorcontrib><creatorcontrib>Bei, Jin-Xin</creatorcontrib><creatorcontrib>Sun, Liangdan</creatorcontrib><creatorcontrib>Jia, Wei-hua</creatorcontrib><creatorcontrib>Li, Tao</creatorcontrib><creatorcontrib>Zhang, Furen</creatorcontrib><creatorcontrib>Seielstad, Mark</creatorcontrib><creatorcontrib>Zeng, Yi-Xin</creatorcontrib><creatorcontrib>Zhang, Xuejun</creatorcontrib><creatorcontrib>Liu, Jianjun</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>American journal of human genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Jieming</au><au>Zheng, Houfeng</au><au>Bei, Jin-Xin</au><au>Sun, Liangdan</au><au>Jia, Wei-hua</au><au>Li, Tao</au><au>Zhang, Furen</au><au>Seielstad, Mark</au><au>Zeng, Yi-Xin</au><au>Zhang, Xuejun</au><au>Liu, Jianjun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation</atitle><jtitle>American journal of human genetics</jtitle><addtitle>Am J Hum Genet</addtitle><date>2009-12-11</date><risdate>2009</risdate><volume>85</volume><issue>6</issue><spage>775</spage><epage>785</epage><pages>775-785</pages><issn>0002-9297</issn><eissn>1537-6605</eissn><coden>AJHGAG</coden><notes>ObjectType-Article-2</notes><notes>SourceType-Scholarly Journals-1</notes><notes>ObjectType-Feature-1</notes><notes>content type line 23</notes><notes>ObjectType-Article-1</notes><notes>ObjectType-Feature-2</notes><notes>These authors contributed equally to this work</notes><abstract>Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future.</abstract><cop>Cambridge, MA</cop><pub>Elsevier Inc</pub><pmid>19944401</pmid><doi>10.1016/j.ajhg.2009.10.016</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0002-9297
ispartof American journal of human genetics, 2009-12, Vol.85 (6), p.775-785
issn 0002-9297
1537-6605
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2790583
source BACON - Elsevier - GLOBAL_SCIENCEDIRECT-OPENACCESS; PubMed Central
subjects Algorithms
Asian Continental Ancestry Group
Biological and medical sciences
China
Computer Simulation
Ethnic Groups
Fundamental and applied biological sciences. Psychology
General aspects. Genetic counseling
Genetic Variation - genetics
Genetics of eukaryotes. Biological and molecular evolution
Genetics, Population
Genome
Genome-Wide Association Study
Genomics
Humans
Medical genetics
Medical sciences
Migration
Minority & ethnic groups
Models, Genetic
Molecular and cellular biology
Polymorphism, Single Nucleotide
Population genetics
Simulation
title Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-09-22T19%3A25%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genetic%20Structure%20of%20the%20Han%20Chinese%20Population%20Revealed%20by%20Genome-wide%20SNP%20Variation&rft.jtitle=American%20journal%20of%20human%20genetics&rft.au=Chen,%20Jieming&rft.date=2009-12-11&rft.volume=85&rft.issue=6&rft.spage=775&rft.epage=785&rft.pages=775-785&rft.issn=0002-9297&rft.eissn=1537-6605&rft.coden=AJHGAG&rft_id=info:doi/10.1016/j.ajhg.2009.10.016&rft_dat=%3Cproquest_pubme%3E1922293731%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c608t-8ec799b057a7faf2e5729e09ce88cb1d62a3042c6a5bfb031e4f8988ae847f353%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=219590088&rft_id=info:pmid/19944401&rfr_iscdi=true