Evolution of Inbreeding Coefficients and Effective Size in the Population of Saguenay Lac-St.-Jean (Québec)
Mourali-Chebil, Soufia
Abstract
We computed mean inbreeding coefficients (F^sub IT^, F^sub IS^, and F^sub ST^) based on approximately 2,700 ascending pedigrees of contemporary people from Saguenay Lac-St-Jean (Québec, Canada). This allowed us to appreciate the accumulated inbreeding and to follow the evolution of these coefficients since the founding of Québec. One of the results was the expected increase in F^sub ST^. Relying on this parameter, we computed the effective size (N^sub e^) of the contemporary population, obtaining a value around 1,000, in agreement with previous estimations. We noticed a decrease of N^sub e^ through history despite the population’s growing size.
KEY WORDS: INBREEDING, EFFECTIVE SIZE, EVOLUTION, CULTURAL TRANSMISSION OF FITNESS, QUEBEC, SAGUENAY LAC-ST.-JEAN.
The Saguenay Lac-St.-Jean (SLSJ) population (200 km north of Québec City) numbers about 300,000. We can find in this population a high prevalence of specific genetic diseases (e.g., spastic ataxia Charlevoix-Saguenay type and tyrosinemia), whereas other diseases, well known in Europe, are extremely rare in this region (Bouchard and De Braekeleer 1991; Laberge 1969; Mathieu et al. 1990).
Our study is an attempt to understand the high prevalence of specific diseases by focusing on inbreeding and more specifically on the F statistics (Wright 1951, 1965). Indeed, with the F^sub ST^ parameter, for example, we can measure the temporal genetic divergence between one population at a given initial time and this same population some generations later.
The SLSJ population is rather well documented because its foundation is relatively recent. SLSJ was founded in the 19th century by immigrants from a sample of the population of the Québec province, which itself essentially issued from a sample of the French population that emigrated in the 17th century (Boleda 1984; Charbonneau et al. 1987; Pouyez et al. 1983). The historic and demographic archives for the SLSJ population show exceptional quality and precision. These records led to the building, by the IREP (Interuniversity Institute for Population Research), of a computerized genealogy database: BALSACRETRO. BALSAC stands for the first regions included in the project (Bas, Saguenay Lac-Saint-Jean, and Charlevoix), and RETRO is a reference to the retrospective view of the demographic past of Saguenay Lac-St.-Jean. This vast database enables the analysis of some population-genetic theories. For instance, relying on approximately 2,700 ascending pedigrees supplied by the IREP from BALSAC-RETRO, we address the following questions:
1. The inbreeding coefficient resulting from drift (F^sub ST^) always increases in a closed panmictic population (Wright 1938, 1951). According to Jacquard (1971), because there is no sib mating in human populations, the global inbreeding coefficient F^sub IT^ is not expected to increase as much. Wang (1997) has shown that exclusion of sib mating usually (under random selection) decreases inbreeding. Knowing that the SLSJ population remained closed for some decades after the British conquest and knowing that cousins rarely married each other, because the population is mostly Catholic, what would be the behavior of F^sub IT^?
2. We also estimate the effective size (N^sub e^) by extracting it from F^sub IT^ and F^sub IS^ measured in the pedigrees for different periods (per generation). We expect an increase in N^sub e^ through history because of the population’s growth. So, is the evolution of our computed N^sub e^ similar to the one expected taking as N^sub e^ the harmonic mean (over the periods) of the population size?
To summarize, our purpose is to observe the behavior of our estimated parameters in the context of the intergeneration correlation of effective family size shown in this population (Austerlitz and Heyer 1998, 2000; Gagnon and Heyer 2001; Heyer et al. 2005) that explains the rapid increase in the frequencies of rare deleterious genes.
Materials and Methods
SLSJ Population Register File: BALSAC-RETRO. Our tools are provided by the IREP. This institute produced a population computerized file (BALSACRETRO) for SLSJ and Charlevoix. The population file (made up of coupled civil status records) includes a central file (BALSAC) and social, cultural, demographic, geographic, and economic data. The BALSAC file does not include any medical or genetic data. A peripheral file, named RETRO, includes genealogical data, going back to the 17th century, on individuals from SLSJ and provides some genetic and medical information (Bouchard and De Braekeleer 1991). Since 1971, the IREP has collected and computerized more than 1.5 million birth, marriage, and death certificates from 1838 on. Because the population is mostly Catholic, these certificates were supplied by the parish registers.
We used some 2,700 pedigrees from the BALSAC-RETRO genealogy database, ascending from contemporary Saguenayans until the 17th-century founders. Our study includes 11 generations, going back from 1986. We defined a generation as a 30-year period because the difference between the dates of marriage of parents and their children, in our pedigrees, has a mean of 30 years. We took as generation O the period in which the greatest number of demographic founders (unrelated immigrants taking part in the later genetic pool) arrived in the province of Québec; for the study population this period is 1630-1660.
Given that we had the real genealogical structure of the population sample at our disposal, we did not need to formulate any hypothesis about the demographic parameters (e.g., variance of the distribution of the family size or fertility).
Estimation of the Inbreeding Parameters. To estimate the F parameters, we used the gene-dropping method (Thomas 1990; Heyer 1999; O’Brien et al. 1994). Gene dropping is a simulation of the handing down of genes in the pedigrees. Relying on our BALSAC-RETRO database, we listed into input files (1) the founders and (2) information on the 2,700 pedigrees (ego, father, mother, date of marriage of parents, etc.). The computerized program (edited in C language) assigns a particular pair of genes (or alleles) to each effective founder. Indeed, these founders were chosen to be unrelated, so each pair of assigned genes (A, a) is unique. To guarantee this exclusivity, with the founders being referenced in the database by numbers, we linked the references of a founder’s genes to his own reference as follows: n(a) = 2n(founder) and n(A) = 2n(founder) + 1.
A generator of pseudo-random numbers designates the alleles inherited by an individual (his ascendancy being available) from his mother and his father. The results of the simulations are kept in an output file with the number of the simulation and the individual’s reference number. Eventually, the program counts the number of appearances of a founder’s gene in the simulations. This provides the probability distribution of the conveyance of each gene arising from a founder. Each gene-dropping simulation is equivalent to a virtual locus. Each simulated allele is neutral.
One of the advantages of the gene-dropping method relative to other methods of pedigree analysis (e.g., potential mates analysis to determine random components of inbreeding; Leslie 1985) is that no particular hypothesis is needed on the rates of fecundity, mortality, and migration, which need to be defined in the assumptions of stable or stationary populations. Similarly, we do not need the combination of pairs of individuals, the removing of incestuous unions, the distribution of age difference between mates, the distinction between discrete and overlapping generations, the sex ratio, and so on. Actually, our genealogical database contains all the founders’ real descending pedigrees. The real situation, as we said, without any hypothesis, is then available, and the gene dropping appears as the simplest method, taking advantage of the pedigrees by simulating the transmission of the genes.
Estimation of F^sub IT^. We used the gene-dropping method to estimate F^sub IT^, starting from N unrelated founders who provided then a stock of 2N different genes (or alleles). What we mean by “different” is not identical by descent. If at the end of a simulation an individual presents twice the same gene, it means that he is inbred.
We carried out 1 million simulations, at the end of which we computed for a given individual the number of simulations for which he is homozygous. This number divided by 1 million gives an estimator of the individual’s level of inbreeding. The mean over a group of individuals is an estimator of F^sub IT^ for this group.
Estimation of the Number of Inbred Individuals. The number of individuals having been homozygous-at least once-at the end of the simulations is the number of inbred individuals.
Selection of the Founders. To carry out the present study, we needed to designate the population founders. Having no idea about the initial level of inbreeding, we had to consider it as null and evaluated the subsequent “in addition inbreeding” relative to this initial level. Ideally, we should select, as founders, the first immigrants to Québec; but some of these founders arrived with their families. We then selected unrelated founders (Mourali 2000) to facilitate the computation of inbreeding coefficients. We assigned to them a stock of different alleles with the gene-dropping program and then generated the different genotypes of their descendants in the pedigrees, given that the program simulates the transmission of the genes from the initial collection.
We called this group of unrelated individuals whose genes have participated in the population gene pool the effective founders. In other words, they are the latest unrelated individuals who gave birth to the population. These effective founders may have immigrated to Québec or not. Thus, in the category of the effective founders, we included the demographic founders or the genealogical ones. Here, we ought to make a distinction between these two nonexclusive notions.
First, we have demographic founders, preferentially immigrants to Québec, who have descendants in the database and who are unrelated to other founders. second, we have genealogical founders, individuals with unknown ancestry who have descendants in the database but who did not necessarily immigrate to Québec. Indeed, a genealogical founder can be included in the database if he is mentioned on the marriage certificate of one of his children, who can be either an immigrant or an immigrant’s ancestor. So a genealogical founder can be a demographic founder or the ancestor of an immigrant.
We rejected the related immigrants to keep their most recent common unrelated ancestors.
For example: Consider the case of a man who has immigrated to Québec and produced his offspring there. If he has either brothers, sisters, uncles, aunts, or first-degree cousins in Québec, we do not consider this man a demographic founder because he is related to other founders; thus we reject him and take his father and his mother instead if they have neither brothers or sisters nor first-degree cousins in the database. Otherwise, we take the man’s grandparents as demographic founders if they answer these same conditions. If not, the algorithm continues until we reach genealogical founders. The genealogical founders will therefore be taken as demographic founders instead of the considered man. Now, if the considered man is an only child as well as his ascendants, we will take him (and not them) as a demographic founder.
We also selected the semifounders: individuals who have one parent (mother or father) absent from the database. We included semifounders within the genealogical founders; otherwise we could not attribute a genotype to them because of the absence of their missing parent’s genes. We also selected the founders’ ancestors: individuals who appear in the ancestry (when available) of a demographic founder. However, we excluded the founders’ ancestors from our file. Finally, we included only the “effective founders” (see Table 1) and their offspring in the file that we used in the gene-dropping simulations.
Results and Discussion
We have measured the global mean inbreeding coefficient (F^sub IT^) in the SLSJ population for each generation since the foundation of Nouvelle France (the Québec province), that is, since the 17th century. This coefficient increases until the 20th century, then begins to decline. To understand this evolution (Figure 1), we should analyze the two components of F^sub IT^-F^sub IS^ and F^sub ST^-separately.
Evolution of F^sub IS^. Similar to F^sub IT^, the nonrandom mating coefficient (F^sub IS^) increases, reaches a maximum, then starts to decrease (Figure 1). Before the third generation, F^sub IS^ is negative. Indeed, at that time the population essentially consists of unrelated founders. Kinship between their descendants needs time to be created. Besides, as in every human population, there is avoidance of close unions (degree 1:1 or 2:1) and of even less close unions (2:2, 2:3, 3:3, etc.) because the population is mostly Catholic.
From the third generation on, F^sub IS^ becomes positive. The increase in F^sub IS^ over time occurs because the founders’ descendants start exchanging spouses, and in this way their own descendants become related. Then, the proportion of consanguineous unions increases with the increase in the proportion of inbred candidates to marriage. This leads to the increase in F^sub IS^.
The relative importance of the F^sub IS^ values during the sixth generation and the seventh generation may reflect what we called the regionalization of the Québec population (Figure 2), that is, a “regional structure” present before the various internal migrations that gave birth to the SLSJ population. Indeed, the SLSJ population was founded in the 19th century by immigrants coming from the whole Québec province. Between that time and the first European colonization of Québec (in other words, during the sixth and seventh generations), these descendants of the early founders were living in diverse regions mostly far away from each other. These regions were essentially endogamous subpopulations. This population structure generated a strong deviation from random mating and thus a high F^sub IS^.
The F^sub IS^ coefficient represented, then, for those generations an overall combination of parameters causing deviation from total panmixia. In this case the parameters are structural random inbreeding between subpopulations (F’^sub ST^) and nonrandom inbreeding within each subpopulation (F’^sub IS^).
This hypothesis would perhaps be confirmed if a further analysis, at the subdivision level, of the values of random and nonrandom inbreeding was performed. This was done, for instance, by Jorde and Morgan (1987) in their study of surnames of Utah Mormons. In their study the processes acting at the withinsubdivision level (nonuniformity of geographic distribution of surnames and avoidance of consanguinity) are not apparent at the total population level. In our case we expect the values of inbreeding, between and within subdivisions, to have additional effects that inflate the magnitude of F^sub IS^ at the total population level.
Beginning with the eighth generation, the F^sub IS^ level stabilizes (under the effect of the internal migrations), then initiates a diminution induced by the opening of the matrimonial market. However, despite their variations through history, the F^sub IS^ values remain close enough to 0 so that we can consider mating as occurring at random in this population. This may also mean that there is a global compensation of the avoidance of close unions (degrees 2:2, 2:3, etc.) by remote consanguineous unions, because most of the population is Catholic; otherwise the F^sub IS^ values would have been negative.
Evolution of F^sub ST^. The inbreeding coefficient resulting from drift (F^sub ST^), deduced from the measures of F^sub IT^ and F^sub IS^, has increased continually since the founding of Québec Province (Figure 1). This increase is expected for a closed population, given that F^sub ST^ decreases rapidly when the migration rate increases. Since the English conquest, at the end of the 18th century, Québec remained closed to immigration from the point of view of genes. Indeed, until the middle of the 19th century (when migrations occurred during the Industrial Revolution), Québec Province received fewer French speakers, while more English speakers did not exchange spouses with the French speakers (given the linguistic and religious obstacles). The increase in F^sub ST^ comes down to the deviation from initial allele frequencies and translates the allelic homogenization exerted by drift on the genetic pool of this population (closed during several decades).
The rapid increase in F^sub ST^ in the last generation must be due to the bias induced by the selection of the probands used to build the pedigrees. Indeed, many of them have a recessive genetic disease, what leads to a higher F^sub ST^.
Evolution of F^sub IT^. We are now able to interpret the evolution of the global inbreeding coefficient F^sub IT^ (Figure 1): Its increase is brought about by one of the nonrandom mating coefficients (F^sub IT^) as well as by the increase in inbreeding as a result of drift (F^sub ST^). Its diminution, in the 20th century, corresponds to the decrease in F^sub IS^: It is the “explosion” of the isolate where many migrations occur, a frequent phenomenon in the West in the same time period.
Evolution of Inbred Individuals. The evolution of inbred individuals’ inbreeding is shown in Figure 3. We observe an important decrease in their global inbreeding coefficient (F^sub IT^) between the years 1720 and 1986. F^sub IT^ decreased from one generation to the next and stabilized in the 20th century. Effectively, at first, one cannot be inbred unless one is the offspring from a marriage between close relatives (such as first-degree cousins). Later, relation between individuals is less and less close. In fact, after several generations there is a smaller proportion of closely kindred individuals, whereas the ratio of remotely related individuals steps up. Actually, the SLSJ population, as we have already said, has been closed for a long period. Besides, the greater part of the Saguenayans are descended from the earliest French-speaking founders (Heyer and Tremblay 1995) and they were not numerous, about 2,600 people (Table 1). Then, after several generations, kinship was necessarily created between the individuals. Anyhow, this relation is not as strong as, for example, the one between first-degree cousins. Indeed, with most of the population being Catholic, the marriages between cousins needed to be authorized by the Church.
The level of total inbreeding reaches, in the last generations, the level of inbred individuals’ inbreeding (Figure 4). This means that, in the end, all of the population is inbred. A similar result was obtained by Jorde (1989) after having divided inbred subjects into different levels of inbreeding coefficients: The frequency of individuals with low inbreeding coefficients increases through time, whereas the frequency of those with high inbreeding coefficients decreases.
The discrepancy between both evolutions of N^sub e^, estimates is clearly due to the different contexts hypothesized. Indeed, for the harmonic mean a geometric growth of the population is assumed, whereas in the gene-dropping analysis we had not elaborated a hypothesis; the computation of N^sub e^. was performed on an empirical situation in which the pedigrees represented all the needed information. Thus the diminution in N^sub e^ in this case is the real trend.
This unexpected result may be a characteristic of a rather unusual situation in theory: the intergeneration correlation of effective family size (Nei and Murata 1966) that has been observed in SLSJ (Austerlitz and Heyer 1998). Furthermore, our Ne value (1,073) in the last generation rejoins the value ([congruent with]1,000) estimated by Austerlitz and Heyer (1998) when they incorporated into their computation of N^sub e^ a “cultural” correlation across generations in the sibship size of each simulated family.
Concluding Remarks
We conducted the present study to answer population genetics questions (such as observing the behavior of inbreeding over the generations), taking advantage of the huge genealogy database BALSAC-RETRO.
Our work is an original application that allowed us to extract the effective size (N^sub e^) from the pedigrees for different periods. Indeed, as human population geneticists, we are interested in the reconstitution of the history of human settlement, for which N^sub e^ is a good indicator.
The evolution (an increase followed by a decrease) of the mean global inbreeding (F^sub IT^) is explained by the evolution of nonrandom mating (F^sub IS^) as well as by the evolution of inbreeding resulting from drift (F^sub ST^). As expected, F^sub IS^ decreased in the last generations because of the opening matrimonial field, and F^sub ST^ increased when the population was closed to immigration.
The unexpected result was the diminution of N^sub e^ in this growing population. This is linked to the intergeneration correlation of effective family size induced by the “cultural transmission of fitness”: a nongenetic heredity of reproductive success (Heyeret al. 2005). This intergeneration correlation together with a great variance in reproductive success (number of offspring per individual) strongly reduces effective size, thereby increasing the effect of genetic drift. An antagonistic effect is then introduced by the correlation and the variance on the one hand and by the population growth, which helps to maintain a certain level of polymorphism, on the other hand. In SLSJ the phenomenon led, alas, to an increase in the prevalence of rare genetic diseases.
It would be interesting to look for the same phenomenon in other populations and to see whether the decrease in N^sub e^ has a minimal threshold.
We have to be aware of the fact that all these results depend on the reliability of the sources used in the genealogical reconstruction, where some (thankfully very few; Heyer et al. 1997; Jobling et al. 1999) links may be missing.
Acknowledgments We are indebted to the previous and present members of the Interuniversity Institute for Population Research (IREP), in particular, Michèle Jomphe, France Néron, Gérard Bouchard, and Julie Arsenault.
Received 7 June 2004; revision received 5 June 2006.
Literature Cited
Auslerlitz, F., and E. Heyer. 1998. Social transmission of reproductive behavior increases frequency of inherited disorders in a young expanding population. Proc. Natl. Acad. Sci. USA 95:15, 140-15, 144.
Austerlitz, F., and E. Heyer. 2000. Allelic association is increased by correlation of effective family size. Eur. J. Hum Genet. 8(12):980-985.
Boleda, M. 1984. Les migrations au Canada sous le régime français (1608-1760). Cah. Quebec. Demogr. 13(1):23-40.
Bouchard, G., and M. De Braekeleer. 1991. Histoire d’un génême: Population et génétique dans l’est du Québec. Sillery. Canada: Presses de l’Université du Québec.
Charbonneau, H., B. Desjardins, A. Guillemette et al. 1987. Naissance d’une population: Les Français établis au Canada au XVIIe- siècle. Cahier 118. Paris: INED. and Montreal: Presses de l’Université de Montréal.
Crow, J. F, and M. Kimura. 1970. An Introduction lu Population Genetics Theory. New York: Harper International Editions.
Gagnon, A., and E. Heyer. 2001. Intergenerational correlation of effective family size in early Quebec (Canada). Am. J. Hum. Biol. 13(5):645-659.
Hartl, D. L. 1988. A Primer of Population Genetics, 2nd ed. Sunderland, MA: Sinauer Associates.
Heyer, E. 1999. One founder/one gene hypothesis in a new expanding population: Saguenay (Quebec, Canada). Hum. Biol. 71:99-109.
Heyer, E., J. Puymirat, P. Dieltjes et al. 1997. Estimating Y-chromosome-specific microsatellite mutation frequencies using deep rooting process. Hum. Mol. Genet. 6(5):799-803.
Heyer, E., A. Sibert, and F. Austerlitz. 2005. Cultural transmission of fitness: Genes take the fast lane. Tr. Genet. 21(4):234-239.
Heyer, E., and M. Tremblay. 1995. Variability of the genetic contribution of Quebec population founders associated to some deleterious genes. Am. J. Hum. Genet. 56:970-978.
Jacquard, A. 1971. Exclusion of sib-mating and genetic drift. Theor. Popul. Biol. 2:91-99.
Jobling, M. A., E. Heyer, P. Dieltjes et al. 1999. Y-chromosome-specific microsatellite mutation rates re-examined using a minisatellite, MSY 1. Hum. Mol. Genet. 8(11):2117-2120.
Jorde, L. B. 1989. Inbreeding in the Utah Mormons: An evaluation of estimates based on pedigrees, isonymy, and migration matrices. Ann. Hum. Genet. 53:339-355.
Jorde, L. B., and K. Morgan. 1987. Genetic structure of the Utah Mormons: Isonymy analysis. Am. J. Phys. Anthropol. 72:403-412.
Laberge, C. 1969. Hereditary tyrosinemia in a French Canadian isolate. Am. J. Hum. Genet. 21(1):36-45.
Leslie, P. W. 1985. Potential mates analysis and the study of human population structure. Yrbk. Phys. Anthropol. 28:53-78.
Malécot, G. 1966. Probabilités et hérédité. Travaux et documents no. 47. Paris: INED.
Mathieu, J., M. De Braekeleer, and C. Prévost. 1990. Genealogical reconstruction of myotonic dystrophy in Saguenay-Lac-Saint-Jean (Quebec, Canada). Neurology 40:839-842.
Mourali, S. 2000. Consanguinité et distance génétique: Mesure du déficit en hétérozygotes dans deux contextes différents. Doctoral thesis, Ecole Doctorale du Muséum National d’Histoire Naturelle (MNHN), Paris.
Nei, M., and M. Murata. 1966. Effective population size when fertility is inherited. Genet. Res. 8(2):257-260.
O’Brien, E., R. A. Kerber, L. B. Jorde et al. 1994. Founder effect: Assessment of variation in genetic contributions among founders. Hum. Biol. 66(2): 185-204.
Pouyez, C., Y. Lavoie, G. Bouchard et al. 1983. Les Saguenayens. Sillery. Canada: Presses de l’Université du Québec.
Thomas, A. 1990. Comparison of an exact and a simulation method for calculating gene extinction probabilities in pedigrees. Zoo-Biol. 9(4):259-274.
Wang, J. 1997. Effect of excluding sib matings on inbreeding coefficient and effective size of finite diploid populations. Biometrics 53:1354-1365.
Wright, S. 1938. Size of population and breeding structure in relation to evolution. Science 87:430-431.
Wright, S. 1951. The genetical structure of populations. Ann. Eugen. 15:323-354.
Wright, S. 1965. The interpretation of population structure by F statistics with special regard to systems of mating. Evolution 19:395-420.
SOUFIA MOURALI-CHEBIL1 AND EVELYNE HEYER1
1 Unité Eco-Anthropologie MNHN/CNRPS/P7 UMR5145, Musée de l’Homme, Paris, France.
Human Biology, August 2006, v. 78, no. 4, pp. 495-508.
Copyright © 2006 Wayne Slate University Press, Detroit, Michigan 48201-1309
Copyright Wayne State University Press Aug 2006
Provided by ProQuest Information and Learning Company. All rights Reserved