- Split View
-
Views
-
Cite
Cite
Takahiro Sakamoto, Hideki Innan, Muller’s ratchet of the Y chromosome with gene conversion, Genetics, Volume 220, Issue 1, January 2022, iyab204, https://doi.org/10.1093/genetics/iyab204
- Share Icon Share
Abstract
Muller’s ratchet is a process in which deleterious mutations are fixed irreversibly in the absence of recombination. The degeneration of the Y chromosome, and the gradual loss of its genes, can be explained by Muller’s ratchet. However, most theories consider single-copy genes, and may not be applicable to Y chromosomes, which have a number of duplicated genes in many species, which are probably undergoing concerted evolution by gene conversion. We developed a model of Muller’s ratchet to explore the evolution of the Y chromosome. The model assumes a nonrecombining chromosome with both single-copy and duplicated genes. We used analytical and simulation approaches to obtain the rate of gene loss in this model, with special attention to the role of gene conversion. Homogenization by gene conversion makes both duplicated copies either mutated or intact. The former promotes the ratchet, and the latter retards, and we ask which of these counteracting forces dominates under which conditions. We found that the effect of gene conversion is complex, and depends upon the fitness effect of gene duplication. When duplication has no effect on fitness, gene conversion accelerates the ratchet of both single-copy and duplicated genes. If duplication has an additive fitness effect, the ratchet of single-copy genes is accelerated by gene duplication, regardless of the gene conversion rate, whereas gene conversion slows the degeneration of duplicated genes. Our results suggest that the evolution of the Y chromosome involves several parameters, including the fitness effect of gene duplication by increasing dosage and gene conversion rate.
Introduction
Nonrecombining chromosomes such as Y chromosome often degenerate rapidly because deleterious mutations accumulate irreversibly. This process called Muller’s ratchet (Muller 1964) has been investigated extensively in many theoretical studies (Haigh 1978; Stephan et al. 1993; Gessler 1995; Charlesworth and Charlesworth 1997; Gordo and Charlesworth 2000a, 2000b; Jain 2008; Rouzine et al. 2008; Waxman and Loewe 2010; Neher and Shraiman 2012; Goyal et al. 2012; Metzger and Eule 2013). Although these studies used models of single-copy genes, recent sequencing of the Y chromosome has revealed that many genes have acquired multiple copies through gene duplication. These theoretical results therefore cannot be directly applied to the Y chromosome, because the evolution of duplicated genes is not as simple as that of single-copy genes. Duplicated genes are likely to undergo concerted evolution, during which the duplicated copies coevolve by exchanging their DNA sequences with each other by gene conversion (Ohta 1983; Arnheim 1983). The major effect of gene conversion during concerted evolution is that, under neutrality, the level of divergence between the duplicated copies is kept low, while the level of polymorphism within each copy is increased (Innan 2002, 2003; Teshima and Innan 2004). It has been theoretically demonstrated that the effect of selection is enhanced in duplicated genes; deleterious mutations are more efficiently removed, and beneficial mutations are more likely to become fixed in both of the duplicated copies (Mano and Innan 2008). However, the way in which gene conversion affects the degeneration of the Y chromosome, in which single-copy genes and duplicated genes coexist with complete linkage, is not fully understood.
The aim of this study was to theoretically understand the degeneration process of duplicated genes with special interest in the Y chromosome. Y chromosomes have a unique evolutionary history. They usually evolve from an autosome, on which a sex-determining locus arises. After recombination with the X chromosome is suppressed, the Y chromosome gradually loses functional genes by the accumulation of deleterious mutations (Charlesworth and Charlesworth 2000; Bachtrog 2013). During this process, many genes undergo gene duplication (Skaletsky et al. 2003; Hughes et al. 2010, 2012, 2020; Soh et al. 2014). Duplicated gene copies are either in large palindrome structures, as found in primates (Skaletsky et al. 2003; Hughes et al. 2010, 2012), or arranged in tandem, as found in mouse (Soh et al. 2014), bull (Hughes et al. 2020), and fruit fly (Bachtrog et al. 2019). Since homologous copies often show high sequence identity, frequent gene conversion should have occurred between the copies (Rozen et al. 2003; Hallast et al. 2013; Skov et al. 2017). These findings indicate that the degeneration of the Y chromosome involves both single-copy and duplicated genes. In this work, we develop a theory to address the following questions: (1) After gene duplication, will the degeneration rate become faster or slower? and (2) How does gene duplication affect the rate of degeneration of linked single-copy genes?
To investigate these questions, we used a model of Muller’s ratchet. In a broad sense, Muller’s ratchet is a process by which deleterious mutations are fixed irreversibly in the absence of recombination (Muller 1964). In theoretical reports, it is commonly assumed that all mutations have the same effect on fitness [but see Söderberg and Berg (2007)]. Under these conditions, the fitness of an individual depends only upon how many deleterious mutations it has, therefore individuals in the population can be classified based on the number of deleterious mutations (d), as illustrated in Figure 2A, in which all haploid individuals have four functional genes, represented by different colored boxes. The class d = 0 consists of individuals with no deleterious mutations, which have the highest fitness in the population (i.e., the least-loaded class in this situation). The second class is that of individuals with one deleterious mutation (d = 1), and those who have two deleterious mutations belong to the class d = 2. Because we assume that all mutations are irreversible (i.e., back mutation is ignored), the class of an individual can shift down (e.g., ) as it accumulates mutations. Under these assumptions, the population can be structured according to d, and Muller’s ratchet proceeds as these classes turn over and over. In practice, if the least-loaded class (d = 0 in Figure 2A) goes extinct, the ratchet clicks and the class of d = 1 becomes the least-loaded class.
We here extend the model of Muller’s ratchet to the case of duplicated genes. Figure 2B illustrates an example in which all haploid individuals have four genes which have been duplicated. Again, the population can be structured based on the number of deleterious mutations, and the process of Muller’s ratchet proceeds along the turnover of the classes. The major difference is that a new deleterious mutation which has occurred in the duplicated genes is not “irreversible” because gene conversion could remove it. If the original part of the intact copy without the corresponding mutation is transferred, the mutated copy will lose the mutation. However, if gene conversion occurs in the opposite direction—from the mutated copy to the intact version—the mutation becomes irreversible in this individual, because gene conversion cannot remove it any more, under the assumption of no back mutation. We refer to the former and latter types of mutations as reversible and irreversible mutations, respectively (presented by yellow and red circles in Figure 2B). Thus, when a new mutation arises in one copy, its fate is not determined, and we can consider that the mutation can contribute to an irreversible click of Muller’s ratchet when the mutation is shared in both copies. In this model, we can simplify the process of Muller’s ratchet if we assume that all mutations are recessive and have the same fitness effect. Individuals can therefore be classified according to the number of irreversible mutations they carry. Note that reversible mutations have no fitness effect. Intuitively, gene conversion should have two opposite effects on the degeneration process (Graves 2004). If gene conversion mutates both copies, producing an irreversible mutation, the degeneration process is accelerated. If, however, gene conversion removes the mutation and restores both copies to the original form, the speed of degeneration is slowed. We used this model to explore the way in which gene conversion affects the degeneration of duplicated genes. We were particularly interested in the interactions between the two counteracting effects of gene conversion. We provide analytical expressions of the speed of Muller’s ratchet under this simplified model with a constant selection coefficient. We also consider the effect of variable selection coefficients and the degree of dominance, using mathematical analysis and simulations.
Several studies have investigated the effect of gene conversion on the degeneration of Y chromosomes (Connallon and Clark 2010; Marais et al. 2010), but their focuses have been different from those of the present study. Connallon and Clark (2010) investigated the role of gene conversion on the conservation of duplicated pairs. Because their model assumes that deleterious mutations have lethal effects, a mutation cannot be shared by both duplicated copies, because this situation would be lethal. Therefore, there is no gene loss through the accumulation of deleterious mutations. Marais et al. (2010) investigated the evolution of the human Y chromosome using simulations, in which gene conversion between duplicated genes was taken into account. Their focus was on the process of fixation of a modifier of the gene conversion rate, rather than the long-term degeneration process.
Model
General model
We used a discrete-generation Wright–Fisher model of a haploid population with size N. Each chromosome consists of L1 single-copy genes and L2 pairs of duplicated genes (Figure 1A). We assume no inter-chromosomal recombination, or crossing-over, so all genes on the same chromosome are completely linked. A chromosome therefore behaves as a single haploid individual. We are interested in the way in which the genes lose their functions through Muller’s ratchet, resulting in a reduction in the number of functional genes on the chromosome. To this end, we applied a simple loss-of-function model to each gene and gene pair. For the ith single-copy gene, we assume the fitness effect of losing the gene function to be si. We only consider loss-of-function mutations, so that one mutation is sufficient to make a gene a pseudogene, with loss of the gene function. Throughout this article, we say a gene is “lost” when it loses the function. The rate of loss-of-function mutation is u per copy per generation, and no back mutation is allowed. Every mutation therefore results in an irreversible loss of the gene (see the right state with a red circle in Figure 1B).
For a pair of duplicated genes (Figure 1C), we set the fitness as follows. To be comparable with the case of single-copy genes, the fitness of the state with one functional copy is 1 (middle in Figure 1C). The other copy is inactivated by a loss-of-function mutation, which is shown by a yellow circle in Figure 1C because it is not irreversible, but “reversible”. A reversible mutation can disappear when the intact sequence is transferred from the other functional copy by gene conversion. If gene conversion occurs in the opposite direction, the loss-of-function mutation is transferred to the functional copy, resulting in a state in which the loss-of-function mutation is shared by both copies (the right state with red circles in both copies in Figure 1C). In this state, the gene has completely lost its function, because the mutation became irreversible, since gene conversion cannot rescue the gene function anymore, and the fitness is given by for the i-th pair of duplicated genes. The fitness of the state with two intact copies (left in Figure 1C) is given by . Having two copies therefore confers a selective advantage by . It is well documented that Y chromosomes have a number of duplicated genes, some of which seem to provide an advantage by increasing the dosage of the gene product (Hughes et al. 2010, 2012, 2020; Soh et al. 2014). Under this model, if we assume , the functional allele is in complete dominance, whereas if is assumed, the dominance effect is additive. Loss-of-function mutations arise at a rate of u per copy per generation, and no back mutation is allowed. Gene conversion occurs at a rate of c per copy per generation in both directions between the duplicated copies. We assume that a gene conversion event transfers the entire genic region (represented by a single box in Figure 1C).
It should be noted that two independent mutations can cause a loss of the gene function, as illustrated in the lower case of the right part of Figure 1C. This situation is more complex, because the gene function is lost but the two mutations are still reversible. Nevertheless, we treat this situation as if the gene function is irreversibly lost, which is true in our model, in which a gene conversion event transfers the entire genic region.
Based on the fitness of all individuals in the current population, the next generation is generated following the Wright–Fisher model.
Simplification for mathematical analyses
Since the general model described above is too complicated for mathematical analysis, we make the following two simplifying assumptions. First, we assume that all genes have the same fitness effect. Therefore, si = s, for all i. Second, we assume that the fitness effect of losing the function is the same for a single-copy gene and a pair of duplicated genes (s = t2). Under these assumptions, we derive the rate of gene loss in two special cases: one in which the functional copy is in complete dominance (), and one in which it is additive (t1 = t2). In the complete dominance case, the fitness of an individual depends on the number of irreversible mutations, d, and the population can be structured based on d (Figure 2). In the additive case, the fitness of an individual depends on the total number of mutations, , and individuals can be classified based on (Figure 3).
Results
We investigated the way in which a nonrecombining chromosome loses functional genes through Muller’s ratchet. Our model assumes a chromosome carrying L1 single-copy genes and L2 pairs of duplicated genes, and we compared the rates of gene loss for single-copy genes and duplicated genes. We theoretically consider the speed of gene loss per gene, conditional on L1 and L2. The most part of the following is a theoretical analysis, to which the simplifying assumptions detailed above apply. These assumptions are relaxed in the simulation-based analysis, where we mention it.
Duplication with no fitness effect of dosage
To verify the performance of Equation (8), we carried out forward simulations, and part of the result is shown in Figures 4 and 5. Our simulation assumed a Wright–Fisher population with N haploid individuals with L1 single-copy genes and L2 pairs of duplicated genes. Mutation and gene conversion rates and their fitness effects were as described in the Model section. We assumed that N = 10,000 and . Then, in every generation, a random N haploid individuals were generated based on the fitness of the individuals in the previous generation [see Equation (1)]. The selection parameters were assumed such that s = 0.01 for a strong selection case, and s = 0.0025 for a weak selection case. The purpose of this simulation was to obtain and , the average time required for one click of the ratchet at single-copy genes and duplicated genes, respectively, conditional on L1 and L2. From them, the two gene loss rates per gene, R1 and R2, can be computed as and , respectively. However, if we simply run a simulation, L1 and L2 decreases along the run, making it difficult to evaluate T conditional on a specified pair of L1 and L2. To solve this problem, in our simulation, we used an ad hoc method, in which L1 and L2 were kept constant by adding an intact gene (or a pair of duplicated genes) when we observed a loss of a gene (or a gene pair). This treatment allowed us to approximately obtain the expectation of T conditional on L1 and L2. See Appendix A, in which we demonstrate that this heuristic treatment worked quite well. In each simulation run, after a burn-in period of 100N generations, we scored the waiting time T for every click for both classes of gene, and the run was terminated when we observed 10,000 clicks or 10,000N generations had passed.
Figure 4 compares the results of two extreme cases: In one, the simulated chromosome consists only of duplicated genes ( and ), and in the other, only single-copy genes are present in the chromosomes ( and ). In the case of all duplicated genes, three levels of dominance were considered ( and t2 presented in blue, green and red, respectively), and the gene conversion rate was changed from to . The result for the case of all single-copy genes is shown by the black broken line in Figure 4.
Figure 4B shows the result for the weak selection case, in which selection is so weak that multiple clicks occur in a sequential manner with overlapping fixation processes. The qualitative effect of gene conversion on R2 appears similar to that in Figure 4A: If c is very small, R2 is almost identical to R1 in the case of all single-copy genes (black dashed line). With increasing c, R2 increases up to an intermediate c, and then decreases. The overall quantitative effect of gene conversion is smaller in comparison with Figure 4A, because Ne is always very small in this regime, and the effect of U on R2 is small.
Figure 5 assumes that single-copy genes and duplicated genes coexist. In this simulation, it was assumed that , and the other parameters were the same as those used in Figure 4. The gene loss rates are shown by open circles for single-copy genes (R1) and by filled circles for duplicated genes (R2). Let us focus on the results of no dominance (blue circles). Equation (8) (blue dashed line in the left panel for R1, blue solid line for R2 in the right panel in Figure 5) agrees well with the simulation results, unless the gene conversion rate is very large.
In the strong selection case (Figure 5A), R1 and R2 are very similar to each other. The increase in R1 and R2 from that for the case of all single-copy genes (black dashed line) is smaller than that in Figure 4A, which is merely due to the smaller L2 assumed in Figure 5. R2 is slightly larger than R1 because the rate of generation of irreversible mutations is larger for duplicated genes [see Equation (9)].
In the weak selection case (Figure 5B), R1 is less sensitive to c and almost identical to R1 for the case of all single-copy genes, while R2 shows a similar behavior to that for the case of all duplicates in Figure 4B. This is because weak selection causes a small Ne, a situation in which random genetic drift dominates. In such a case, the gene loss rate is roughly proportional to the rate of generation of irreversible mutations, which is constant at u in single-copy genes for any value of c.
As we consider the case of no dominance in this section, let us focus on Figure 6A. Equation (10) (solid lines) is in good agreement with the simulation results. When the gene conversion rate is low (), the gene loss rates for both single-copy genes and duplicated genes are very similar to those in the case of all single-copy genes, consistent with the results in Figure 5. When the gene conversion rate is high (), the gene loss process is slightly accelerated in duplicated genes, consistent with Figure 5B. We also consider the case where si and are heterogeneous among loci in Appendix B and obtained qualitatively similar patterns (see Appendix B for details).
Duplication with an additive fitness effect of dosage
We consider the case of , in which duplication has an additive effect on fitness. In this case, the fitness of individuals can be specified by the sum of the number of reversible mutations and irreversible mutations, . The treatment for single-copy genes is the same as in the previous section, while some modifications are needed for duplicated genes. Unlike the previous section, a single irreversible mutation in duplicated genes should be counted as two deleterious mutations, because having an irreversible mutation is as deleterious as having two reversible mutations. Based on , the Muller’s ratchet process is illustrated in Figure 3. This ratchet is different from that in the previous section (see Figure 2) because a ratchet click can occur in either direction. Let us consider the situation where the least-loaded class has mutations. A ratchet click occurs in the forward direction when the class goes extinct, as in the previous section. We also need to consider a click in the backward direction, when an individual with mutations arises by gene conversion, and its descendants become the majority of the population, thereby constituting the new least-loaded class, with . Clicks in both directions simply change the number of reversible mutations, and only a part of the forward clicks can cause gene loss. The analytical approach we use here is quite different from that in the previous case, but is similar to that of Goyal et al. (2012), who incorporated back mutations into the model of Muller’s ratchet for single-copy genes. Following Goyal et al. (2012), we consider the click process in each direction separately.
We will below obtain in the following derivation for the backward process.
Together with Equations (14) and (15), we are now ready to compute . This treatment is general, in that we can obtain the temporal change in from any initial condition.
To check the performance of Equation (13), we compare it with simulation results in Figures 4 and 5 (see above for details about the simulations). Although we can compute R1 and R2 using Equation (13) for any initial values of , we use a treatment with no initial conditions specified, because we are interested in R1 and R2 conditional on L1 and L2 in a steady state, in which the initial condition is relatively unimportant. To obtain R1 and R2 conditional on L1 and L2, we assume that the frequency of reversible mutation is in equilibrium. Then, is determined from Equations (13) and (14) such that . Using this , we derived R1 and R2 by Equation (13).
Figure 4 shows the results of two extreme cases: In one case, the chromosome consists only of duplicated genes (), and in the other case, it consists only of single-copy genes (). In the case of all duplicates, since Equation (13) is applicable to the case of t1 = t2, let us focus on the result represented by the red circles. The result for the case of all single-copy genes is represented by black broken lines. Equation (13) agrees well with the simulation results, unless the gene conversion rate is very high.
In the strong selection case (Figure 4A), R2 is strongly affected by the gene conversion rate. When the gene conversion rate is very low (), R2 is much higher than R1 in the case of all single-copy genes, because is elevated due to the increase in copy number caused by duplication [see Equation (11)]. As the gene conversion rate increases, R2 decreases, and then drops dramatically when , producing a bad fit between Equation (13) and the simulation result. This situation arises because reversible mutations at duplicated genes are quickly removed by gene conversion in this regime. Individuals with a reversible mutation in the second most loaded class are quickly transferred into the least-loaded class, which increases the size of the least-loaded class and retards its extinction. In this situation, in the population does not follow a Poisson distribution, explaining why the simulation result is not well explained by our derivation [Equation (12)], in which a Poisson distribution is used for the distribution of .
In the weak selection case (Figure 4B), R2 is relatively robust to the gene conversion rate. In this regime, is small enough for random genetic drift to dominate, and R2 is roughly proportional to the rate of generation of irreversible mutations per gene, [see Equation (13)]. When the gene conversion rate is very low (), R2 is almost identical to R1 in the case of all single-copy genes (Figure 4B). This phenomenon can be explained by considering the fate of a newly arisen mutation, as shown in Figure 1C, in which gene conversion should be ignored. If a mutation arises in one copy, given a small , the mutation can become fixed in one copy with a specific probability. Once it is fixed, as our model does not allow back mutation, the state with one reversible mutation (middle in Figure 1C) is prolonged, because this mutation cannot be removed (if gene conversion is ignored). The next event that could happen is that an independent mutation fixes in the other copy, causing a loss of the duplicated genes (the lower case in the right part in Figure 1C). Thus, if we consider a chromosome with L2 pairs of duplicates in a steady state, it is likely that most pairs would be in this state (), because genes with their functions already lost are out of the system. Give this situation, the rate at which another mutation causing a gene loss is generated is approximately u per gene, which is identical to that of the all single-copy case, explaining the similar gene loss rates in the two cases. As the gene conversion rate increases, R2 decreases because gene conversion removes reversible mutations to some extent, resulting in .
In Figure 5, we consider a chromosome in which single-copy genes and duplicated genes coexist. It was assumed that and other parameters were the same as those used for Figure 4. In the additive selection case (red circles), Equation (13) is in a good agreement with the simulation results unless the gene conversion rate is very large.
In the strong selection case, R2 exhibits behavior similar to that shown in Figure 4A, but R1 is quite different. When the gene conversion rate is very low (), R1 is much higher than that of the all single-copy case, because elevated by gene duplication reduces , so that the ratchet clicks in the forward direction occur more frequently [see Equations (11)–(13)]. Unless the gene conversion rate is very high (), R1 is quite robust to c, because gene conversion in duplicated genes should not have a direct effect on mutations in single-copy genes. When the gene conversion rate is very high (), R1 starts decreasing with increasing c. The fit of Equation (13) to the simulation result is not good for the same reason as that in Figure 4A.
When selection is weak, R2 is very similar to that in Figure 4B. R1 is less affected by the gene conversion rate, because Ne is small enough for random genetic drift to dominate, and the gene loss rate is roughly proportional to the rate of generation of irreversible mutations, which is constant for u in single-copy genes.
To check the accuracy of Equation (19), we performed simulations (Figure 6C). In the initial state of the simulation, each chromosome had 1,000 intact single-copy genes and 1,000 intact duplicated pairs of genes ( and ). Two gene conversion rates () were considered. It was found that, at both of the two gene conversion rates, single-copy genes decreased faster and duplicated genes decreased more slowly than in the case of all single-copy genes (black circles in Figure 6C), because the rapid decrease in the number of single-copy genes slows ratchet clicks in the forward direction. As a consequence, more duplicated genes remain functional than in the case of all single-copy genes. The deviation from the case of all single-copy genes is larger when the gene conversion rate is higher. A very similar result was obtained when the assumption of constant selection coefficient was violated. See Appendix B for details.
Duplication with an intermediate fitness effect of dosage
Finally, we consider the case of an intermediate degree of dosage effect, where is assumed. Since it was difficult to obtain analytical results, we investigated this case using simulations. The green circles in Figure 4 show R2, the gene loss rate in the case of all duplicated genes (). In the strong selection case, R2 generally decreases as c increases, whereas in the weak selection case, R2 is almost identical to R1 in the case of all single-copy genes. When both single-copy genes and duplicated genes coexist (, in green in Figure 5), the pattern is generally similar to that in Figure 4. In the strong selection case, R1 and R2 show a similar pattern to the additive case. Figure 6B shows the long-term degeneration pattern. In Appendix B, we consider the case where si and are heterogeneous among loci. Overall, the behavior when seems to be intermediate between those of the cases of no fitness effect of dosage () and additive effect (t1 = t2).
Discussion
Muller’s ratchet is a process in which a nonrecombining chromosome irreversibly accumulates deleterious mutations (Muller 1964). Muller’s ratchet has been considered to play an important role in the evolution of Y chromosome (Bachtrog 2008). Previous theories regarding Muller’s ratchet considered only single-copy genes, and the way in which Muller’s ratchet works in duplicated genes has not been fully understood. Because there are a number of duplicated genes on the Y chromosome in many species (Skaletsky et al. 2003; Hughes et al. 2010, 2012, 2020; Soh et al. 2014; Bachtrog et al. 2019; Peichel et al. 2020), in this work we developed a theory for the process of Muller’s ratchet on a nonrecombining chromosome in which single-copy and duplicated genes coexist. Mutations in duplicated genes can be considered to be a kind of epistatic mutations, because the strength of selection on a mutation depends on whether the other copy already has a mutation or not. Several studies have investigated the effect of epistasis on the process of Muller’s ratchet (Charlesworth et al. 1993; Kondrashov 1994; Butcher 1995; Jain 2008). However, these studies have focused on more complex epistatic interactions, in which the fitness effect of a mutation depends upon mutations at all other loci.
This work focuses on the role of gene conversion between duplicates, which has two opposite effects on the degeneration process. Degeneration is promoted if gene conversion leads to the mutation of both copies, while degeneration is retarded if gene conversion restores both copies to the original state. Our theoretical results demonstrate that the effect of gene conversion is complex, depending on the fitness effect of dosage change by gene duplication. When duplication has no fitness effect by dosage, gene conversion increases the rates of loss of both single-copy and duplicated genes (see Figure 5). In the case of an additive dosage effect on fitness, the gene loss rate of single-copy genes is elevated by gene duplication, while gene conversion prevents duplicated genes from losing their functions (see Figure 5). These patterns are more clearly observed when selection is strong.
This complex nature of the effect of Muller’s ratchet on the Y chromosome has not previously been identified. Most of the previous studies have considered only the positive effects of gene conversion. For example, Connallon and Clark (2010) showed that gene conversion works positively in the conservation of essential duplicated genes, assuming mutations are lethal when present in both copies. Our work demonstrates that this is not the case when duplicated genes are not essential (see the Introduction for the difference in the models). Mano and Innan (2008) considered a “single” pair of duplicated gene in a Wright-Fisher population, so the effect of linked genes was ignored, and demonstrated that deleterious mutations could be efficiently removed by gene conversion, which suggests that gene conversion could slow the degeneration process of duplicated genes [similar to Connallon and Clark (2010)]. However, we found that, when multiple genes are completely linked and multiple mutations exist simultaneously, gene duplication can accelerate the degeneration process in some cases.
Our results suggest that understanding the way in which the Y chromosome evolves requires the consideration of a number of parameters, including the number of duplicated genes and their fitness effect through dosage increase. Unfortunately, there are very few empirical data that allow us to estimate those parameters. Theory suggests that dosage increase of many duplicated genes may be beneficial because duplication is more likely to be fixed in the population when it has a direct selective advantage (Clark 1994; Connallon and Clark 2010; Innan and Kondrashov 2010). If so, our theory predicts that duplicated genes are well-conserved by gene conversion, while linked single-copy genes are lost rapidly. However, the situation should be much more complicated, particularly in the early stages of the evolution of the Y chromosome, when duplication-rich Y chromosomes may be developed. Duplicated genes with no dosage effect on fitness may be fixed with a linked gene which has a selective advantage due to a dosage effect. More data with reliable estimates of those selection parameters will give us deeper insights into the evolution of the Y chromosome.
Another important parameter is the gene conversion rate, which has been relatively well estimated in the palindrome regions of the human Y chromosome. Rozen et al. (2003) estimated the gene conversion rate of human palindrome as per nucleotide per generation, based on the amount of divergence between duplicates. Hallast et al. (2013) used a phylogenetic approach, and reported a similar but slightly smaller value ( - ). At such a high gene conversion rate, our theory predicts that gene conversion plays a significant role in the degeneration of the Y chromosome. Although the gene conversion rate in other species has not been as well investigated, high sequence identity between duplicated copies is observed in many species (Soh et al. 2014; Hughes et al. 2020), suggesting that our theory would apply to a wide range of species.
Data availability
The authors state that all data necessary for confirming the conclusions presented in the manuscript are represented fully within the manuscript. Codes used for numerical analyses and simulations are available at https://github.com/TSakamoto-evo/Y-ratchet.
Supplementary material is available at GENETICS online.
Funding
This work was partly supported by grants from the Graduate University for Advanced Studies, SOKENDAI, and Japan Society for the Promotion of Science (JSPS) to H.I. and T.S. (JSPS KAKENHI Grant Numbers JP19H03207, JP20J21760).
Conflicts of interest
The authors declare that there is no conflict of interest.
Literature cited
Appendices
Appendix A: On the ad hoc treatment in the simulation to keep L1 and L2 constant
In the simulation to obtain R1 and R2 conditional on L1 and L2 in a steady state, we used an ad hoc treatment in which an intact single-copy gene, or an intact pair of duplicated genes, suddenly appears in all individuals when an irreversible mutation is fixed in the population. The problem is that this method skips a burn-in period, in which some mutations could be accumulated. The performance of this ad hoc method was verified using the following simulation. Let us assume that we would like to obtain R1 and R2 conditional on and . Then, we need to consider a degeneration process of a chromosome on which single-copy genes and duplicated genes are initially located, where should be sufficiently large. Then, the system waits for the state with the focal pair , and we can continue the simulation run to obtain the waiting time for the next click. Very few simulation runs hit , and all other runs are terminated when or . This method is honest and correct, but requires a very large number of runs to accumulate a reasonable number of simulation runs that hit . This is why we used the ad hoc treatment in the main text. We here show how the ad hoc treatment works in comparison with the correct simulation with a limited set of parameters.
We assumed that N = 500, and . The gene conversion rate changed from to . In Supplementary Figure S1A we assumed strong selection (), while in Supplementary Figure S1B, weak selection () was assumed. All population scaled parameters (Nu, Nc, Ns, Nt1, and Nt2) are identical to those in Figure 5 if scaled by the population size N. This process demonstrates that the results of the two methods are almost identical, indicating that our ad hoc method works well.
Appendix B: Variable selection coefficients across loci
In this Appendix, we relax the assumption that all genes have identical effects on fitness, and explore the effect of varying fitness selection coefficients across genes on the long-term degeneration process. Three types of dosage effect on duplication are considered: , and . We performed simulations for these three types of dosage effect, with variation allowed between individual gene pairs. We randomly chose si and , assuming they follow an exponential distribution with mean , and they were shared by the simulations for the three types of dosage effect to focus on the effect due the dosage type alone. We simulated the case of all single-copy genes, in which we also used the parameters determined above ( is assumed for ). All other parameters are the same as those used in the main text. Supplementary Figure S2 shows the simulation result, which is qualitatively very similar to Figure 6.