Excerpts from "Evolution of Primate ABO Blood Group Genes and Their Homologous Genes." The following is the complete title.
Naruya Saitou and Fumi-ichiro Yamamoto, Evolution of Primate ABO Blood Group Genes and Their Homologous Genes, Molecular Biology and Evolution, 14(4):399-411, 1997.
There seems to be some interesing and new findings.
These findings are completeley opposite to the established theories: Type O is the oldest, Type A is older, Type B is not older than 15,000-25,000 years and Type AB is far more recent.
I hope molecular biology is correct, because I am Type AB. :-)
Excerpts from ABO-international ML:
According to the the established theories, Type AB is not older than Jesus Christ. But molecular biology says Type AB is several million years old. Can molecular biology cut the Gordinan knot? |
There are three common alleles (A, B, and O) at the human ABO blood group locus. We compared nucleotide sequences of these alleles, and relatively large numbers of nucleotide differences were found among them. These differences correspond to the divergence time of at least a few million years, which is unusually large for a human allelic divergence under neutral evolution. We constructed phylogenetic networks of human and nonhuman primate ABO alleles, and at least three independent appearances of B alleles from the ancestral A form were observed. These results suggest that some kind of balancing selection may have been operating at the ABO locus. We also constructed phylogenetic trees of ABO and their evolutionarily related a-1, 3-galactosyltransferase genes, and the divergence time between these two gene families was estimated to be roughly 400 MYA.
Comparison of the Five Human ABO Gene Sequences
We first compared five cDNA sequences for the human ABO gene presented by Yamamoto et al. (1990a, 1990b). A1-1, A1-2, and B alleles cover the complete coding sequences, while O-1 and O-2 allele sequences lack three nucleotides corresponding to the first codon (Yamamoto et al. 1990b). Polymorphic nucleotide sites are shown in table 1. Because we dealt with closely related sequences, the phylogenetic network method and the maximum-parsimony method were used. Figure 1 shows the unique phylogenetic network (A) and four equally parsimonious trees (B-F). Because a partition defined by a single gap (position 261) and one defined by a nucleotide configuration at position 297 are incompatible, there is one loop in network A. If we cut one of those branches that form this loop, four alternative trees are produced (trees B-F). Since allele A1-l was assumed to be identical with the ancestral sequence of human ABO genes (see below), this information was used to locate the root (designated by a broken line) in those equally parsimonious trees. It should be noted that topologies of trees B and C are identical and only some branch lengths are different.
Although 18 changes are required
in all four trees, two insertion/deletion events are involved in trees D and E, while only
one deletion is required for trees B and C. Saitou and Ueda (1994) showed that the
evolutionary rate of insertions and deletions in primates was about one order slower than
that of nucleotide substitutions. Therefore, trees B and C are much more probable than
trees D and E. In the former two trees, however, the number of nucleotide substitutions
along the branches leading to O-1 and O-2 are quite different. Interestingly, allele O-1
is identical to allele A1-1 except for the single nucleotide deletion, while
allele O-2 is different from allele O-1 with nine nucleotides. It is possible that allele
O-1 might be a product of intragenic recombinations or gene conversions, because those
events homogenize different alleles. However, we have to assume at least two such events
to explain the observed sequence pattern. It suggests that intragenic recombinations or
gene conversions occur rather frequently at this locus. In fact, Ogasawara et al. (1996b)
did find a probable recombinant between an O and a B allele.
We estimated divergence times among human ABO alleles based on
nucleotide differences. Nucleotide difference per site between A1-2 and B-1 is
0.008 (=8/1,062), while that between B-1 and O-2 is 0.013 (=14/1,059), respectively. These
values are much larger than the average number of nucleotide differences among different
nucleotide sequences within a locus in human (Li and Sadler 1991). Saitou (1991) estimated
the average number of nucleotide differences of noncoding regions between human and
chimpanzee to be 0.014/site. If we use 5 Myr for the divergence time between human and
chimpanzee, the rate of nucleotide substitution becomes 1.4 x 10-9/site/year,
and those allelic nucleotide differences for the human ABO genes correspond to 2.7-4.7
Myr. Those values are unusually large for different alleles of a typical human locus (see
discussion below), and yet those divergence times may still be underestimations. Because
the ABO gene is functional, use of an evolutionary rate for noncoding-region DNA is
expected to give underestimations of the divergence time. Gene conversion and
recombination are also possible causes for underestimation, for they homogenize different
aleles.
The Phylogenetic Network of Human ABO Gene Alleles
We used all of the 20 sequence data shown in table 1. Because many of the sequences analyzed did not identify the nucleotides including variant positions 109, 191, 192, and 203, those positions were not used for the following analysis. The maximum-parsimony method was first used, and 39 equally maximum parsimonious trees were produced by using PAUP 3.1.1 with the branch-and-bound option (trees not shown). Clearly, the maximum-parsimony method is not appropriate for de-lineating the complex nature of the ABO allele polymorphism.
The Phylogenetic Tree of Primate ABO Gene Alleles
Figure 4 is the phylogenetic tree of the primate ABO genes based on the phylogenetic network of figure 3 [omitted]. All the parallelograms that existed in the network were eliminated by cutting some of those edges. In the case of the four rectangles around human and gorilla sequences, for example, the edge connecting a gorilla sequence (go-1,2,4,5) and a node neighboring the human B-3 sequence was eliminated, because an extant gorilla sequence is unlikely to be identical with an ancestral sequence of human alleles. The resulting tree of figure 4 is one of 10 equally parsimonious trees (not shown) that were found by using PAUP with the branch-and-bound option. All of the 60 substitution events were unambiguously located at one of the branches of the tree.
It should be noted that the
topology of the tree in figure 4 is different from that of Martinko et al. (1993), where
human B allele was clustered with gorilla B al les. Although this relationship was also
one of most parsimonious trees in our data set, we believe that the topology of figure 4
is more probable than their tree, according to our argument above on the network of figure
3 [omitted]. This tree indicates that the common ancestral gene for the hominoid and Old
World monkey ABO blood group is A type, and three B alleles evolved independently on the
human, gorilla, and baboon lineages. Those changes correspond to the two amino acid
substitutions (266: L->M and 268: G->A; see table 3) that are responsible for the
change of substrate specificity (Yamamoto and Hakomori 1990).
It has been known that the human ABO-like blood group also exists
in nonhuman primates. Moor-Jan-kowski, Wiener, and Rogers (1964) summarized the data, and
they are presented in table 5 in an abbreviated form. Because orangutan and gibbon both
possess A and B alleles, it is possible that B alleles. evolved independently in those
lineages too: Macaque species and New World monkeys also have both A and B alleles.
However, this prediction (repeated emergence of the B allele) awaits the determination of nucleotide sequences for these species in the future. Because all the nucleotide changes were inferred in the gene tree of figure 4, directions of changes could be determined except for 12 substitutions on the branch connecting hominoid and Old World monkeys. This process involved the reconstruction of the nucleotides in all the interior nodes, and those reconstructions are considered to be reliable because the compared sequences are closely related (see Yang, Kumar, and Nei 1995). It has enabled us to estimate the pattern of nucleotide substitutions, as shown in table 4. Transitions occurred two times as frequently as transversions, so the transition parameter (a) for the two-parameter model (Kimura 1980) is roughly four times higher than the transversion parameter (b). When we consider the direction of substitutions, it is clear that there is a bias toward AT richness; G-to-A and C-to-T changes are much more abundant than A-to-G and T-to-C changes.
Possibility of Natural Selection on ABO Genes
We found unusually large
coalescence times for human ABO alleles. It is of interest to compare those with
coalescence times for nonhuman primate ABO genes. We thus estimated coalescence times of
the ABO alleles in each species in figure 4 as follows. The numbers of nucleotide
substitutions from the most recent ancestor sequence to the extant sequences in each
species were first estimated applying Ishida et al.'s (1995) method. The resultant values
are 0.0072, 0.0085, 0.0023, 0.0132, and 0.0070 for human, chimpanzee, gorilla, orangutan,
and baboon, respectively. We then used for the calibration the evolutionary rate (1.4 x 10-9)
based on the human, chimpanzee comparison as used in the previous section. The results are
2.6, 3.0, 0.8, 4.7, and 2.5 Myr for human, chimpanzee, gorilla, orangutan, and baboon,
respectively. It is interesting that not only human, but also chimpanzee, orangutan, and
baboon showed large coalescence times.
We applied the same method to the 238-bp HOX2 intergenic sequence
data of Ruano et al. (1992) to see if the long coalescence times are special for the ABO
gene. The numbers of nucleotide substitutions from the most recent ancestor sequence to
the extant sequences were estimated to be 0.008 and 0.011 for chimpanzee and gorilla,
respectively. The coalescence times are thus estimated to be 3 Myr for chimpanzee and 3.8
Myr for gorilla. The HOX2 intergenic region is assumed to be under neutral evolution
(Kimura 1983). Although only short nucleotide sequences were compared, it is thus
suggested that nonhuman primates have large coalescence times even for a DNA region under
neutral evolution. It is possible that nonhuman primates are highly geographically
isolated and this caused the coalescence times to be much longer than that for human. Such
long coalescence times for nonhuman primates are a clear contrast to human, where the
nucleotide difference per site between randomly chosen genes of a locus was estimated to
be only 0.0004 (Li and Sadler 1991). This situation is similar to that of primate MHC
genes, where the coalescence time of the class II DRB 1 locus was estimated to be more
than 30 MYA, probably caused by a strong balancing selection (Takahata 1993a).
Therefore, long coalescence times estimated for the human ABO gene both from the complete
cDNA region and from a partial cDNA region suggest the possibility of the existence of
some kind of balancing selection at this gene, at least for human.
Higher rates of nonsynonymous substitutions to synonymous
ones were observed for the antigen recognition sites of human and mice MHC class I and
class II proteins (Hughes and Nei 1988; Nei and Hughes 1991). Those higher rates are
considered to be clear evidence of the existence of positive selection on those NIHC
genes. We, therefore, estimated numbers of synonymous and nonsynonymous substitutions
between human ABO A1-l allele and B-1 allele by using Nei and ciojobori's
(1986) method. The ODEN computer package (ma 1994) was used for computation. Estimated
numbers of synonymous and nonsynonymous sites out of the complete cDNA sequences are 258.2
and 800.8, respectively (the initiation codon ATG was eliminated from the comparison, the
sum being 1,059). Because the numbers of synonymous and nonsynonymous nucleotide
differences between the two alleles are 3 and 4, respectively (see table 1), the
proportions of synonymous and nonsynonymous differences were 0.0116 (=3/258.2) and 0.0050
(=4/800.8), respectively. The numbers of synonymous and nonsynonymous substitutions were
thus estimated to be 0.0117 ア
0.0068 and 0.0050 ア
0.0025, respectively. The ratio of nonsynonymous/synonymous
substitutions became 0.43. Takahata (1993b) estimated this ratio for the human
ABO gene, and it turned out to be 2.0. That comparison was, however, based on partial ABO
sequences (405 bp long) corresponding to the region sequenced by Martinko et al. 1993) for
nonhuman primate ABO genes. Therefore, it is not clear whether that high ratio can be
considered evidence for the existence of positive selection on the ABO gene. We saw that B
alleles evolved independently at least three times in primate evolution in the previous
section. To examine whether this phenomenon is statistically significant or not, we
performed a goodness-of- fit test between the observed and the expected numbers of
substitutions per site for the ABO gene based on the tree of figure 4 (table 6). Expected
numbers of substi tutions were computed applying a Poisson distribution under the
assumption of equal substitution rate at every site. The mean number (X) of nucleotide
substitutions per site per whole tree was estimated to be 0.14. A highly significant
difference (P < 0.001) between observed and expected numbers was observed (table 6).
This difference was mainly caused by unusually high substitutions at the nucleotide sites
(796 and 803) responsible for the functional difference between A and B transferases.
Unless those positions are mutational hot spots, this recurrent occurrence of B alleles
does not seem to reflect the pattern of mutations. Because the neutrally evolving genes
are expected to accumulate nucleotide changes according to their mutation pattern (Kimura
1983), we have to consider the existence of some kind of positive selection on the ABO
blood group locus. The best candidate may be the overdominant selection, for the emergence
of new alleles produces heterozygotes. Natural selection on the ABO gene has long been
studied (e.g., Chung and Morton 1961; Hiraizumi 1990). However, all of the studies were
based on nonmolecular data, and only a short time span was considered to detect any type
of selection. It is now increasingly clear that analysis of long-term evolution is more
powerful than that of short term evolution. Therefore, we believe that our present study,
based on the accumulation of mutations over a long evolutionary time period, opened a new
aspect for the study of natural selection on the ABO gene.
Perspectives
We have presented various
phylogenetic analyses of primate ABO genes and its related genes in this paper. We found
some unusual evolutionary patterns for the ABO genes; however, those phylogenetic analyses
inevitably remain to show only indirect evidence for the possibility of natural selection.
More direct evidence is needed from experimental studies. For example, Borén et al.
(1993) found that a gram-negative bacterium Helicobacter pylori, a possible
causative agent in gastritis and gastric ulcers, binds to the carbohydrate structure Leb,
while it does not bind to ALeb determinant antigens. Borén et al. thus
suggested that the availability of receptors for this bacterium may be reduced in
individuals with A and B phenotypes compared to those with O phenotype. More of such
experimental studies directly connecting microorganisms and specific blood group types
will be definitely necessary to clarify the biological mechanism of the existence of the
ABO blood group.
Natural antibodies against A and B determinants are present in
individuals who do not express these antigens (Landsteiner's Law). High titers of these
antibodies have been attributed to constant stimulation by bacterial flora in intestines,
some of which share the epitopes of A and B antigens (Springer, Horton, and Forbes 1959).
If natural antibodies are important for fighting against parasites, then individuals with
0 phenotype may be selectively more advantageous than those with A, B, or AB phenotypes,
for O individuals are expected to have both anti-A and anti-B antibodies. If so, the
argument of Galili and Andrews (1995) on the GAL pseudogene may also apply to the ABO
gene, because the O allele is by definition nonfunctional. In fact, it has been one of the
major mysteries of the ABO gene that the nonfunctional O allele is one of the common
alleles. The O allele is even fixed in some South American populations (see table 141 of
Roychoudhury and Nei 1988). This pattern of allele frequency distribution is clearly out
of the scope of the standard mutation-drift balance model, where null (nonfunctional)
alleles are expected to remain rare.
It should be remembered that several glycosyltransferases are
involved in the production of complex carbohydrate structures to manifest the blood group
antigens. When one transferase happens to lose its function, the precursor carbohydrate
structure (H determinant in the case of the ABO blood group system) will accumulate
without further modification. Clearly, more studies are needed to grasp the whole story of
the evolution of the ABO blood group gene.
Prof. Naruya Saitou's book in Japanese "Iden-shi wa 35 oku-nen no yume o miru -- bakuteria kara hito no shinka made" [Genes dream of 3.5 billion year dream -- evolution from bacterium to mankind] (Yamato Shobo) is also available. Summary of the above paper is in it.
The following papers in Japanese are also available now:
E-mail: abofan@js2.so-net.ne.jp