• Building a phylogenetic tree

    From Popping Mad@21:1/5 to All on Mon Apr 3 06:07:38 2023
    https://www.khanacademy.org/science/ap-biology/natural-selection/phylogeny/a/building-an-evolutionary-tree


    Building a phylogenetic tree
    AP.BIO: EVO‑3 (EU), EVO‑3.B (LO), EVO‑3.B.1 (EK), EVO‑3.C (LO), EVO‑3.C.1 (EK), EVO‑3.C.2 (EK), EVO‑3.C.3 (EK)
    Google Classroom
    The logic behind phylogenetic trees. How to build a tree using data
    about features that are present or absent in a group of organisms.
    Key points:
    Phylogenetic trees represent hypotheses about the evolutionary
    relationships among a group of organisms.
    A phylogenetic tree may be built using morphological (body shape),
    biochemical, behavioral, or molecular features of species or other groups.
    In building a tree, we organize species into nested groups based on
    shared derived traits (traits different from those of the group's ancestor). The sequences of genes or proteins can be compared among species and
    used to build phylogenetic trees. Closely related species typically have
    few sequence differences, while less related species tend to have more. Introduction
    We're all related—and I don't just mean us humans, though that's most definitely true! Instead, all living things on Earth can trace their
    descent back to a common ancestor. Any smaller group of species can also
    trace its ancestry back to common ancestor, often a much more recent one.
    Given that we can't go back in time and see how species evolved, how can
    we figure out how they are related to one another? In this article,
    we'll look at the basic methods and logic used to build phylogenetic
    trees, or trees that represent the evolutionary history and
    relationships of a group of organisms.
    Overview of phylogenetic trees
    In a phylogenetic tree, the species of interest are shown at the tips of
    the tree's branches. The branches themselves connect up in a way that represents the evolutionary history of the species—that is, how we think
    they evolved from a common ancestor through a series of divergence (splitting-in-two) events. At each branch point lies the most recent
    common ancestor shared by all of the species descended from that branch
    point. The lines of the tree represent long series of ancestors that
    extend from one species to the next.

    Image modified from Taxonomy and phylogeny: Figure 2, by Robert Bear et
    al., CC BY 4.0
    For a more detailed explanation, check out the article on phylogenetic
    trees.
    Even once you feel comfortable reading a phylogenetic tree, you may have
    the nagging question: How do you build one of these things? In this
    article, we'll take a closer look at how phylogenetic trees are constructed. The idea behind tree construction
    How do we build a phylogenetic tree? The underlying principle is
    Darwin’s idea of “descent with modification.” Basically, by looking at the pattern of modifications (novel traits) in present-day organisms, we
    can figure out—or at least, make hypotheses about—their path of descent from a common ancestor.
    As an example, let's consider the phylogenetic tree below (which shows
    the evolutionary history of a made-up group of mouse-like species). We
    see three new traits arising at different points during the evolutionary history of the group: a fuzzy tail, big ears, and whiskers. Each new
    trait is shared by all of the species descended from the ancestor in
    which the trait arose (shown by the tick marks), but absent from the
    species that split off before the trait appeared.

    [That tree is confusing! Can we go through step-by-step?]
    When we are building phylogenetic trees, traits that arise during the
    evolution of a group and differ from the traits of the ancestor of the
    group are called derived traits. In our example, a fuzzy tail, big ears,
    and whiskers are derived traits, while a skinny tail, small ears, and
    lack of whiskers are ancestral traits. An important point is that a
    derived trait may appear through either loss or gain of a feature. For instance, if there were another change on the E lineage that resulted in
    loss of a tail, taillessness would be considered a derived trait.
    Derived traits shared among the species or other groups in a dataset are
    key to helping us build trees. As shown above, shared derived traits
    tend to form nested patterns that provide information about when
    branching events occurred in the evolution of the species.
    When we are building a phylogenetic tree from a dataset, our goal is to
    use shared derived traits in present-day species to infer the branching
    pattern of their evolutionary history. The trick, however, is that we
    can’t watch our species of interest evolving and see when new traits
    arose in each lineage.
    Instead, we have to work backwards. That is, we have to look at our
    species of interest – such as A, B, C, D, and E – and figure out which traits are ancestral and which are derived. Then, we can use the shared
    derived traits to organize the species into nested groups like the ones
    shown above. A tree made in this way is a hypothesis about the
    evolutionary history of the species – typically, one with the simplest possible branching pattern that can explain their traits.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Harshman@21:1/5 to Popping Mad on Mon Apr 3 06:21:16 2023
    On 4/3/23 3:07 AM, Popping Mad wrote:
    https://www.khanacademy.org/science/ap-biology/natural-selection/phylogeny/a/building-an-evolutionary-tree


    Building a phylogenetic tree
    AP.BIO: EVO‑3 (EU), EVO‑3.B (LO), EVO‑3.B.1 (EK), EVO‑3.C (LO), EVO‑3.C.1 (EK), EVO‑3.C.2 (EK), EVO‑3.C.3 (EK)
    Google Classroom
    The logic behind phylogenetic trees. How to build a tree using data
    about features that are present or absent in a group of organisms.
    Key points:
    Phylogenetic trees represent hypotheses about the evolutionary
    relationships among a group of organisms.
    A phylogenetic tree may be built using morphological (body shape), biochemical, behavioral, or molecular features of species or other groups.
    In building a tree, we organize species into nested groups based on
    shared derived traits (traits different from those of the group's ancestor). The sequences of genes or proteins can be compared among species and
    used to build phylogenetic trees. Closely related species typically have
    few sequence differences, while less related species tend to have more. Introduction
    We're all related—and I don't just mean us humans, though that's most definitely true! Instead, all living things on Earth can trace their
    descent back to a common ancestor. Any smaller group of species can also trace its ancestry back to common ancestor, often a much more recent one. Given that we can't go back in time and see how species evolved, how can
    we figure out how they are related to one another? In this article,
    we'll look at the basic methods and logic used to build phylogenetic
    trees, or trees that represent the evolutionary history and
    relationships of a group of organisms.
    Overview of phylogenetic trees
    In a phylogenetic tree, the species of interest are shown at the tips of
    the tree's branches. The branches themselves connect up in a way that represents the evolutionary history of the species—that is, how we think they evolved from a common ancestor through a series of divergence (splitting-in-two) events. At each branch point lies the most recent
    common ancestor shared by all of the species descended from that branch point. The lines of the tree represent long series of ancestors that
    extend from one species to the next.

    Image modified from Taxonomy and phylogeny: Figure 2, by Robert Bear et
    al., CC BY 4.0
    For a more detailed explanation, check out the article on phylogenetic
    trees.
    Even once you feel comfortable reading a phylogenetic tree, you may have
    the nagging question: How do you build one of these things? In this
    article, we'll take a closer look at how phylogenetic trees are constructed. The idea behind tree construction
    How do we build a phylogenetic tree? The underlying principle is
    Darwin’s idea of “descent with modification.” Basically, by looking at the pattern of modifications (novel traits) in present-day organisms, we
    can figure out—or at least, make hypotheses about—their path of descent from a common ancestor.
    As an example, let's consider the phylogenetic tree below (which shows
    the evolutionary history of a made-up group of mouse-like species). We
    see three new traits arising at different points during the evolutionary history of the group: a fuzzy tail, big ears, and whiskers. Each new
    trait is shared by all of the species descended from the ancestor in
    which the trait arose (shown by the tick marks), but absent from the
    species that split off before the trait appeared.

    [That tree is confusing! Can we go through step-by-step?]
    When we are building phylogenetic trees, traits that arise during the evolution of a group and differ from the traits of the ancestor of the
    group are called derived traits. In our example, a fuzzy tail, big ears,
    and whiskers are derived traits, while a skinny tail, small ears, and
    lack of whiskers are ancestral traits. An important point is that a
    derived trait may appear through either loss or gain of a feature. For instance, if there were another change on the E lineage that resulted in
    loss of a tail, taillessness would be considered a derived trait.
    Derived traits shared among the species or other groups in a dataset are
    key to helping us build trees. As shown above, shared derived traits
    tend to form nested patterns that provide information about when
    branching events occurred in the evolution of the species.
    When we are building a phylogenetic tree from a dataset, our goal is to
    use shared derived traits in present-day species to infer the branching pattern of their evolutionary history. The trick, however, is that we
    can’t watch our species of interest evolving and see when new traits
    arose in each lineage.
    Instead, we have to work backwards. That is, we have to look at our
    species of interest – such as A, B, C, D, and E – and figure out which traits are ancestral and which are derived. Then, we can use the shared derived traits to organize the species into nested groups like the ones
    shown above. A tree made in this way is a hypothesis about the
    evolutionary history of the species – typically, one with the simplest possible branching pattern that can explain their traits.

    I should point out that this is not actually quite how it's done. There
    is no need to determine derived states in advance; that determination
    comes after the tree is formed and rooted. All you need to do is score putatively homologous traits, e.g. by aligning DNA sequences. Then you
    use some model of evolutionary processes (which may be as simple as "parsimony", or the minimum number of changes, to decide which unrooted
    tree best fits the data. After that you declare a root, most likely by
    deciding that one taxon is the outgroup to the rest.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Nyikos@21:1/5 to Popping Mad on Mon Apr 3 12:26:12 2023
    On Monday, April 3, 2023 at 6:07:54 AM UTC-4, Popping Mad wrote:
    https://www.khanacademy.org/science/ap-biology/natural-selection/phylogeny/a/building-an-evolutionary-tree


    This is quite a nice, down to earth treatment of the topic. I'm glad you
    also did a thread using a professional article, and I've already replied to that
    an hour and a half ago:

    https://groups.google.com/g/sci.bio.paleontology/c/rX_v-fcwq7s/m/3Y7nZwYrAwAJ Re: Phylogenetic Trees: The What and The Why

    I happen to prefer doing several posts along the same thread to starting a new thread for discussing closely related articles, but YMMV.

    Building a phylogenetic tree
    AP.BIO: EVO‑3 (EU), EVO‑3.B (LO), EVO‑3.B.1 (EK), EVO‑3.C (LO), EVO‑3.C.1 (EK), EVO‑3.C.2 (EK), EVO‑3.C.3 (EK)
    Google Classroom
    The logic behind phylogenetic trees. How to build a tree using data
    about features that are present or absent in a group of organisms.
    Key points:
    Phylogenetic trees represent hypotheses about the evolutionary
    relationships among a group of organisms.
    A phylogenetic tree may be built using morphological (body shape), biochemical, behavioral, or molecular features of species or other groups. In building a tree, we organize species into nested groups based on
    shared derived traits (traits different from those of the group's ancestor). The sequences of genes or proteins can be compared among species and
    used to build phylogenetic trees. Closely related species typically have
    few sequence differences, while less related species tend to have more.


    "less related" can easily be defined by using "the path metric" between
    the various pairs of species involved, as in the sentence,
    "Species A is more closely related to species B than it is to species C because
    the path metric from A to C is greater than the one from A to B."
    This metric is formally defined in:

    https://people.math.wisc.edu/~roch/research_files/review-steel-ams.pdf

    I did an explanation of the path metric on a level intermediate between
    the above article and the one we are reviewing here, in the post I referenced above.


    Introduction
    We're all related—and I don't just mean us humans, though that's most definitely true! Instead, all living things on Earth can trace their
    descent back to a common ancestor.

    This is debatable; but substitute "eukaryotes" for "living things" and you
    are on solid ground. If you try to include prokaryotes, you get involved
    in endosymbiosis and also a possibly excessive amount of lateral genetic transfer.


    Any smaller group of species can also
    trace its ancestry back to common ancestor, often a much more recent one. Given that we can't go back in time and see how species evolved, how can
    we figure out how they are related to one another? In this article,
    we'll look at the basic methods and logic used to build phylogenetic
    trees, or trees that represent the evolutionary history and
    relationships of a group of organisms.

    Overview of phylogenetic trees
    In a phylogenetic tree, the species of interest are shown at the tips of
    the tree's branches.

    This is in contrast to what are called "evolutionary trees," also known as Besseyan cactuses or commagrams,
    which also show species at branching points.
    Here is an example which puts species at all but one of the branching points: https://en.wikipedia.org/wiki/Evolutionary_taxonomy#/media/File:DidymCactus.png


    [snip of introductory material]


    How do we build a phylogenetic tree? The underlying principle is
    Darwin’s idea of “descent with modification.” Basically, by looking at the pattern of modifications (novel traits) in present-day organisms, we
    can figure out—or at least, make hypotheses about—their path of descent from a common ancestor.
    [...]
    When we are building phylogenetic trees, traits that arise during the evolution of a group and differ from the traits of the ancestor of the
    group are called derived traits. In our example [of a made-up group of mouse species],
    a fuzzy tail, big ears, and whiskers are derived traits, while a skinny tail,
    small ears, and lack of whiskers are ancestral traits. An important point is that a
    derived trait may appear through either loss or gain of a feature. For instance, if there were another change on the E lineage that resulted in loss of a tail, taillessness would be considered a derived trait.

    Yes, unless the whole group had a tailless ancestor somewhere down the line. Tails go back to before vertebrates evolved, so it's only of interest if that tailless species were a mouse, or at worst a rodent.

    Derived traits shared among the species or other groups in a dataset are
    key to helping us build trees. As shown above, shared derived traits
    tend to form nested patterns that provide information about when
    branching events occurred in the evolution of the species.
    When we are building a phylogenetic tree from a dataset, our goal is to
    use shared derived traits in present-day species to infer the branching pattern of their evolutionary history. The trick, however, is that we can’t watch our species of interest evolving and see when new traits
    arose in each lineage.
    Instead, we have to work backwards. That is, we have to look at our
    species of interest – such as A, B, C, D, and E – and figure out which traits are ancestral and which are derived. Then, we can use the shared derived traits to organize the species into nested groups like the ones shown above. A tree made in this way is a hypothesis about the
    evolutionary history of the species – typically, one with the simplest possible branching pattern that can explain their traits.

    The part after the dash is the "maximum parsimony" (MP) method.
    There are others, like ML and NJ, the latter being used sometimes
    due to a drawback of MP:

    "However, as Theorem 2 suggests, the space of trees is large and it turns out that constructing a maximum parsimony tree T∗ is in fact computationally intractable. (See e.g. Section 5.3 of the book under review for more details.) Nevertheless a natural
    heuristic for minimizing the parsimony score of C, which has proved useful in practice, is to perform a local search on tree space."
    --https://people.math.wisc.edu/~roch/research_files/review-steel-ams.pdf


    Peter Nyikos
    Professor, Dept. of Mathematics -- standard disclaimer--
    University of South Carolina
    http://people.math.sc.edu/nyikos

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Nyikos@21:1/5 to Peter Nyikos on Mon Apr 3 12:40:02 2023
    On Monday, April 3, 2023 at 3:26:14 PM UTC-4, Peter Nyikos wrote:
    On Monday, April 3, 2023 at 6:07:54 AM UTC-4, Popping Mad wrote:

    In a phylogenetic tree, the species of interest are shown at the tips of the tree's branches.

    This is in contrast to what are called "evolutionary trees," also known as Besseyan cactuses or commagrams,
    which also show species at branching points.
    Here is an example which puts species at all but one of the branching points:
    https://en.wikipedia.org/wiki/Evolutionary_taxonomy#/media/File:DidymCactus.png

    Correction: all but two of the branching points. They are represented by balloons
    with question marks in them: one turquoise, the other white, near the top, with 1.5 also in it.


    Peter Nyikos
    Professor, Dept. of Mathematics -- standard disclaimer--
    University of South Carolina
    http://people.math.sc.edu/nyikos

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Harshman@21:1/5 to Peter Nyikos on Mon Apr 3 17:16:35 2023
    On 4/3/23 12:26 PM, Peter Nyikos wrote:
    On Monday, April 3, 2023 at 6:07:54 AM UTC-4, Popping Mad wrote:
    https://www.khanacademy.org/science/ap-biology/natural-selection/phylogeny/a/building-an-evolutionary-tree


    This is quite a nice, down to earth treatment of the topic. I'm glad you
    also did a thread using a professional article, and I've already replied to that
    an hour and a half ago:

    https://groups.google.com/g/sci.bio.paleontology/c/rX_v-fcwq7s/m/3Y7nZwYrAwAJ Re: Phylogenetic Trees: The What and The Why

    I happen to prefer doing several posts along the same thread to starting a new
    thread for discussing closely related articles, but YMMV.

    Building a phylogenetic tree
    AP.BIO: EVO‑3 (EU), EVO‑3.B (LO), EVO‑3.B.1 (EK), EVO‑3.C (LO),
    EVO‑3.C.1 (EK), EVO‑3.C.2 (EK), EVO‑3.C.3 (EK)
    Google Classroom
    The logic behind phylogenetic trees. How to build a tree using data
    about features that are present or absent in a group of organisms.
    Key points:
    Phylogenetic trees represent hypotheses about the evolutionary
    relationships among a group of organisms.
    A phylogenetic tree may be built using morphological (body shape),
    biochemical, behavioral, or molecular features of species or other groups. >> In building a tree, we organize species into nested groups based on
    shared derived traits (traits different from those of the group's ancestor). >> The sequences of genes or proteins can be compared among species and
    used to build phylogenetic trees. Closely related species typically have
    few sequence differences, while less related species tend to have more.


    "less related" can easily be defined by using "the path metric" between
    the various pairs of species involved, as in the sentence,
    "Species A is more closely related to species B than it is to species C because
    the path metric from A to C is greater than the one from A to B."
    This metric is formally defined in:

    https://people.math.wisc.edu/~roch/research_files/review-steel-ams.pdf

    I did an explanation of the path metric on a level intermediate between
    the above article and the one we are reviewing here, in the post I referenced above.


    Introduction
    We're all related—and I don't just mean us humans, though that's most
    definitely true! Instead, all living things on Earth can trace their
    descent back to a common ancestor.

    This is debatable; but substitute "eukaryotes" for "living things" and you are on solid ground. If you try to include prokaryotes, you get involved
    in endosymbiosis and also a possibly excessive amount of lateral genetic transfer.


    Any smaller group of species can also
    trace its ancestry back to common ancestor, often a much more recent one.
    Given that we can't go back in time and see how species evolved, how can
    we figure out how they are related to one another? In this article,
    we'll look at the basic methods and logic used to build phylogenetic
    trees, or trees that represent the evolutionary history and
    relationships of a group of organisms.

    Overview of phylogenetic trees
    In a phylogenetic tree, the species of interest are shown at the tips of
    the tree's branches.

    This is in contrast to what are called "evolutionary trees," also known as Besseyan cactuses or commagrams,
    which also show species at branching points.
    Here is an example which puts species at all but one of the branching points: https://en.wikipedia.org/wiki/Evolutionary_taxonomy#/media/File:DidymCactus.png


    [snip of introductory material]


    How do we build a phylogenetic tree? The underlying principle is
    Darwin’s idea of “descent with modification.” Basically, by looking at >> the pattern of modifications (novel traits) in present-day organisms, we
    can figure out—or at least, make hypotheses about—their path of descent >> from a common ancestor.
    [...]
    When we are building phylogenetic trees, traits that arise during the
    evolution of a group and differ from the traits of the ancestor of the
    group are called derived traits. In our example [of a made-up group of mouse species],
    a fuzzy tail, big ears, and whiskers are derived traits, while a skinny tail,
    small ears, and lack of whiskers are ancestral traits. An important point is that a
    derived trait may appear through either loss or gain of a feature. For
    instance, if there were another change on the E lineage that resulted in
    loss of a tail, taillessness would be considered a derived trait.

    Yes, unless the whole group had a tailless ancestor somewhere down the line. Tails go back to before vertebrates evolved, so it's only of interest if that tailless species were a mouse, or at worst a rodent.

    Derived traits shared among the species or other groups in a dataset are
    key to helping us build trees. As shown above, shared derived traits
    tend to form nested patterns that provide information about when
    branching events occurred in the evolution of the species.
    When we are building a phylogenetic tree from a dataset, our goal is to
    use shared derived traits in present-day species to infer the branching
    pattern of their evolutionary history. The trick, however, is that we
    can’t watch our species of interest evolving and see when new traits
    arose in each lineage.
    Instead, we have to work backwards. That is, we have to look at our
    species of interest – such as A, B, C, D, and E – and figure out which >> traits are ancestral and which are derived. Then, we can use the shared
    derived traits to organize the species into nested groups like the ones
    shown above. A tree made in this way is a hypothesis about the
    evolutionary history of the species – typically, one with the simplest
    possible branching pattern that can explain their traits.

    The part after the dash is the "maximum parsimony" (MP) method.
    There are others, like ML and NJ, the latter being used sometimes
    due to a drawback of MP:

    "However, as Theorem 2 suggests, the space of trees is large and it turns out that constructing a maximum parsimony tree T∗ is in fact computationally intractable. (See e.g. Section 5.3 of the book under review for more details.) Nevertheless a
    natural heuristic for minimizing the parsimony score of C, which has proved useful in practice, is to perform a local search on tree space."
    --https://people.math.wisc.edu/~roch/research_files/review-steel-ams.pdf

    That's not a difference between MP and ML. It's a feature of any method
    that uses a search of possible trees, in which one evaluates some
    measure of fit for a number of trees. It's only guaranteed to find the
    best tree if you evaluate all possible trees. Since that's usually
    prohibitive, one resorts to heuristics.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)