Tree of Life Activity Part 3

Compute and Visualize a Phylogenetic Tree

COVID-19 is caused by a type of coronavirus, SARS-CoV-2, and can infect a variety of animal species, including cats, dogs, lions, and tigers. The coronavirus gets inside the cells of animal hosts and causes infection by binding to a specific gene, called ACE2.


ACE2 is an important gene in animals because it is involved in homeostasis. ACE2 is especially interesting to scientists right now because of its assocation with the coronavirus.


Scientists can use a phylogenetic tree to understand how the DNA sequence of ACE2 is similar or different among many different animals, and to identify animals that could potentially be infected by the coronavirus.


In today's activity, we will use the sequence of the ACE2 gene from different animal species to construct a phylogenetic tree and compare how related they are.


Once we create our tree, we will be able to make inferences about which species are more closely related to each other, based on our visual analysis!






We will use DNA sequences for the following animals:

	Common Name    Scientific Name
	Human          Homo sapiens
	Mouse          Mus musculus
	Dog            Canis lupus
	Cow            Bos taurus
	Cat            Felis catus
	Frog           Xenopus tropicalis
	Pika           Ochotona princeps
	Coelacanth     Latimeria chalumnae
	Hedgehog       Erinaceus europaeus
	Big Brown Bat  Eptesicus fuscus
	Hummingbird    Calypte anna
	Bald Eagle     Haliaeetus leucocephalus
	Camel          Camelus dromedarius
	Koala          Phascolarctos cinereus
	Vampire Bat    Desmodus rotundus
	Narwhal        Monodon monoceros
	Horseshoe Bat  Rhinolophus ferrumequinum
      

Compute and visualize a phylogenetic tree

You will need to navigate to your Jupyter notebook to complete this activity

In this activity we will use a Jupyter Notebook that has some pre-written code. Here is a button to open Jupyter if you don't already have it open:

Open Jupyter Hub

Access the notebook for this activity by typing the following command in your Jupyter notebook environment:

import os
os.system("wget http://compbiocamp.cgrb.oregonstate.edu/notebooks/Activity3_TreeOfLife_Part3.ipynb")


After we align our sequences with clustalw2, we will want to look at the file containing the multiple sequence alignment, called "ACE2.aln"

To do this, you will need to navigate to a "Bash" kernel from your Jupyter notebook page



Once we have an open terminal window, we will type the command below and press "play":

less ACE2.aln


The contents of the file should look like this:



The alignments in this file let us see where gene sequences are very similar

The '*' character in the last row indicates a perfect alignment: all species share that particular nucleotide

Human              TTTGCTTGGTGATATGTGGGGTAGATTTTGGACAAATCTGTACTCTTTGACAGTTCCCTT
Mouse              TTTGCTTGGTGATATGTGGGGTAGATTTTGGACAAATCTGTACCCTTTGACTGTTCCCTT
Hedgehog           TTTGCTTGGTGATATGTGGGGTAGATTTTGGACAAATCTATACCGCTTGACAGTCCCCTA
Camel              TTTGCTTGGCGATATGTGGGGTAGATTTTGGACAAATCTATACTCTTTGACAGTCCCCTT
Narwhal            TTTGCTTGGTGATATGTGGGGGAGATTTTGGACAAATCTGTACCCTTTGACAGTCCCCTT
Big_Brown_Bat      TTTGCTTGGTGATATGTGGGGTAGATTTTGGACAAATCTGTACAATCTGACAGTCCCCTT
Horseshoe_Bat      TTTGCTTGGTGATATGTGGGGTAGATTTTGGACAAATCTGTACCCTTTGACAGTCCCCTT
Vampire_Bat        TTTGCTCGGTGATATGTGGGGTAGATTTTGGACAAATCTGTACAATTTGACAGCCCCCTT
Cat                TTTGCTTGGCGATATGTGGGGTCGATTTTGGACAAATCTGTACCCTTTGACAGTCCCCTT
Dog                TTTGCTTGGTGATATGTGGGGTAGATTTTGGACAAATCTGTACCCTTTGACAGTCCCCTT
Cow                TTTGCTTGGTGATATGTGGGGGAGATTTTGGACAAATCTGTACTCTTTGACAGTCCCCTT
Pika               TTTGCTTGGTGACATGTGGGGTAGATTTTGGACAAACCTGTACTCTTTGACAGTCCCCTT
Koala              TTTGCTTGGTGATATGTGGGGCAGATTTTGGACAAATCTATATTCACTGACAGTGCCCTA
Hummingbird        CTTGCTGGGTGATATGTGGGGTAGATTTTGGACAAATCTGTATCCCTTGACTGTTCCCTA
Bald_Eagle         CTTGCTGGGTGATATGTGGGGTAGATTTTGGACAAATCTGTATGCCTTGACCGTTCCCTA
Coelacanth         TTTACTTGGTGACATGTGGGGAAGATTTTGGACAAACTTGTACCCCTTGGCTGTCCCATA
Frog               TTTGCTTGGTGATATGTGGGGAAGATTTTGGACAAATTTGTATCCTCTGATGGTCCCCTA
                    ** ** ** ** ********  *************  * **     **   *  ** * 


We can also see where parts of the sequence are similar only among certain species

The '-' character indicates where a matching sequence is missing from the alignment

Question: in the section of the alignment shown below, which species are most similar? Most different?

Human              AA---TCTATGTTTTTCCTCTTGAGGTGATTTTGTTGTATGTAAATGTTAATTTCAT--G
Mouse              AG---CACTTGTC---------------ATCTTCCTGTATGTAAATGCTAACTTCAT--A
Hedgehog           AAG-CCGTTTGCATTTCTCCTTGAGGTGATTTGATTATCCATAAATATTAATTTCA----
Camel              AA--TCTATTTTATTTCCTCTTGAGGTGATTTTATTGTATGTAAATGTTAGTTTCAC--G
Narwhal            AA--TCTCTTTTATTTCCTCTTGAGGTGATCTTATTGTATGTAAATGTTAATTTCAC--A
Big_Brown_Bat      AA--TCCATTTTATTTCCTCTTGAGGTGATTTTATGGCATGTAAATGTTAATTTCAC--A
Horseshoe_Bat      AA--TCTATTTTATTTCCTCTTGAGGTGATTTCATTGTATGTAAATGTTAATTTCGT--A
Vampire_Bat        CA--TCATTTATATTTTCCCTTGAGGTGATTTTATCATACATACATGTTAATTTCAC--A
Cat                AA-------TCTATTTCCTCTTGAGGTGATTTCATTGTATGTAAATGTTGATTTTAC--G
Dog                ------------------------------------------------------------
Cow                TAG-TCATGAGAAGC------TAAAATAGGACTCGTGTACTTCTGTGTCAAG---AT--A
Pika               AGT-CCTTTTCTTTTTGAGGTGAAGTTAATGTGGCAGGCCGAAAGAAGCGAGAACA---A
Koala              AT----CTTTTCATTT----TTTTACTCATTTCATTTCCTGTCAATACTAACATTATGTA
Hummingbird        ATG-AATCTACA------------------------------------------------
Bald_Eagle         ------------------------------------------------------------
Coelacanth         AAA-ATAAATATGAAAAAGTCTCAATTAATTCCCTTTTAATTGTGCTCTTAAA-------
Frog               AGCATTTGATGGAAATGATAATAAAGCGATTGACGGAAATGACAATAAAGGGATTGA-TG



Back to Main Activity Page Previous Activity Next Activity