Searching for a Motif

In computational biology, we sometimes refer to important subsequences as motifs. Suppose we know a particular motif and want to locate it within a larger sequence

For example, the SARS CoV-2 genome is about 30,000 base pairs long. The spike protein is nearly 4,000 nucleotides in length. The 27 nt. long sequence "GGCGGCTTCAATTTCAGCCAGATTCTG" codes for a small protein domain called the fusion peptide.




First, let's access the notebooks for this module and this activity by copy-pasting the following code into a Jupyter notebook environment:

	import os
	os.system("wget http://compbiocamp.cgrb.oregonstate.edu/notebooks/Motif_Search.ipynb")
	os.system("wget http://compbiocamp.cgrb.oregonstate.edu/notebooks/Genome_Assembly.ipynb")
	os.system("wget http://compbiocamp.cgrb.oregonstate.edu/notebooks/Plotting.ipynb")
	os.system("wget http://compbiocamp.cgrb.oregonstate.edu/notebooks/Sequence_Comparison.ipynb")
      

Open Jupyter Hub

Now please click the notebook for this activity: Motif_Search.ipynb. The notebook will provide more instructions.

Back to Activities Next Activity