Imagine that you have just received an important message. Unfortunately, the message has some problems --- it's cut up into pieces and all out of order!
What do you notice about the pieces of the message?
0 | UMANSSHARE60 |
1 | %OFTHEIRDNAW |
2 | FTHEIRDNAWIT |
3 | E60%OFTHEIRD |
4 | ARE60%OFTHEI |
5 | EIRDNAWITHAB |
6 | RDNAWITHABAN |
7 | MANSSHARE60% |
8 | SSHARE60%OFT |
9 | ANSSHARE60%O |
10 | SHARE60%OFTH |
11 | HARE60%OFTHE |
12 | IRDNAWITHABA |
13 | AWITHABANANA |
14 | HUMANSSHARE6 |
15 | NAWITHABANAN |
16 | 60%OFTHEIRDN |
17 | DNAWITHABANA |
18 | HEIRDNAWITHA |
19 | THEIRDNAWITH |
20 | OFTHEIRDNAWI |
21 | RE60%OFTHEIR |
22 | 0%OFTHEIRDNA |
23 | NSSHARE60%OF |
See if you can decipher the order of the message using the fact that the last 11 characters of each string are the beginning of the next string.
Q: What does this have to do with computational biology?
A: To sequence the genome of an organism, we read the nucleotide sequence from the physical molecules in shorter, overlapping chunks called reads. Then, we assemble the reads into the full sequence.
Now please click the notebook for this activity: Genome_Assembly.ipynb. The notebook will provide more instructions.
Back to Activities Next Activity