Problem Set 4
- Due May 21, 2021 by 11:59pm
- Points 100
This assignment is worth 15% of your final grade. It is graded out of 100 points.
Objectives
- Gain experience using standard short read alignment, genotyping, and visualization techniques.
- Learn how to detect artifacts in alignment or variant calling that can arise from next generation sequencing analysis.
- Learn how to use methods for filtering and prioritizing variants in medical NGS studies.
- Explore new long read sequencing technologies.
Data for this problem set is provided in /datasets/cs284-sp21-A00-public/ps4
.
This problem set consist of 4 parts, each in a separate notebook so that you can work on each problem + validate the notebook separately:
CSE284-PS4-PART1.ipynb
: Sequence alignment and visualization (15 points)CSE284-PS4-PART2.ipynb
: Writing a simple SNP caller (35 points)CSE284-PS4-PART3.ipynb
: SNP calling with long reads (25 points)CSE284-PS4-PART4.ipynb
: Mutation hunting in Kabuki Syndrome (25 points)