Research: Genbank - Curto Results File

Thursday, October 9, 2014

Genbank - Curto Results File

Took master file BLAST-combined.fasta and needed to filter out curto sequences

Build algorithm to sort through file and pull out curto sequences and write in new file

Make a reference text file with curto accession numbers

curto-accession-numbers.txt

Run code to make new curto only file

curto-only2.py

Problem - new file had >4000 sequences (should be 982)

Multiple duplicate accession numbers - need to remove
Build algorithm to keep unique accession numbers

duplicate-removal.py

All curto and frigo bacteria -> curto-and-frigo-only.fasta

Added outlier sequence (AB695377.1 Sediminihabitans luteus) for phylo reference

curto-and-frigo-only-with-outlier.fasta

Create OTUs within curto and frigo genera

Used QIIME pick_otus.py

Use default confidence intervals (97% (n = 41))
Use curto-and-frigo-only-with-outlier.fasta as input file

curto-and-frigo-only-with-outlier_otus.txt
Generated biom file - not important with such closely related taxon

otu-table.biom

Pick representative sequence for each OTU

pick_rep_set.py
rep_set.fna

Assign Taxonomy to each rep set to make sure everything has worked so far

assign_taxonomy.py
/taxonomy-results/rep_set_tax_assignments.txt

Must align multiple rep sequences to template - greengenes core database (16S gene)

align_seqs.py
Use PYNAST with min length of 75% of the median sequence length

Filter alignment (filter_alignment.py)

Remove positions which are gaps in every sequence (common for PyNAST, as typical sequences cover only 200-400 bases, and they are being aligned against the full 16S gene)
Removed some OTUs due to failure to align (moved to new file):

OTU2
OTU3
OTU4
OTU8
OTU10
OTU11
OTU26
OTU30
OTU37
OTU38
OTU39
OTU40

Removal of these OTUs reduced the overall number of samples by 50
Should have 933 samples left in 30 OTUs

Make phylogeny - newick file

make_phylogeny.py
Need to reroot file (in FigTree) for outlier

For Reference: QIIME Review

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)