- Problem - most sequences do not have adequate information in iso_source to sufficiently conclude where the sequence originated
- Use the title of the paper (listed in GenBank file) to lookup paper
- Find geographic location and origin of sequence in Methods Section
- Time intensive - better way to do this?
- Categorize the sequences into either Terrestrial, Aquatic, or Air-borne
- Subcategorize into various fields listed in excel doc
- BLAST-gg-aligned-with-otus.xlsx
Thursday, October 16, 2014
Genbank - Relate Metadata to Phylo
Use isolation source (extracted from GenBank files using extract-data-from-genbank.py) to see what habitat each sequence was obtained from.
Labels:
GenBank
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment