names file

umberar7 · April 12, 2010, 4:24pm

Hi,
I’m following the Costello pyrosequencing example to analyze my dataset. I’ve completed all the steps up to the alpha diversity measurements. I’m having difficulties deciding what names file to should use:
fasta file (unique) has 33 255 sequences
my names file has 33 255 unique sequences (i have 33 255 rows in the names file) but the second column has all the non-unique sequences.
In order to continue with the alpha diversity measurements, I need to create a groups file which contains all the sequence ID listed in the names file (unique and non-unique). So I do a list.seqs to produce an accnos file which I can use to produce a new groups file containing all 50,970 sequence IDs listed in the names file (unique and non-unique sequence IDs). Is this correct? Or should I be doing a unique.seqs on my fasta file containing only the unique sequences to get a names file containing only unique sequence IDs? I hope this is not too confusing…Thanks

pschloss · April 12, 2010, 6:22pm

if you’re following the script, then you should have the group file that you need to run read.otu. The original groups file is generated in the trim.seqs command and then you remove sequences as you go along in the screen.seqs commands.

umberar7 · April 12, 2010, 6:25pm

Actually, when I ran trim.seqs I did not generate a groups file. I think this is because I didn’t use the oligo option (as my dataset already had the barcodes and primer sequences removed prior to receiving it; therefore, I don’t have an oligos file). Instead, I just didn’t use the group file option in screen.seqs and only created a groups file from my list file (created by cluster() ). What should I have used to generate a groups file and at what point should I have made this group file?

pschloss · April 12, 2010, 8:39pm

If you’re doing 454 sequence analysis, the easiest way to do it is to let trim.seqs make the groups file for you. Alternatively, what you described in your first post sounds right.

Pat

umberar7 · April 12, 2010, 9:24pm

sounds good. so if I don’t include the oligos= option, I will not get a group file (which is the problem I encountered) so instead I created a group file by using list.seqs(fasta=…trim.fasta) and used this group file for downstream processing.
thanks

pschloss · April 13, 2010, 11:57am

right, but you’ll have to do some file manipulation to make the group file.

Topic		Replies	Views
creating .names and .groups files Commands in mothur	3	54201	December 4, 2014
groups file out of sync with Costello pipeline Commands in mothur	11	10827	August 30, 2012
trim.seqs: names file?? Commands in mothur	6	4114	October 8, 2014
How to make a "group" file Commands in mothur	4	4782	January 11, 2013
Name file and group file sequence discrepancy Commands in mothur	5	3846	May 29, 2013

names file

Related topics