I feel that this should be so obvious but which command(s) in mothur will create a .names and .groups file? Or do I create these myself using a program like Excel?
list.seqs will generate the names file.
The groups file is where you specify which sequences belong in which treatment/sample; the program has no way of knowing this, so I believe you have to define this yourself ahead of time. However, there are functions in mothur that can help with this if you have a large number of sequences. For example, you can use list.seqs to pull out the names, then use a simple sed statement (this step NOT in mothur, just at the command line) such as:
sed ‘s/$/ AppendThisStringToEachLine/’ InputFileName > OutputFileName
to add the group name to each record in the names file. Do this for each sample/names file, then use merge.files in mothur to put them all together into a single groups file.
unique.seqs will also generate a .names file.
This is all very new to me, so it’s probably a silly question: I’m going through the MiSeq SOP but avoiding the first stage of make.contigs. I’ve been trying to creat a *.groups file manually by manipulating the fasta file using Excel (please don’t laugh). When running screen.seqs on the fasta and groups files I get these error massages: “Your groupfile does not include the sequence “X” please correct”, but only for some of the sequences. Manually I can find this sequence name in the groups file.
If I try to run around this by avoiding a groups file in this stage and creating it with the fasta file created after preforming unique.seqs and then moving on to count.seqs, I get this: “[ERROR]: “X” is not in your groupfile” for all of my sequences.
I’m using 2 processors.
Thank you very much!