Hello, I am an undergraduate senior and am very new to mothur. I am doing a project in school following the Mothur tutorial through the Galaxy Server using published NCBI SRAs instead of the tutorial data. I am running into an issue with my count tables having a lack of metadata (output from Count.seqs). The NCBI source page reads that the layout is paired so am skipping the pairing step and Make.contigs (when I try these steps, the command fails). These are the steps I have done before getting an issue
Before data-cleaning steps, the samples contained 160,000 and 190,000 sequences respectively.
1. Download and Extract Reads in FASTQ/FASTA (used two different samples from same study)
2. Create collection of the two outputs
3. FASTQ to FASTA converter on collection
4. Summary.seqs on converted collection (logfile=yes)
5. Make.groups on converted collection (automatically from collection)
5. Screen.seqs on converted collection and make.groups output(maxlength= 251, maxambig=0, groupfile=make.groups output, logfile=yes)
6. Unique.seqs on Screen.seqs output (output format=Name file, logfile=yes)
7. Count.seqs on names output from Unique.seqs and group file from Screen.seqs
This is where the issue occurs. Count.seqs usually fails (for some reason a few times only one of the samples failed) but if I exclude the group file, then I am missing metadata for further steps. This really becomes an issue when getting to the Cluster.split command.
If anyone has any feedback, I would really appreciate it.