creating group files

jugsy67 · April 23, 2010, 1:36pm

does anyone have a foolproof and easy approach to creating group files in a Win OS envrionment?

Pat has helped me twice as I used excel and exported the columns but some odd invisible characters get inserted that mess up subsequent processes and I can’t see them in any text ditor that I use or that have been suggested to me. I don’t have access to textwrangler for the Mac (which Pat tells me shows the offending characters) and I feel bad asking Pat to help with such trivial issues.

cheers

Julian

hoytpr · April 23, 2010, 9:45pm

I struggled quite a bit at first with group files, and would be glad to try and help. Could you upload a sample file as an attachment ?

Pete

jangidk · April 24, 2010, 5:02am

There are two ways i used on windows before switching to linux.

You can make the entries in an excel worksheet and then save it either as a taxt document or as a CSV file. This might not get rid of the unusual characters sometimes because of the way it saves the columns into the text format. In case i see these characters, i would simply copy that character and do find/replace with none.
Once you have the entries in the excel sheet, you could copy and paste these column entires into a notepad document. This generally worked for me without any errors or unusual characters.

Hope that helps.

jugsy67 · April 26, 2010, 9:50am

thanks for the help (pete) but I need to sort this out at this end and Pat has helped so I know what the issue is, but not sure of the solution.

Jangidk’s suggestion was one I tried but notepad didn’t show the characters that Pat saw with textwrangler.

Julian

laalaa99stl · April 26, 2010, 2:16pm

I recommend the FOSS notepad++ for editting text on Win
including the ability to view hidden characters:
http://notepad-plus.sourceforge.net/
Shareware alternatives are EditPlus and UltraEdit.

Robin

hoytpr · April 26, 2010, 7:43pm

I was just wondering which characters you were seeing, that’s all. I agree that NotePad++ is a good one, but if you have really hard to eliminate characters, I bet you would be pretty happy with EMBOSS’s TrimSeq (http://www.molgen.mpg.de/~beck/EMBOSS/trimseq.html).

I like to use MetaPad (http://liquidninja.com/metapad/) which has great features to strip trailing whitespace AND to strip first characters.

Also, I’ve been able to see some characters at the ends of files or titles using the old software BioEdit. (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). I’ll usually copy them to metapad to fix them.

Finally, if you want you can use GREP on your windows machine! Just download the unix-Utilities:

Between MetaPad and Grep I got everything done. Good luck!

(No spaces are allowed in group names!)
Pete

pschloss · April 29, 2010, 12:21pm

I’m sorry it is such a pain to generate a group file. Part of the reason that we don’t have a command for this is because it seems like people would be starting from such diverse points. Perhaps it would be better to just jump in and go with something than to do nothing. Let me know what you guys think of this idea…

The user would enter a command like this: make.group(fasta=A.fasta-B.fasta-C.fasta, group=A-B-C)
User provides separate fasta files for each group and each fasta file contains those sequence in the group
The labels (A, B, C) would be the labels for the 3 fasta files in that order
The output would be composite.fasta representing a concatenation of the 3 fasta files and composite.groups which would be a group file.

Would this work for people? I guess I don’t know whether people would have their sequences separated by groups. Any other permutations that people can think of?

Pat

westcott · July 20, 2010, 11:31am

Just to update, mothur now contains the make.group Pat described above.

Carolinervlld · October 11, 2014, 2:39am

Hello, I tried to use the make.group command but cannot put more than 3 fasta files in it… is it normal? Regards, Caroline.

westcott · October 14, 2014, 6:43pm

Hi Caroline, thanks for reporting this issue. It will be fixed in the next release. As a workaround, you can do the following:

make.group(fasta=fastafile1.fasta-fastafile2.fasta-fastafile3.fasta, groups=A-B-C)
make.group(fasta=fastafile4.fasta-fastafile5.fasta-fastafile6.fasta, groups=D-E-F)
merge.files(input=groupFile1-groupFile2, output=completeGroupFile)

wielern · November 9, 2014, 3:08pm

Hi

I got trimmed and paired fasta files, I grouped them and merged them. But when I run summary.seqs for the merge command output file, I got the result of total number of sequences=1, which isn’t possible. what am I missing?

thank you,
nimrod

Topic		Replies	Views
creating group file mothur bugs	1	1012	October 2, 2017
Generating a group file Commands in mothur	3	25486	January 27, 2010
remove duplicate entries from groups file Commands in mothur	2	2882	January 8, 2013
Windows bug in trim.seqs v 1.29.2 ? mothur bugs	3	3499	March 5, 2013
How to make a "group" file Commands in mothur	4	4736	January 11, 2013

creating group files

Related Topics