I am trying the pca command on 4 different data sets. Each data set contains 14 groups. There are two data sets using v4 illumina tag sequences and two using the v6 illumina tag sequences. Each of the two (v4 and v6) data sets are 10,000 sequences per group and 230,000 sequences per group. When I run pca for the 10,000 sequences, both the v4 and v6 data sets work well. However, at the 230,000 sequences level, only the v4 set works. The v6 run has timed out twice now. Once after 5 days and once after 10 days while the v4 set only took a few hours. This is completely opposite to everything else I have run in mothur where the v4 runs are consistently longer than the v6 runs due to the sequences averaging about 250 bp vice the 60ish bp of the v6 tag. Is there an algorithm reason this is occurring (more difficulty with shorter reads?), or some other problem I haven’t found yet (typo in my batch file)?
Thanks,
Zak