We’re subsampling a practice dataset after make.contigs (just to reduce comp time as people are learning). Some people have a 1.38.1 and on 1.38.0. After screen.seqs (maxambig=0, maxlength=275) and unique.seqs, the 2 with 1.38.1 have exactly the same number of unique and total seqs. The person with 1.38.0 has a different but close number. This implies that the versions are sub.sampling differently?
Could you explain further? The sub.sample command randomly selects sequences. Given this variability one could expect to see differences in the results.
Yeah I thought subsample was random. I can’t figure out how 2 people got what appears to be the exact same subsample. Here’s what we did.
make.contigs(file=stability.txt) sub.sample(fasta=current, groups=current, size=20000, persample=T) total seqs # 240000 screen.seqs(fasta=current, group=current, maxambig=0, maxlength=275) unique.seqs(fasta=current)
One person (running 1.38.0) had
unique seqs 77289
total # of seqs: 187529
But both the people running 1.38.1 got
Which looks like they subsampled the exact same sequences to end up with the same # of uniques and total?