sand_filter_kernel - filter read sequences sequentially


sand_filter_kernel [options] <sequence file> [second sequence file]


sand_filter_kernel filters a list of genomic sequences, and produces a list of candidate pairs for more detailed alignment. It is not normally called by the user, but is invoked by sand_filter_master(1) for each sequential step of a distributed alignment workload.

If one sequence file is given, sand_filter_kernel will look for similarities between all sequences in that file. If given two files, it will look for similarities between sequences in the first file and the second file. The output is a list of candidate pairs, listing the name of the candidate sequences and a starting position for alignment.


-s <size>
Size of "rectangle" for filtering. You can determine the size dynamically by passing in d rather than a number.
-r <file>
A meryl file of repeat mers to be ignored.
-k <size>
The k-mer size to use in candidate selection (default is 22).
-w <number>
The minimizer window size to use in candidate selection (default is 22).
-o <filename>
The output file. Default is stdout.
-d <subsystem>
Enable debug messages for this subsystem. Try -d all to start.
-v Show version string.
-h Show help screen.


On success, returns zero. On failure, returns non-zero.


Users do not normally invoke sand_filter_kernel directly. Instead, options such as the k-mer size, minimizer window, and repeat file may be specified by the same arguments to sand_filter_master(1) instead. For example, to run a filter with a k-mer size of 20, window size of 24, and repeat file of mydata.repeats:
% sand_filter_master -k 20 -w 24 -r mydata.repeats mydata.cand


The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and Copyright (C) 2005-2015 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.


CCTools 6.0.0 from source released on