-
Notifications
You must be signed in to change notification settings - Fork 3
Sketching and sequence distances
Ali Osman Berk Şapcı edited this page Nov 25, 2025
·
1 revision
In addition to indexing multiple references together and performing one-to-many queries, krepp can also create a sketch from a single FASTA/Q file (which can be a complete/draft assembly, perhaps a genome skim, or even a sample) for practical analysis (one-to-one).
For this, you can simply run
krepp sketch -i /path/to/input.fasta -o /path/to/sketch --num-threads 16where -i is a URL or a filepath containing reference sequences from which k-mers will be extracted, and -o is the path to save the resulting sketch as a single binary file.
A sketch can not be used for phylogenetic placement, but you can efficiently seek query sequences in a sketch to get distance estimates by running
krepp seek -i /path/to/sketch -q /path/to/query.fastq --num-threads 16The output is in a tab-separated format with two columns: the sequence ID and the distance estimate.