Skip to content

Quickstart

Ali Osman Berk Şapcı edited this page Nov 24, 2025 · 2 revisions

Install

Install krepp via Bioconda: conda install bioconda::krepp. Details are here.

Get a reference index ready

Select a pre-built index from the catalogue of available indexes or build one. For example, you can download Web of Life v1 (download size: 41 GB), which is a microbial index consisting of bacterial and archaeal genomes.

wget --no-check-certificate https://ter-trees.ucsd.edu/data/krepp/index_WoLv1-k29w35-h14.tar.gz
tar -xzvf index_WoLv1-k29w35-h14.tar.gz

The resulting directory is an index that you can use to perform queries. This index would require ~53 GB of memory.

The metadata of the genomes that are in this index is here. This might be handy as krepp reports with respect to genome IDs, not taxonomic labels. The phylogeny is also available, which you can use to perform phylogenetic placement.

Estimate distances of reads to reference genomes

Once the index is untarred, you can simply run krepp dist to get distance estimates of individual reads to all sufficiently close references in WoL-v1.

krepp dist -i index_WoLv1-k29w35-h14 -q /path/to/query.fastq --num-threads 16

By default, krepp outputs to stdout; you can use the -o option to write the result into a file.

If you are interested in the composition of the entire query file (e.g., a metagenomic sample) rather than individual reads, use the --summarize flag to count the number of hits for each reference genome in the index across your query FASTQ.

Run krepp dist --help to see other options, or refer to this section.

Place reads onto a backbone phylogeny

In addition to distance estimation, you could place reads on the backbone tree given. If the index was built with a phylogeny, you don't need to specify a backbone tree (which is the case for the WoL v1 index).

krepp place -i index_WoLv1-k29w35-h14 -q /path/to/query.fastq --num-threads 16 -o /path/to/output.jplace

By default, krepp uses the jplace format to report placements. You can use gappa to process jplace files downstream and visualize. If you would like to have your output in a tabular format, which could be easier to process, use the --tabular flag. The --summarize option applies to krepp place as well.

If the index that you will use does not have a phylogeny, or if you would like to use a different phylogeny than the one in the index, you can provide a backbone tree (a Newick file) with the -t (--nwk-file) option. In this case, the tip labels of the tree and the genome IDs in the index have to match.

Alternatively, you can perform taxonomic assignments via the -l (--lineage-file) option. This option expects a Greengenes2/GTDB style lineage file, which is essentially a mapping between reference genome IDs in the index and taxonomic lineages (see an example here). For WoL-v1, you can use this file. It is also possible to decorate a phylogeny with taxonomic labels using tax2tree.

For the other options, either run krepp place --help or refer to this section.

Clone this wiki locally