Examples Project

This example will demonstrate: 1. Genome assembly (output assembled genome in fasta format) 2. Genome completeness evaluation (output completeness notation) 3. Recombination test (output recombination test table and plot) 4. Tree construction (output phylogenetic tree file in nwk format) Detailed instructions can be found later

Docker (Optional)

We provide a docker image to make the installation steps more convenient:

docker pull osvolo/anasfv:latest
docker container run -it osvolo/anasfv /bin/bash

Prepare your data

The test files are already included in the docker container. If you don't use Docker, you can get it by the following method.

① ONT reads (fasta or fastq file). You can download our test file using the following command.

wget https://github.com/lrslab/anasfv/releases/download/test_data.fasta/test_data.fasta

② Other ASFV genomes. You can directly use the single_fasta directory in this project, which contains 406 downloaded ASFV genomes.

git clone https://github.com/nimua/single_fasta.git

Or you can run download_asfv_genome.py, which will create a single_fasta directory in the working directory and download all the latest ASFV genomes on NCBI to the single_fasta directory.

download_asfv_genome.py

These genomes are used for mapping assembly and tree building.

Task 1 (Assembling a genome):

Using ONT reads of PCR-amplified ASFV to assemble a genome. (This task is optional. If you have already obtained an assembled genome by other methods, you can apply Tasks 2, 3, and 4)

mapping_assembly.py -p 4 -r single_fasta -i test_data.fasta -o genome.fasta --medaka r941_min_high_g303

Polish the homopolymers (Select the closest non-ONT sequenced ASFV genome as the reference genome in NCBI by blastn). Using MN194591.1.fasta as an example:

polish_asfv.py -i single_fasta/MN194591.1.fasta -r single_fasta/OR180113.1.fasta -m R9.4.pkl

Task 2 (Genome completeness evaluation):

We only established consensus gene sets for genotype I and genotype II. Using -c to assign consensus gene sets. Using MN194591.1.fasta as an example:

completeness.py single_fasta/MN194591.1.fasta -c II > MN194591.1_completeness.tsv

Example of result:

file_name size prodigal_gene_num with_MGF without_MGF
MN194591.1.fasta 191911 242 C:57.43%[D:0.0%],F:39.19%,M:3.38%,n:148 C:51.3%[D:0.0%],F:44.35%,M:4.35%,n:115

Task 3 (Recombination test):

Using OQ504956.1 as example:

recombination_test.py single_fasta/OQ504956.1.fasta > OQ504956.1_recombination_test.tsv
recombination_plot.py OQ504956.1_recombination_test.tsv

Recombination plot of OQ504956.1: Image cannot be loaded

Task 4 (Constructing a tree):

Using all genome files from "./single_fasta" and get a final file "tree.nwk" in Newick format

make_tree.py -p 4 -f single_fasta -o tree --udance ./uDance --iteration

Phylogenetic tree using single_fasta (visualized by iTOL): Image cannot be loaded