Constructing a Tree
We provide make_tree.py for a one step tree generation process. Additionally, we offer get_cds_alignments.py to generate the necessary CDS alignment files for tree construction. You can use these CDS alignments as input to build trees using other methods. You can utilize download_asfv_genome.py to download all the latest ASFV genomes from NCBI. Alternatively, you can use the directory "single_fasta" provided on github containing 312 genomes.
make_tree.py
Description
188 CDS from NC_044959.2 (ASFV Georgia 2007/1) were used as a reference and employed Exonerate for pairwise sequence comparison. This facilitated the acquisition of the corresponding CDS for each ASFV isolate. These obtained results were then utilized as input for the tree construction of "denovo" mode using uDance. Subsequently, the generated tree served as a backbone for a second iteration of tree construction by adjusting the backbone option to 'tree' in uDance, allowing the replacement of unplaceable taxa (the taxa that could not find their optimal placement in the tree).
Arguments
| Argument name | Required | Description |
|---|---|---|
| -p, --processes | No | number of processes (default = 4) |
| -f, --file | Yes | a directory of multiple ASFV genome fasta files as input |
| -o, --output | Yes | name of output directory |
| --udance | Yes | path to udance directory |
Example
make_tree.py -p 4 -f single_fasta -o tree --udance ./uDance --iteration
Output
The final tree file is a Newick file named 'tree.nwk', located in the output dir specified by the -o parameter.
get_cds_alignments.py
Description
Get the CDS for each ASFV isolate and perform multiple sequence alignment. The output obtained can be directly used as input for tree building (this task is already included in make_tree.py).
Arguments
| Argument name | Required | Description |
|---|---|---|
| -f, --file | Yes | a directory of multiple ASFV genome fasta files as input |
| -c, --core | No | number of processes (default = 32) |
Example
get_cds_alignments.py -f single_fasta
Output
A dir "alignments" containing the multiple sequence alignment for each CDS.