Badlon 0.1.3

Description:

badlon 0.1.3

Badlon
Installation
Bablon can be installed with pip:
pip install badlon

Now you can run tool from any directory as badlon.
Pipeline Usage
Modules
Badlon includes multiple modules to process data. They can be listed with help command:
$ badlon --help
usage: badlon [-h] {prepare,analysis,match} ...

Tool for block based analysis of bacterial populations. Choose one of available modules.

positional arguments:
{prepare,analysis,match}
prepare Prepare draft dataset for SibeliaZ.
analysis Analyze pan-genome and other block-based features based on synteny blocks.
match Performs matching of block and genes based on coordinates.

optional arguments:
-h, --help show this help message and exit

Here is recommended pipeline to process data with badlon:
Step 1: prepare data with PanACoTA pipeline
If you have genomes in some folder called some_folder (one file for genome), we suggest preparing data for badlon using PanACoTA pipeline.
To do so, you can use those commands:
1.1 Preparing data and tables with PanACoTA prepare module:
PanACoTA prepare --norefseq --min 0 --max 1 -o 1-prepare -d some_folder --cutn 125

--min 0 --max 1 are used to keep all genomes, parameter can be changed depending on task as well as all other parameters;
For check other parameters visit PanACoTA prepare documentation.

1.2 Annotating genomes with PanACoTA annotate module:
PanACoTA annotate --info 1-prepare/L* -r 2-annotate -n ESCO --threads 16

You can change label -n ESCO depending on your species (ESCO is for Escherichia coli);
For check parameters visit PanACoTA annotate documentation.

1.3 Calling orthology genes using PanACoTA pangenome module:
PanACoTA pangenome -l 2-annotate/LSTINFO-* -n ESCO -d 2-annotate/Proteins/ -o 3-pangenome

You can change -i which is minimum sequence identity to be considered in the same cluster (float between 0 and 1). Default is 0.8.
For check parameters visit PanACoTA pangenome documentation.

Step 2: Preparing data for alignment with badlon prepare module
Prepare module is used to prepare data for using SibeliaZ package keeping all necessary information: genome labels and chromosome numbers.
Parameters can be checked with help option:
$ badlon prepare --help
usage: badlon prepare [-h] --folder FOLDER [--contigs CONTIGS]
[--output OUTPUT]
[--annotate_subfolder ANNOTATE_SUBFOLDER]
[--min_len MIN_LEN]

optional arguments:
-h, --help show this help message and exit
--contigs CONTIGS, -c CONTIGS
Number of maximum contigs to take from every genome.
By default, keeps all.
--output OUTPUT, -o OUTPUT
Output file path.
--annotate_subfolder ANNOTATE_SUBFOLDER, -a ANNOTATE_SUBFOLDER
Subfolder of PanACoTA contains results of annotate
module. Used for finding LSTINFO file. Default is
'2-annotate'.
--min_len MIN_LEN, -l MIN_LEN
Minimum contig length, less then that value will be
filtered. Default is 1000.

Required arguments:
--folder FOLDER, -f FOLDER
Folder with PanACoTA output. Will be used to search
genome files based on LSTINFO file from annotate
module.

Example command:
badlon prepare -f 2-annotate -o for_sibeliaz.fna

Step 3: Obtaining blocks with SibeliaZ
3.1 Running SibeliaZ with recommended command based on badlon prepare output.
Example:
sibeliaz -k 15 -a 100 -n -t 32 -o sibeliaz_out for_sibeliaz.fna

Watch out -a it needs to be equal around number_of_genome * 20, badlon prepare calculates it automatically.

3.2 Obtaining blocks from alignment
Check recommended command from badlon prepare module output. Usually it's (blocks minimal size 3000):
cd sibeliaz_out
echo $'30 150\n100 500\n500 1500' > fine.txt
maf2synteny -s fine.txt -b 3000 blocks_coords.gff

Step 4: Calculating block based statistics and charts with badlon analysis module:
Parameters can be checked with help option:
$ badlon analysis --help
usage: badlon analysis [-h] --blocks_file BLOCKS_FILE --type {chr,contig}
[--output OUTPUT]

optional arguments:
-h, --help show this help message and exit
--output OUTPUT, -o OUTPUT
Path to output folder. Default: blockomics_output.

Required arguments:
--blocks_file BLOCKS_FILE, -b BLOCKS_FILE
Blocks resulted as output of original Sibelia or
maf2synteny tool. Usually it's
sibeliaz_out/3000/block_coords.txt file.
--type {chr,contig}, -t {chr,contig}
Type of genome assembly, either 'chr' or 'contig'

Example command:
cd ..
badlon analysis -b sibeliaz_out/3000/blocks_coords.txt

Step 5 (optional): Match block and genes annotation with badlon match module
Parameters can be checked with help option:
$ badlon match --help
usage: badlon match [-h] --blocks_file BLOCKS_FILE --annotated_folder
ANNOTATED_FOLDER --pangenome_file PANGENOME_FILE --type
{chr,contig} [--output OUTPUT]

optional arguments:
-h, --help show this help message and exit
--output OUTPUT, -o OUTPUT
Path to output folder. Default: blockomics_output.

Required arguments:
--blocks_file BLOCKS_FILE, -b BLOCKS_FILE
Blocks folder resulted as output of original Sibelia
or maf2synteny tool. Usually it's `sibeliaz_out/3000/`
folder.
--annotated_folder ANNOTATED_FOLDER, -a ANNOTATED_FOLDER
LSTINFO folder path, output of `annotate` step of
PanACoTA.
--pangenome_file PANGENOME_FILE, -pg PANGENOME_FILE
File .lst with orthologous genes, output of
`pangenome` step of PanACoTA.
--type {chr,contig}, -t {chr,contig}
Type of genome assembly, either 'chr' or 'contig'

Example command:
badlon match -b sibeliaz_out/3000/blocks_coords.txt -a 2-annotate/ -pg 3-pangenome/*.lst -t contig

Overview

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

You're allowed to use the code bits in the repositories in unlimited projects.
Attribution is not required to use the code bits.

What you can do with it

Use them freely in your personal and professional work.

What you can't do with it

Don't be greedy. Selling or distributing these repositories in their original state is prohibited.

zed

Languages

Categories

Description:

License:

Share

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

More From This Creator

CSS Menu

CLI Spinners

Type Fest

dtm-main

es-toolkit

badlon 0.1.3

Languages

Categories

Description:

License:

Share

Customer Reviews

License

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

zed

More From This Creator

CSS Menu

CLI Spinners

Type Fest

dtm-main

es-toolkit