bioconvert 1.1.1

Last updated:

0 purchases

bioconvert 1.1.1 Image
bioconvert 1.1.1 Images
Add to Cart

Description:

bioconvert 1.1.1

Bioconvert
Bioconvert is a collaborative project to facilitate the interconversion of life science data from one format to another.












contributions:
Want to add a convertor ? Please join https://github.com/bioconvert/bioconvert/issues/1




Overview
Life science uses many different formats. They may be old, or with complex syntax and converting those formats may be a challenge. Bioconvert aims at providing a common tool / interface to convert life science data formats from one to another.
Many conversion tools already exist but they may be dispersed, focused on few specific formats, difficult to install, or not optimised. With Bioconvert, we plan to cover a wide spectrum of format conversions; we will re-use existing tools when possible and provide facilities to compare different conversion tools or methods via benchmarking. New implementations are provided when considered better than existing ones.
In Jan 2023, we had 50 formats, 100 direct conversions available.



Installation
BioConvert is developped in Python. Please use conda or any Python environment manager to install BioConvert using the pip command:
pip install bioconvert
50% of the conversions should work out of the box. However, many conversions require external tools. This is why we
recommend to use a conda environment. In particular, most external tools are available on the bioconda channel.
For instance if you want to convert a SAM file to a BAM file you would need to install samtools as follow:
conda install -c bioconda samtools
Since bioconvert is available on bioconda on solution that installs BioConvert and all its dependencies is to use conda/mamba:
conda env create --name bioconvert mamba
conda activate bioconvert
mamba install bioconvert
bioconvert --help
See the Installation section for more details and alternative solutions (docker, singularity).


Quick Start
There are many conversions available. Type:
bioconvert --help
to get a list of valid method of conversions. Taking the example of a conversion from a FastQ file into
a FastA file, you could do the conversion as follows:
bioconvert fastq2fasta input.fastq output.fasta
bioconvert fastq2fasta input.fq output.fasta
bioconvert fastq2fasta input.fq.gz output.fasta.gz
bioconvert fastq2fasta input.fq.gz output.fasta.bz2
When there is no ambiguity, you can be implicit:
bioconvert input.fastq output.fasta
The default method of conversion is used but you may use another one. Checkout the available methods with:
bioconvert fastq2fasta --show-methods
For more help about a conversion, just type:
bioconvert fastq2fasta --help
and more generally:
bioconvert --help
You may also call BioConvert from a Python shell:
# import a converter
from bioconvert.fastq2fasta import FASTQ2FASTA

# Instanciate with infile/outfile names
convert = FASTQ2FASTA(infile, outfile)

# the conversion itself:
convert()


Available Converters

Conversion table






Converters
CI testing
Default method



abi2fasta


BIOPYTHON

abi2fastq


BIOPYTHON

abi2qual


BIOPYTHON

bam2bedgraph


BEDTOOLS

bam2bigwig


DEEPTOOLS

bam2cov


BEDTOOLS

bam2cram


SAMTOOLS

bam2fasta


SAMTOOLS

bam2fastq


SAMTOOLS

bam2json


BAMTOOLS

bam2sam


SAMBAMBA

bam2tsv


SAMTOOLS

bam2wiggle


WIGGLETOOLS

bcf2vcf


BCFTOOLS

bcf2wiggle


WIGGLETOOLS

bed2wiggle


WIGGLETOOLS

bedgraph2bigwig


UCSC

bedgraph2cov


BIOCONVERT

bedgraph2wiggle


WIGGLETOOLS

bigbed2bed


DEEPTOOLS

bigbed2wiggle


WIGGLETOOLS

bigwig2bedgraph


DEEPTOOLS

bigwig2wiggle


WIGGLETOOLS

bplink2plink


PLINK

bplink2vcf


PLINK

bz22gz


Unix commands

clustal2fasta


BIOPYTHON

clustal2nexus


GOALIGN

clustal2phylip


BIOPYTHON

clustal2stockholm


BIOPYTHON

cram2bam


SAMTOOLS

cram2fasta


SAMTOOLS

cram2fastq


SAMTOOLS

cram2sam


SAMTOOLS

csv2tsv


BIOCONVERT

csv2xls


Pandas

dsrc2gz


DSRC software

embl2fasta


BIOPYTHON

embl2genbank


BIOPYTHON

fasta2clustal


BIOPYTHON

fasta2faa


BIOCONVERT

fasta2fasta_agp


BIOCONVERT

fasta2fastq


PYSAM

fasta2genbank


BIOCONVERT

fasta2nexus


GOALIGN

fasta2phylip


BIOPYTHON

fasta2twobit


UCSC

fasta_qual2fastq


PYSAM

fastq2fasta


BIOCONVERT available

fastq2fasta_qual


BIOCONVERT

fastq2qual


READFQ

genbank2embl


BIOPYTHON

genbank2fasta


BIOPYTHON

genbank2gff3


BIOCODE

gfa2fasta


BIOCONVERT

gff22gff3


BIOCONVERT

gff32gff2


BIOCONVERT

gff32gtf


BIOCONVERT

gz2bz2


pigz/pbzip2 software

gz2dsrc


DSRC software

json2yaml


Python

maf2sam


BIOCONVERT

newick2nexus


GOTREE

newick2phyloxml


GOTREE

nexus2clustal


GOALIGN

nexus2fasta


BIOPYTHON

nexus2newick


GOTREE

nexus2phylip


GOALIGN

nexus2phyloxml


GOTREE

ods2csv


pyexcel library

pdb2faa


BIOCONVERT

phylip2clustal


BIOPYTHON

phylip2fasta


BIOPYTHON

phylip2nexus


GOALIGN

phylip2stockholm


BIOPYTHON

phylip2xmfa


BIOPYTHON

phyloxml2newick


GOTREE

phyloxml2nexus


GOTREE

plink2bplink


PLINK

plink2vcf


PLINK

sam2bam


SAMTOOLS

sam2cram


SAMTOOLS

sam2paf


BIOCONVERT

scf2fasta


BIOCONVERT

scf2fastq


BIOCONVERT

sra2fastq


FASTQDUMP

stockholm2clustal


BIOPYTHON

stockholm2phylip


BIOPYTHON

tsv2csv


BIOCONVERT

twobit2fasta


DEEPTOOLS

vcf2bcf


BCFTOOLS

vcf2bed


BIOCONVERT

vcf2bplink


PLINK

vcf2plink


PLINK

vcf2wiggle


WIGGLETOOLS

wig2bed


BEDOPS

xls2csv




xlsx2csv


Pandas library

xmfa2phylip


BIOPYTHON

yaml2json


Pandas library





Contributors
Setting up and maintaining Bioconvert has been possible thanks to users and contributors.
Thanks to all:



Changes


Version
Description



1.1.1

Fix benchmark labels.
NEW: fast52pod5 conversion
FIX: set goalign and gotree instead of go requirements



1.1.0

Implement ability to benchmark the CPU and memory usage (not just time)
benchmark incorporates CPU/memory usage



1.0.0

Fix bam2fastq for paired data that computed useless intermediate file
https://github.com/bioconvert/bioconvert/issues/325
more realistic fastq simulator
pin openpyxl to <=3.0.10 to prevent regression error in v3.1.0



0.6.3

add picard method in bam2sam
Fixed all CI workflows to use mamba
drop python3.7 support and add 3.10 support
update bedops test file to fit the latest bedops 2.4.41 version
revisit logging system



0.6.2

added gff3 to gtf conversion.
Added pdb to faa conversion
Added missing –reference argument to the cram2sam conversion



0.6.1

output file can be in sub-directories allowing syntax such as
‘bioconvert fastq2fasta test.fastq outputs/test.fasta
fix all CI actions
add more examples as notebooks in ./examples
add a Snakefile for the paper in ./doc/Snakefile_paper



0.6.0

Fix bug in bam2sam (method sambamba)
Fix graph layout
add threading in fastq2fasta (seqkit method)
multibenchmark feature added
stable version used for web interface



0.5.2

Update requirements and environment.yml and add a conda spec-file.txt file



0.5.1

add genbank2gff3 requirement material in bioconvert.utils.biocode



0.5.0

Add CI actions for all converters
remove sniffer (now in biosniff on pypi https://pypi.org/project/biosniff/)
A complete benchmarking suite (see doc/Snakefile_benchmark file and
benchmarking)
documentation and tests for all converters
removed the validators (we assume intputs are correct)



0.4.X

(aug 2019) added nexus2fasta, cram2fasta, fasta2faa … ; 1-to-many and
many-to-one converters are now part of the API.



0.3.X
may 2019. new methods abi2qual, bigbed2bed, etc. added –threads option

0.2.X
aug 2018. abi2fastx, bioconvert_stats tool added

0.1.X
major refactoring to have subcommands with implicit/explicit mode

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.