fragmentstein 2023.4a0

Creator: bradpython12

Last updated:

Add to Cart

Description:

fragmentstein 2023.4a0

Fragmentstein
version 2023.04
Creating a BAM files from non-sensitive fragments data (i.e. FinaleDB frag.tsv.bgz file) using sequences extracted from a reference genome.
Contents

Dependencies
Installation
Test usage
Arguments
Credits

Make sure you have all the dependencies and you will be able to run the program.
Dependencies

samtools version 1.7 or higher;
bedtools version v2.30.0 or higher;
awk version 20200816 or higher;
gunzip (gzip) version 1.6 or higher;
Python version 3.10 or higher, only if you install it as a Python package;

Installation
For installing fragmentstein from the Python PyPi repository:
pip install fragmentstein

Optional, you can install it in a dedicated Python environment:
conda create -n fragmentstein python=3.10 samtools bedtools -c bioconda
conda activate fragmentstein
pip install fragmentstein

The same you can do using the Mamba package manager:
mamba create -n fragmentstein python=3.10 samtools bedtools -c bioconda
mamba activate fragmentstein
pip install fragmentstein

Afterwards, you can use fragmentstein directly from your shell:
fragmentstein -h

Alternatively, you can install it from sources:
Clone the repository.
git clone https://github.com/uzh-dqbm-cmi/fragmentstein
cd fragmentstein

Add the path of the './scripts/fragmentstein.sh' into your PATH, best in your ~/.bashrc or ~/.zshrc using the following command:
echo 'export PATH=$(pwd)/scripts/fragmentstein.sh:$PATH' >> ~/.bashrc

The fragmentstein.sh script should be available in your shell:
fragmentstein.sh -h

Test usage
The following examples will show you how to do a test run
mkdir results
fragmentstein.sh -i -i tests/data/test_sample1.tsv.bgz -o results/test_sample1.bam \
-g tests/data/resources/test_ref_hg38.fna -c tests/data/resources/test_ref.chrom.sizes

You can install the Python wrapper also from sources as follows:
First install the Python dependency management and packaging tool called Poetry:
curl -sSL https://install.python-poetry.org | python3 -

Followed by installing the fragmentstein Python wrapper from the root of the cloned repository:
poetry install

To run tests use the following command:
poetry run pytest

Arguments
Required arguments
-i or --input Path to finaleDB frag.tsv.bgz file or .bed or .bedpe file. Expected are either a 6-column BED file or a 10-column paired-end BEDPE file.
-g or --genome Path to the reference genome fasta file.
-c or --chrom_sizes Chromosome sizes file.
Optional arguments
-o or --output Path to and name of the output BAM file. Default is to substitute the .tsv.gz part of the extension with .bam.
-r or --read_length Both reverse and forward reads of a fragment will have this length unless the fragment is shorter than the read length. Default: 101.
-qf or --map_quality_filter Minimum mapping quality. Setting it to '0' accepts all fragments. Default: 30.
-qd or --map_quality_default Mapping quality to set for example if missing from the input files or if you want to change it for downstream analyses. Default: 60.
-bq or --base_quality ASCII of Phred-scaled base QUALity+33. Default: F (quality: 37).
-N or --replace_incomplete_nucleotides Replace all incompletely specified nucleotides with N.
-s or --sort Sort the output BAM file by coordinate. No value has to be specified, just type -s for sorting.
-t or --threads Number of parallel threads to be used when possible. Default: 1.
--temp Temporary folder where to store intermediate temporary files. Default: same folder as the output file.
Credits
Fragmentstein is developed and maintained by Zsolt Balázs and Todor Gitchev.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.