ftarc 0.2.4

Creator: bigcodingguy24

Last updated:

Add to Cart

Description:

ftarc 0.2.4

ftarc
FASTQ-to-analysis-ready-CRAM Workflow Executor for Human Genome Sequencing



Installation
$ pip install -U ftarc

Dependent commands:

pigz
pbzip2
bgzip
tabix
samtools (and plot-bamstats)
gnuplot
java
gatk
cutadapt
fastqc
trim_galore
bwa or bwa-mem2

Docker image
Pull the image from Docker Hub.
$ docker image pull dceoy/ftarc

Usage
Create analysis-ready CRAM files from FASTQ files



input files
output files




read1/read2 FASTQ (Illumina)
analysis-ready CRAM





Download hg38 resource data.
$ ftarc download --dest-dir=/path/to/download/dir



Write input file paths and configurations into ftarc.yml.
$ ftarc init
$ vi ftarc.yml # => edit

Example of ftarc.yml:
---
reference_name: hs38DH
adapter_removal: true
metrics_collectors:
fastqc: true
picard: true
samtools: true
resources:
reference_fa: /path/to/GRCh38_full_analysis_set_plus_decoy_hla.fa
known_sites_vcf:
- /path/to/Homo_sapiens_assembly38.dbsnp138.vcf.gz
- /path/to/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
- /path/to/Homo_sapiens_assembly38.known_indels.vcf.gz
runs:
- fq:
- /path/to/sample01.WGS.R1.fq.gz
- /path/to/sample01.WGS.R2.fq.gz
- fq:
- /path/to/sample02.WGS.R1.fq.gz
- /path/to/sample02.WGS.R2.fq.gz
- fq:
- /path/to/sample03.WGS.R1.fq.gz
- /path/to/sample03.WGS.R2.fq.gz
read_group:
ID: FLOWCELL-1
PU: UNIT-1
SM: sample03
PL: ILLUMINA
LB: LIBRARY-1



Create analysis-ready CRAM files from FASTQ files
$ ftarc pipeline --yml=ftarc.yml --workers=2

Standard workflow:

Trim adapters

trim_galore


Map reads to a human reference genome

bwa mem (or bwa-mem2 mem)


Mark duplicates

gatk MarkDuplicates
gatk SetNmMdAndUqTags


Apply BQSR (Base Quality Score Recalibration)

gatk BaseRecalibrator
gatk ApplyBQSR


Remove duplicates

samtools view


Validate output CRAM files

gatk ValidateSamFile


Collect QC metrics

fastqc
samtools
gatk





Preprocessing and QC-check


Validate BAM or CRAM files using Picard
$ ftarc validate /path/to/genome.fa /path/to/aligned.cram



Collect metrics from FASTQ files using FastQC
$ ftarc fastqc read1.fq.gz read2.fq.gz



Collect metrics from FASTQ files using FastQC
$ ftarc samqc /path/to/genome.fa /path/to/aligned.cram



Apply BQSR to BAM or CRAM files using GATK
$ ftarc bqsr \
--known-sites-vcf=/path/to/Homo_sapiens_assembly38.dbsnp138.vcf.gz \
--known-sites-vcf=/path/to/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz \
--known-sites-vcf=/path/to/Homo_sapiens_assembly38.known_indels.vcf.gz \
/path/to/genome.fa /path/to/markdup.cram



Remove duplicates in marked BAM or CRAM files
$ ftarc dedup /path/to/genome.fa /path/to/markdup.cram



Run ftarc --help for more information.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.