ftarc 0.2.4
ftarc
FASTQ-to-analysis-ready-CRAM Workflow Executor for Human Genome Sequencing
Installation
$ pip install -U ftarc
Dependent commands:
pigz
pbzip2
bgzip
tabix
samtools (and plot-bamstats)
gnuplot
java
gatk
cutadapt
fastqc
trim_galore
bwa or bwa-mem2
Docker image
Pull the image from Docker Hub.
$ docker image pull dceoy/ftarc
Usage
Create analysis-ready CRAM files from FASTQ files
input files
output files
read1/read2 FASTQ (Illumina)
analysis-ready CRAM
Download hg38 resource data.
$ ftarc download --dest-dir=/path/to/download/dir
Write input file paths and configurations into ftarc.yml.
$ ftarc init
$ vi ftarc.yml # => edit
Example of ftarc.yml:
---
reference_name: hs38DH
adapter_removal: true
metrics_collectors:
fastqc: true
picard: true
samtools: true
resources:
reference_fa: /path/to/GRCh38_full_analysis_set_plus_decoy_hla.fa
known_sites_vcf:
- /path/to/Homo_sapiens_assembly38.dbsnp138.vcf.gz
- /path/to/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
- /path/to/Homo_sapiens_assembly38.known_indels.vcf.gz
runs:
- fq:
- /path/to/sample01.WGS.R1.fq.gz
- /path/to/sample01.WGS.R2.fq.gz
- fq:
- /path/to/sample02.WGS.R1.fq.gz
- /path/to/sample02.WGS.R2.fq.gz
- fq:
- /path/to/sample03.WGS.R1.fq.gz
- /path/to/sample03.WGS.R2.fq.gz
read_group:
ID: FLOWCELL-1
PU: UNIT-1
SM: sample03
PL: ILLUMINA
LB: LIBRARY-1
Create analysis-ready CRAM files from FASTQ files
$ ftarc pipeline --yml=ftarc.yml --workers=2
Standard workflow:
Trim adapters
trim_galore
Map reads to a human reference genome
bwa mem (or bwa-mem2 mem)
Mark duplicates
gatk MarkDuplicates
gatk SetNmMdAndUqTags
Apply BQSR (Base Quality Score Recalibration)
gatk BaseRecalibrator
gatk ApplyBQSR
Remove duplicates
samtools view
Validate output CRAM files
gatk ValidateSamFile
Collect QC metrics
fastqc
samtools
gatk
Preprocessing and QC-check
Validate BAM or CRAM files using Picard
$ ftarc validate /path/to/genome.fa /path/to/aligned.cram
Collect metrics from FASTQ files using FastQC
$ ftarc fastqc read1.fq.gz read2.fq.gz
Collect metrics from FASTQ files using FastQC
$ ftarc samqc /path/to/genome.fa /path/to/aligned.cram
Apply BQSR to BAM or CRAM files using GATK
$ ftarc bqsr \
--known-sites-vcf=/path/to/Homo_sapiens_assembly38.dbsnp138.vcf.gz \
--known-sites-vcf=/path/to/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz \
--known-sites-vcf=/path/to/Homo_sapiens_assembly38.known_indels.vcf.gz \
/path/to/genome.fa /path/to/markdup.cram
Remove duplicates in marked BAM or CRAM files
$ ftarc dedup /path/to/genome.fa /path/to/markdup.cram
Run ftarc --help for more information.
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.