PMPrimer 1.0.7

Last updated:

0 purchases

PMPrimer 1.0.7 Image
PMPrimer 1.0.7 Images
Add to Cart

Description:

PMPrimer 1.0.7

PMPrimer: a Python-based tool for automated design and evaluation of multiplex PCR primer pairs for diverse templates
Unlike single sequence, primer design on multiple sequences needs data quality filtering, multiple sequence alignment, conservative region identification, primer design for multiple templates, and evaluation of coverage and taxon specificity on multiple templates.
PMPrimer is a Python-based tool for automated design and evaluation of multiplex PCR primer pairs for diverse templates. The greatest strength of PMPrimer is the ability to identify conservative region using Shannon's entropy method, tolerate gaps using haplotype-based method, and evaluate multiplex PCR primer pairs according to template coverage and taxon specificity. PMPrimer were tested in datasets with diversified conservation and data size, including tuf genes of Staphylococci, hsp65 genes of Mycobacteriaceae, and 16S ribosomal RNA genes of Archaea. By comparison with previously designed primers and existing tools, PMPrimer has outstanding performance, such as automation, big data support, higher accuracy, and reasonable running time.
PMPrimer is a command-line tool. To handle diversified characteristics of target sequences, PMPrimer has multiple built-in parameters to make appropriate adjustments for data processing. The only necessary input file for PMPrimer is multiple sequences of target gene in FASTA format. Results are provided as JSON and CSV format files, which can be easily loaded into other programming languages for further analysis.
0x01 Installation
The software requires third-party packages as follow: primer3-py, biopython, matplotlib, pandas, seaborn and blast (2.13.0).
All python-based third-party packages will be automatically installed by pip. User needs install blast according to the documentation at BLAST OFFICIAL WEBSITE.
PMPrimer has been tested with Python3.8+ on Linux and Windows.


Install by pip
We recommend that you install PMPrimer through pip using the command: pip install PMPrimer or pip3 install PMPrimer. pip will automatically install all python-based packages and add a command pmprimer in your system.


Install by code
Download PMPrimer source code from this repository, decompress the source code, install all dependency packages by yourself in your system, then you can run this software by use python3 PMPrimer/pmprimer.py.


0x02 Usage
Use help command pmprimer --help to show usage message. The detailed information is provided as follow.


Input file

--file -f

The only necessary input file for PMPrimer is multiple sequences of target gene in FASTA format. It can be unaligned or aligned. If you assign an unaligned file, you need assignment -a muscle parameter in primer design model to do multiple sequence alignment.
Use like : pmprimer -f seqs.fasta or pmprimer --file seqs.fasta


Basic process

--progress -p

In basic process, PMPrimer assesses data quality, calculates length distribution of input data, filters low quality templates (such as too small and too long templates) according to length distribution, removes redundant templates in terminal taxa in default. If input file has been manually checked, you can use -p notlen to disable length filter based on length distribution. Similarly, you can use -p notsameseq to disable redundant filter based on sequence comparison in terminal taxa. Moreover, if you want to draw heatmap according to distances of multiple templates, you can use -p matrix to trigger drawing heatmap according to distances of multiple templates.
All parameter as follows :

notlen Do not processing sequences according to distribution of length
notsameseq Do not remove redundancy sequences in terminal taxa
matrix Draw heatmap according to distances of multiple templates

Add parameters behind -p, use like : -p notlen notsameseq matrix, order doesn't matter.


Primer design

--alldesign -a

This module includes three steps, multiple sequence alignment by muscle parameter, conservative region identification by threshold minlen gaps merge parameters, and primer design by tm primer2 parameters. If you assign an unaligned file, you must asign muscle parameter in multiple sequence alignment step. You can adjust parameters for conservative region identification to obtain optimal conservative regions. You need assign primer2 to trigger primer design when you get optimal conservative regions.
All parameter as follows :

muscle Use MUSCLE to align multiple sequences, the alignment sequences will be saved as *.mc.fasta
threshold Threshold for identify conservative region, default is 0.95 frequency for major allele, use like threshold:0.95
minlen Minimum length for identifying conservative region, default is 15, use like minlen:15
gaps Maximum rate of gaps, default is 0.1, use like gaps:0.1
merge Use merge module to merge conservative regions
rank1 Rank according to diversity score of non conservative regions and display them
rank2 Rank according to diversity score of conservative regions and display them
haplo Display haploType sequence number of conservative regions
tm Minimum melting temperature, default is 50, use like tm:50.0
primer2 Primer design

Add parameters behind -a, use like : -a threshold:0.96 minlen:16 merge primer2, order doesn't matter.


Amplicon selection and evaluation

--evaluate -e

To evaluate amplicon specificity, we use blast to align amplicon primer pairs to target fasta files. We can assign multiple target fasta files by using "," to split file pathes.
All parameter as follows :

hpcnt Maximum count of primer haploType sequences, default is 10, use like hpcnt:10
minlen Minimum length of amplicon, default is 150bp, use like : minlen:150
maxlen Maximum length of amplicon, default is 1500bp, use like : maxlen:1500
blast Use blast to evaluate amplicon specificity by aligning amplicon primer pairs to target fasta files, use ',' to split multiple target files, use like blast:seqs.fasta,hosts.fasta
rmlow Remove primers with melting temperature lower than parameter tm in primer design module
save Save final results

Add parameters behind -e, use like : -e hpcnt:50 maxlen:1200 save, order doesn't matter.


Debug mode

--debuglevel -d

All parameter is number :

1 Basic debug info
2 Module debug info
4 Complex debug info

If need debug info, use like : -d 1 in normal conditions, this number use binary calculate.


0x03 Output


length filter result
If you use default parameter in basic process (not use notlen), the software will filter input sequences according to length distribution and save the length filter result as seqs.filt.fasta.


redundant filter result
If you use default parameter in basic process (not use notsameseq), the software will filter input sequences according to sequence comparison in terminal taxa and save the redundant filter result as seqs.filt.fasta.


heatmap result
If you use matrix parameter in basic process, the software will create a heatmap figure according to distances of multiple templates and save the figure as tmp.png.


multiple sequence alignment
If you use muscle parameter in primer design module to do multiple sequence alignment, the software will save multiple sequence alignment as seqs.mc.fasta.


amplicon primer pairs
The final result is saved as JSON and CSV format files (timestamp_recommand_region_primer.json and timestamp_recommand_region_primer.csv). The JSON file can be easily loaded into Python, R, or other programming languages for further analysis. The CSV file provides a user readable result file, which has seven columns titled as "Amplicon, Coverage, Taxon specificity, Effective length, Forward degenerate primer, Forward haplotype primers, Forward primer info, Reverse degenerate primer, Reverse haplotype primers, Reverse primer info". In Forward/Reverse primer info, five numbers are included, which are occurrence number of primer in multiple templates, melting temperature of primer, melting temperature of hairpin structure, melting temperature of homodimer structure, and coverage rate of primer in multiple templates.


0x04 Tips


If input file need data process, use like :
pmprimer -f seqs.fasta -p default

Filt sequences according to distribution of length,remove redundancy sequences in terminal taxa.



If input file need align, use like :
pmprimer -f seqs.fasta -a muscle primer2

This command will save alignments as 'seqs.mc.fasta'



If input file is alignments, use like :
pmprimer -f seqs.fasta -a primer2


If you want check result of conservative region identification, you can use rank1, rank2, and haplo parameters to show the result, use like :
pmprimer -f seqs.fasta -a rank1 rank2 haplo
After you obtain optimal conservative regions, You can use primer2 to trigger primer design.


0x05 Datasets In Paper
Dataset in paper can obtained from PMPrimer Datasets


16S ribosomal RNA (rRNA) genes of Archaea
Command in paper is : pmprimer -f Archaea_16SrRNA.rep.mc.fasta -a threshold:0.85 gaps:1.0 merge primer2 tm:45.0 -e hpcnt:600 save


hsp65 (groEL2) genes of Mycobacteriaceae
Command in paper is : pmprimer -f Mycobacteriaceae_groEL2.filt.mc.fasta -a primer2 -e hpcnt:30 save


tuf genes of Staphylococci
Command in paper is : pmprimer -f Staphylococcus_tuf.filt.mc.fasta -a threshold:0.995 minlen:5 merge primer2 -e save


0x06 Citing
Please cite this paper when useing PMPrimer for your publications.

A tool to automatically design multiplex PCR primer pairs for specific targets using diverse templates


https://doi.org/10.1038/s41598-023-43825-0

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.