Scikit Fingerprints 1.8.0 | GitLocker.com Product

Description:

scikitfingerprints 1.8.0

scikit-fingerprints

scikit-fingerprints is a Python library for efficient
computation of molecular fingerprints.
Table of Contents

Description
Supported platforms
Installation
Quickstart
Project overview
Contributing
License

Description
Molecular fingerprints are crucial in various scientific fields, including drug discovery, materials science, and
chemical analysis. However, existing Python libraries for computing molecular fingerprints often lack performance,
user-friendliness, and support for modern programming standards. This project aims to address these shortcomings by
creating an efficient and accessible Python library for molecular fingerprint computation.
You can find the documentation HERE
Main features:

scikit-learn compatible
feature-rich, with >30 fingerprints
parallelization
sparse matrix support
commercial-friendly MIT license

Supported platforms

python3.9
python3.10
python3.11
python3.12

Ubuntu - latest
✅
✅
✅
✅

Windows - latest
✅
✅
✅
✅

macOS - latest
only macOS 13 or newer
✅
✅
✅

Installation
You can install the library using pip:
pip install scikit-fingerprints

If you need bleeding-edge features and don't mind potentially unstable or undocumented functionalities,
you can also install directly from GitHub:
pip install git+https://github.com/scikit-fingerprints/scikit-fingerprints.git

Quickstart
Most fingerprints are based on molecular graphs (2D-based), and you can use SMILES
input directly:
from skfp.fingerprints import AtomPairFingerprint

smiles_list = ["O=S(=O)(O)CCS(=O)(=O)O", "O=C(O)c1ccccc1O"]

atom_pair_fingerprint = AtomPairFingerprint()

X = atom_pair_fingerprint.transform(smiles_list)
print(X)

For fingerprints using conformers (3D-based), you need to create molecules first
and compute conformers. Those fingerprints have requires_conformers attribute set
to True.
from skfp.preprocessing import ConformerGenerator, MolFromSmilesTransformer
from skfp.fingerprints import WHIMFingerprint

smiles_list = ["O=S(=O)(O)CCS(=O)(=O)O", "O=C(O)c1ccccc1O"]

mol_from_smiles = MolFromSmilesTransformer()
conf_gen = ConformerGenerator()
fp = WHIMFingerprint()
print(fp.requires_conformers) # True

mols_list = mol_from_smiles.transform(smiles_list)
mols_list = conf_gen.transform(mols_list)

X = fp.transform(mols_list)
print(X)

You can also use scikit-learn functionalities like pipelines, feature unions
etc. to build complex workflows. Popular datasets, e.g. from MoleculeNet benchmark,
can be loaded directly.
from skfp.datasets.moleculenet import load_clintox
from skfp.metrics import multioutput_auroc_score
from skfp.model_selection.scaffold_split import scaffold_train_test_split
from skfp.fingerprints import ECFPFingerprint, MACCSFingerprint
from skfp.preprocessing import MolFromSmilesTransformer

from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import make_pipeline, make_union

smiles, y = load_clintox()
smiles_train, smiles_test, y_train, y_test = scaffold_train_test_split(
smiles, y, test_size=0.2
)

pipeline = make_pipeline(
MolFromSmilesTransformer(),
make_union(ECFPFingerprint(count=True), MACCSFingerprint()),
RandomForestClassifier(random_state=0),
)
pipeline.fit(smiles_train, y_train)

y_pred_proba = pipeline.predict_proba(smiles_test)
auroc = multioutput_auroc_score(y_test, y_pred_proba)
print(f"AUROC: {auroc:.2%}")

Project overview
scikit-fingerprint brings molecular fingerprints and related functionalities into
the scikit-learn ecosystem. With familiar class-based design and .transform() method,
fingerprints can be computed from SMILES strings or RDKit Mol objects. Resulting NumPy
arrays or SciPy sparse arrays can be directly used in ML pipelines.
Main features:

Scikit-learn compatible: scikit-fingerprints uses familiar scikit-learn
interface and conforms to its API requirements. You can include molecular
fingerprints in pipelines, concatenate them with feature unions, and process with
ML algorithms.

Performance optimization: both speed and memory usage are optimized, by
utilizing parallelism (with Joblib) and sparse CSR matrices (with SciPy). Heavy
computation is typically relegated to C++ code of RDKit.

Feature-rich: in addition to computing fingerprints, you can load popular
benchmark datasets (e.g. from MoleculeNet), perform splitting (e.g. scaffold
split), generate conformers, and optimize hyperparameters with optimized cross-validation.

Well-documented: each public function and class has extensive documentation,
including relevant implementation details, caveats, and literature references.

Extensibility: any functionality can be easily modified or extended by
inheriting from existing classes.

High code quality: pre-commit hooks scan each commit for code quality (e.g. black,
flake8), typing (mypy), and security (e.g. bandit, safety). CI/CD process with
GitHub Actions also includes over 250 unit and integration tests.

Contributing
Please read CONTRIBUTING.md and CODE_OF_CONDUCT.md for details on our code of
conduct, and the process for submitting pull requests to us.
Citing
If you use scikit-fingerprints in your work, please cite our paper, available on ArXiv:
@misc{scikit-fingeprints,
title={Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python},
author={Jakub Adamczyk and Piotr Ludynia},
year={2024},
eprint={2407.13291},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2407.13291},
}

License
This project is licensed under the MIT License - see the LICENSE.md file for details.

Overview

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

You're allowed to use the code bits in the repositories in unlimited projects.
Attribution is not required to use the code bits.

What you can do with it

Use them freely in your personal and professional work.

What you can't do with it

Don't be greedy. Selling or distributing these repositories in their original state is prohibited.

zed

scikit-fingerprints 1.8.0

Languages

Categories

Description:

License

Share

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

More From This Creator

xdict 1.1.11

xdisplayselect 1.0.0

xfcs 1.1.6

xfcsdashboard 0.0.2

xfds 0.3.0

scikit-fingerprints 1.8.0

Languages

Categories

Description:

License

Share

Customer Reviews

License

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

zed

More From This Creator

xdict 1.1.11

xdisplayselect 1.0.0

xfcs 1.1.6

xfcsdashboard 0.0.2

xfds 0.3.0