datascience_cookiecutter 0.3.5

Creator: bradpython12

Last updated:

Add to Cart

Description:

datascience cookiecutter 0.3.5

Table of Contents

Motivation
Features
Installation
Basic Usage
Default Template
Customizing Templates
Makefile
PDM
pytest

🍪 Data Science Cookiecutter
The Data Science Cookiecutter 🍪 is an opinionated, yet configurable, Python project that provides a template for organizing and setting up data science projects. It uses the Cookiecutter project structure to create a standardized and reproducible project layout.
🎯 Motivation
Data science projects often require a well-structured project layout to ensure reproducibility and collaboration. The Data Science Cookiecutter aims to solve this problem by providing a project template. While it follows certain opinions about project organization, it also allows for easy customization to fit different project needs.
✨ Features

Standardized project structure for data science projects
Automatic generation of project files and folders
Customizable templates for different project needs
Customizable for multiple programming languages (currently, the default template currently only has Python)
Easy project initialization with just a few command-line arguments

⚙️ Installation
The Data Science Cookiecutter is on pypi and can be installed using pip, poetry, pdm or conda.
pdm add datascience-cookiecutter

🚀 Basic Usage
To create a new data science project using the Data Science Cookiecutter, follow these steps:

Open a terminal or command prompt.
cd to the directory where you want to create the project.
Run the following command: cookiecutter myproject where myproject is the name of your project.
profit 🎉

📁 Default Template
.
├── Makefile <- Makefile for project automation
├── README.md <- Project documentation and instructions
├── pyproject.toml <- Configuration file for dependencies and project metadata
├── data <- Folder to store data
│ ├── final <- Folder for final processed data
│ ├── processed <- Folder for intermediate processed data
│ ├── raw <- Folder for raw data
│ └── sim <- Folder for simulated data
├── dev <- Folder for development-related files
│ ├── notebooks <- Folder for Jupyter notebooks
│ └── scripts <- Folder for development scripts
├── docs <- Folder for project documentation
├── myproject <- Placeholder folder for the project itself (replaced with your project name)
│ ├── __init__.py <- Python package initialization file
│ └── main.py <- Main Python script for the project
├── references <- Folder for reference materials
├── reports <- Folder for project reports
│ ├── img <- Folder for images and visualizations used in reports
│ └── report.md <- Sample report file (Markdown format)
└── tests <- Folder for project tests

🛠️ Customizing Templates
If you want to customize the default template used by cookiecutter, you can create a templates.py file in your $HOME/.config/cookiecutter directory. Follow these steps:

Open a text editor and create a new file called templates.py.
Import the necessary classes Folder and FileTemplate by adding the following lines to templates.py:

from datascience_cookiecutter import Folder, FileTemplate

Define your custom template using the Folder and FileTemplate classes. Here's a minimal example:
MYTEMPLATE = Folder(
name="{{name}}",
subfolders=[
Folder(name="src", files=[FileTemplate(filename="main.py", content="print('Hello, world!')")]),
Folder(name="data"),
Folder(name="docs"),
],
files=[
FileTemplate(filename="README.md", content="# My {{name}}"),
],
)

Occurences of {{name}} will be replaced by the project name as provided
with cookiecutter myprojectname. To use your custom template, simply run the
cookiecutter command with the --template option followed by the name of
your custom template. For example:
$ cookiecutter myproject --template=MYTEMPLATE

Enjoy customizing your templates! ✨🧙‍♂️
🛠️ Makefile
A Makefile is a file containing a set of instructions, known as targets, used to automate tasks in software development. It provides a convenient way to define and organize common commands for building, testing, and managing a project.
In the provided Makefile, you have the following targets:

install: Installs project dependencies using pdm install.
test: Runs project tests with pytest
format: Applies code formatting using isort and black.
lint: Performs linting and static type checking using ruff and mypy

To use the Makefile, open a terminal or command prompt, navigate to your project directory, and run the desired target using the make command followed by the target name. For example:
make install

❤️ PDM
PDM is a Python package manager and build tool that provides an alternative to other package managers like pip or Poetry. It aims to simplify and enhance the management of project dependencies, virtual environments, and building distributions. Follow the link to install it. If you dont want to use it, you can customize the template to create your own Makefile and pyproject.toml.
The template (and PDM) follow the PEP 621 standard for project metadata to use a pyproject.toml file instead of setup.py. This file contains the project metadata and dependencies. It also allows you to specify details like the Python version and the project entry point.
🔬 pytest
Pytest is a Python testing framework that allows you to write simple and scalable tests with a clean and expressive syntax. It provides powerful features like fixtures, test discovery, and test selection.
For more information, you can visit the official pytest website: pytest.org

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.