anyGPT 0.1.0

Creator: bradpython12

Last updated:

Add to Cart

Description:

anyGPT 0.1.0

anyGPT







anyGPT is a general purpose library for training any type of GPT model. Support for gpt-1, gpt-2, and gpt-3 models.
Inspired by nanoGPT by Andrej Karpathy, the goal
of this project is to provide tools for the training and usage of GPT style large language models. The aim is to provide
a tool that is

production ready
easily configurable
scalable
free and open-source
accessible by general software engineers and enthusiasts
easily reproducible and deployable

You don't need a Ph.D. in Machine Learning or Natural Language Processing to use anyGPT.
Installation

NOTE: It is recommended that you set up of a python virtual environment
using mamba, conda,
or poetry.
To install anyGPT:

$ pip install anyGPT

Using Docker
The Docker image supports GPU passthrough for training and inference. In order to enable GPU
passthrough please follow the guide for installing the NVidia Container Toolkit for your OS.

NOTE On Windows you need to follow the guide to get NVidia Container Toolkit setup on WSL2. Docker WSL2 Backend is required.

Once NVidia Container Toolkit and Docker is setup correctly, build the Docker image
$ docker build -t anygpt .

Use the following command to login to the container interactively, and use
anygpt as if it was on your local host
$ docker run --gpus all -it anygpt

Mounting Volumes
It is recommended to mount a local directory into your container in order to
share data between your local host and the container. This will allow you to
save trained checkpoints, reuse datasets between runs and more.
$ docker run --gpus all -v /path/to/local/dir:/data -it anygpt

The above example mounts /path/to/local/dir to the /data directory in the container, and all data and changes are shared between them dynamically.
Non interactive Docker
The above documentation explains how to run a Docker container with an interactive session of anyGPT. You can
also run anyGPT commands to completion using Docker by overriding the entrypoint
$ docker run --gpus=all -v /path/to/your/data:/data --entrypoint anygpt-run -it anygpt /data/test.pt "hello world"

The above command runs anygpt-run with the parameters /data/test.pt "hello world"
Dependencies

torch >= 2.0.0
numpy
transformers
datasets
tiktoken
wandb
tqdm
PyYAML
lightning
tensorboard

Features
Current

CLI and config file driven GPT training
Supports CPU, GPU, TPU, IPU, and HPU
Distributed training strategies for training at scale
Easy spin up using Docker
FastAPI end-points for containerized microservice deployment
HuggingFace integration

Load pre-trained gpt models



Roadmap

Documentation
HuggingFace integration

push to hub


Easy spin VM spinup/getting started with

Downloading of pre-trained models
Gradio ChatGPT style interface for testing and experimentation


Fine-tuning of pre-trained models
Reinforcement Learning from Human Feedback and Rules Base Reward Modeling for LLM alignment
More dataformat support beyond hosted text files


Usage
Data Preparation
$ anygpt-prepare-data -n shakespeare_complete -u https://www.gutenberg.org/cache/epub/100/pg100.txt

Training
Create a config file. In this example, I'll call it gpt-2-30M.yaml. You can also check out the example configuration files.
model_config:
name: 'gpt-2-30M'
block_size: 256
dropout: 0.2
embedding_size: 384
num_heads: 6
num_layers: 6

training_config:
learning_rate: 1.0e-3
batch_size: 8
accumulate_gradients: 8
beta2: 0.99
min_lr: 1.0e-4
max_steps: 5000
val_check_interval: 200
limit_val_batches: 100

io_config:
experiment_name: 'gpt-2'
dataset: 'shakespeare_complete'

$ anygpt-train gpt-2-30M.yaml

Inference
$ anygpt-run results/gpt-2-pretrain/version_0/checkpoints/last.pt \
"JAQUES.
All the world’s a stage,
And all the men and women merely players;
They have their exits and their entrances,
And one man in his time plays many parts,"

Inference Microservice
anyGPT supports running models as a hosted microservice with a singular endpoint for inference.
To launch the microservice, use the anygpt-serve entrypoint.
Commandline Options
$ anygpt-serve -h
usage: anyGPT inference service [-h] [--port PORT] [--log-level LOG_LEVEL] model

Loads an anyGPT model and hosts it on a simple microservice that can run inference over the network.

positional arguments:
model Path t0 the trained model checkpoint to load

options:
-h, --help show this help message and exit
--port PORT Port to start the microservice on (default: 5000)
--host HOST Host to bind microservice to (default: 127.0.0.1)
--log-level LOG_LEVEL
uvicorn log level (default: info)

Example
$ anygpt-serve results/gpt-2-pretrain/version_0/checkpoints/last.pt --port 5000 --log-level info

Sending Requests
anygpt-serve uses FastAPI to serve the microservice.
To see the available microservice api go to the /docs endpoint in your browser once the microservice is started
Spinning up Microservice in Docker
$ docker run --gpus=all -v /path/to/your/data:/data -p 5000:5000 --entrypoint anygpt-serve -it anygpt /data/test.pt --port 5000 --host 0.0.0.0 --log-level info
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:5000 (Press CTRL+C to quit)

Documentation

Limitations
TBD
License
The goal of this project is to enable organizations, both large and small, to train and use GPT style
Large Language Models. I believe the future is open-source, with people and organizations being able to
train from scratch or fine-tune models and deploy to production without relying on gatekeepers. So I'm releasing this
under an MIT license for the benefit of all and in the hope that the community will find it useful.
Released under MIT by @any-LABS.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.