Personal Information Tagger Based On Named Entity Recognition

Personal Information Tagger Based on Named Entity Recognition

Last updated:

0 purchases

Personal Information Tagger Based on Named Entity Recognition Image
Personal Information Tagger Based on Named Entity Recognition Images
Add to Cart

Description:

This project focuses on creating a Named Entity Recognition (NER) model capable of identifying key elements in a text, such as names, places, brands, and monetary values. It is particularly useful for sorting unstructured data and extracting important information from large datasets. The project utilizes Roberta architecture, Hugging Face transformers, and is deployed using AWS ECR and AWS EC2.

Features:

  • NER Tagging: Identifies and tags entities like names, places, brands, and values.
  • Cross-Lingual Support: Based on the XTREME dataset, supporting up to 40 typologically diverse languages.
  • Model Architecture: Utilizes Roberta for tokenization and training.
  • API Integration: FastAPI is used to deploy the model, offering an interactive Swagger UI.
  • Scalable Deployment: Hosted on AWS ECR and AWS EC2 for efficient and scalable access.

Requirements:

  • Software:
    • Python 3.7
    • Conda for environment management
    • AWS CLI for deployment
  • Libraries:
    • PyTorch
    • Transformers (Hugging Face)
    • FastAPI
    • Swagger for API documentation

Install dependencies:

 

bash

Copy code

pip install -r requirements.txt

Instructions:

  1. Setup Environment:

    • Create a new Conda environment:
       

      bash

      Copy code

      conda create -p ./env python=3.7 -y

    • Activate the environment:
       

      bash

      Copy code

      conda activate ./env

  2. Prepare Dataset:

    • Use the XTREME benchmark for diverse linguistic datasets.
    • Generate labeled data using tools like Displacy.
  3. Model Training:

    • Train the NER model using the Roberta architecture and tokenizer from Hugging Face.
    • Implement and test both training and prediction pipelines.
  4. Run Inference:

    • Start the application:
       

      bash

      Copy code

      python app.py

    • Access Swagger UI for API testing:
      • URL: http://localhost:8080/docs
  5. Deploy on AWS:

    • Build and push the Docker image to AWS ECR.
    • Deploy the containerized application to an AWS EC2 instance.

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Files In This Product: (if this is empty don't purchase this product)

Customer Reviews

There are no reviews.