datacatalog-object-storage-processor 0.1.2

Creator: coderz1093

Last updated:

Add to Cart

Description:

datacatalogobjectstorageprocessor 0.1.2

datacatalog-object-storage-processor
A package for performing Data Catalog operations on object storage solutions.


Table of Contents


1. Environment setup

1.1. Get the code
1.2. Auth credentials

1.2.1. Create a service account and grant it below roles
1.2.2. Download a JSON key and save it as


1.3. Virtualenv

1.3.1. Install Python 3.6+
1.3.2. Create and activate an isolated Python environment
1.3.3. Install the dependencies
1.3.4. Set environment variables


1.4. Docker


2. Create DataCatalog entries based on object storage files

2.1. python main.py


3 Delete up object storage entries on entry group
Disclaimers



1. Environment setup
1.1. Get the code
git clone https://github.com/mesmacosta/datacatalog-object-storage-processor
cd datacatalog-object-storage-processor

1.2. Auth credentials
1.2.1. Create a service account and grant it below roles

Data Catalog Admin
Storage Admin or Custom Role with storage.buckets.list acl

1.2.2. Download a JSON key and save it as

./credentials/datacatalog-object-storage-processor-sa.json

1.3. Virtualenv
Using virtualenv is optional, but strongly recommended unless you use Docker.
1.3.1. Install Python 3.6+
1.3.2. Create and activate an isolated Python environment
pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate

1.3.3. Install the dependencies
pip install --upgrade --editable .

1.3.4. Set environment variables
export GOOGLE_APPLICATION_CREDENTIALS=./credentials/datacatalog-object-storage-processor-sa.json

1.4. Docker
Docker may be used as an alternative to run all the scripts. In this case, please disregard the Virtualenv install instructions.
2. Create DataCatalog entries based on object storage files
2.1. python main.py

python

datacatalog-object-storage-processor \
object-storage create-entries --type cloud-storage \
--project-id my_project \
--entry-group-name my_entry_group_name \
--bucket-prefix my_bucket


docker

docker build --rm --tag datacatalog-object-storage-processor .
docker run --rm --tty -v your_credentials_folder:/data datacatalog-object-storage-processor \
--type cloud-storage \
--project-id my_project \
--entry-group-name my_entry_group_name \
--bucket-prefix my_bucket

3 Delete up object storage entries on entry group
Delete entries for given entry group
datacatalog-object-storage-processor \
object-storage delete-entries --type cloud-storage \
--project-id my_project \
--entry-group-name my_entry_group_name

Disclaimers
This is not an officially supported Google product.
History
0.1.0 (2020-05-01)

First release on PyPI.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.