Last updated:
0 purchases
pyblux 0.0.5
pyblux: A suite of fast, easy-to-use, and intuitive Python ETL utilities.
What is it?
pyblux is a Python package that provides a suite of ETL utilities built to make the interactions with databases in the cloud as well as on-premise fast, easy and intuitive.
Features
Support multiple databases, including Postgres, MySql, MS SQL, SQLIte, Teradata and Oracle.
The get_engine method makes it easy to connect to databases in a simple and intuitive manner.
Blux.sql method from the Blux class helps run fast queries. It Povides output results as namedtuple or dictionary and it supports parameterised queries and in-flight transformation of data.
Logger class helps setup logging via log file or console.
send_teams_notification method provides an easy way to send alerts to a MS Teams channel via an incoming webhook.
send_email method helps to send email with HTML content
Helpful error messages display the failed query SQL
DBAPI2 specification was used in order to simplify coding for queries on relational database systems using Python.
Where to get it
The source code is currently hosted on GitHub at:
https://github.com/bertin.nono/pyblux
Binary installers for the latest released version are available at the Python
Package Index (PyPI)
# or PyPI
pip install pyblux
pyblux provides support for the databases below:
Teradata
PostgreSQL
MySQL and MariaDB
SQLite
Oracle
Microsoft SQL Server
Dependencies
Depending on the use case, the database package should be installed.
Teradata : Install the Teradata SQL Driver Dialect for SQLAlchemy
pip install teradatasqlalchemy
PostgreSQL: Install Psycopg which is the most popular PostgreSQL database adapter for the Python programming language.
pip install psycopg2-binary
MySQL and MariaDB: Install the Pure Python MySQL Driver
pip install PyMySQL
[SQLite]: No install required
Oracle: Intall cx_Oracle which is a Python extension module that enables access to Oracle Database.
pip install cx-Oracle
Microsoft SQL Server: Install the pyodbc which is an open source Python module that makes accessing ODBC databases simple.
pip install pyodbc
Documentation
Classes
Methods
References
Classes
Blux:
Establishes a connection engine to a database system referenced by the dialect attribute run fast queries.
Note: The dialect is the system SQLAlchemy uses to communicate with various types of DBAPI implementations and databases.
class Blux:
"""
This class connects to a local database session using the db `dialect` library.
"""
def __init__(self, engine=None,dialect=None):
"""
Args:
engine (str): Database connection engine.
dialect (str): database system name(postgres, oracle, teradata,...)
Note: Database Connection package must be installed in order to use this backend.
"""
self.engine = engine
self._dialect = dialect
self.__errlimit = 1
self.__warnings = []
self.__errors = []
self.__logons = []
Example:
table_list = """SELECT table_schema, table_name FROM information_schema.tables"""
postgres_engine = get_engine(dialect='PG', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
blux= Blux(engine=postgres_engine, dialect='postgres')
dataframe = blux.sql(query=table_list)
table = 'test'
database = 'postgres'
#load dataframe to table
blux.sql(dataframe=dataframe,database=database,table=table, dialect='postgres')
Logger:
provides a custom logging handler called logger. Helps Debug SQL and monitor progress with logging.
class Logger:
"""
This class connects to a local database session using the db `connnection` library.
"""
def __init__(self, logname:str, filename:str, level=logging.INFO, console:bool=True):
"""
Args:
logname (str): Logger Name.
filename (str): log file path
level (str): Logger Level (DEBUG, INFO, WARNING, ERROR)
console (cool): print to console
"""
self._logname = logname
self._filename = filename
self._level = level
self._console = console
Example:
import logging
from pyblux import logger
pyblux_logger = Logger(logname=ETL.NAME, filename=log_file,level=logging.INFO, console=True)
logger=pyblux_logger.logger( verbose=True)
Output from a call for get_engine will look like:
2021-07-07 15:06:22,411 get_engine:
2021-07-07 15:06:22,413 get_engine:
2021-07-07 15:06:22,416 get_engine:
Methods:
get_engine:
Creates a database connection engine.
get_engine(user:str,password:str,host:str,port:int,database:str,dialect:str,verbose:bool=False,parameter:str=None,raw_engine:bool=True,logger:Callable=print)
Database connection details are defined by get_engine objects (see below).
Example:
import pandas
from pyblux.utils import get_engine
from pyblux.blux import Blux
oracle_engine = get_engine(dialect='oracle', host="localhost", port=1521,database="mydata", user="oracle_user", password="123")
teradata_engine = get_engine(dialect='mssql', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
postgres_engine = get_engine(dialect='mysql', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
postgres_engine = get_engine(dialect='teradata', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
postgres_engine = get_engine(dialect='postgres', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
Passwords
It is best practice for Database passwords to be stored in environment variables.
This can be done on the command line via:
export password=secret-password on Linux
set password=secret-password on Windows
Or in a Python terminal via:
import os
os.environ['password'] = 'secret-password'
No password is required for SQLite databases.
When URL that includes the password contains Password conatains special characters
Example:
Connection_String = postgresql+psycopg2://user:p@ssword%to%encode@hosturl/defaultdb
The above password encoded using urllib.parse:
pwd=urllib.parse.quote_plus("p@ssword%to%encode")
print(pwd)
Connection_String = postgresql+psycopg2://user:urllib.parse.quote_plus("p@ssword%to%encode")@hosturl/defaultdb
get_connection:
Gets a regular database connection.
get_connection(user:str,password:str,host:str,port:int,database:str,dialect:str,verbose:bool=False,parameter:str=None,logger:Callable=print):
"""
Get a regular connection for Teradata , Oracle, Aurora/Postgres, Aurora/MySql/MariaDB, SQLite, and Microsoft SQL Server
Returns
-------
Connection Object
"""
Example:
from pyblux.utils import get_connection
oracle_conn = get_connection(dialect='oracle', host="localhost", port=1521,database="mydata", user="oracle_user", password="123")
teradata_conn = get_connection(dialect='mssql', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
postgres_conn = get_connection(dialect='mysql', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
postgres_conn = get_connection(dialect='teradata', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
postgres_conn = get_connection(dialect='postgres', host="localhost", port=5432, database="mydata", user="postgres_user", password="123")
create_table_from_dataframe:
Creates table from a dataframe attributes and fastload data into in it.
create_table_from_dataframe(dataframe:pd.DataFrame=None,table:str=None,Blux:Blux=None,verbose:bool=False,logger:Callable=print)
is_exist:
Checks is a table or view exist.
is_exist(table:str='', Blux:Blux=None,verbose:bool=False,logger:Callable=print)
drop_table:
Checks is a table or view exist and then drops it if it exists.
drop_table(table:str=None,Blux:Blux=None,verbose:bool=False,logger:Callable=print)
send_teams_notifications:
Send a Card to a MS Teams Channel
send_teams_notification ( hookurl: str, title: str='' , text: str='', message: str ='', status: str ='', error_message: str='', activitySubtitle: str='', activityText: str='')
send_email:
Send an HTML formated email that can include a dataframe
send_email(server:str, port:int,sender: str, receivers: list, subject: str, body_text: str, attachment: any = None,df: pd.DataFrame = None)
Maintainers:
Bertin Nono
Development status
Stable
Licence
MIT
References
psycopg2
cx_Oracle
sqlalchemy
Change Log
(09/12/2021)
Initial Release
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.