pandaspg 0.0.3.3

Creator: railscoder56

Last updated:

Add to Cart

Description:

pandaspg 0.0.3.3

Description
This library contains several functions that allow you to migrate data from a CSV file or Pandas Dataframe into a PostgreSQL database using the libraries Psycopg2 and Pandas. Specifically, it includes functions for:

Loading data from a CSV file to a Pandas dataframe
Mapping Pandas Dataframe columns to their datatypes
Mapping Pandas Dataframe columns to suitable PostgreSQL datatypes
Connect to a PostgreSQL database
Creating new tables on a PostgreSQL database
Inserting data from a Pandas DataFrame into a table in a PostgreSQL database


Installation
Install The Required Dependencies
pip install pandas psycopg2

Install The pandaspg Package
pip install pandaspg

💡 Make sure you have a PostgreSQL database up and running!

Usage
I will walk you through a step-by-step example of how to migrate data from a CSV file into a PostgreSQL database. For demonstration purposes, download the exoplanets_07-04-28.csv file from my [exoplanets](https://github.com/eadwulf/exoplanets) repo.

Download The CSV file
wget https://raw.githubusercontent.com/Eadwulf/exoplanets/main/exoplanets_07-04-2023.csv

💡 This file contains a total of 34,112 rows and 92 columns with a total size of 28.6 MB

Import The Library
import pandaspg

Create a Pandas dataframe with the CSV file data
dataframe = pandaspg.csv_to_dataframe('exoplanets_04-07-2023.csv')

Generate a dictionary mapping the dataframe columns to their datatype
column_datatypes_dict = pandaspg.get_dataframe_column_dtypes_dict(dataframe)

Generate a dictionary mapping the dataframe columns to a suitable PostgreSQL datatype
pg_column_datatypes_dict = pandaspg.map_pandas_to_postgresql_datatypes(
column_datatypes_dict)

Connect to an existing and running PostgreSQL database
connection = pandaspg.connect_to_postgresql(database='analysis',
user='postgres',
password='postgres',
host='localhost',
port=5432)

💡 Replace the above parameters as you see fit.
Create a PostgreSQL table
pandaspg.create_postgresql_table(
connection, 'exoplanets_csv', pg_column_datatypes_dict)

💡 The parameter `exoplanets_csv` is the name of the database to be created.
Insert the from the dataframe to the recently created table
pandaspg.insert_dataframe_into_postgresql(
connection, 'exoplanets_csv', dataframe)

Close the connection with the database
connection.close()


The Full Example
import pandaspg

dataframe = pandaspg.csv_to_dataframe('exoplanets_04-07-2023.csv')

column_datatypes_dict = pandaspg.get_dataframe_column_dtypes_dict(dataframe)

pg_column_datatypes_dict = pandaspg.map_pandas_to_postgresql_datatypes(
column_datatypes_dict)

connection = pandaspg.connect_to_postgresql(database='analysis',
user='postgres',
password='postgres',
host='localhost',
port=5432)

pandaspg.create_postgresql_table(
connection, 'exoplanets_csv', pg_column_datatypes_dict)

pandaspg.insert_dataframe_into_postgresql(
connection, 'exoplanets_csv', dataframe)

connection.close()


Inspect The Results
Enter the PostgreSQL prompt
psql -U postgres -d analysis

List the tables in the analysis database
\dt

Retrieve the data from the exoplanets_csv table
SELECT * FROM exoplanets_csv;

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.