datachecks 0.3.1

Creator: bradpython12

Last updated:

Add to Cart

Description:

datachecks 0.3.1

Open Source Data Quality Monitoring.









⭐️ If you like it, star the repo ⭐
|
Documentations
|
Slack Community
|


Why Data Monitoring?
APM (Application Performance Monitoring) tools are used to monitor the performance of applications. APM tools are mandatory part of dev stack. Without AMP tools, it is very difficult to monitor the performance of applications.



But for Data products regular APM tools are not enough. We need a new kind of tools that can monitor the performance of Data applications.
Data monitoring tools are used to monitor the data quality of databases and data pipelines. It identifies potential issues, including in the databases and data pipelines. It helps to identify the root cause of the data quality issues and helps to improve the data quality.
What is datachecks?
Datachecks is an open-source data monitoring tool that helps to monitor the data quality of databases and data pipelines.
It identifies potential issues, including in the databases and data pipelines. It helps to identify the root cause of the data quality issues and helps to improve the data quality.
Datachecks can generate several reliability, uniqueness, completeness metrics from several data sources
Reports: Data Quality Visualisation
You can generate with just one command. It generates a beautiful data quality report with all the metrics.
This html report can be shared with the team.



CLI: Data Quality Visualisation in Bash
Data quality report can be generated in the terminal. It is very useful for debugging. All it takes is one command.



Getting Started
Install datachecks with the command that is specific to the database.
Install Datachecks
To install all datachecks dependencies, use the below command.
pip install datachecks -U

Create the config file
With a simple config file, you can generate data quality reports for your data sources. Below is the sample config example.
For more details, please visit the config guide



Run from CLI
Generate Report in Terminal
datachecks inspect -C config.yaml

Generate HTML Report
datachecks inspect -C config.yaml --html-report

Please visit the Quick Start Guide
Supported Data Sources
Datachecks supports sql and search data sources. Below are the list of supported data sources.



Data Source
Type
Supported




Postgres
Transactional Database
:thumbsup:


MySql
Transactional Database
:thumbsup:


MS SQL Server
Transactional Database
:thumbsup:


OpenSearch
Search Engine
:thumbsup:


Elasticsearch
Search Engine
:thumbsup:


GCP BigQuery
Data Warehouse
:thumbsup:


DataBricks
Data Warehouse
:thumbsup:


Snowflake
Data Warehouse
:thumbsup:


AWS RedShift
Data Warehouse
:x:



Metric Types



Metric
Description




Reliability Metrics
Reliability metrics detect whether tables/indices/collections are updating with timely data


Numeric Distribution Metrics
Numeric Distribution metrics detect changes in the numeric distributions i.e. of values, variance, skew and more


Uniqueness Metrics
Uniqueness metrics detect when data constraints are breached like duplicates, number of distinct values etc


Completeness Metrics
Completeness metrics detect when there are missing values in datasets i.e. Null, empty value


Validity Metrics
Validity metrics detect whether data is formatted correctly and represents a valid value



Overview



What Datacheck does not do?



Community & Support
For additional information and help, you can use one of these channels:

Slack (Live chat with the team, support, discussions, etc.)
GitHub issues (Bug reports, feature requests)

Contributions
:raised_hands: We greatly appreciate contributions - be it a bug fix, new feature, or documentation!
Check out the contributions guide and open issues.
Datachecks contributors: :blue_heart:






Telemetry
Usage Analytics & Data Privacy
License
This project is licensed under the terms of the APACHE 2 License.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.