Last updated:
0 purchases
pyQualitas 1.0.9
PyQualitas
This project aims towards developing a python library ensure quality of the data. This project is an inspiration from deequ and
dataflare which are also aimed towards the quality of the data.
Requirements:
Pyspark - Version 3.3.0
Pandas - Version 1.5.0
Jinja2 - Version 3.1.2
Slack-SDK - Version 3.19.3
PyMSTeams - Version 0.2.2
Installation:
The package can be installed as follows:
"pip install pyQualitas"
The test version of this package can be installed as follows:
"pip install -i https://test.pypi.org/simple/ pyQualitas"
Use Cases:
The main agenda behind creating this library is to help the QA Engineers to ensure quality of the data. Given the volume of the data & the frequency of the releases happening in the industry, there is an enormous responsibility on the Quality Assurance team to ensure & sign-off the quality of the data generated by the application.
It is very hard to achieve this using manual testing and scheduling an automated validation helps achieve the timelines and ensure a high quality of the data with less efforts.
There are various tests in this library that would come in handy during the regression testing process. Since the project is implemented in Python, the learning curve is short when compared to the libraries that are available in Scala.
The documentation can be found in the following link:
https://github.com/IamVenkatesh/pyQualitas/wiki
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.