outlier-remover-101703283 0.0.0

Creator: railscoder56

Last updated:

Add to Cart

Description:

outlierremover101703283 0.0.0

outlier_remover_101703283
For : Project-2 (UCS633)
Submitted by : Katinder Kaur
Roll no : 101703283
Group : 3COE13
outlier_remover_101703283 is a Python library for dealing with anomalies or outliers in a dataset. The presence of outliers in a dataset is very common, especially in raw data. Outlier removal is an important preprocessing stage since their presence leads to significant hindrance in the performance and prediction accuracies of the model.
There are several methods to detect and remove outliers, this script uses Interquartile Range(IQR) as the method of detection of anomalous data.
Installation
Use the package manager pip to install outlier_remover_101703283.
pip install outlier_remover_101703283

Usage
For command prompt:
usage: outlier_remover [-h] [-o OUTPUTDATAFILE] [-c COLUMNSTOSKIP]
InputDataFile

positional arguments:
InputDataFile Enter the name of input CSV file with .csv extention

optional arguments:
-h, --help show this help message and exit
-o OUTPUTDATAFILE, --OutputDataFile OUTPUTDATAFILE
Enter the name of output CSV file with .csv extention
-c COLUMNSTOSKIP, --ColumnsToSkip COLUMNSTOSKIP
Enter the columns to be left out of analysis

Enter the input csv filename followed by .csv extentsion
outlier_remover sample_inputfile.csv

after the records with anomalous values are removed, the resultant data will be implicitly stored in sample_input_sansOutliers.csv (i.e. _sansOutliers.csv )
Custom output file name:
Destination output file name can be provided explicitly by using -o flag
outlier_remover sample_inputfile.csv -o my_outputfile.csv

the output data in this case will be stored in a csv file named my_outputfile.csv
Skipping out columns:
In some cases one may want to leave some features out of analysis (like in case of catagorical data or indices) , that can be facilitated by using the -c flag
outlier_remover sample_inputfile.csv -c 0,2,8

or
outlier_remover sample_inputfile.csv -c "0,2,8"

Note : Column numbers start from 0.
View help
To view usage help, use
outlier_remover -h

For Python IDLE:
>>> from outlier_remover.outlier_remover import outlier_remover
>>> list_of_columns_to_skip=[]
>>> outlier_remover('inputfile.csv','outputfile.csv',list_of_columns_to_skip)
Removed 2 row(s) successfully.
Save successful!
Check outputfile.csv for results


>>> from outlier_remover.outlier_remover import outlier_removerfn
>>> outlier_removerfn('sample2.csv')
Removed 1 row(s) successfully.
Save successful!
Check sans_outliers.csv for results

License
MIT

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.