cognitivefactory-features-maximization-metric 1.0.0

Creator: bradpython12

Last updated:

Add to Cart

Description:

cognitivefactoryfeaturesmaximizationmetric 1.0.0

Features Maximization Metric




Implementation of Features Maximization Metric, an unbiased metric aimed at estimate the quality of an unsupervised classification.
Quick description
Features Maximization (FMC) is a features selection method described in Lamirel, J.-C., Cuxac, P., & Hajlaoui, K. (2016). A Novel Approach to Feature Selection Based on Quality Estimation Metrics. In Advances in Knowledge Discovery and Management (pp. 121–140). Springer International Publishing. https://doi.org/10.1007/978-3-319-45763-5_7.
This metric is computed by applying the following steps:


Compute the Features F-Measure metric (based on Features Recall and Features Predominance metrics).

(a) The Features Recall FR[f][c] for a given class c and a given feature f is the ratio between
the sum of the vectors weights of the feature f for data in class c
and the sum of all vectors weights of feature f for all data.
It answers the question: "Can the feature f distinguish the class c from other classes c' ?"


(b) The Features Predominance FP[f][c] for a given class c and a given feature f is the ratio between
the sum of the vectors weights of the feature f for data in class c
and the sum of all vectors weights of all feature f' for data in class c.
It answers the question: "Can the feature f better identify the class c than the other features f' ?"


(c) The Features F-Measure FM[f][c] for a given class c and a given feature f is
the harmonic mean of the Features Recall (a) and the Features Predominance (c).
It answers the question: "How much information does the feature f contain about the class c ?"



Compute the Features Selection (based on F-Measure Overall Average comparison).

(d) The F-Measure Overall Average is the average of Features F-Measure (c) for all classes c and for all features f.
It answers the question: "What are the mean of information contained by features in all classes ?"


(e) A feature f is Selected if and only if it exist at least one class c for which the Features F-Measure (c) FM[f][c] is bigger than the F-Measure Overall Average (d).
It answers the question: "What are the features which contain more information than the mean of information in the dataset ?"


(f) A Feature f is Deleted if and only if the Features F-Measure (c) FM[f][c] is always lower than the F-Measure Overall Average (d) for each class c.
It answers the question: "What are the features which do not contain more information than the mean of information in the dataset ?"



Compute the Features Contrast and Features Activation (based on F-Measure Marginal Averages comparison).

(g) The F-Measure Marginal Averages for a given feature f is the average of Features F-Measure (c) for all classes c and for the given feature f.
It answers the question: "What are the mean of information contained by the feature f in all classes ?"


(h) The Features Contrast FC[f][c] for a given class c and a given selected feature f is the ratio between
the Features F-Measure (c) FM[f][c]
and the F-Measure Marginal Averages (g) for selected feature f
put to the power of an Amplification Factor.
It answers the question: "How relevant is the feature f to distinguish the class c ?"


(i) A selected Feature f is Active for a given class c if and only if the Features Contrast (h) FC[f][c] is bigger than 1.0.
It answers the question : "For which classes a selected feature f is relevant ?"



This metric is an efficient method to:

identify relevant features of a dataset modelization;
describe association between vectors features and data classes;
increase contrast between data classes.

Documentation

Main documentation

Installation
Features Maximization Metric requires Python 3.8 or above.
To install with pip:
# install package
python3 -m pip install cognitivefactory-features-maximization-metric

To install with pipx:
# install pipx
python3 -m pip install --user pipx

# install package
pipx install --python python3 cognitivefactory-features-maximization-metric

Development
To work on this project or contribute to it, please read:

the Copier PDM template documentation ;
the Contributing page for environment setup and development help ;
the Code of Conduct page for contribution rules.

References

Features Maximization Metric: Lamirel, J.-C., Cuxac, P., & Hajlaoui, K. (2016). A Novel Approach to Feature Selection Based on Quality Estimation Metrics. In Advances in Knowledge Discovery and Management (pp. 121–140). Springer International Publishing. https://doi.org/10.1007/978-3-319-45763-5_7
V-Measure: Rosenberg, Andrew & Hirschberg, Julia. (2007). V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure. 410-420.

How to cite
Schild, E. (2023). cognitivefactory/features-maximization-metric. Zenodo. https://doi.org/10.5281/zenodo.7646382.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.