
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/evaluation/plot_metrics.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_evaluation_plot_metrics.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_evaluation_plot_metrics.py:


=======================================
Metrics specific to imbalanced learning
=======================================

Specific metrics have been developed to evaluate classifier which
has been trained using imbalanced data. :mod:`imblearn` provides mainly
two additional metrics which are not implemented in :mod:`sklearn`: (i)
geometric mean and (ii) index balanced accuracy.

.. GENERATED FROM PYTHON SOURCE LINES 11-15

.. code-block:: default


    # Authors: Guillaume Lemaitre <g.lemaitre58@gmail.com>
    # License: MIT








.. GENERATED FROM PYTHON SOURCE LINES 16-20

.. code-block:: default

    print(__doc__)

    RANDOM_STATE = 42








.. GENERATED FROM PYTHON SOURCE LINES 21-22

First, we will generate some imbalanced dataset.

.. GENERATED FROM PYTHON SOURCE LINES 24-39

.. code-block:: default

    from sklearn.datasets import make_classification

    X, y = make_classification(
        n_classes=3,
        class_sep=2,
        weights=[0.1, 0.9],
        n_informative=10,
        n_redundant=1,
        flip_y=0,
        n_features=20,
        n_clusters_per_class=4,
        n_samples=5000,
        random_state=RANDOM_STATE,
    )








.. GENERATED FROM PYTHON SOURCE LINES 40-41

We will split the data into a training and testing set.

.. GENERATED FROM PYTHON SOURCE LINES 43-49

.. code-block:: default

    from sklearn.model_selection import train_test_split

    X_train, X_test, y_train, y_test = train_test_split(
        X, y, stratify=y, random_state=RANDOM_STATE
    )








.. GENERATED FROM PYTHON SOURCE LINES 50-53

We will create a pipeline made of a :class:`~imblearn.over_sampling.SMOTE`
over-sampler followed by a :class:`~sklearn.linear_model.LogisticRegression`
classifier.

.. GENERATED FROM PYTHON SOURCE LINES 53-59

.. code-block:: default


    from sklearn.linear_model import LogisticRegression
    from sklearn.preprocessing import StandardScaler

    from imblearn.over_sampling import SMOTE








.. GENERATED FROM PYTHON SOURCE LINES 60-68

.. code-block:: default

    from imblearn.pipeline import make_pipeline

    model = make_pipeline(
        StandardScaler(),
        SMOTE(random_state=RANDOM_STATE),
        LogisticRegression(max_iter=10_000, random_state=RANDOM_STATE),
    )








.. GENERATED FROM PYTHON SOURCE LINES 69-73

Now, we will train the model on the training set and get the prediction
associated with the testing set. Be aware that the resampling will happen
only when calling `fit`: the number of samples in `y_pred` is the same than
in `y_test`.

.. GENERATED FROM PYTHON SOURCE LINES 75-78

.. code-block:: default

    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)








.. GENERATED FROM PYTHON SOURCE LINES 79-82

The geometric mean corresponds to the square root of the product of the
sensitivity and specificity. Combining the two metrics should account for
the balancing of the dataset.

.. GENERATED FROM PYTHON SOURCE LINES 84-88

.. code-block:: default

    from imblearn.metrics import geometric_mean_score

    print(f"The geometric mean is {geometric_mean_score(y_test, y_pred):.3f}")





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    The geometric mean is 0.940




.. GENERATED FROM PYTHON SOURCE LINES 89-91

The index balanced accuracy can transform any metric to be used in
imbalanced learning problems.

.. GENERATED FROM PYTHON SOURCE LINES 93-103

.. code-block:: default

    from imblearn.metrics import make_index_balanced_accuracy

    alpha = 0.1
    geo_mean = make_index_balanced_accuracy(alpha=alpha, squared=True)(geometric_mean_score)

    print(
        f"The IBA using alpha={alpha} and the geometric mean: "
        f"{geo_mean(y_test, y_pred):.3f}"
    )





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    The IBA using alpha=0.1 and the geometric mean: 0.884




.. GENERATED FROM PYTHON SOURCE LINES 104-111

.. code-block:: default

    alpha = 0.5
    geo_mean = make_index_balanced_accuracy(alpha=alpha, squared=True)(geometric_mean_score)

    print(
        f"The IBA using alpha={alpha} and the geometric mean: "
        f"{geo_mean(y_test, y_pred):.3f}"
    )




.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    The IBA using alpha=0.5 and the geometric mean: 0.884





.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.017 seconds)


.. _sphx_glr_download_auto_examples_evaluation_plot_metrics.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example




    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_metrics.py <plot_metrics.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_metrics.ipynb <plot_metrics.ipynb>`


.. include:: plot_metrics.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
