
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/datasets/plot_make_imbalance.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_datasets_plot_make_imbalance.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_datasets_plot_make_imbalance.py:


============================
Create an imbalanced dataset
============================

An illustration of the :func:`~imblearn.datasets.make_imbalance` function to
create an imbalanced dataset from a balanced dataset. We show the ability of
:func:`~imblearn.datasets.make_imbalance` of dealing with Pandas DataFrame.

.. GENERATED FROM PYTHON SOURCE LINES 10-16

.. code-block:: default


    # Authors: Dayvid Oliveira
    #          Christos Aridas
    #          Guillaume Lemaitre <g.lemaitre58@gmail.com>
    # License: MIT








.. GENERATED FROM PYTHON SOURCE LINES 17-23

.. code-block:: default

    print(__doc__)

    import seaborn as sns

    sns.set_context("poster")








.. GENERATED FROM PYTHON SOURCE LINES 24-30

Generate the dataset
--------------------

First, we will generate a dataset and convert it to a
:class:`~pandas.DataFrame` with arbitrary column names. We will plot the
original dataset.

.. GENERATED FROM PYTHON SOURCE LINES 32-48

.. code-block:: default

    import matplotlib.pyplot as plt
    import pandas as pd
    from sklearn.datasets import make_moons

    X, y = make_moons(n_samples=200, shuffle=True, noise=0.5, random_state=10)
    X = pd.DataFrame(X, columns=["feature 1", "feature 2"])
    ax = X.plot.scatter(
        x="feature 1",
        y="feature 2",
        c=y,
        colormap="viridis",
        colorbar=False,
    )
    sns.despine(ax=ax, offset=10)
    plt.tight_layout()




.. image-sg:: /auto_examples/datasets/images/sphx_glr_plot_make_imbalance_001.png
   :alt: plot make imbalance
   :srcset: /auto_examples/datasets/images/sphx_glr_plot_make_imbalance_001.png
   :class: sphx-glr-single-img





.. GENERATED FROM PYTHON SOURCE LINES 49-55

Make a dataset imbalanced
-------------------------

Now, we will show the helpers :func:`~imblearn.datasets.make_imbalance`
that is useful to random select a subset of samples. It will impact the
class distribution as specified by the parameters.

.. GENERATED FROM PYTHON SOURCE LINES 57-65

.. code-block:: default

    from collections import Counter


    def ratio_func(y, multiplier, minority_class):
        target_stats = Counter(y)
        return {minority_class: int(multiplier * target_stats[minority_class])}









.. GENERATED FROM PYTHON SOURCE LINES 66-102

.. code-block:: default

    from imblearn.datasets import make_imbalance

    fig, axs = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))

    X.plot.scatter(
        x="feature 1",
        y="feature 2",
        c=y,
        ax=axs[0, 0],
        colormap="viridis",
        colorbar=False,
    )
    axs[0, 0].set_title("Original set")
    sns.despine(ax=axs[0, 0], offset=10)

    multipliers = [0.9, 0.75, 0.5, 0.25, 0.1]
    for ax, multiplier in zip(axs.ravel()[1:], multipliers):
        X_resampled, y_resampled = make_imbalance(
            X,
            y,
            sampling_strategy=ratio_func,
            **{"multiplier": multiplier, "minority_class": 1},
        )
        X_resampled.plot.scatter(
            x="feature 1",
            y="feature 2",
            c=y_resampled,
            ax=ax,
            colormap="viridis",
            colorbar=False,
        )
        ax.set_title(f"Sampling ratio = {multiplier}")
        sns.despine(ax=ax, offset=10)

    plt.tight_layout()
    plt.show()



.. image-sg:: /auto_examples/datasets/images/sphx_glr_plot_make_imbalance_002.png
   :alt: Original set, Sampling ratio = 0.9, Sampling ratio = 0.75, Sampling ratio = 0.5, Sampling ratio = 0.25, Sampling ratio = 0.1
   :srcset: /auto_examples/datasets/images/sphx_glr_plot_make_imbalance_002.png
   :class: sphx-glr-single-img






.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.396 seconds)


.. _sphx_glr_download_auto_examples_datasets_plot_make_imbalance.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example




    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_make_imbalance.py <plot_make_imbalance.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_make_imbalance.ipynb <plot_make_imbalance.ipynb>`


.. include:: plot_make_imbalance.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
