
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/parallel_random_state.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_parallel_random_state.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_parallel_random_state.py:


===================================
Random state within joblib.Parallel
===================================

Randomness is affected by parallel execution differently by the different
backends.

In particular, when using multiple processes, the random sequence can be
the same in all processes. This example illustrates the problem and shows
how to work around it.

.. GENERATED FROM PYTHON SOURCE LINES 14-19

.. code-block:: default


    import numpy as np
    from joblib import Parallel, delayed









.. GENERATED FROM PYTHON SOURCE LINES 20-21

A utility function for the example

.. GENERATED FROM PYTHON SOURCE LINES 21-27

.. code-block:: default

    def print_vector(vector, backend):
        """Helper function to print the generated vector with a given backend."""
        print('\nThe different generated vectors using the {} backend are:\n {}'
              .format(backend, np.array(vector)))









.. GENERATED FROM PYTHON SOURCE LINES 28-36

Sequential behavior
####################

 ``stochastic_function`` will generate five random integers. When
 calling the function several times, we are expecting to obtain
 different vectors. For instance, we will call the function five times
 in a sequential manner, we can check that the generated vectors are all
 different.

.. GENERATED FROM PYTHON SOURCE LINES 36-48

.. code-block:: default



    def stochastic_function(max_value):
        """Randomly generate integer up to a maximum value."""
        return np.random.randint(max_value, size=5)


    n_vectors = 5
    random_vector = [stochastic_function(10) for _ in range(n_vectors)]
    print('\nThe different generated vectors in a sequential manner are:\n {}'
          .format(np.array(random_vector)))





.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    The different generated vectors in a sequential manner are:
     [[2 5 0 3 8]
     [1 6 9 1 6]
     [2 6 8 0 6]
     [0 0 2 2 7]
     [2 2 7 5 1]]




.. GENERATED FROM PYTHON SOURCE LINES 49-54

Parallel behavior
##################

 Joblib provides three different backends: loky (default), threading, and
 multiprocessing.

.. GENERATED FROM PYTHON SOURCE LINES 54-60

.. code-block:: default


    backend = 'loky'
    random_vector = Parallel(n_jobs=2, backend=backend)(delayed(
        stochastic_function)(10) for _ in range(n_vectors))
    print_vector(random_vector, backend)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    The different generated vectors using the loky backend are:
     [[0 8 2 0 0]
     [9 8 0 7 9]
     [8 2 0 2 2]
     [2 0 5 0 0]
     [0 2 9 6 5]]




.. GENERATED FROM PYTHON SOURCE LINES 61-67

.. code-block:: default


    backend = 'threading'
    random_vector = Parallel(n_jobs=2, backend=backend)(delayed(
        stochastic_function)(10) for _ in range(n_vectors))
    print_vector(random_vector, backend)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    The different generated vectors using the threading backend are:
     [[8 9 6 2 0]
     [3 5 2 7 2]
     [1 2 9 3 8]
     [9 2 8 4 0]
     [5 6 0 0 7]]




.. GENERATED FROM PYTHON SOURCE LINES 68-78

Loky and the threading backends behave exactly as in the sequential case and
do not require more care. However, this is not the case regarding the
multiprocessing backend with the "fork" or "forkserver" start method because
the state of the global numpy random stated will be exactly duplicated
in all the workers

Note: on platforms for which the default start method is "spawn", we do not
have this problem but we cannot use this in a Python script without
using the if __name__ == "__main__" construct. So let's end this example
early if that's the case:

.. GENERATED FROM PYTHON SOURCE LINES 78-86

.. code-block:: default


    import multiprocessing as mp
    if mp.get_start_method() != "spawn":
        backend = 'multiprocessing'
        random_vector = Parallel(n_jobs=2, backend=backend)(delayed(
            stochastic_function)(10) for _ in range(n_vectors))
        print_vector(random_vector, backend)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    The different generated vectors using the multiprocessing backend are:
     [[7 8 4 0 3]
     [7 8 4 0 3]
     [0 8 4 8 8]
     [0 8 4 8 8]
     [8 8 3 1 9]]




.. GENERATED FROM PYTHON SOURCE LINES 87-95

Some of the generated vectors are exactly the same, which can be a
problem for the application.

Technically, the reason is that all forked Python processes share the
same exact random seed. As a result, we obtain twice the same randomly
generated vectors because we are using ``n_jobs=2``. A solution is to
set the random state within the function which is passed to
:class:`joblib.Parallel`.

.. GENERATED FROM PYTHON SOURCE LINES 95-102

.. code-block:: default



    def stochastic_function_seeded(max_value, random_state):
        rng = np.random.RandomState(random_state)
        return rng.randint(max_value, size=5)









.. GENERATED FROM PYTHON SOURCE LINES 103-106

``stochastic_function_seeded`` accepts as argument a random seed. We can
reset this seed by passing ``None`` at every function call. In this case, we
see that the generated vectors are all different.

.. GENERATED FROM PYTHON SOURCE LINES 106-112

.. code-block:: default


    if mp.get_start_method() != "spawn":
        random_vector = Parallel(n_jobs=2, backend=backend)(delayed(
            stochastic_function_seeded)(10, None) for _ in range(n_vectors))
        print_vector(random_vector, backend)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    The different generated vectors using the multiprocessing backend are:
     [[5 1 6 9 0]
     [6 9 0 7 4]
     [6 1 1 9 0]
     [6 3 5 9 8]
     [7 1 3 3 2]]




.. GENERATED FROM PYTHON SOURCE LINES 113-123

Fixing the random state to obtain deterministic results
########################################################

 The pattern of ``stochastic_function_seeded`` has another advantage: it
 allows to control the random_state by passing a known seed. For best results
 [1]_, the random state is initialized by a sequence based on a root seed and
 a job identifier. So for instance, we can replicate the same generation of
 vectors by passing a fixed state as follows.

 .. [1]  https://numpy.org/doc/stable/reference/random/parallel.html

.. GENERATED FROM PYTHON SOURCE LINES 123-133

.. code-block:: default


    if mp.get_start_method() != "spawn":
        seed = 42
        random_vector = Parallel(n_jobs=2, backend=backend)(delayed(
            stochastic_function_seeded)(10, [i, seed]) for i in range(n_vectors))
        print_vector(random_vector, backend)

        random_vector = Parallel(n_jobs=2, backend=backend)(delayed(
            stochastic_function_seeded)(10, [i, seed]) for i in range(n_vectors))
        print_vector(random_vector, backend)




.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    The different generated vectors using the multiprocessing backend are:
     [[5 7 2 3 5]
     [1 7 4 0 6]
     [4 9 5 6 4]
     [1 8 1 0 4]
     [8 3 5 7 0]]

    The different generated vectors using the multiprocessing backend are:
     [[5 7 2 3 5]
     [1 7 4 0 6]
     [4 9 5 6 4]
     [1 8 1 0 4]
     [8 3 5 7 0]]





.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.211 seconds)


.. _sphx_glr_download_auto_examples_parallel_random_state.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example




    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: parallel_random_state.py <parallel_random_state.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: parallel_random_state.ipynb <parallel_random_state.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
