% \VignetteIndexEntry{Getting Started with doMC and foreach} % \VignetteDepends{doMC} % \VignetteDepends{foreach} % \VignettePackage{doMC} \documentclass[12pt]{article} \usepackage{amsmath} \usepackage[pdftex]{graphicx} \usepackage{color} \usepackage{xspace} \usepackage{url} \usepackage{fancyvrb} \usepackage{fancyhdr} \usepackage[ colorlinks=true, linkcolor=blue, citecolor=blue, urlcolor=blue] {hyperref} \usepackage{lscape} \usepackage{Sweave} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % define new colors for use \definecolor{darkgreen}{rgb}{0,0.6,0} \definecolor{darkred}{rgb}{0.6,0.0,0} \definecolor{lightbrown}{rgb}{1,0.9,0.8} \definecolor{brown}{rgb}{0.6,0.3,0.3} \definecolor{darkblue}{rgb}{0,0,0.8} \definecolor{darkmagenta}{rgb}{0.5,0,0.5} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\bld}[1]{\mbox{\boldmath $#1$}} \newcommand{\shell}[1]{\mbox{$#1$}} \renewcommand{\vec}[1]{\mbox{\bf {#1}}} \newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize} \newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize} \newcommand{\halfs}{\frac{1}{2}} \setlength{\oddsidemargin}{-.25 truein} \setlength{\evensidemargin}{0truein} \setlength{\topmargin}{-0.2truein} \setlength{\textwidth}{7 truein} \setlength{\textheight}{8.5 truein} \setlength{\parindent}{0.20truein} \setlength{\parskip}{0.10truein} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \pagestyle{fancy} \lhead{} \chead{Getting Started with doMC and foreach} \rhead{} \lfoot{} \cfoot{} \rfoot{\thepage} \renewcommand{\headrulewidth}{1pt} \renewcommand{\footrulewidth}{1pt} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \title{Getting Started with doMC and foreach} \author{Steve Weston} \begin{document} \maketitle \thispagestyle{empty} \section{Introduction} The \texttt{doMC} package is a ``parallel backend'' for the \texttt{foreach} package. It provides a mechanism needed to execute \texttt{foreach} loops in parallel. The \texttt{foreach} package must be used in conjunction with a package such as \texttt{doMC} in order to execute code in parallel. The user must register a parallel backend to use, otherwise \texttt{foreach} will execute tasks sequentially, even when the \%dopar\% operator is used.\footnote{\texttt{foreach} will issue a warning that it is running sequentially if no parallel backend has been registered. It will only issue this warning once, however.} The \texttt{doMC} package acts as an interface between \texttt{foreach} and the \texttt{multicore} functionality of the \texttt{parallel} package, originally written by Simon Urbanek and incorporated into \texttt{parallel} for R 2.14.0. The \texttt{multicore} functionality currently only works with operating systems that support the \texttt{fork} system call (which means that Windows isn't supported). Also, \texttt{multicore} only runs tasks on a single computer, not a cluster of computers. That means that it is pointless to use \texttt{doMC} and \texttt{multicore} on a machine with only one processor with a single core. To get a speed improvement, it must run on a machine with multiple processors, multiple cores, or both. \section{A word of caution} Because the \texttt{multicore} functionality starts its workers using \texttt{fork} without doing a subsequent \texttt{exec}, it has some limitations. Some operations cannot be performed properly by forked processes. For example, connection objects very likely won't work. In some cases, this could cause an object to become corrupted, and the R session to crash. In addition, it usually isn't safe to run \texttt{doMC} and \texttt{multicore} from a GUI environment. \section{Registering the \texttt{doMC} parallel backend} To register \texttt{doMC} to be used with \texttt{foreach}, you must call the \texttt{registerDoMC} function. This function takes only one argument, named ``cores''. This specifies the number of worker processes that it will use to execute tasks, which will normally be equal to the total number of cores on the machine. You don't need to specify a value for it, however. By default, the \texttt{multicore} package will use the value of the ``cores'' option, as specified with the standard ``options'' function. If that isn't set, then \texttt{multicore} will try to detect the number of cores, and use approximately half that many workers. Remember: unless \texttt{registerDoMC} is called, \texttt{foreach} will {\em not} run in parallel. Simply loading the \texttt{doMC} package is not enough. \section{An example \texttt{doMC} session} Before we go any further, let's load \texttt{doMC}, register it, and use it with \texttt{foreach}: <>= library(doMC) registerDoMC(2) foreach(i=1:3) %dopar% sqrt(i) @ \begin{quote} Note well that this is {\em not} a practical use of \texttt{doMC}. This is my ``Hello, world'' program for parallel computing. It tests that everything is installed and set up properly, but don't expect it to run faster than a sequential \texttt{for} loop, because it won't! \texttt{sqrt} executes far too quickly to be worth executing in parallel, even with a large number of iterations. With small tasks, the overhead of scheduling the task and returning the result can be greater than the time to execute the task itself, resulting in poor performance. In addition, this example doesn't make use of the vector capabilities of \texttt{sqrt}, which it must to get decent performance. This is just a test and a pedagogical example, {\em not} a benchmark. \end{quote} But returning to the point of this example, you can see that it is very simple to load \texttt{doMC} with all of its dependencies (\texttt{foreach}, \texttt{iterators}, \texttt{multicore}, etc), and to register it. For the rest of the R session, whenever you execute \texttt{foreach} with \texttt{\%dopar\%}, the tasks will be executed using \texttt{doMC} and \texttt{multicore}. Note that you can register a different parallel backend later, or deregister \texttt{doMC} by registering the sequential backend by calling the \texttt{registerDoSEQ} function. \section{A more serious example} Now that we've gotten our feet wet, let's do something a bit less trivial. One good example is bootstrapping. Let's see how long it takes to run 10,000 bootstrap iterations in parallel on \Sexpr{getDoParWorkers()} cores: <>= x <- iris[which(iris[,5] != "setosa"), c(1,5)] trials <- 10000 ptime <- system.time({ r <- foreach(icount(trials), .combine=cbind) %dopar% { ind <- sample(100, 100, replace=TRUE) result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit)) coefficients(result1) } })[3] ptime @ Using \texttt{doMC} and \texttt{multicore} we were able to perform 10,000 bootstrap iterations in \Sexpr{ptime} seconds on \Sexpr{getDoParWorkers()} cores. By changing the \texttt{\%dopar\%} to \texttt{\%do\%}, we can run the same code sequentially to determine the performance improvement: <>= stime <- system.time({ r <- foreach(icount(trials), .combine=cbind) %do% { ind <- sample(100, 100, replace=TRUE) result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit)) coefficients(result1) } })[3] stime @ The sequential version ran in \Sexpr{stime} seconds, which means the speed up is about \Sexpr{round(stime / ptime, digits=1)} on \Sexpr{getDoParWorkers()} workers.\footnote{If you build this vignette yourself, you can see how well this problem runs on your hardware. None of the times are hardcoded in this document. You can also run the same example which is in the examples directory of the \texttt{doMC} distribution.} Ideally, the speed up would be \Sexpr{getDoParWorkers()}, but no multicore CPUs are ideal, and neither are the operating systems and software that run on them. At any rate, this is a more realistic example that is worth executing in parallel. I'm not going to explain what it's doing or how it works here. I just want to give you something more substantial than the \texttt{sqrt} example in case you want to run some benchmarks yourself. You can also run this example on a cluster by simply registering a different parallel backend that supports clusters in order to take advantage of more processors. \section{Getting information about the parallel backend} To find out how many workers \texttt{foreach} is going to use, you can use the \texttt{getDoParWorkers} function: <>= getDoParWorkers() @ This is a useful sanity check that you're actually running in parallel. If you haven't registered a parallel backend, or if your machine only has one core, \texttt{getDoParWorkers} will return one. In either case, don't expect a speed improvement. \texttt{foreach} is clever, but it isn't magic. The \texttt{getDoParWorkers} function is also useful when you want the number of tasks to be equal to the number of workers. You may want to pass this value to an iterator constructor, for example. You can also get the name and version of the currently registered backend: <>= getDoParName() getDoParVersion() @ This is mostly useful for documentation purposes, or for checking that you have the most recent version of \texttt{doMC}. \section{Specifying multicore options} The \texttt{doMC} package allows you to specify various options when running \texttt{foreach} that are supported by the underlying \texttt{mclapply} function: ``preschedule'', ``set.seed'', ``silent'', and ``cores''. You can learn about these options from the \texttt{mclapply} man page. They are set using the \texttt{foreach} \texttt{.options.multicore} argument. Here's an example of how to do that: <>= mcoptions <- list(preschedule=FALSE, set.seed=FALSE) foreach(i=1:3, .options.multicore=mcoptions) %dopar% sqrt(i) @ The ``cores'' options allows you to temporarily override the number of workers to use for a single \texttt{foreach} operation. This is more convenient than having to re-register \texttt{doMC}. Although if no value of ``cores'' was specified when \texttt{doMC} was registered, you can also change this value dynamically using the \texttt{options} function: \begin{verbatim} > registerDoMC() > getDoParWorkers() [1] 3 > options(cores=2) > getDoParWorkers() [1] 2 > options(cores=3) > getDoParWorkers() [1] 3 \end{verbatim} If you did specify the number of cores when registering \texttt{doMC}, the ``cores'' option is ignored: <>= registerDoMC(2) options(cores=4) getDoParWorkers() @ As you can see, there are a number of options for controlling the number of workers to use with \texttt{multicore}, but the default behaviour usually does what you want. \section{Conclusion} The \texttt{doMC} and \texttt{multicore} packages provide a nice, efficient parallel programming platform for multiprocessor/multicore computers running operating systems such as Linux and Mac OS X. It is very easy to install, and very easy to use. In short order, an average R programmer can start executing parallel programs, without any previous experience in parallel computing. \end{document}