%%\VignetteEngine{knitr::knitr} %%\VignetteIndexEntry{Using proftools} \documentclass[nojss]{jss} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% almost as usual \author{Luke Tierney\\University of Iowa \And Riad Jarjour\\University of Iowa} \title{Examining \proglang{R} Profiling Data: The \pkg{proftools} Package} %% for pretty printing and a nice hypersummary also set: \Plainauthor{Luke Tierney, Riad Jarjour} %% comma-separated \Plaintitle{Examining R Profiling Data: The proftools Package} %% without formatting %$ \Shorttitle{\pkg{foo}: A Capitalized Title} %% a short title (if necessary) %% an abstract and keywords \Abstract{ This note introduces the \pkg{proftools} package for examining data collected by \proglang{R}'s sampling profiler. \pkg{proftools} includes facilities for summarizing results at the function, call, and source line level; for filtering to narrow the focus to functions of primary interest; and for visualizing profiling data. Use of the package is illustrated with a small running example. } \Keywords{\pkg{proftools}, profiling, \proglang{R}, \code{Rprof}} \Plainkeywords{proftools, profiling, R, Rprof} %% without formatting %% at least one keyword must be supplied %% publication information %% NOTE: Typically, this can be left commented and will be filled out by the technical editor %% \Volume{50} %% \Issue{9} %% \Month{June} %% \Year{2012} %% \Submitdate{2012-06-04} %% \Acceptdate{2012-06-04} %% The address of (at least) one author should be given %% in the following format: \Address{ Luke Tierney\\ Department of Statistics and Actuarial Science\\ Faculty of Statistics\\ University of Iowa\\ Iowa City, USA\\ E-mail: \email{luke-tierney@uiowa.edu}\\ URL: \url{http://homepage.stat.uiowa.edu/~luke/} Riad Jarjour\\ Department of Statistics and Actuarial Science\\ Faculty of Statistics\\ University of Iowa\\ Iowa City, USA\\ E-mail: \email{riad-jarjour@uiowa.edu} } %% It is also possible to add a telephone and fax number %% before the e-mail in the following format: %% Telephone: +43/512/507-7103 %% Fax: +43/512/507-2851 %% for those who use Sweave please include the following line (with % symbols): %% need no \usepackage{Sweave.sty} %% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{document} %% include your article here, just as usual %% Note that you should use the \pkg{}, \proglang{} and <<>>=} commands. <>= library("proftools") options(keep.source = TRUE) knitr::opts_chunk$set(size = "small", fig.align = "center") @ %% \section{Introduction} Profiling is a program analysis method for determining where a program run spends most of its execution time, and can be very helpful in guiding programmer effort for improving program performance. \proglang{R} includes a sampling based profiling mechanism that records information about calls on the stack at specified time intervals. If available, information about the specific source code lines active at the sampling point is recorded as well. Information about time spent in the garbage collector can also be collected. The collected profiling data is written to a file, by default the file \code{Rprof.out} in the current working directory. The function \code{summaryRprof}, available in the \pkg{base} package in \proglang{R}, provides a simple interface for examining this data. The \pkg{profvis} package \citep{Chang:profvis} provides a visualization and another representation of the data. The \pkg{proftools} package provides a much more extensive set of tools for summarizing, visualizing, and filtering this data. \section{Collecting Profile Data} The \pkg{proftools} package includes an example script \code{bootlm.R} in the \code{samples} directory that runs several examples from the \pkg{boot} package \citep{CantyRipley:Boot} and then fits a simple linear model. The file path can be obtained as <<>>= srcfile <- system.file("samples", "bootlmEx.R", package = "proftools") @ %% The traditional way to collect profiling data in R is to call \code{Rprof} to start profiling, run the code to be profiled, and then call \code{Rprof} again to end profiling. For example, to profile the code in \code{bootlm.R}, and collect both source line and GC information, you could use <>= profout <- tempfile() Rprof(file = profout, gc.profiling = TRUE, line.profiling = TRUE) source(srcfile) Rprof(NULL) pd <- readProfileData(profout) unlink(profout) @ %% The \pkg{proftools} package provides the simpler alternative <>= pd <- profileExpr(source(srcfile)) @ %% By default \code{profileExpr} enables GC and source information to be collected. It also trims off stack information leading up to the \code{profileExpr} call. \textbf{RStudio} \citep{RStudio} provides a \textbf{Profile} menu for collecting profile data. The data are stored in a cache directory. The function \code{readRStudioProfileCacheData} returns the data for the most recent profile run in the cache, or \code{NULL} if none is available. \section{Summary Functions} The most basic summary function is \code{funSummary} for summarizing profile results at the function level. It produces information similar to the result returned by R's \code{summaryRprof} but in a more usable form: <<>>= head(funSummary(pd), 10) @ %% The result returned by \code{funSummary}, and most summary functions, is a data frame, so \code{head} can be used to focus on the top entries. By default, when source information is available results are summarized at the call level, so multiple calls to \code{boot} from different source lines are shown separately. This can be suppressed by supplying the \code{srclines = FALSE} argument: <<>>= head(funSummary(pd, srclines = FALSE), 10) @ %% Data can also be summarized by call: <<>>= head(callSummary(pd), 10) @ %% When source information is available in the profile data the \code{srcSummary} function can be used to summarize at the source line level; only lines appearing in the sample are included: <<>>= srcSummary(pd) @ %% The function \code{annotateSource} can show the full files with profiling annotations. A useful way to examine profile data is to look for hot execution paths. This approach sorts functions called at top level, i.e. at the bottom of the call stack, by the total amount of time spent in their top level calls; within each top level call to a function \code{f} the functions called by \code{f} are sorted by the amount of time spent in them within the top level call to \code{f}; and the process continues for higher level calls. The function \code{hotPaths} produces a hot path summary; the \code{total.pct} argument causes leaf functions in stack traces to be pruned back until the execution time percentage for each stack trace is at least \code{total.pct}: <<>>= hotPaths(pd, total.pct = 10.0) @ %% Examining the result of \code{hotPaths} starting with high values of \code{total.pct} and then moving to lower values is a useful way to explore where the computational effort is concentrated. An alternative for limiting the depth to which stack traces are followed is provided by the \code{maxdepth} argument. %% Other summary: \code{pathSummary} \section{Filtering Profile Data} The hot path summary shows information associated with the source command itself that is not directly relevant to our analysis. The \code{filterProfileData} function can be used to select or omit certain functions, drop functions with small self or total times, narrow to a particular time interval, among others. For example, by selecting only stack traces that include calls to \code{withVisible} and then trimming off the leading four calls we can focus just on the work done in the sourced file: <<>>= filteredPD <- filterProfileData(pd, select = "withVisible", skip = 4) @ %% The hot path summary for this reduced profile is <<>>= hotPaths(filteredPD, total.pct = 10.0) @ %% We can use the \code{focus} filter to further narrow our examination to stack frames containing calls to \code{glm} and also remove all calls preceding the first \code{glm} call from the selected stack frames. For this reduced data it also makes sense to follow the paths further by lowering \code{total.pct} to 5\%: <<>>= glmPD <- filterProfileData(filteredPD, focus = "glm") hotPaths(glmPD, total.pct = 5.0) @ %% \section{Visualizing Profile Data} \href{https://en.wikipedia.org/wiki/Call_graph}{Call graphs} annotated with profile information are a very popular way to view profiling results. \code{plotProfileCallGraph} uses the \pkg{graph} \citep{Gentleman:graph} and \pkg{Rgraphviz} \citep{Gentry:Rgraphviz} packages from Bioconductor to render an annotated call graph. The default style for the graph is based on the style used in \href{https://github.com/gperftools}{Google's profiling library} but can be customized in a number of ways. A call graph for the full profile data is produced by <>= plotProfileCallGraph(pd) @ %% and is shown in Figure \ref{fig:fullCallGraph}. By default the call graph size is limited to at most 30 nodes; nodes with lower total hit counts are dropped. The \code{maxnodes} argument can be used to adjust this limit. Also by default color is used to encode the total hit score for the function nodes. This can be suppressed with the \code{score = "none"} argument. We can obtain a more readable graph by filtering. For example, to examine the \code{glm.fit} call and its callees we can use <>= plotProfileCallGraph(filterProfileData(pd, focus = "glm.fit")) @ %% The result is shown in Figure \ref{fig:filteredCallGraph}. A printed version of the call graph, similar to the call graph produced by \code{gprof} \citep{Graham:1982:GCG:800230.806987}, can be obtained with \code{printProfileCallGraph}. For the subgraph of the \code{glm.fit} calls, for example, <>= printProfileCallGraph(filterProfileData(pd, focus = "glm.fit")) @ %% produces the printed representation shown in Figure \ref{fig:printGraph}. \begin{figure} \tiny <>= <> @ %% \caption{Printed call graph.} \label{fig:printGraph} \end{figure} Another visualization sometimes used is a \href{http://www.brendangregg.com/flamegraphs.html}{flame graph}. A flame graph for the original data is produced by <>= flameGraph(pd) @ %% The result is shown in Figure \ref{fig:flameGraph}. A flame graph of the filtered data is produced by <>= flameGraph(filteredPD) @ %% and shown in Figure \ref{fig:filteredFlameGraph} %% Furthermore, an svg file of the graph can be outputted by using the %% svg argument: %% <>= %% flameGraph(pd, svg = "graph.svg", order = "hot") %% @ The vertical positions of rectangles in these flame graphs represent call depth on the stack. The widths of the rectangles represent the amount of time spent in a call at a particular call or set of calls at a particular depth. The \code{order} argument determines the ordering of call rectangles at a particular level within a call at the lower level. The default order is \code{"hot"}; it uses the hot path ordering with the call with the largest amount of time first. This produces a visual representation of the hot path summary. The \code{"alpha"} ordering orders the calls alphabetically. Specifying \code{order = "time"} shows the calls in the order in which they occurred: <>= flameGraph(pd, order = "time") @ %% Figure \ref{fig:timeGraph} shows the \code{boot} calls preceding the data generation for the \code{lm} call and the \code{lm} call itself. Default colors for flame graphs are based on the \code{rainbow} palette. Alternative colors can be specified by providing a \code{colormap} function. If space permits, the call labels are printed within the rectangles representing the calls. A method for the \code{identify} function is available that can be used to identify the individual calls when they are not visible. An option to this method is to request that all instances of the identified calls be outlined on the plot. The \code{identify} method uses the value returned by \code{flameGraph}; thus it would be used as <>= fg <- flameGraph(pd) identify(fg) @ A third visualization that is sometimes used is a callee tree map. This is produced by <>= calleeTreeMap(pd) @ %% and shown in Figure \ref{fig:calleeTreeMap}. A callee tree map shows a tree map \citep{ShneidermanTreemaps} of the calls in the call tree. The tiling algorithm used depends on the \code{squarify} argument. If \code{squarify} is \code{TRUE} then the \emph{squarified} algorithm \citep{BrulsVanWijk:sqarified} is used; otherwise, the longer side is partitioned. Again a method for \code{identify} is provided for \code{calleeTreeMap} objects. Clicking on rectangles in the plot returns a list of the call stacks for the identified rectangles. \section{Graphical User Interfaces} The function \code{writeCallgrindFile} can be used to write the profile data in \href{valgrind.org}{Valgrind's} \code{callgrind} format for use with the \href{http://kcachegrind.sourceforge.net/html/Home.html}{\code{kcachegrind}} or \code{qcachegrind} graphical user interfaces available on Linux and Mac OS X. Figure \ref{kcachegrindFig} shows the \code{kcachegrind} interface for the example profile data written out using \code{writeCallgrindFile}. \begin{figure} \centering \includegraphics[width=\maxwidth]{kcachegrind} \caption{\code{kcachegrind} interface for examining profile data for the example.} \label{kcachegrindFig} \end{figure} By default \code{writeCallgrindFile} assumes the common case where, as here, the profiled code is in a file run using \code{source} and source code information is retained. If this is the case, \code{writeCallgrindFile} removes calls associated with the \code{source} call and adds a \code{} entry; this persuades \code{kcachegrind} to show the top level line information. Graphical user interfaces within R will be made available in the \href{https://github.com/ltierney/Rpkg-proftools-GUI}{\pkg{proftoolsGUI}} package. The current development version provides two interfaces, one based on \pkg{gWidgets2} \citep{Verzani:gWidgets2} and one on \pkg{shiny} \citep{Chang:shiny}. Figure \ref{shinyGUIFig} shows the shiny interface at the time of writing. \begin{figure} \centering \includegraphics[width=\maxwidth]{shinyGUI} \caption{\code{proftoolsGUI} \code{shiny} interface for examining profile data for the example.} \label{shinyGUIFig} \end{figure} \bibliography{proftools} \end{document}