<%@meta language="R-vignette" content="--------------------------------
%\VignetteIndexEntry{doFuture: An Overview on using Foreach to Parallelize via the Future Framework}
%\VignetteAuthor{Henrik Bengtsson}
%\VignetteKeyword{R}
%\VignetteKeyword{package}
%\VignetteKeyword{vignette}
%\VignetteKeyword{foreach}
%\VignetteKeyword{future}
%\VignetteKeyword{promise}
%\VignetteKeyword{lazy evaluation}
%\VignetteKeyword{synchronous}
%\VignetteKeyword{asynchronous}
%\VignetteKeyword{parallel}
%\VignetteKeyword{cluster}
%\VignetteEngine{R.rsp::rsp}
%\VignetteTangle{FALSE}
--------------------------------------------------------------------"%>
# doFuture: An Overview on using Foreach to Parallelize via the Future Framework

## TL;DR

To run `foreach()` in parallel, install R packages **[doFuture]** and
**[futurize]**, and call:

```r
library(futurize)
plan(multisession)

y <- foreach(x = 1:4, y = 1:10) %do% {
  z <- x + y
  slow_sqrt(z)
} |> futurize()
```

That's it - easy!


## Introduction

The **[foreach]** package implements a map-reduce API with functions
`foreach()` and `times()` that provide us with powerful methods for
iterating over one or more sets of elements with options to do it in
parallel.

The **[future]** package provides a generic API for using futures in
R.  A future is a simple yet powerful mechanism to evaluate an R
expression and retrieve its value at some point in time.  Futures can
be resolved in many different ways depending on which strategy is
used. You can resolve them sequential, in parallel on your local
computer, on remove computers, in the cloud, on a high-performance compute (HPC) cluster, or via any [future backend] available.

The **[doFuture]** package provides a bridge between **foreach** and
the **future** parallelization framework. Specifically, the
**doFuture** package provides three alternatives for using futures
with **foreach**:

 1. `y <- foreach(...) %do% { ... } |> futurize()`

 2. `y <- foreach(...) %dofuture% { ... }`

 3. `registerDoFuture()` + `y <- foreach(...) %dopar% { ... }`.
 


### Alternative 1: `futurize()` (recommended)

The _first alternative_ (recommended) uses `futurize()` of the
**[futurize]** package. An example is:

```r
library(futurize)
plan(multisession)

y <- foreach(x = 1:4, y = 1:10) %do% {
  z <- x + y
  slow_sqrt(z)
} |> futurize()
```

This alternative is the recommended and most clean way to let
`foreach()` parallelize via the future framework, especially if you
start out from scratch.  All you need to remember is to pipe it to
`futurize()`, and, yes, it is correct to use `%do%` here.  In addition
to `multisession`, parallelization can be done via any compliant
[future backend].  Identification of globals, random number generation
(RNG), and error handling is handled the same way as elsewhere in the
future ecosystem.  We recommend to use `futurize()`, because it is
consistent with how we parallelize `lapply()` and `purrr::map()` using
**futurize**.  With `futurize()`, you do not have to explicitly load
**doFuture** - instead **doFuture** will serve `futurize()` under the
hood.

See `help("futurize", package = "futurize")` for more details and
examples on this approach.



### Alternative 2: `%dofuture%`

The _second alternative_ (formely recommended), which uses
`%dofuture%`, avoids having to use `registerDoFuture()`.  The
`%dofuture%` operator provides a more consistent behavior than
`%dopar%`, e.g. there is a unique set of foreach arguments instead of
one per possible adapter. An example is:

```r
library(doFuture)
plan(multisession)

y <- foreach(x = 1:4, y = 1:10) %dofuture% {
  z <- x + y
  slow_sqrt(z)
}
```

This alternative was the recommended way to let `foreach()`
parallelize via the future framework, but now we recommend using
`futurize()` instead, especially if you start out from scratch.

See `help("%dofuture%", package = "doFuture")` for more details and
examples on this approach.


### Alternative 3: `registerDoFuture()` + `%dopar%`

The _third alternative_ is based on the traditional **foreach**
approach where one registers a foreach adapter to be used by
`%dopar%`.  A popular adapter is `doParallel::registerDoParallel()`,
which parallelizes on the local machine using the **parallel**
package.  This package provides `registerDoFuture()`, which
parallelizes using the **future** package, meaning any
future-compliant parallel backend can be used.

An example is:

```r
library(doFuture)
registerDoFuture()
plan(multisession)

y <- foreach(x = 1:4, y = 1:10) %dopar% {
  z <- x + y
  slow_sqrt(z)
}
```

This alternative is useful if you already have a lot of R code that
uses `%dopar%` and you just want to switch to using the future
framework for parallelization.  Using `registerDoFuture()` is also
useful when you wish to use the future framework with packages and
functions that use `foreach()` and `%dopar%` internally, but still do
not support `futurize()`, e.g. **[NMF]**.

See `help("registerDoFuture", package = "doFuture")` for more details
and examples on this approach.



[doFuture]: https://doFuture.futureverse.org
[futurize]: https://futurize.futureverse.org
[future]: https://future.futureverse.org
[foreach]: https://cran.r-project.org/package=foreach
[batchtools]: https://cran.r-project.org/package=batchtools
[future.batchtools]: https://future.batchtools.futureverse.org
[NMF]: https://cran.r-project.org/package=NMF
[future backend]: https://www.futureverse.org/backends.html
