---
title: "Benchmarks"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Benchmarks}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

In the same way that using R packages 
[**cpp11**](https://CRAN.R-project.org/package=cpp11) or 
[**Rcpp**](https://CRAN.R-project.org/package=Rcpp) allows you to 
bridge your R code with C++ code that runs faster than R, you can use **luajr** 
to bridge your R code with Lua code that runs faster than R.

Often, C/C++ code runs about 5-100 times faster than the equivalent R code.
In my experience, **luajr** code typically presents a similar speedup. The 
amount of speedup also depends strongly on what you're doing.

Why use **luajr** then? **Rcpp** and **cpp11** require a C++ toolchain (e.g. 
gcc, clang, etc.) and requires long compilation times, whereas **luajr** 
doesn't. This means that **luajr** is usable when a C++ compiler isn't 
available, or when compilation times are prohibitive or an annoyance.

In this vignette, we show a few example benchmarks to show situations where 
**luajr** offers a substantial improvement in code execution speed and when it
doesn't. 

## Example 1: when to stick with R

Lots of R's built-in functions are already speedy because they rely on compiled
C code. For example, consider the following ways you might sum all the elements
of a numeric vector:

```{r}
library(luajr)

x <- rnorm(1e6)

# Method one: built-in sum
s1 <- sum(x)

# Method two: sum in R
sum_R <- function(x) {
    s <- 0
    for (y in x) {
        s <- s + y
    }
    return (s)
}
s2 <- sum_R(x)

# Method three: sum in Lua
sum_L <- lua_func("function(x)
    local s = 0
    for i = 1, #x do
        s = s + x[i]
    end
    return s
end")
s3 <- sum_L(x)
```

Each method produces the same answer. Using `bench::mark()` on a 2025-era 
MacBook Pro, the execution time for each method is as follows:

| Method       | Median runtime |
|:-------------|---------------:|
|`sum(x)`      |     664 µs     |
|`sum_R(x)`    |    6340 µs     |
|`sum_L(x)`    |   **546 µs**   |

R's built-in `sum()` function calls an internal R function written in C, so it
is already relatively fast. The "naïve" `sum_R()` function, which calculates
the sum manually, is around 10x slower. And the Lua version just barely edges
out R's built-in `sum()` function, but it's worth noting that Lua arithmetic
may not handle `NA` values correctly for numeric vectors and certainly won't
handle `NA` values correctly for integers (without explicit handling).

## Example 2: logistic map

Consider the following:

```{r}
logistic_map_R = function(x0, burn, iter, A)
{
    result_x = numeric(length(A) * iter)
    
    j = 1
    for (a in A) {
        x = x0
        for (i in 1:burn) { 
            x = a * x * (1 - x)
        }
        for (i in 1:iter) { 
            result_x[j] = x
            x = a * x * (1 - x)
            j = j + 1
        }
    }
    
    return (list2DF(list(a = rep(A, each = iter), x = result_x)))
}

logistic_map_L = lua_func(
"function(x0, burn, iter, A)
    local dflen = #A * iter
    local aa = luajr.numeric(dflen, 0)
    local xx = luajr.numeric(dflen, 0)
    
    local j = 1
    for k, a in pairs(A) do
        local x = x0
        for i = 1, burn do
            x = a * x * (1 - x)
        end
        for i = 1, iter do
            aa[j] = a
            xx[j] = x
            x = a * x * (1 - x)
            j = j + 1
        end
    end
    
    local result = luajr.dataframe()
    result:set('a', aa)
    result:set('x', xx)
    
    return result
end", "native, native, native, auto")

# To be compiled using Rcpp::cppFunction()
logistic_map_C =
'DataFrame logistic_map(double x0, unsigned int burn, unsigned int iter, NumericVector A)
{
    unsigned int dflen = A.length() * iter;
    NumericVector da(dflen, 0);
    NumericVector dx(dflen, 0);
    
    unsigned int j = 0;
    for (auto a : A)
    {
        double x = x0;
        for (unsigned int i = 0; i < burn; ++i)
            x = a * x * (1 - x);
        for (unsigned int i = 0; i < iter; ++i, ++j)
        {
            dx[j] = x;
            da[j] = a;
            x = a * x * (1 - x);
        }
    }

    return DataFrame::create(Named("a") = da, Named("x") = dx);
}'
```

Here we are comparing three different versions (R, Lua, C++) of running a 
parameter sweep of the logistic map, a chaotic dynamical system popularized by 
Bob May in a [1976 Nature article](https://abel.math.harvard.edu/archive/118r_spring_05/docs/may.pdf).
The output looks like this:

```{r}
#| fig.alt: >
#|   Plot of fixed points of the logistic map chaotic dynamical system.
#|   The input x-axis value a ranges from 2.0 to 3.85. Between 2.0 and 
#|   3.0 there is a single fixed point; above 3.0 the fixed points exhibit
#|   bifurcations and chaotic behaviour.
logistic_map = logistic_map_L(0.5, 100, 100, 200:385/100)
plot(logistic_map$a, logistic_map$x, pch = ".")
```

The times taken by each function on a 2025-era MacBook Pro are as follows:

|Call                                          | Method       | Median runtime |
|----------------------------------------------|--------------|---------------:|
|`logistic_map_R(0.5, 100, 100, 200:385/100))` | R function   |       850 µs   |
|`logistic_map_L(0.5, 100, 100, 200:385/100))` | Lua function |      **80 µs** |
|`logistic_map_C(0.5, 100, 100, 200:385/100))` | C++ function |       100 µs   |

The version written in Lua is around 10 times faster than the version in R, 
and even somewhat outperforms **Rcpp**. Note that the relative speed of Lua 
versus C++ depends on the number of iterations, and whether you use **Rcpp** or
**cpp11**, which each seem to have marginal advantages in different cases. 
Nonetheless, the Lua version seems to execute about as quickly as the C++ 
version, within plus or minus 20%.

The speedup of the Lua function relative to R was much more notable in an 
earlier test where the R version first created the data frame and then 
performed the iteration, i.e. with the line `result$x[j] = x` instead of 
`result_x[j] = x`. The median runtime for that R version was two orders of 
magnitude slower than the Lua version; the extra overhead associated with 
`data.frame` methods was pointed out by Tim Taylor. This goes to show that 
when aiming for efficient R code, it helps to know a few tricks.

## Example 3: Lorenz attractor

Even greater speedups can be seen when conducting more intensive calculations.
For example, solving a system of ordinary differential equations (ODEs) is a 
common scientific task. Normally this is done in R using a general-purpose 
package such as [**deSolve**](https://CRAN.R-project.org/package=deSolve). In 
this benchmark we will instead do simple Euler integration of the famous
chaotic [Lorenz system](https://en.wikipedia.org/wiki/Lorenz_system).

To calculate this system entirely in R, you might write something like the 
following:

```{r}
# Lorenz attractor in R

# Take a single step, size dt, of the Lorenz system
step1 = function(x, rho, sigma, beta, dt)
{
    dx = sigma * (x[2] - x[1])
    dy = x[1] * (rho - x[3]) - x[2]
    dz = x[1] * x[2] - beta * x[3]
    x[1] = x[1] + dx * dt
    x[2] = x[2] + dy * dt
    x[3] = x[3] + dz * dt
    return (x)
}

# Calculate Lorenz system trajectory
lorenz1 = function(n, init, rho, sigma, beta)
{
    x = init
    result = matrix(0, nrow = n, ncol = 3)
    for (i in 1:n) {
        result[i, ] = x
        x = step1(x, rho, sigma, beta, 0.01)
    }
    return (result)
}

# Generate and plot result
result1 = lorenz1(5000, c(1,1,1), 28, 10, 8/3)
plot(result1[,1], result1[,2], type = "l")
```

Similar code can be written entirely in Lua using **luajr**:

```{r}
# Lorenz attractor in Lua
library(luajr)

lorenz2 = lua_func("
function(n, init, rho, sigma, beta)
    local step2 = function(x, rho, sigma, beta, dt)
        local dx = sigma * (x[2] - x[1])
        local dy = x[1] * (rho - x[3]) - x[2]
        local dz = x[1] * x[2] - beta * x[3]
        x[1] = x[1] + dx * dt
        x[2] = x[2] + dy * dt
        x[3] = x[3] + dz * dt
        return x
    end

    local x = init
    local result = luajr.matrix(n, 3)

    for i = 1,n do
        result[i] = x[1]
        result[i + n] = x[2]
        result[i + 2*n] = x[3]
        x = step2(x, rho, sigma, beta, 0.01)
    end

    return result
end", "native auto native")

result2 = lorenz2(5000, c(1,1,1), 28, 10, 8/3)
plot(result2[,1], result2[,2], type = "l")
```

Execution times are as follows:

|Call                                          | Method       | Median runtime |
|----------------------------------------------|--------------|---------------:|
|`lorenz1(5000, c(1, 1, 1), 28, 10, 8/3)`      | R function   |      3300 µs   |
|`lorenz2(5000, c(1, 1, 1), 28, 10, 8/3)`      | Lua function |      **38 µs** |

The Lua code is 85 times faster than the equivalent R code.

As mentioned before, a library such as **deSolve** is normally used to 
integrate systems of ODEs and both versions above do execute rather quickly
anyway -- it may not be worth the trouble here to save 3 milliseconds! But
for intensive and complex simulations, it may help to get a substantial speed 
boost over what base R can do.
