---
title: "Reproducible Output"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Reproducible Output}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)
```

Reproducibility is a cornerstone of scientific research. **localLLM** is designed with reproducibility as a first-class feature, ensuring that your LLM-based analyses can be reliably replicated.

## Deterministic Generation by Default

All generation functions in localLLM (`quick_llama()`, `generate()`, and `generate_parallel()`) use **deterministic greedy decoding** by default. This means running the same prompt twice will produce identical results.

```{r}
library(localLLM)

# Run the same query twice
response1 <- quick_llama("What is the capital of France?")
response2 <- quick_llama("What is the capital of France?")

# Results are identical
identical(response1, response2)
```

```
#> [1] TRUE
```

## Seed Control for Stochastic Generation

Reproducibility is ensured even when temperature > 0:

```{r}
# Stochastic generation with seed control
response1 <- quick_llama(
  "Write a haiku about data science",
  temperature = 0.9,
  seed = 92092
)

response2 <- quick_llama(
  "Write a haiku about data science",
  temperature = 0.9,
  seed = 92092
)

# Still reproducible with matching seeds
identical(response1, response2)
```

```
#> [1] TRUE
```

```{r}
# Different seeds produce different outputs
response3 <- quick_llama(
  "Write a haiku about data science",
  temperature = 0.9,
  seed = 12345
)

identical(response1, response3)
```

```
#> [1] FALSE
```

## Input/Output Hash Verification

All generation functions compute SHA-256 hashes for both inputs and outputs. These hashes enable verification that collaborators used identical configurations and obtained matching results.

```{r}
result <- quick_llama("What is machine learning?")

# Access the hashes
hashes <- attr(result, "hashes")
print(hashes)
```

```
#> $input
#> [1] "a3f2b8c9d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1"
#>
#> $output
#> [1] "b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5"
```

The input hash includes:
- Model identifier
- Prompt text
- Generation parameters (temperature, seed, max_tokens, etc.)

The output hash covers the generated text, allowing collaborators to verify they obtained matching results.

### Hashes with explore()

For multi-model comparisons, `explore()` computes hashes per model:

```{r}
res <- explore(
  models = models,
  prompts = template_builder,
  hash = TRUE
)

# View hashes for each model
hash_df <- attr(res, "hashes")
print(hash_df)
```

```
#>   model_id                         input_hash                        output_hash
#> 1  gemma4b a3f2b8c9d4e5f6a7b8c9d0e1f2a3b4c5... b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9...
#> 2  llama3b c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0... d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1...
```

Set `hash = FALSE` to disable hash computation if not needed.

## Automatic Documentation

Use `document_start()` and `document_end()` to capture everything that happens during your analysis. The log records:

- Timestamps
- Model metadata (paths, parameters)
- Summaries of function calls
- SHA-256 fingerprint of the entire run

```{r}
# Start documentation
document_start(path = "analysis-log.txt")

# Run your analysis
result1 <- quick_llama("Classify this text: 'Great product!'")
result2 <- explore(models = models, prompts = prompts)

# End documentation
document_end()
```

The log file contains a complete audit trail:

```
localLLM Run Log
File: /path/to/analysis-log.txt
Started: 2025-01-15 14:30:22 EST
Ended: 2025-01-15 14:35:12 EST
Duration: 289.9 seconds

Events:
- [2025-01-15 14:30:22 EST] document_start
    {
      "package_version": "1.2.1",
      "r_version": "4.4.1",
      "platform": "aarch64-apple-darwin22.6.0",
      "os": "Darwin",
      "user": "researcher",
      "working_directory": "/home/user/analysis"
    }

- [2025-01-15 14:30:25 EST] quick_llama
    {
      "model": "Llama-3.2-3B-Instruct-Q5_K_M.gguf",
      "prompt_count": 1,
      "n_gpu_layers": 999,
      "n_ctx": 2048,
      "max_tokens": 100,
      "temperature": 0,
      "seed": 1234,
      "auto_format": true,
      "clean": false
    }

- [2025-01-15 14:30:25 EST] quick_llama_hash
    {
      "input_hash": "a3f2b8c9...",
      "output_hash": "b4c5d6e7..."
    }

- [2025-01-15 14:35:12 EST] document_end
    {
      "duration_seconds": 289.9,
      "total_events": 4
    }

Hash (SHA-256): e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2...
```

## Best Practices for Reproducible Research

### 1. Always Set Seeds

Even with `temperature = 0`, explicitly setting seeds documents your intent:

```{r}
result <- quick_llama(
  "Analyze this text",
  temperature = 0,
  seed = 42  # Explicit for documentation
)
```

### 2. Log Your Environment

Record your setup at the start of analysis:

```{r}
# Check hardware profile
hw <- hardware_profile()
print(hw)
```

```
#> $os
#> [1] "Darwin"
#>
#> $cpu_cores
#> [1] 10
#>
#> $ram_total
#> [1] 17179869184
#>
#> $gpu
#> $gpu$name
#> [1] "Apple M2 Pro"
```

### 3. Use Document Functions for Audit Trails

Wrap your entire analysis in documentation calls:

```{r}
document_start(path = "my_analysis_log.txt")

# All your analysis code here
# ...

document_end()
```

### 4. Share Hashes for Verification

When publishing or sharing results, include hashes so others can verify:

```{r}
result <- quick_llama("Your prompt here", seed = 42)

# Report these in your paper/documentation
cat("Input hash:", attr(result, "hashes")$input, "\n")
cat("Output hash:", attr(result, "hashes")$output, "\n")
```

### 5. Version Control Your Models

Track which model versions you used:

```{r}
# List cached models with metadata
cached <- list_cached_models()
print(cached[, c("name", "size_bytes", "modified")])
```

## Summary

| Feature | Function/Parameter | Purpose |
|---------|-------------------|---------|
| Deterministic output | `temperature = 0` (default) | Same input = same output |
| Seed control | `seed = 42` | Reproducible stochastic generation |
| Hash verification | `attr(result, "hashes")` | Verify identical configurations |
| Audit trails | `document_start()`/`document_end()` | Complete session logging |
| Hardware info | `hardware_profile()` | Record execution environment |

With these tools, your LLM-based analyses become fully reproducible and verifiable.
