---
title: "Meta-analyzing correlations"
author: "Brenton Wiernik"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Meta-analyzing correlations}
  %\VignetteEngine{knitr::rmarkdown}
  %\usepackage[utf8]{inputenc}
csl: "apa.csl"
bibliography: vignette.yaml
---

**Note: This vignette is a work in progress.** 

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(psychmeta)
options(width = 160)
```

This vignette will walk you through estimating barebones meta-analyses of
correlations between multiple constructs.
For more vignettes, see the [**psychmeta** overview](overview.html).

## Getting Started

To begin, you will need your meta-analytic data sheet for analysis. We recommend
the **rio** package for importing data to **R**. For an introduction to **rio**, see
`vignette("rio", "rio")`.

**psychmeta** assumes that your data are in "long" format, with each row corresponding
to one effect size. For example, this is the format used in this data frame:

```{r}
knitr::kable(data_r_meas_multi[1:10,])
```

In this table, 
  - **`sample_id`** contains labels indicating the sample each effect size is drawn from; 
  - **`moderator`** is a moderator variable, each row containing the effect size's level for that moderator; 
  - **`x_name`** and **`y_name`** are columns indicating the variables/constructs being related in the effect size; 
  - **`n`** is the sample size for the effect size; 
  - **`rxyi`** is the effect size (the correlation between the two constructs/variables); 
  - **`rxxi`** and **`ryyi`** are the sample reliability values for the measures of the `x_name` and `y_name` variables, respectively; 
  - **`citekey`** contains the citations keys for each study (used to generate bibliographies of included studies).

You can see this data set includes correlations among three variables: `X`, `Y`,
and `Z`, and that each sample contributes several effect sizes, one each for
for different pairs of variables/constructs.

If your data are in a different format, you can use the `reshape_wide2long()`
function to reshape it.

## Estimating a Barebones Meta-Analysis

Let's assume your data frame is called `coding_sheet`.

```{r}
coding_sheet <- data_r_meas_multi

head(coding_sheet)
```

The primary function to conduct meta-analyses of correlations is `ma_r()`. To 
conduct barebones meta-analyses, run:

```{r}
ma_res <- ma_r(rxyi = rxyi, 
               n = n, 
               construct_x = x_name,
               construct_y = y_name,
               sample_id = sample_id, 
               moderators = moderator,
               data = coding_sheet
               )
```

- **`data`** is your data frame. 
- **`rxyi`**, **`n`**, **`construct_x`**, **`construct_y`**, **`sample_id`**, 
  and **`moderators`** are the names of the columns in your data frame that 
  contain the appropriate values. 
  - **`rxyi`** is the correlation effect sizes; 
  - **`n`** is the sample sizes; 
  - **`construct_x`** and **`construct_y`** are the labels for the variables/constructs 
    being correlated; 
  - **`sample_id`** is the sample identification labels; 
  - **`moderators`** is a vector of moderator variable names for the meta-analyses. 
  - Column names can be provided either with quotes (e.g., `"rxyi"`, `"n"`) or 
    without (e.g., `rxyi`, `n`).

To conduct a barebones meta-analysis, at minimum, `n` and `rxyi` are needed.

### Modeling Options

- By default, correlations are weighted by sample size. You can specify alternative
  weights using the `wt_type` argument.

- Random-effects variance (τ^2^ or *SD~res~*^2^) is estimated using the 
  Hunter-Schmidt estimator, computed using the unbiased sample variance estimator 
  (i.e., dividing by $k-1$ rather than $k$). To use the maximum-likelihood estimator
  instead, specify `var_unbiased = FALSE`.

- Barebones results are corrected for the small-sample bias in the correlation
  coefficient. To disable this correction, specify `correct_bias = FALSE`.

- By default, confidence and credibility intervals are constructed using a _t_
  distribution with $k-1$ degrees of freedom. To use a normal distribution instead,
  specify, `conf_method = "norm"` and `cred_method = "norm"`. To customize the
  coverage levels for these intervals, use the `conf_level` and `cred_level` arguments.

## The **psychmeta** Meta-Analysis Object

A **psychmeta** meta-analsyis object is a data frame, with each row being a meta-analysis 
or subanalysis and each column containing information about or results from that 
analysis. For example, the results of the analysis above look like this:

```{r}
ma_res
```

Each row corresponds to a different pair of variables/constructs (X-Y; X-Z; Y-Z)
and level of the moderator variable (overall/all levels pooled together; 
moderator = 1; moderator = 2).

- **`analysis_id`** is a numeric label for each analysis;
- **`pair_id`** is a numeric label for each pair of variables/constructs (X-Y; 
  X-Z; Y-Z); 
- **`construct_x`** and **`construct_y`** indicate which variables/constructs 
  are being meta-analyzed.

- **`analysis_type`** indicates the type of analysis. 
  - "Overall" means an overall meta-analysis, pooling across all moderator levels. 
  - "Simple Moderator" means a subgroup moderator analysis of only studies with 
    the specified levels of the moderator variable(s) in the next column(s).

    (See below for how to conduct meta-analyses with multiple moderator variables
    or with continuous moderators)

- **`meta_tables`** contains the principal meta-analysis results tables.
- **`escalc`** contains tables of effect sizes, sampling error variances, weights, 
  residuals, and other data. These tables can be used for follow-up analyses or
  with the **metafor** package for additional meta-analysis techniques.

## Viewing Results Summaries

To view meta-anlaysis results tables, use the `summary()` function:

```{r}
summary(ma_res)
```

In this table, 

- **`analysis_id`**, **`pair_id`**, **`construct_x`**, **`construct_y`**, and the
  moderator columns are defined as above.
- **`k`** is the number of effect sizes contributing to each meta-analysis. `N` is the
  total sample size contributing to each meta-analysis.
- **`mean_r`** is the weighted mean correlation.
- **`sd_r`** is the weighted observed standard deviation of correlations. 
- **`se_r`** is the standard error of `mean_r`. 
- **`sd_res`** is the estimated random-effects standard deviation (residual _SD_ 
  of correlations after accounting for sampling error). 
- **`CI_LL_95`** and **`CI_UL_95`** are the upper and lower bounds of the confidence 
  interval for `mean_r`; the number indicates the coverage level (default: 95%). 
- **`CR_LL_80`** and **`CR_UL_80`** are the upper and lower bounds of the credibility 
  interval for the estimated population distribution; the number indicates the 
  coverage level (default: 80%).

To view additional results, such as observed variance (`var_r`) or standard
deviation of sampling errors (`sd_e`), use the `get_metatab()` function and select
the appropriate columns:

```{r}
names(get_metatab(ma_res))

get_metatab(ma_res)$var_r
```

To view all columns of this table, convert it to a `data.frame` or `tibble`:

```{r}
dplyr::as_tibble(get_metatab(ma_res))
```

```{r}
as.data.frame(get_metatab(ma_res))
```

### Moderator Analyses

Results for subgroup analyses for different levels of categorical moderators are
shown in the rows of the meta-analysis results table. To estimate confidence
intervals for differences between levels or an omnibus ANOVA statistic, use the
`anova()` function:

```{r}
anova(ma_res)
```

### Correcting for Statistical Artifacts

See Artifact corrections](artifact_corrections.html).

## Outputting Results

To output the main meta-analysis results table to RMarkdown, Word, HTML, PDF, or 
other formats, use the `metabulate()` function. For example, to output the above 
results to a Word document, run:

```{r, eval=FALSE}
metabulate(ma_res, file = "meta-analysis_results.docx", output_format = "word")
```

## Follow-Up Analyses

### Plotting

You can add plots for each meta-analysis in `ma_res` using the `plot_forest()`
and `plot_funnel()` functions:

```{r}
ma_res <- plot_funnel(ma_res)
ma_res <- plot_forest(ma_res)
```

You can view these plots using the `get_plots()` function. This will return a
list of all of the plots in this results. Specify which meta-analysis you want 
to view plots for by passing its `analysis_id` to `[[`:

```{r, fig.show='hold'}
get_plots(ma_res)[["forest"]][[2]]
get_plots(ma_res)[["funnel"]][[2]]
```

For forest plots, if you select an "Overall" meta-analysis, it will include 
plots faceted by moderator levels (`"moderated"`) and not (`"unmoderated"`):

```{r, fig.show='hold'}
get_plots(ma_res)[["forest"]][[1]][["moderated"]][["barebones"]]
get_plots(ma_res)[["forest"]][[1]][["unmoderated"]][["barebones"]]
```

### Heterogeneity Analyses

**psychmeta** reports the random-effects standard deviaton (τ or *SD_res_*) 
and credibility intervals (`mean_r` ± _crit_ × *SD~res~*) in the main 
meta-analaysis results tables. To view confidence intervals for *SD_res_* or 
additional heterogeneity statistics, use the `heterogeneity()` function:

```{r}
ma_res <- heterogeneity(ma_res)
get_heterogeneity(ma_res)[[1]][["barebones"]]
```

### Publication Bias and Sensitivity Analyses

**psychmeta** supports cumulative meta-analysis for publication/small-sample bias
detection, leave-1-out sensitivity analyses, and bootstrap confidence intervals
using the sensitivity function:

```{r, fig.show='hold'}
ma_res <- sensitivity(ma_res)
get_cumulative(ma_res)[[1]][["barebones"]]
get_cumulative(ma_res)[[1]][["barebones"]][["plots"]]
```

```{r}
get_bootstrap(ma_res)[[1]][["barebones"]]
```
