---
title: "Demo"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Demo}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(pureseqtmr)
library(testthat)
library(knitr)
```

`pureseqtmr` is a package to call PureseqTM from R.
PureseqTM predicts the topology of a membrane protein, where the topology can
be either inside, or not inside the membrane.

To be able to call PureseqTM, it needs to be installed:

```
pureseqtmr::install_pureseqtm()
```

Note that this code is not actually run, to comply with CRAN guidelines.

PureseqTM supplies some example files. Use `get_example_filenames` to
get the path to all these files:

```{r}
if (is_pureseqtm_installed()) {
  get_example_filenames()
}
```

In this example, `1bhaA.fasta` will be used. To obtain the full path,
use `get_example_filename`. `get_example_filename` will give an
error if the file is not found.

```{r}
if (is_pureseqtm_installed()) {
  fasta_filename <- get_example_filename("1bhaA.fasta")
  head(readLines(fasta_filename))
}
```

Getting the topology of this protein:

```{r}
if (is_pureseqtm_installed()) {
  topology <- predict_topology(fasta_filename)
  kable(topology)
}
```

Or show the topology as a plot:

```{r fig.width=7}
if (is_pureseqtm_installed()) {
  plot_topology(topology)
}
```

One needs the exact same code for a full proteome.
Here we use a `pureseqtmr` example file,
which is the COVID-19 reference proteome,
as downloaded from `https://www.uniprot.org/proteomes/UP000464024`.

```{r}
fasta_filename <- system.file(
  "extdata",
  "UP000464024.fasta",
  package = "pureseqtmr"
)
expect_true(file.exists(fasta_filename))
```
Show the (top of the) proteome:

```{r}
head(readLines(fasta_filename))
```

Getting the topology of this protein:

```{r}
if (is_pureseqtm_installed()) {
  topology <- predict_topology(fasta_filename)
}
```

Instead of directly showing the raw data, the protein names
are shortened first:

```{r}
if (is_pureseqtm_installed()) {
  topology$name <- stringr::str_match(
    string = topology$name,
    pattern = "..\\|.*\\|(.*)_SARS2"
  )[, 2]
}
```

Show the topology as a plot:

```{r fig.width=7}
if (is_pureseqtm_installed()) {
  plot_topology(topology)
}
```

And tally the number of transmembrane helices per protein:

```{r}
if (is_pureseqtm_installed()) {
  kable(tally_tmhs(topology))
}
```

