Best Practices in R2020-08-22 

developed by Emil Hvitfeldt1 / 46

Welcome!

2 / 46

Change Settings

Keyboard shortcut to open settings
⌘ + , in Mac OS,
ctrl + , in Windows

✓ - Uncheck "Restore .RData into work space at start up"

✓ - Set "Save work space to .Rdata on exit" to "Never"

Settings window

3 / 46

Change Appearance

RStudio themes

Fonts

Font Sizes

Editor Themes

Settings window

4 / 46

Pane layouts

Change the layout of the panes

Source on top?

Source down to the right?

It's all up to you!

Settings window

5 / 46

Pane layouts

Some like having both source and console open

6 / 46

Pane layouts

...while still allowing to have viewer open

7 / 46

RStudio ProjectsKeep all files from one project together. Use RStudio projects.8 / 46

RStudio ProjectsKeep all files from one project together. Use RStudio projects.Self contained8 / 46

RStudio ProjectsKeep all files from one project together. Use RStudio projects.Self containedProject orientated8 / 46

keep all the files associated with a project together — input data, R scripts, analytic results, figures.

usethisusethis::create_project("project_name")9 / 46

RStudio Projects - Creation 1 / 4

Click File > New Project
Up right tick

Or click on the upper right Up right tick

10 / 46

RStudio Projects - Creation 2 / 4

11 / 46

RStudio Projects - Creation 3 / 4

12 / 46

RStudio Projects - Creation 4 / 4

13 / 46

Folder Structure14 / 46

Folder Structure

name_of_project
|--raw_data
    |--WhateverData.xlsx
    |--report_2017.csv
|--output_data
    |--summary2017.csv
|--rmd
    |--01-analysis.Rmd
|--docs
    |--01-analysis.html
    |--01-analysis.pdf
|--scripts
    |--exploratory_analysis.R
    |--pdf_scraper.R
|--figures
    |--weather_2017.png
|--name_of_project.Rproj
|--run_all.R

14 / 46

Raw data separate from cleaned data
Reports and scrips are separated
Generated and imported figures has its own place
Numbered using 2 digits
Reusable and easily understandable
15 / 46

Folder Structure

library(fs)
folder_names <- c("raw_data", "output_data", "rmd", "docs", 
                  "scripts", "figures")
dir_create(fldr_names)

16 / 46

never modify raw data, only read (forever untouched)

Paths

library(tidyverse)
# data import
data <- read_csv("/Users/Emil/Research/Health/amazing_data.csv")

17 / 46

Paths

library(tidyverse)
# data import
data <- read_csv("/Users/Emil/Research/Health/amazing_data.csv")

## Error: '/Users/Emil/Research/Health/amazing_data.csv' does not exist.

18 / 46

Paths

library(tidyverse)
# data import
data <- read_csv("/Users/Emil/Research/Health/amazing_data.csv")

## Error: '/Users/Emil/Research/Health/amazing_data.csv' does not exist.

Only use relative paths, never absolute paths

18 / 46

Introducing the here package.

library(here)
here()

## [1] "/Users/Emil/Research/Health"

library(here)
data <- read_csv(here("raw_data", "amazing_data.csv"))

19 / 46

Naming Things20 / 46

Naming Things

tweet about naming

20 / 46

Organization
Ease of use
There will be multi slides about naming

Naming Things - Files

NO

report.pdf
reportv2.pdf
reportthisisthelastone.pages
Figure 2.png 
3465-234szx.r
foo.R

YES

2018-10-01_01_report-for-cdc.pdf
01_data.rmd
01_data.pdf
02_data-filtering.rmd
02_data-filtering.pdf

21 / 46

Follow narrative from folder structure slide
jenny Bryan naming things

Avoid spaces, punctuation, special characters and case sensitivity
Deliberate use of delimiters
Describe the contents of the file
Put something numeric first
Left pad numbers with zeroes
Use a standard date (YYYY-MM-DD)

22 / 46

to preserve chronological and logical ordering.

Naming Things - Files

library(fs)
dir_ls("data/", regexp = "health-study")

## 2018-02-23_health-study_power-100_group-A1.csv
## 2018-02-23_health-study_power-100_group-B1.csv
## 2018-02-23_health-study_power-100_group-C1.csv
## 2018-02-23_health-study_power-200_group-A1.csv
## 2018-02-23_health-study_power-200_group-B1.csv
## 2018-02-23_health-study_power-200_group-C1.csv

23 / 46

Naming Things - Files

library(fs)
dir_ls("data/", regexp = "health-study")

## 2018-02-23_health-study_power-100_group-A1.csv
## 2018-02-23_health-study_power-100_group-B1.csv
## 2018-02-23_health-study_power-100_group-C1.csv
## 2018-02-23_health-study_power-200_group-A1.csv
## 2018-02-23_health-study_power-200_group-B1.csv
## 2018-02-23_health-study_power-200_group-C1.csv

stringr::str_split_fixed(x, "[_\\.]", 5)

##      [,1]         [,2]           [,3]        [,4]       [,5] 
## [1,] "2018-02-23" "health-study" "power-100" "group-A1" "csv"
## [2,] "2018-02-23" "health-study" "power-100" "group-B1" "csv"
## [3,] "2018-02-23" "health-study" "power-100" "group-C1" "csv"
## [4,] "2018-02-23" "health-study" "power-200" "group-A1" "csv"
## [5,] "2018-02-23" "health-study" "power-200" "group-B1" "csv"
## [6,] "2018-02-23" "health-study" "power-200" "group-C1" "csv"

23 / 46

Avoid spaces, punctuation, special characters and case sensitivity
Deliberate use of delimiters
File name should describe the contents of the file
Put something numeric first
Left pad numbers with zeroes
Use ISO 8601 standard for dates (YYYY-MM-DD)

Naming Things - Files

library(tidyverse)
map_df(dir_ls("data/", regexp = "health-study"), read_csv)
# or
dir_ls("data/", regexp = "health-study") %>%
  map_df(read_csv)

24 / 46

Avoid spaces, punctuation, special characters and case sensitivity
Deliberate use of delimiters
File name should describe the contents of the file
Put something numeric first
Left pad numbers with zeroes
Use ISO 8601 standard for dates (YYYY-MM-DD)

Naming Things - ObjectsOnly use lowercase letters, numbers, and _
Use names that are not jargony, weight instead of K
Use informative names
25 / 46

Naming Things - Objects

# Bad
df
e
tuningVar
# Good
health_data
error
tuning_var

26 / 46

lowercase letters + numbers = alpha-numeric characters (ish)

What To Avoid - attach()Never use attach()27 / 46

What To Avoid - attach()

Never use `attach()`

attach(mtcars)
mean(mpg)

## [1] 20.09062

Loads lots of names into the search path, ambiguous selections.

27 / 46

What To Avoid - attach()

Never use `attach()`

attach(mtcars)
mean(mpg)

## [1] 20.09062

Loads lots of names into the search path, ambiguous selections.

Try `with()` or `withr` instead

27 / 46

What To Avoid - attach()Never use rm(list=ls())28 / 46

What To Avoid - attach()Never use rm(list=ls())Instead, restart the R sessionCTRL+SHIFT+F10 for WindowsCMD+SHIFT+ALT+F10 for Mac OS28 / 46

R Markdown documents versus R scriptsYou can use R scripts for simple self contained tasks.source() R scripts into your R Markdown document where you will do analyses, visualizations and reporting.29 / 46

R Markdown

- 01-import.R
- 02-clean-names.R
- 03-tidy.R
- etc

30 / 46

R Markdown

- 01-import.R
- 02-clean-names.R
- 03-tidy.R
- etc

Include at the start of R Markdown file

{r load_scripts, include = FALSE}
library(here)
source(here("scripts", "01-import.R"))
source(here("scripts", "02-clean-names.R"))
source(here("scripts", "03-tidy.R"))

30 / 46

Naming Chunks

Names can be placed after the comma

```{r, chunk-label, results='hide', fig.height=4}

or before

```{r chunk-label, results='hide', fig.height=4}

In general it is recommended to use alphabetic characters with words separated by - and avoid other characters. - Yihui Xie

31 / 46

Makes navigating the R Markdown document easier
Makes your R Markdown easier to understand
Clarifies error reports or progress of knitting
Caching when moving chunks around
32 / 46

Lower left corner of Rstudio have menu where sections and chunks can be selected with.

Caching on unnamed chunks are based on numbering.

Setup Chunk

In a fresh R Markdown document you see this

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

33 / 46

Setup Chunk

In a fresh R Markdown document you see this

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

The setup chunk is run before another code - use to your advantage

33 / 46

Setting figure path34 / 46

Setting figure path

```{r setup, include=FALSE}
knitr::opts_chunk$set(fig.path = "figures/")

34 / 46

highlight use of fig.path option

fig.path: ('figure/'; character) prefix to be used for figure filenames (fig.path and chunk labels are concatenated to make filenames)

Styling CodeUse consistent style when writing code  35 / 46

Styling Code

Use consistent style when writing code

http://style.tidyverse.org/

35 / 46

Styling Code

Use consistent style when writing code

http://style.tidyverse.org/

All about preferences but keep it consistent!!!

35 / 46

Give examples of styles to follow

Use the styler package to style your code for you36 / 46

Keep .Rprofile CleanYour computer contains a file called .Rprofile.This file runs first in every session. Think of it as configuration file.37 / 46

Keep .Rprofile Clean

Your computer contains a file called .Rprofile.

This file runs first in every session. Think of it as configuration file.

options(stringsAsFactors = FALSE)
options(max.print = 100)

37 / 46

Keep .Rprofile Clean

Only put interactive code in

Yes

# add this with usethis::use_usethis()
library(usethis)

No

library(tidyverse)

38 / 46

Use it to change options and load packages

Comment Your CodeFunctions: Arguments and purposeCode: What or why, NOT how39 / 46

Comment Your Code

Functions: Arguments and purpose

Code: What or why, NOT how

# Takes a data.frame (data) and replaces the columns with the names
# (names) and converts them from factor variable to character 
# variables. Keeps characters variables unchanged.
factor_to_text <- function(data, names) {
  for (i in seq_along(names)) {
    if(is.factor(data[, names[i], drop = TRUE]))
      data[, names[i]] <- as.character.factor(data[, names[i], 
                                                   drop = TRUE])
  }
  data
}

39 / 46

Updating R and RStudio

The most recent version of R can be downloaded from The Comprehensive R Archive Network (CRAN)

40 / 46

Updating R and RStudio

Download the most recent version of RStudio at their downloads page

41 / 46

How to ask for help (`datapasta` and `reprex`)

The `reprex` package helps you create a reproducible example

`datapasta` lets you easy copy + paste small samples of data into RStudio

42 / 46

How to ask for help (`reprex`)

Check out the package website and RStudio webinar on creating reproducible examples

Art by Allison Horst

43 / 46

Where to get help

RStudio has a helpful community if you have questions (everyone does!)

RStudio Community:

RStudio has a dedicated forum for questions related to R and RStudio: https://community.rstudio.com/

44 / 46

Where else to get help

Stack Overflow

Check out the questions tagged r on Stack Overflow: https://stackoverflow.com/questions/tagged/r

45 / 46

`#rstats` on Twitter

If you have a Twitter account, check out #rstats: https://twitter.com/hashtag/rstats

Art by Allison Horst

46 / 46

Help

Keyboard shortcuts

↑, ←, Pg Up, k

Go to previous slide

↓, →, Pg Dn, Space, j

Go to next slide

Home

Go to first slide

End

Go to last slide

Number + Return

Go to specific slide

b / m / f

Toggle blackout / mirrored / fullscreen mode

Clone slideshow

Toggle presenter mode

Restart the presentation timer

?, h

Toggle this help

Best Practices in R

2020-08-22 developed by Emil Hvitfeldt

Welcome!

Change Settings

✓ - Uncheck "Restore .RData into work space at start up"

✓ - Set "Save work space to .Rdata on exit" to "Never"

Change Appearance

RStudio themes

Fonts

Font Sizes

Editor Themes

Pane layouts

Change the layout of the panes

Source on top?

Source down to the right?

It's all up to you!

Pane layouts

Some like having both source and console open

Pane layouts

...while still allowing to have viewer open

RStudio Projects

Keep all files from one project together. Use RStudio projects.

RStudio Projects

Keep all files from one project together. Use RStudio projects.

Self contained

RStudio Projects

Keep all files from one project together. Use RStudio projects.

Self contained

Project orientated

usethis

usethis::create_project("project_name")

RStudio Projects - Creation 1 / 4

RStudio Projects - Creation 2 / 4

RStudio Projects - Creation 3 / 4

RStudio Projects - Creation 4 / 4

Folder Structure

Folder Structure

Folder Structure

Paths

Paths

Paths

Only use relative paths, never absolute paths

Introducing the here package.

Naming Things

Naming Things

Naming Things - Files

NO

YES

Naming Things - Files

Naming Things - Files

Naming Things - Files

Naming Things - Objects

Naming Things - Objects

What To Avoid - attach()

Never use attach()

What To Avoid - attach()

Never use attach()

Loads lots of names into the search path, ambiguous selections.

What To Avoid - attach()

Never use attach()

Loads lots of names into the search path, ambiguous selections.

Try with() or withr instead

What To Avoid - attach()

Never use rm(list=ls())

What To Avoid - attach()

Never use rm(list=ls())

Instead, restart the R session

CTRL+SHIFT+F10 for Windows

CMD+SHIFT+ALT+F10 for Mac OS

R Markdown documents versus R scripts

You can use R scripts for simple self contained tasks.

source() R scripts into your R Markdown document where you will do analyses, visualizations and reporting.

R Markdown

R Markdown

Naming Chunks

Names can be placed after the comma

or before

Setup Chunk

In a fresh R Markdown document you see this

Setup Chunk

2020-08-22

developed by Emil Hvitfeldt

`usethis::create_project("project_name")`

Never use `attach()`

Never use `attach()`

Never use `attach()`

Try `with()` or `withr` instead

Never use `rm(list=ls())`

Never use `rm(list=ls())`

`CTRL+SHIFT+F10` for Windows

`CMD+SHIFT+ALT+F10` for Mac OS

`source()` R scripts into your R Markdown document where you will do analyses, visualizations and reporting.

How to ask for help (`datapasta` and `reprex`)

The `reprex` package helps you create a reproducible example

`datapasta` lets you easy copy + paste small samples of data into RStudio

How to ask for help (`reprex`)

`#rstats` on Twitter