class: center, middle, inverse, title-slide # Writing reproducible examples to get ‘good help’ ### Katie Jolly @ RLadies Seattle katiejolly.io/rladies-reprex-2020 ### 2020-03-24 --- ## Reprex [Life] Kit <img src="images/reprex_kit.png" width="1600" /> --- ## Reproducible examples Reproducible examples [allow someone else to recreate your problem by just copying and pasting R code](http://adv-r.had.co.nz/Reproducibility.html) <br> <br> .center[  ] <br> <br> But they can be tedious to write. The {reprex} package helps with that! --- class: middle ## A reproducible example ``` r seq(1, 10) # I want this to count by 2 #> [1] 1 2 3 4 5 6 7 8 9 10 ``` <sup>Created on 2020-03-19 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup> --- ## Writing reproducible examples Let's say you want a reproducible example for generating a sequence of numbers that counts by 2, but can only make it count by 1 currently. There are a few ways to do this: \#1 **Copy** your code to your clipboard and then run `reprex()` \#2 **Wrap** your code with `reprex()` ```r reprex(seq(1, 10)) ``` \#3 **Wrap** an input filepath with `reprex()` ```r reprex(input = "my_reprex.R") ``` \#4 Use the reprex **RStudio Addin** You'll then have a **reprex** in your clipboard to paste! Since `reprex()` is meant to be run interactively, avoid leaving it in your code. --- ## \#1: Write Comments liberally In order to leave descriptive comments throughout, it's often easier to write multi-line code. ```r reprex({ # <- notice the {} enclosing the code # I want to count by 2 seq(1, 10) # But I can only figure out how to count by 1 }) ``` ``` r # I want to count by 2 seq(1, 10) #> [1] 1 2 3 4 5 6 7 8 9 10 # But I can only figure out how to count by 1 ``` This also helps readers spot your problem and see your thought process. It can also greatly help you resolve your own question! --- ## \#2: Use data that others can use The general advice is to make your data as **boring** as possible. This means try to use data that is readily available like `mtcars` or `iris`. But let's say you have a more specific question. You might post this code on StackOverflow and ask: Why are all my averages NA? ```r df %>% group_by(person) %>% summarise(average = mean(total)) ``` ``` ## # A tibble: 3 x 2 ## person average ## <chr> <dbl> ## 1 A NA ## 2 B NA ## 3 C NA ``` But, no one else can run your code to see where things go wrong :( --- Without more context, it's really tough to see the problem. With a combination of {datapasta} & {reprex} we can write a much better question! In general we want the smallest dataset possible, so I like to use `head()` to get just a few rows. ```r library(datapasta) x <- tribble_construct(head(df)) # writes code to reconstruct table cat(x) ``` ``` ## tibble::tribble( ## ~person, ~date, ~total, ## "A", "2020-01-01", "1", ## "B", "2020-01-01", "2", ## "C", "2020-01-01", "3", ## "A", "2020-02-01", "2", ## "B", "2020-02-01", "1", ## "C", "2020-02-01", "2" ## ) ``` --- Then we can post a better question with the data included: ``` r library(dplyr) df <- tibble::tribble( ~person, ~date, ~total, "A", "2020-01-01", "3", "B", "2020-01-01", "3", "C", "2020-01-01", "1", "A", "2020-02-01", "1", "B", "2020-02-01", "3", "C", "2020-02-01", "1" ) df %>% group_by(person) %>% summarise(average = mean(total)) # I expected this to give 1 for C, for example. But it's NA #> # A tibble: 3 x 2 #> person average #> <chr> <dbl> #> 1 A NA #> 2 B NA #> 3 C NA ``` And someone will be able to help you see that your numbers are actually characters, thus giving NA! Much easier than the previous guesswork. --- ## \#3 Shorter code is better & clean up after yourself Maybe later on you want to plot your data and change the colors. But we don't need to know that to answer the wrangling question! <code class ='r hljs remark-code'>library(tidyverse)<br><span style='background-color:pink'>library(colorspace)</span><br><span style='background-color:pink'>library(ggthemes)</span><br><br>df %>%<br> group_by(person) %>%<br> summarise(average = mean(total)) # I expected this to give 1 for C, for example. But it's NA</code> And if you write a file, delete it after! But do **not** delete anything you didn't write. ```r ... write_file(df, "output.csv") file.remove("output.csv") ``` --- ## \#4 `sessionInfo` is a helpful look at your environment If you have old/out-of-date packages, this is easiest to see with your session information. It's a summary of your R version, packages, and their versions. There are two ways to include this information: ```r # argument to reprex() reprex(..., si = TRUE) # OR # manually include the function reprex({ ... sessionInfo() }) ``` --- ## \#5 Test your question in a fresh R session before posting it Often you'll solve your own problem by writing a **good** question! You may find that you had overwritten an object or another somewhat-invisible error. To make sure that the code will run on someone else's machine, make sure to test it in a **NEW** R session. Don't just clear your environment! This is not the same thing. You can do this by clicking `Session` -> `Restart R` Or with the keyboard shortcut `ctrl[cmd]+shift+F10` <br> .center[ <img src="https://www.stickpng.com/assets/images/585e4831cb11b227491c338e.png" width="150" /> ] --- class: center, middle <img src="images/reprex_kit.png" width="1600" /> --- class: middle ## 5 Main Takeaways \#1: Write comments liberally \#2: Use data that others can use \#3: Shorter code is better & clean up after yourself \#4: `sessionInfo()` is a helpful look at your environment \#5: Test your question in a fresh R session before posting it --- ## Where to find more information? * [Reprex do's and don'ts](https://reprex.tidyverse.org/articles/reprex-dos-and-donts.html) (post) * [How to use reprex](https://reprex.tidyverse.org/articles/articles/learn-reprex.html) (video + slides) * [Using datapasta with reprex](https://reprex.tidyverse.org/articles/articles/datapasta-reprex.html) (post) * [What's a reproducible example ('reprex') and how do I do one](https://community.rstudio.com/t/faq-whats-a-reproducible-example-reprex-and-how-do-i-do-one/5219/3) (forum) * [How to ask questions so they get answered! Possibly by yourself! rOpenSci Community Call v13](https://ropensci.org/commcalls/2017-03-07/) (video)