ETC5523: Communicating with Data

Tutorial 1

Author

Michael Lydeamore

Published

January 1, 2001

🎯 Objectives

  • understand what elements make certain questions harder or easier to answer
  • classify types of communication
  • modify communication to be more effective
  • Install the R-packages below:
install.packages(c("rmarkdown", "tidyverse", "reprex"))
  • Download the daily maximum temperature from the Australian Government, Bureau of Meteorology, at the Melbourne Airport weather station here. You can find the note about this data here

  • You can similarly download the daily minimum temperature at the Melbourne Airport weather station from here and the note about this data here

👥 Exercise 1A

Asking questions

Study each of the examples from Stack Overflow and write the good points of their communication and where applicable, how you would improve communicating the author’s problem.

  1. Problem rendering a list of multiple object types in Rmarkdown
  2. How to paste variable to all values in column
  3. Convert only certain rows from long to wide in R
  4. How to transform 3 seperate numeric variables into 3 levels of one variable?
  5. Looping variables in the parameters of the YAML header of an R Markdown file and automatically outputting a PDF for each variable
  1. Problem rendering a list of multiple object types in Rmarkdown
    • Notice that there is even a comment talking about producing reproducible examples. If you followed the comment, you’d have found a whole discussion about how to make a great R reproducible example.
    • This question was edited later by the author and now inludes a reproducible example. The author makes it quite clear what they want for the output as well by posting the image of what they want.
    • This edited question does fairly well in communicating their problem. The only other thing that may be helpful is if the author also included the session information.
    • The question showed that the author may not know how to access elements of a list. To access the first element of the list, the author should have used lists[[1]] instead of lists[1].
    • The code below does what the author wants.
Code for author
library(highcharter)
library(kableExtra)
gen_list <- function(x) {
    df_x <- data.frame(x_ = x, y = 5, z = 6)

    obj_1 <- df_x %>%
        knitr::kable(caption = "test") %>%
        kableExtra::kable_styling(bootstrap_options = "striped", full_width = F)

    obj_2 <- highcharter::hchart(df_x, "line", highcharter::hcaes(x = x_, y = y))

    list(obj_1, obj_2)[[1]]
}

lists <- lapply(c(1, 2, 3), gen_list)

lists[[1]]
  1. How to paste variable to all values in column
    • The author makes it easy to see their problem and the expected solution.
    • The author provides data however it is not in a form that can be easily used by other people.
    • The author shares what they have tried and didn’t work for them.
    • The package loaded or session information is not shared.
    • Their example is not a minimum reproducible example.
  2. Convert only certain rows from long to wide in R
    • This question is well written with data provided as code that others can copy-and-paste, the problem is easy to see aided by the depictions of the expected outputs.
    • Perhaps because it is well written, the question is answered soon after it is posted.
  3. How to transform 3 seperate numeric variables into 3 levels of one variable?
    • The problem is hard to understand and there are no code provided when the question is technical part of R programming.
    • Perhaps because the problem is poorly phrased, the post has received a negative vote and a moderator has closed the question.
  4. Looping variables in the parameters of the YAML header of an R Markdown file and automatically outputting a PDF for each variable
    • The question is extensively written describing the author’s problem.
    • Even though the problem is extensively written, the writing is not concise. How committed are the stackoverflow fairies to read through this long question? You need to consider your audience each time.
    • Reading carefully you can see what the author may want. The author is in fact just looking to iterate over row like below code.
for (i in 1:nrow(positions)) {
  render_function(positions$position[i], positions$company[i])
}

💻 Exercise 1B

Reproducible example

Generate a reproducible example to answer the questions 2 from 1A using the reprex R-package.

How to paste variable to all values in column

library(tidyve#| rse)
df <- tribble(
    ~page, ~word,
    97, "hello",
    3, "good",
    7, "age",
    20, "morning",
    100, "nice",
    4, "detail",
    30, "assessment"
)
prefix <- "abc"
df <- df %>%
    mutate(new_column = paste(prefix, df$word, sep = "", collapse = NULL))

Created on 2021-07-25 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.1 (2020-06-06)
#>  os       macOS  10.16                
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_AU.UTF-8                 
#>  ctype    en_AU.UTF-8                 
#>  tz       Australia/Melbourne         
#>  date     2022-07-20                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date       lib source        
#>  assertthat    0.2.1   2019-03-21 [2] CRAN (R 4.0.0)
#>  backports     1.2.1   2020-12-09 [1] CRAN (R 4.0.2)
#>  broom         0.7.6   2021-04-05 [1] CRAN (R 4.0.2)
#>  cellranger    1.1.0   2016-07-27 [2] CRAN (R 4.0.0)
#>  cli           2.5.0   2021-04-26 [1] CRAN (R 4.0.2)
#>  colorspace    2.0-1   2021-05-04 [1] CRAN (R 4.0.2)
#>  crayon        1.4.1   2021-02-08 [1] CRAN (R 4.0.2)
#>  DBI           1.1.1   2021-01-15 [1] CRAN (R 4.0.2)
#>  dbplyr        2.1.1   2021-04-06 [1] CRAN (R 4.0.2)
#>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.2)
#>  dplyr       * 1.0.6   2021-05-05 [1] CRAN (R 4.0.2)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.0.2)
#>  evaluate      0.14    2019-05-28 [2] CRAN (R 4.0.0)
#>  fansi         0.4.2   2021-01-15 [2] CRAN (R 4.0.2)
#>  forcats     * 0.5.1   2021-01-27 [1] CRAN (R 4.0.2)
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
#>  generics      0.1.0   2020-10-31 [2] CRAN (R 4.0.2)
#>  ggplot2     * 3.3.3   2020-12-30 [1] CRAN (R 4.0.1)
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.2)
#>  gtable        0.3.0   2019-03-25 [2] CRAN (R 4.0.0)
#>  haven         2.4.1   2021-04-23 [2] CRAN (R 4.0.2)
#>  highr         0.9     2021-04-16 [2] CRAN (R 4.0.2)
#>  hms           1.0.0   2021-01-13 [2] CRAN (R 4.0.2)
#>  htmltools     0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2)
#>  httr          1.4.2   2020-07-20 [1] CRAN (R 4.0.2)
#>  jsonlite      1.7.2   2020-12-09 [1] CRAN (R 4.0.2)
#>  knitr         1.33    2021-04-24 [1] CRAN (R 4.0.2)
#>  lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.0.2)
#>  lubridate     1.7.10  2021-02-26 [1] CRAN (R 4.0.2)
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.2)
#>  modelr        0.1.8   2020-05-19 [2] CRAN (R 4.0.0)
#>  munsell       0.5.0   2018-06-12 [2] CRAN (R 4.0.0)
#>  pillar        1.6.0   2021-04-13 [1] CRAN (R 4.0.2)
#>  pkgconfig     2.0.3   2019-09-22 [2] CRAN (R 4.0.0)
#>  purrr       * 0.3.4   2020-04-17 [2] CRAN (R 4.0.0)
#>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.2)
#>  Rcpp          1.0.6   2021-01-15 [1] CRAN (R 4.0.2)
#>  readr       * 1.4.0   2020-10-05 [2] CRAN (R 4.0.2)
#>  readxl        1.3.1   2019-03-13 [2] CRAN (R 4.0.0)
#>  reprex        2.0.0   2021-04-02 [1] CRAN (R 4.0.2)
#>  rlang         0.4.11  2021-04-30 [1] CRAN (R 4.0.2)
#>  rmarkdown     2.8     2021-05-07 [1] CRAN (R 4.0.2)
#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.0.1)
#>  rvest         1.0.0   2021-03-09 [1] CRAN (R 4.0.2)
#>  scales        1.1.1   2020-05-11 [2] CRAN (R 4.0.0)
#>  sessioninfo   1.1.1   2018-11-05 [2] CRAN (R 4.0.0)
#>  stringi       1.6.2   2021-05-17 [1] CRAN (R 4.0.2)
#>  stringr     * 1.4.0   2019-02-10 [2] CRAN (R 4.0.0)
#>  styler        1.4.1   2021-03-30 [1] CRAN (R 4.0.2)
#>  tibble      * 3.1.1   2021-04-18 [1] CRAN (R 4.0.2)
#>  tidyr       * 1.1.3   2021-03-03 [1] CRAN (R 4.0.2)
#>  tidyselect    1.1.1   2021-04-30 [1] CRAN (R 4.0.2)
#>  tidyverse   * 1.3.1   2021-04-15 [1] CRAN (R 4.0.2)
#>  utf8          1.2.1   2021-03-12 [1] CRAN (R 4.0.2)
#>  vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.0.2)
#>  withr         2.4.2   2021-04-18 [1] CRAN (R 4.0.2)
#>  xfun          0.23    2021-05-15 [1] CRAN (R 4.0.2)
#>  xml2          1.3.2   2020-04-23 [2] CRAN (R 4.0.0)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.2)
#> 
#> [1] /Users/etan0038/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

👥 Exercise 1C

Join the GitHub Classroom

  1. Go to your Moodle course page and find the invite link to the GitHub Classroom.
  2. Make sure to select your name from identifier! (Let the teaching team know if you cannot find it)
  3. Join the GitHub Classroom before the end of Tue 2nd August 11.55pm.

👥️ Exercise 1D

Classify the types of communication

For the following scenarios, classify the type of communication.

  1. In the news: “Coldest morning in four years. This comes as several more Melbourne suburbs recorded their coldest morning since 2018. Coldstream – a township in the Greater Melbourne area won the unenviable title recording the coldest morning, with residents waking up to -3.7C temperatures.”
  2. In someone’s code: “# FIXME: use sum instead of +”
  3. In a conversation between person A and person B:
    A: Based on the sales of the last round, we have to change the promotion of the products.
    B: How do we need to change it?
    A: Well the new brand of the skin care range didn’t do well.
    B: I thought market research suggested that was what the consumers wanted?
    A: Yes, I think we didn’t promote it enough.

Reflect on your own communication style for intrapersonal, interpersonal, small group, public and mass communications. How does it differ?

  1. Mass
  2. Intrapersonal
  3. Interpersonal
  • Public or mass communications don’t easily allow for transactional exchanges so information presented must be more polished or complete.
  • In mass communication, you are likely “competing” for attention so you want to make the main message stand out.
  • Public communications don’t allow for individual feedbacks but you can encourage participation by asking questions with simple responses (e.g. yes and no).
  • In a small group, you can utilise the dynamics of the situation and relations in the group to tailor your message.
  • In an interpersonal communication, your communication can be more intimate so you can have go into more depth and details about the data.
  • Intrapersonal communication can be unpolished and there is less concern for damaging relationships – it offers an opportunity to make notes for your future self, self monitor and reflect

👥️ Exercise 1E

Effective communciation

This year’s winter in Melbourne is particularly cold. Using the daily maximum and/or minimum temperature data from the Bureau of Meteorology at the Melbourne Airport, construct a narrative to demonstrate this for 3 possible audiences below:

  1. Small group
  2. Public
  3. Mass

What elements of your communciation differ for the three different scenarios above?

Note: there is no one right answer for this. There are number of ways you can communicate a story about this year’s winter being particularly cold.

For example, in a small group, you may wish to show an interactive plot of the time series of daily minimum and maximum temperature grouped by year (like in Plot 1). This can be helpful to show variation and generate discussion that each member of the group can contribute to.

For the public audience, you may like to design a plot or compute statistics that is more to the point. E.g. the minimum temperature was -1.5 degrees Celsius on 20th July 2022 which was the coldest has been the 7th coldest day since measurements were taken from 1970.

For the masses, you can do something similar as for the public audience but since you are competing for more attention, you may like to make the main message stand out by either bolding the message or using a colorful plot.

library(tidyverse)
# first we need to do some data wrangling to clean up the data
dfmin <- read_csv("data/melb-airport-min-temp-data.csv") %>%
    select(
        year = Year,
        month = Month,
        day = Day,
        temp = `Minimum temperature (Degree C)`
    ) %>%
    mutate(date = as.Date(paste(year, month, day, sep = "-"))) %>%
    # winter in Australia is June to August
    filter(month %in% c("06", "07", "08")) %>%
    mutate(
        md = paste0(month, "-", day),
        type = "min"
    )

dfmax <- read_csv("data/melb-airport-max-temp-data.csv") %>%
    select(
        year = Year,
        month = Month,
        day = Day,
        temp = `Maximum temperature (Degree C)`
    ) %>%
    mutate(date = as.Date(paste(year, month, day, sep = "-"))) %>%
    # winter in Australia is June to August
    filter(month %in% c("06", "07", "08")) %>%
    mutate(
        md = paste0(month, "-", day),
        type = "max"
    )


df <- rbind(dfmin, dfmax)

g <- ggplot(df, aes(md, temp, group = paste(year, type), text = year)) +
    geom_line(
        data = ~ filter(., year != "2022" & type == "min"),
        color = "#2a4d69"
    ) +
    geom_line(
        data = ~ filter(., year != "2022" & type == "max"),
        color = "#C00040"
    ) +
    geom_line(
        data = ~ filter(., year == "2022" & type == "max"),
        color = "#eea990"
    ) +
    geom_line(
        data = ~ filter(., year == "2022" & type == "min"),
        color = "#eea990"
    ) +
    labs(x = "Month-Day", y = "Temperature (°C)", title = "Plot 1") +
    theme(legend.position = "none")

plotly::ggplotly(g)