Chapter 4 texting

canvass <- read_csv("data/canvassing_results.csv")

## Parsed with column specification:
## cols(
##   van_id = col_double(),
##   date = col_date(format = ""),
##   vol_yes = col_double()
## )

van_names <- read_csv("data/van_names.csv")

## Parsed with column specification:
## cols(
##   van_id = col_double(),
##   first_name = col_character(),
##   last_name = col_character(),
##   date_of_birth = col_date(format = ""),
##   age = col_double()
## )

turf_lookup <- read_csv("data/van_turf_lookup.csv")

## Parsed with column specification:
## cols(
##   van_id = col_double(),
##   turf_code = col_character()
## )

universe <- inner_join(canvass, van_names) %>% 
  group_by(van_id) %>% 
  # for duplicates grab the one where they indicated vol yes
  top_n(1, wt = vol_yes) %>% 
  # make sure there is only 1 observation per person in the case that 
  # duplicates had the same vol_yes result
  sample_n(size = 1) %>% 
  ungroup() %>% 
  #join region for hypothetical polling location / organizer info
  left_join(turf_lookup)

## Joining, by = "van_id"

## Joining, by = "van_id"

We want the message to be coming from the proper regional organizing director (ROD).

We will make some fake names for our RODs. We will create a named vector where the name is the turf code and the value is the organizer’s name (sampled from the babynames package / dataset cite here).

organizers <- c("Rosaleen", "Larissa", "Lafayette", "Theo", "Zamere", "Colleen")
names(organizers) <- LETTERS[1:6]

organizers

##           A           B           C           D           E           F 
##  "Rosaleen"   "Larissa" "Lafayette"      "Theo"    "Zamere"   "Colleen"

We will use stringr::str_replace_all() to create an organizer column

universe %>% 
  mutate(organizer = str_replace_all(turf_code, organizers)) %>% 
  select(organizer, everything())

## # A tibble: 6,004 x 9
##    organizer van_id date       vol_yes first_name last_name date_of_birth
##    <chr>      <dbl> <date>       <dbl> <chr>      <chr>     <date>       
##  1 Larissa        1 2019-01-05       1 Timika     Ehrsam    1970-01-03   
##  2 Zamere         2 2019-01-30       1 Johanna    Gorden    1973-12-15   
##  3 Zamere         3 2019-02-27       0 Parys      Stoelting 1997-02-15   
##  4 Larissa        4 2019-01-06       0 Lavell     Dewall    1992-10-23   
##  5 Lafayette      5 2019-03-02       0 Brenisha   Pachter   1977-07-05   
##  6 Zamere         6 2019-01-13       1 Hoang      Millon    1992-02-29   
##  7 Larissa        7 2019-01-14       1 Rishith    Pisciotto 1987-09-25   
##  8 Lafayette      8 2019-01-28       1 Maximilian Arakawa   1999-05-14   
##  9 Theo           9 2019-02-18       0 Timmie     Schlag    1965-06-12   
## 10 Rosaleen      10 2019-02-05       0 Swetha     Spreitzer 2000-12-08   
## # … with 5,994 more rows, and 2 more variables: age <dbl>, turf_code <chr>

Say each region has their own unique polling location (realistically this will be a much more fine grain dataset that you can join on).

We can specify the polling locations using a case_when() function call. We will build upon the previous pipe line. In case when you specify a logical statement and then return a value using the ~—i.e. something == TRUE ~ "if true value".

universe_locations <- universe %>% 
  mutate(organizer = str_replace_all(turf_code, organizers),
         polling_place = case_when(
           turf_code == "A" ~ "Community Center",
           turf_code == "B" ~ "High School",
           turf_code == "C" ~ "Town Hall",
           turf_code ==  "D" ~ "Elementary School", 
           turf_code == "E" ~ "Rotary Club",
           turf_code == "F" ~ "Senior Center"
         )
  )

universe_locations

## # A tibble: 6,004 x 10
##    van_id date       vol_yes first_name last_name date_of_birth   age
##     <dbl> <date>       <dbl> <chr>      <chr>     <date>        <dbl>
##  1      1 2019-01-05       1 Timika     Ehrsam    1970-01-03       49
##  2      2 2019-01-30       1 Johanna    Gorden    1973-12-15       46
##  3      3 2019-02-27       0 Parys      Stoelting 1997-02-15       22
##  4      4 2019-01-06       0 Lavell     Dewall    1992-10-23       27
##  5      5 2019-03-02       0 Brenisha   Pachter   1977-07-05       42
##  6      6 2019-01-13       1 Hoang      Millon    1992-02-29       27
##  7      7 2019-01-14       1 Rishith    Pisciotto 1987-09-25       32
##  8      8 2019-01-28       1 Maximilian Arakawa   1999-05-14       20
##  9      9 2019-02-18       0 Timmie     Schlag    1965-06-12       54
## 10     10 2019-02-05       0 Swetha     Spreitzer 2000-12-08       19
## # … with 5,994 more rows, and 3 more variables: turf_code <chr>,
## #   organizer <chr>, polling_place <chr>

Generally, it is useful to segment texting scripts to allow for more tailored messaging. It is recommended to treat your potential volunteers differently than those who have not indicated a desire to volunteer.

Let’s go ahead and create two different tibbles, one for vol yes and vol no. Based on this, we will create custom scripts.

vol_yes <- filter(universe_locations, vol_yes == 1)
vol_no <- filter(universe_locations, vol_yes == 0)

At this point you should always check to see if your segmentation has missed anyone. The sum of the number of rows in your two tables should add up to the total number of rows in the original tibble (universe_locations). Let’s perform that sanity check before moving on.

nrow(vol_yes) + nrow(vol_no) == nrow(universe_locations)

## [1] TRUE

This returns TRUE, we are good to move onward! If there were any missing rows, I would recommend finding a way to incorporate them into some generic universe.

The next step is to create the script. The package glue allows us to create character strings with the expressions or values from a tibble. Learn more here.

vol_yes_message <- vol_yes %>% 
  mutate(message = glue::glue("Hi {first_name} this is {organizer} with Abraham Lincoln for the Union! The election is right around the corner. We need all the help we can get, can we count on you to volunteer at {polling_place} on election day?"))

vol_no_message <- vol_no %>% 
  mutate(message = glue::glue("Hi {first_name} this is {organizer} with Abraham Lincoln for the Union! The election is right around the corner. Your polling location is at the {polling_place}. Can we count on your vote?"))

messages <- bind_rows(vol_yes_message, vol_no_message)

## Warning in bind_rows_(x, .id): Vectorizing 'glue' elements may not preserve
## their attributes

## Warning in bind_rows_(x, .id): Vectorizing 'glue' elements may not preserve
## their attributes

select(messages, message)

## # A tibble: 6,004 x 1
##    message                                                                 
##    <chr>                                                                   
##  1 Hi Timika this is Larissa with Abraham Lincoln for the Union! The elect…
##  2 Hi Johanna this is Zamere with Abraham Lincoln for the Union! The elect…
##  3 Hi Hoang this is Zamere with Abraham Lincoln for the Union! The electio…
##  4 Hi Rishith this is Larissa with Abraham Lincoln for the Union! The elec…
##  5 Hi Maximilian this is Lafayette with Abraham Lincoln for the Union! The…
##  6 Hi Deeksha this is Lafayette with Abraham Lincoln for the Union! The el…
##  7 Hi Therron this is Lafayette with Abraham Lincoln for the Union! The el…
##  8 Hi Lonney this is Lafayette with Abraham Lincoln for the Union! The ele…
##  9 Hi Robby this is Theo with Abraham Lincoln for the Union! The election …
## 10 Hi Siah this is Rosaleen with Abraham Lincoln for the Union! The electi…
## # … with 5,994 more rows

Once you have created your custom messaging you can write this to a csv and upload it into a peer to peer texting platform like Relay and Hustle. In there you can, hopefully, map the VAN IDs so that the text messages are recorded in VAN (talk to your VAN Admin about setting up these integrations).

One problem that you might face with platforms like Relay and Hustle is that custom fields can have a character limit. There are a few ways to handle this. One way is by recreating the custom messages within the platform themselves. However, I have found this historically somewhat cumbersom. My work around was to split each sentence into it’s own custom field.

We can split the message into the sentences. This will create a list column which we will then unnest (working with list columns by Garret Grolemund).

final_message <- messages %>% 
  mutate(message = str_split(message, boundary("sentence"))) %>% 
  unnest() %>% 
  group_by(van_id) %>% 
  mutate(message_number = row_number(),
         message_number = paste0("message_",message_number)) %>% 
  ungroup() %>% 
  spread(message_number, message)

## Warning: `cols` is now required.
## Please use `cols = c(message)`

select(final_message, contains("message_"))

## # A tibble: 6,004 x 4
##    message_1              message_2        message_3              message_4
##    <chr>                  <chr>            <chr>                  <chr>    
##  1 "Hi Timika this is La… "The election i… We need all the help … <NA>     
##  2 "Hi Johanna this is Z… "The election i… We need all the help … <NA>     
##  3 "Hi Hoang this is Zam… "The election i… We need all the help … <NA>     
##  4 "Hi Rishith this is L… "The election i… We need all the help … <NA>     
##  5 "Hi Maximilian this i… "The election i… We need all the help … <NA>     
##  6 "Hi Deeksha this is L… "The election i… We need all the help … <NA>     
##  7 "Hi Therron this is L… "The election i… We need all the help … <NA>     
##  8 "Hi Lonney this is La… "The election i… We need all the help … <NA>     
##  9 "Hi Robby this is The… "The election i… We need all the help … <NA>     
## 10 "Hi Siah this is Rosa… "The election i… We need all the help … <NA>     
## # … with 5,994 more rows

Final step: upload to relay / hustle / whatever platform you use. Blast ’em.