Community Forums

Find answers, ask questions, and connect with the
GRAPH Courses community.

Home Forums R help Loops

  • Kene David

    Administrator
    December 16, 2022 at 3:57 pm

    Hello! Here are some possible solutions.


    library(tidyverse) library(tibble)

    Define the data


    df ﹤- tribble( ~respondent, ~eval_period, ~formal_emp_ugx, ~personal_emp_ugx, ~casual_emp_ugx, "Aaron", "Baseline", 500, 600, 700, "Bob", "Endline", 600, 700, 800, "Charlie", "Midline", 700, 800, 900 )

    You can use the across function

    There are two options. First you can use the across() function.


    df %﹥% mutate(across(.cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"), .fns = ~ case_when(eval_period == "Baseline" ~ .x/3700, eval_period == "Endline" ~ .x/3800, eval_period == "Midline" ~ .x/3500)))
    ## # A tibble: 3 × 5
    
    ##   respondent eval_period formal_emp_ugx personal_emp_ugx casual_emp_ugx
    
    ##   ﹤chr﹥    ﹤chr﹥          ﹤dbl﹥            ﹤dbl﹥          ﹤dbl﹥
    
    ## 1 Aaron      Baseline             0.135            0.162          0.189
    
    ## 2 Bob        Endline              0.158            0.184          0.211
    
    ## 3 Charlie    Midline              0.2              0.229          0.257
    

    The tilde, ~ tells R, I am about to give you an operation to apply on many columns. And the .x signifier represents each of the columns listed (“formal_emp_ugx”, “personal_emp_ugx”, “casual_emp_ugx”).

    Once this is done, you can rename the variables.

    Here are some tutorials on how to use the across function:

    Pivot longer

    You could also first pivot the data to a longer format


    df_long ﹤- df %﹥% pivot_longer(cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"), values_to = "ugx")

    Then it becomes easy to do what you need:


    df_long %﹥% mutate(usd = case_when(eval_period == "Baseline" ~ ugx/3700, eval_period == "Endline" ~ ugx/3800, eval_period == "Midline" ~ ugx/3500))
    ## # A tibble: 9 × 5
    
    ##   respondent eval_period name               ugx   usd
    
    ##   ﹤chr﹥      ﹤chr﹥       ﹤chr﹥            ﹤dbl﹥ ﹤dbl﹥
    
    ## 1 Aaron      Baseline    formal_emp_ugx     500 0.135
    
    ## 2 Aaron      Baseline    personal_emp_ugx   600 0.162
    
    ## 3 Aaron      Baseline    casual_emp_ugx     700 0.189
    
    ## 4 Bob        Endline     formal_emp_ugx     600 0.158
    
    ## 5 Bob        Endline     personal_emp_ugx   700 0.184
    
    ## 6 Bob        Endline     casual_emp_ugx     800 0.211
    
    ## 7 Charlie    Midline     formal_emp_ugx     700 0.2  
    
    ## 8 Charlie    Midline     personal_emp_ugx   800 0.229
    
    ## 9 Charlie    Midline     casual_emp_ugx     900 0.257
    

    And you can pivot back at the end:


    df %﹥% pivot_longer(cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"), values_to = "ugx") %﹥% mutate(usd = case_when(eval_period == "Baseline" ~ ugx/3700, eval_period == "Endline" ~ ugx/3800, eval_period == "Midline" ~ ugx/3500)) %﹥% pivot_wider(names_from = name, values_from = c(usd, ugx))
    ## # A tibble: 3 × 8
    
    ##   respondent eval_period usd_formal_em…¹ usd_p…² usd_c…³ ugx_f…⁴ ugx_p…⁵ ugx_c…⁶
    
    ##   ﹤chr﹥      ﹤chr﹥                 ﹤dbl﹥   ﹤dbl﹥   ﹤dbl﹥   ﹤dbl﹥   ﹤dbl﹥   ﹤dbl﹥
    
    ## 1 Aaron      Baseline              0.135   0.162   0.189     500     600     700
    
    ## 2 Bob        Endline               0.158   0.184   0.211     600     700     800
    
    ## 3 Charlie    Midline               0.2     0.229   0.257     700     800     900
    
    ## # … with abbreviated variable names ¹​usd_formal_emp_ugx, ²​usd_personal_emp_ugx,
    
    ## #   ³​usd_casual_emp_ugx, ⁴​ugx_formal_emp_ugx, ⁵​ugx_personal_emp_ugx,
    
    ## #   ⁶​ugx_casual_emp_ugx
    

  • Immaculate

    Member
    December 16, 2022 at 6:05 pm

    Thanks! This has been helpful🙏🏾

  • Immaculate

    Member
    December 16, 2022 at 6:38 pm

    The first solution replaces values of existing columns but doesn’t create new columns/ variables. Which changes can be made so that new columns are created?

    • Kene David

      Administrator
      December 16, 2022 at 6:45 pm

      For this you can use the `.names` argument. Like this:

      df %>% 
      mutate(across(.cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"),
      .fns = ~ case_when(eval_period == "Baseline" ~ .x/3700,
      eval_period == "Endline" ~ .x/3800,
      eval_period == "Midline" ~ .x/3500),
      .names = "{.col}_usd"
      ))

      That will leave you with names that look like “ugx_usd” though. To fix that you can use the `rename_with()` function which can rename many columns at the same time:

      df %>% 
      mutate(across(.cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"),
      .fns = ~ case_when(eval_period == "Baseline" ~ .x/3700,
      eval_period == "Endline" ~ .x/3800,
      eval_period == "Midline" ~ .x/3500),
      .names = "{.col}_usd"
      )) %>%
      rename_with(.fn = ~ str_replace_all(.x, "ugx_usd", "usd"))
  • Immaculate

    Member
    December 19, 2022 at 9:39 am

    Thank you very much 🙏🏾