Home › Forums › R help (deprec) › Loops
-
Loops
Posted by Immaculate on December 16, 2022 at 8:21 amI am trying to replicate some analysis I had done using STATA in R but I need your help. Is there a way I can use a for loop to create new variables whose values comprise adjustments of already existing variables? I was converting income from Uganda shillings to USD but the rates varied at baseline, midline and endline periods. I used “mutate” and “case_when” for each variable but found the code to be lengthy and thought a loop, which I used in STATA, would make my work easier. I have attached screenshots of the STATA code and R code for your reference.
Immaculate replied 5 months, 3 weeks ago 2 Members · 5 Replies -
5 Replies
-
Hello! Here are some possible solutions.
library(tidyverse) library(tibble)Define the data
df ﹤- tribble( ~respondent, ~eval_period, ~formal_emp_ugx, ~personal_emp_ugx, ~casual_emp_ugx, "Aaron", "Baseline", 500, 600, 700, "Bob", "Endline", 600, 700, 800, "Charlie", "Midline", 700, 800, 900 )You can use the across function
There are two options. First you can use the
across()
function.
df %﹥% mutate(across(.cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"), .fns = ~ case_when(eval_period == "Baseline" ~ .x/3700, eval_period == "Endline" ~ .x/3800, eval_period == "Midline" ~ .x/3500)))## # A tibble: 3 × 5 ## respondent eval_period formal_emp_ugx personal_emp_ugx casual_emp_ugx ## ﹤chr﹥ ﹤chr﹥ ﹤dbl﹥ ﹤dbl﹥ ﹤dbl﹥ ## 1 Aaron Baseline 0.135 0.162 0.189 ## 2 Bob Endline 0.158 0.184 0.211 ## 3 Charlie Midline 0.2 0.229 0.257
The tilde,
~
tells R, I am about to give you an operation to apply on many columns. And the.x
signifier represents each of the columns listed (“formal_emp_ugx”, “personal_emp_ugx”, “casual_emp_ugx”).Once this is done, you can rename the variables.
Here are some tutorials on how to use the across function:
- Official documentation
-
Articla from Rebecca Barter
-
Video from IDG tech talk
Pivot longer
You could also first pivot the data to a longer format
df_long ﹤- df %﹥% pivot_longer(cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"), values_to = "ugx")Then it becomes easy to do what you need:
df_long %﹥% mutate(usd = case_when(eval_period == "Baseline" ~ ugx/3700, eval_period == "Endline" ~ ugx/3800, eval_period == "Midline" ~ ugx/3500))## # A tibble: 9 × 5 ## respondent eval_period name ugx usd ## ﹤chr﹥ ﹤chr﹥ ﹤chr﹥ ﹤dbl﹥ ﹤dbl﹥ ## 1 Aaron Baseline formal_emp_ugx 500 0.135 ## 2 Aaron Baseline personal_emp_ugx 600 0.162 ## 3 Aaron Baseline casual_emp_ugx 700 0.189 ## 4 Bob Endline formal_emp_ugx 600 0.158 ## 5 Bob Endline personal_emp_ugx 700 0.184 ## 6 Bob Endline casual_emp_ugx 800 0.211 ## 7 Charlie Midline formal_emp_ugx 700 0.2 ## 8 Charlie Midline personal_emp_ugx 800 0.229 ## 9 Charlie Midline casual_emp_ugx 900 0.257
And you can pivot back at the end:
df %﹥% pivot_longer(cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"), values_to = "ugx") %﹥% mutate(usd = case_when(eval_period == "Baseline" ~ ugx/3700, eval_period == "Endline" ~ ugx/3800, eval_period == "Midline" ~ ugx/3500)) %﹥% pivot_wider(names_from = name, values_from = c(usd, ugx))## # A tibble: 3 × 8 ## respondent eval_period usd_formal_em…¹ usd_p…² usd_c…³ ugx_f…⁴ ugx_p…⁵ ugx_c…⁶ ## ﹤chr﹥ ﹤chr﹥ ﹤dbl﹥ ﹤dbl﹥ ﹤dbl﹥ ﹤dbl﹥ ﹤dbl﹥ ﹤dbl﹥ ## 1 Aaron Baseline 0.135 0.162 0.189 500 600 700 ## 2 Bob Endline 0.158 0.184 0.211 600 700 800 ## 3 Charlie Midline 0.2 0.229 0.257 700 800 900 ## # … with abbreviated variable names ¹usd_formal_emp_ugx, ²usd_personal_emp_ugx, ## # ³usd_casual_emp_ugx, ⁴ugx_formal_emp_ugx, ⁵ugx_personal_emp_ugx, ## # ⁶ugx_casual_emp_ugx
-
The first solution replaces values of existing columns but doesn’t create new columns/ variables. Which changes can be made so that new columns are created?
-
For this you can use the `.names` argument. Like this:
df %>%
mutate(across(.cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"),
.fns = ~ case_when(eval_period == "Baseline" ~ .x/3700,
eval_period == "Endline" ~ .x/3800,
eval_period == "Midline" ~ .x/3500),
.names = "{.col}_usd"
))That will leave you with names that look like “ugx_usd” though. To fix that you can use the `rename_with()` function which can rename many columns at the same time:
df %>%
mutate(across(.cols = c("formal_emp_ugx", "personal_emp_ugx", "casual_emp_ugx"),
.fns = ~ case_when(eval_period == "Baseline" ~ .x/3700,
eval_period == "Endline" ~ .x/3800,
eval_period == "Midline" ~ .x/3500),
.names = "{.col}_usd"
)) %>%
rename_with(.fn = ~ str_replace_all(.x, "ugx_usd", "usd"))
-