Back to Course

FoSSA: Fundamentals of Statistical Software & Analysis

0% Complete
0/0 Steps
  1. Course Information

    Meet the Teaching Team
  2. Course Dataset 1
  3. Course Dataset 2
  4. MODULE A1: INTRODUCTION TO STATISTICS USING R, STATA, AND SPSS
    A1.1 What is Statistics?
  5. A1.2.1a Introduction to Stata
  6. A1.2.2b: Introduction to R
  7. A1.2.2c: Introduction to SPSS
  8. A1.3: Descriptive Statistics
  9. A1.4: Estimates and Confidence Intervals
  10. A1.5: Hypothesis Testing
  11. A1.6: Transforming Variables
  12. End of Module A1
    1 Quiz
  13. MODULE A2: POWER & SAMPLE SIZE CALCULATIONS
    A2.1 Key Concepts
  14. A2.2 Power calculations for a difference in means
  15. A2.3 Power Calculations for a difference in proportions
  16. A2.4 Sample Size Calculation for RCTs
  17. A2.5 Sample size calculations for cross-sectional studies (or surveys)
  18. A2.6 Sample size calculations for case-control studies
  19. End of Module A2
    1 Quiz
  20. MODULE B1: LINEAR REGRESSION
    B1.1 Correlation and Scatterplots
  21. B1.2 Differences Between Means (ANOVA 1)
  22. B1.3 Univariable Linear Regression
  23. B1.4 Multivariable Linear Regression
  24. B1.5 Model Selection and F-Tests
  25. B1.6 Regression Diagnostics
  26. End of Module B1
    1 Quiz
  27. MODULE B2: MULTIPLE COMPARISONS & REPEATED MEASURES
    B2.1 ANOVA Revisited - Post-Hoc Testing
  28. B2.2 Correcting For Multiple Comparisons
  29. B2.3 Two-way ANOVA
  30. B2.4 Repeated Measures and the Paired T-Test
  31. B2.5 Repeated Measures ANOVA
  32. End of Module B2
    1 Quiz
  33. MODULE B3: NON-PARAMETRIC MEASURES
    B3.1 The Parametric Assumptions
  34. B3.2 Mann-Whitney U Test
  35. B3.3 Kruskal-Wallis Test
  36. B3.4 Wilcoxon Signed Rank Test
  37. B3.5 Friedman Test
  38. B3.6 Spearman's Rank Order Correlation
  39. End of Module B3
    1 Quiz
  40. MODULE C1: BINARY OUTCOME DATA & LOGISTIC REGRESSION
    C1.1 Introduction to Prevalence, Risk, Odds and Rates
  41. C1.2 The Chi-Square Test and the Test For Trend
  42. C1.3 Univariable Logistic Regression
  43. C1.4 Multivariable Logistic Regression
  44. End of Module C1
    1 Quiz
  45. MODULE C2: SURVIVAL DATA
    C2.1 Introduction to Survival Data
  46. C2.2 Kaplan-Meier Survival Function & the Log Rank Test
  47. C2.3 Cox Proportional Hazards Regression
  48. C2.4 Poisson Regression
  49. End of Module C2
    1 Quiz

Learning Outcomes

By the end of this section, students will be able to:

  • Explain the importance of the parametric assumptions and determine if they have been met
  • Explain the basic principles of rank based non-parametric statistical tests
  • Describe the use of a range of common non-parametric tests
  • Conduct and interpret common non-parametric tests

You can download a copy of the slides here: B3.5 Friedman Test

B3.5 PRACTICAL: R

We want to examine if there is a significant difference in body condition score among the three measures taken during the study for all mice.

As with the repeated measures ANOVA, before we begin using the Friedman test, we must convert our data into long form. We will do this in a similar manner to before, first by creating a subject id variable, and then reshaping the data:

> mice$id <- 1:nrow(mice)
> mice.bcs <- mice %>%
>   gather(key = “time”, value = “BCS”, BCS_baseline, BCS_mid, BCS_.end) %>%
>   convert_as_factor(id, time)

We can now use the friendman.test function – it requires three inputs: the data, the comparison you are doing, and the blocking factor (subject id) and takes the form:

> friendman.test(data = name, dependent variable ~ independent variable | id)

In this case, we will use:

> friedman.test(data=mice.bcs, BCS ~ time | id)

Which produces the following results:

The body condition score did not significantly (p>0.05) differ between the time points of measurement when consider all the mice.

Next, we may want to consider the different strains individually. Create a subset containing the N-ras knockout mice only:

> mice3 <- mice[mice$Strain==”KO N-ras”,]

The Friedman test must have all of the id variables in ascending order from 1 to n, so we must remake the id variable in the same manner as before, namely:

> mice3$id <- 1:nrow(mice3)

We then reshape our data into long form again. Note that this is exactly the same code as above aside from changing the dataset names, so can be run very quickly.

> mice.bcs3 <- mice3 %>%
>   gather(key = “time”, value = “BCS”, BCS_baseline, BCS_mid, BCS_.end) %>%
>   convert_as_factor(id, time)

We are now able to run the Friedman test on this subset of our data.

Question B3.5: Use a Friedman test to assess if there is a difference in body condition score only for the N-ras knockout mice only.

Answer

We first run the Friedman test:

> friedman.test(data=mice.bcs3, BCS ~ time | id)

The RStudio output looks like this:

We can see there is a significant difference (p>0.05) in the body condition score of the N-ras knockout mice between the time points of measurement.

B3.5 PRACTICAL: Stata

To perform the Friedman Test in Stata we need to install a new package called ‘emh’ written by the Stata community, so type:

ssc install emh

This package will run a Friedman test, but it needs your data to be in long format instead of wide format. (Note: there is a Stata Friedman package for data in wide format called ‘friedman’ but it does not include a correction for ties and it will produce different results).

Long format means that all the BCS variables are in one column, with a second column specifying which time they were measured at (beginning, middle or end). We need an ID variable to reshape our data, and we need to make sure the BCS variables have a standardised name with a number at the end. Since we are changing the structure of our data for this exercise, and we will want to return our data to the original format after we are finished with this test, we

also want to use the ‘preserve’ and ‘restore’ commands. Considering all of this, run the following commands:

preserve

 gen id=_n

 keep Strain_group BCS_baseline BCS_mid BCS_end id

 rename BCS_baseline BCS1

 rename BCS_mid BCS2

 rename BCS_end BCS3

 reshape long BCS, i(id) j(time)

 br

These commands save your original data format, they reshape your data so that all the BCS measurements are in one column, and they make a new column called ‘time’ which indicates if the BCS measurements were taken at the beginning (1), middle (2) or end (3).

Now you are ready to run the Friedman test:

emh BCS time, anova transformation(rank)

This shows that when considered as a whole group the body condition score of the mice did not significantly differ between the timepoints at which this was measured.

Next, we may want to consider the different strains individually. Use the ‘if’ option to look at this for the N-ras knockout mice only (i.e. “if Strain_group==3”).

When you are done, type the command ‘restore’ to restore your dataset to its original structure.

Question B3.5: Use a Friedman test to assess if there is a difference in body condition score only for Strain_group==3.

Answer

The test statistics tell us that there is a significant difference for this group (Q(2)=13.33, p<0.01).

B3.5 PRACTICAL: SPSS

Part 1

Use the Friedman Test to determine if there is a significant difference in body condition score across all mice in the study between each of the three measures taken during the trial.

Select

Analyze >> Nonparametric Tests  >> Legacy Dialogs >> K Related Samples

SPSS assumes that each row is a separate participant or case, so for all repeated measures tests it requires each measure to be a separate variable.

Move the all three of the variables you wish to consider into the ‘Test Variables’ box. If you have more timepoints, you can use move all of them into the ‘Test Variables’ box at the same time, it is not limited to three.

Make sure ‘Friedman’ is selected at the bottom of the box before you press ‘OK’ to run the test.

Part 2

In this situation we may want to consider the different strains individually. Use the ‘Select Cases’ function to look at this for the N-ras knockout mice only.

Select

Data >> Select Cases

Select ‘If condition is satisfied’ and then press the blue button called ‘If’

Here we can specify we only want cases where the variable Strain_group is equal to 3.

Press continue and then OK to run.

Then re-run the Friedman test the same as in Part 1.

Answer

Part 1

This shows that when considered as a whole group the body condition score of the mice did not significantly differ between the timepoints at which this was measured.

Part 2

If the ‘Select cases’ has been done correctly you will see a filter variable appear which tells you which rows are selected. Unselected rows will also appear with a line through them.

The test statistics tell us that there is a significant difference for this group, and the ranks tell us that their body condition was decreasing throughout the trial.

You can now try re-running this for each of the strains in the trial if you wish.

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Questions or comments?x