Back to Course

FoSSA: Fundamentals of Statistical Software & Analysis

0% Complete
0/0 Steps
  1. Course Information

    Meet the Teaching Team
  2. Course Dataset 1
  3. Course Dataset 2
  4. MODULE A1: INTRODUCTION TO STATISTICS USING R, STATA, AND SPSS
    A1.1 What is Statistics?
  5. A1.2.1a Introduction to Stata
  6. A1.2.2b: Introduction to R
  7. A1.2.2c: Introduction to SPSS
  8. A1.3: Descriptive Statistics
  9. A1.4: Estimates and Confidence Intervals
  10. A1.5: Hypothesis Testing
  11. A1.6: Transforming Variables
  12. End of Module A1
    1 Quiz
  13. MODULE A2: POWER & SAMPLE SIZE CALCULATIONS
    A2.1 Key Concepts
  14. A2.2 Power calculations for a difference in means
  15. A2.3 Power Calculations for a difference in proportions
  16. A2.4 Sample Size Calculation for RCTs
  17. A2.5 Sample size calculations for cross-sectional studies (or surveys)
  18. A2.6 Sample size calculations for case-control studies
  19. End of Module A2
    1 Quiz
  20. MODULE B1: LINEAR REGRESSION
    B1.1 Correlation and Scatterplots
  21. B1.2 Differences Between Means (ANOVA 1)
  22. B1.3 Univariable Linear Regression
  23. B1.4 Multivariable Linear Regression
  24. B1.5 Model Selection and F-Tests
  25. B1.6 Regression Diagnostics
  26. End of Module B1
    1 Quiz
  27. MODULE B2: MULTIPLE COMPARISONS & REPEATED MEASURES
    B2.1 ANOVA Revisited - Post-Hoc Testing
  28. B2.2 Correcting For Multiple Comparisons
  29. B2.3 Two-way ANOVA
  30. B2.4 Repeated Measures and the Paired T-Test
  31. B2.5 Repeated Measures ANOVA
  32. End of Module B2
    1 Quiz
  33. MODULE B3: NON-PARAMETRIC MEASURES
    B3.1 The Parametric Assumptions
  34. B3.2 Mann-Whitney U Test
  35. B3.3 Kruskal-Wallis Test
  36. B3.4 Wilcoxon Signed Rank Test
  37. B3.5 Friedman Test
  38. B3.6 Spearman's Rank Order Correlation
  39. End of Module B3
    1 Quiz
  40. MODULE C1: BINARY OUTCOME DATA & LOGISTIC REGRESSION
    C1.1 Introduction to Prevalence, Risk, Odds and Rates
  41. C1.2 The Chi-Square Test and the Test For Trend
  42. C1.3 Univariable Logistic Regression
  43. C1.4 Multivariable Logistic Regression
  44. End of Module C1
    1 Quiz
  45. MODULE C2: SURVIVAL DATA
    C2.1 Introduction to Survival Data
  46. C2.2 Kaplan-Meier Survival Function & the Log Rank Test
  47. C2.3 Cox Proportional Hazards Regression
  48. C2.4 Poisson Regression
  49. End of Module C2
    1 Quiz

Learning Outcomes

By the end of this section, students will be able to:

  • Explain the importance of the parametric assumptions and determine if they have been met
  • Explain the basic principles of rank based non-parametric statistical tests 
  • Describe the use of a range of common non-parametric tests
  • Conduct and interpret common non-parametric tests

You can download a copy of the slides here: B3.2 Mann-Whitney U Test

B3.2 PRACTICAL: R

The Mann-Whitney U test is also sometimes called the Wilcoxon Rank-Sum test.

When first examining your data, you may want to check the distribution of the variables of interest and calculate appropriate summary statistics for them.

We will use the wilcox_test command to perform this. We must specify the data, and the variables to be considered in the form dependent variable ~ grouping variable.

We want to use the Mann-Whitney U Test to determine if there is a significant difference in body condition score between the wild type mice and the Cdkn1a knockout mice at the start of the study (BCS_baseline).

This test can only have two groups so we need to use the comparisons argument in the function so that it specifies the two groups being compared:

> wilcox_test(mice, BCS_baseline ~ Strain, comparisons = list(c(“KO Cdkn1a”, “Wild”)))

The RStudio output looks like this:

There is no significant difference (p>0.05) in body condition score between the wild type mice and the Cdkn1a knockout mice at the baseline.

We can see that there is no significant difference (p>0.05) in body condition score between the wild type mice and the Cdkn1a knockout mice at the baseline.

Question B3.2: Is there a significant difference in body condition score between the two different knockout strainsat the end of the trial?

Answer

We can run this comparison by specifying these two strains in the comparison argument of the function:

> wilcox_test(mice, BCS_end ~ Strain, comparisons = list(c(“KO Cdkn1a”, “KO N-ras”)))

The RStudio output looks like this:

We can see that the two knockout strains are significantly different (p<0.05) in body condition score at the end of the study.

B3.2 PRACTICAL: Stata

The Mann-Whitney U test is also sometimes called the Wilcoxon Rank-Sum test.

When first examining your data, you may want to check the distribution of the variables of interest and calculate appropriate summary statistics for them. To calculate the median, there is a function under the egen command that you can look up. You can calculate the IQR by hand from the display of the summarise, detail command; or you can type egen iqr=iqr(var1) and then tab iqr.

For Mann-Whitney U test (or Wilcoxon rank-sum test), the Stata code is:

ranksum var1, by(var2)

Use the Mann-Whitney U Test to determine if there is a significant difference in body condition score between the wild type mice and the Cdkn1a knockout mice at the start of the study (BCS_baseline).

This test can only have two groups so we need to recode our strain variable so that it specifies the two groups being compared:

recode Strain_group (1=1 “Wild”) (2=2 “Cdkn1a”) (3=.), gen(strain1_2) label(strain12)

tab strain1_2, m

ranksum BCS_baseline,by( strain1_2)

When using small sample sizes (N<200) Stata will report the exact significance alongside the asymptotic significance, so we can report P=0.18 in this case. There is no significant difference in BCS between these groups at baseline.

Question B3.2: Is there a significant differences in body condition score between the two different knockout strains (strains 2 and 3) at the end of the trial?

Answer

recode Strain_group (1=.) (2=2 “Cdkn1a”) (3=3 “N-ras”), gen(strain2_3) label(strain23)

Here P<0.001, so there is a significant difference in BCS between these two groups at the end of the trial.

B3.2 PRACTICAL: SPSS

Use the Mann-Whitney U Test to determine if there is a significant difference in body condition score between the wild type mice and the Cdkn1a knockout mice at the start of the study (BCS_baseline).

Select

Analyze >> Nonparametric Tests  >> Legacy Dialogs >> 2 Independent Samples

SPSS assumes that each row is a separate participant or case, so for all independent tests it requires the dependant variable to be all in one column, and for there to be a separate grouping variable.

Move the dependant variable of interest (BCS_baseline) into the Test Variable List.

Assign ‘Strain_group’ as the grouping variable and then click ‘Define Groups’. Here you need to add the numerical grouping value of the two groups you wish to test, in this case 1 for wild type and 2 for Cdkn1a knockout.

Make sure Mann-Whitney U is selected at the bottom of the box before you press ‘OK’ to run the test.

Now use the same process to test for any significant differences in body condition score between the two different knockout strains at the end of the trial.

Answer

When using small sample sizes such as this SPSS will report the exact significance alongside the asymptotic significance, so we can report P=0.289 in this case. There is no significant difference in BCS between these groups at baseline.

Here P<0.001, so there is a significant difference in BCS between these two groups at the end of the trial.

👋 Before you go, please rate your satisfaction with this lesson

Ratings are completely anonymous

Average rating 5 / 5. Vote count: 1

No votes so far! Be the first to rate this post.

Please share any positive or negative feedback you may have.

Feedback is completely anonymous

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Questions or comments?x