In this document we will briefly practice some foundational concepts of statistical inference and hypothesis testing. Follow the instructions in the comments of each code chunk.

Exercise 1 - Running a \(t\)-test

# Download the charity donation dataset from the URL provided
url <- "https://peopleanalytics-regression-book.org/data/charity_donation.csv"
charity_data <- read.csv(url)
# Create two vectors capturing the total donations for those who have 
# recently donated and those who have not
donations_recent <- charity_data$total_donations[charity_data$recent_donation == 1]
donations_notrecent <- charity_data$total_donations[charity_data$recent_donation == 0]
# Calculate the difference in the means of these two vectors
# Round this to 2 decimal places
diff <- mean(donations_recent) - mean(donations_notrecent) 
(diff <- round(diff, 2))
## [1] 781.38
# Test the hypothesis that recent donors have a different mean donation amount. 
# Ensure that your test is saved as an object with a name of your choice.
(diff_test <- t.test(donations_recent, donations_notrecent)) 
## 
##  Welch Two Sample t-test
## 
## data:  donations_recent and donations_notrecent
## t = 2.443, df = 130.34, p-value = 0.01591
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   148.6116 1414.1404
## sample estimates:
## mean of x mean of y 
##  2948.313  2166.937
# EXTENSION:  Test the hypothesis that recent donors donate MORE
# (Hint: seek help on the t.test function to work out how to do this)
t.test(donations_recent, donations_notrecent, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  donations_recent and donations_notrecent
## t = 2.443, df = 130.34, p-value = 0.007953
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  251.5076      Inf
## sample estimates:
## mean of x mean of y 
##  2948.313  2166.937

Exercise 2 - Interpreting a \(t\)-test

# The results of a t-test are actually a named list.
# You can access specific elements of the list using $ (eg test$p.value)
# Return the standard error value for the difference in mean total donations. 
# Round this to 2 decimal places
(se <- diff_test$stderr |> 
   round(2))
## [1] 319.85
# Return the p-value and the 95% confidence interval for the population diff.
# round these to 2 decimal places 
(pval <- diff_test$p.value |> 
   round(2))
## [1] 0.02
(confint <- diff_test$conf.int |> 
    round(2))
## [1]  148.61 1414.14
## attr(,"conf.level")
## [1] 0.95

Use these values to write an interpretation of your \(t\)-test, explaining what you have observed in the sample and what this means you can infer about the population.