Jacob Liljehult
Klinisk sygeplejespecialist
cand.scient.san, Ph.d.
Neurologisk afdeling
Nordsjællands Hospital
På denne side gives eksempler på hvordan man kan analysere data hvor eksponeringen er kategorisk men udfaldet er kvantitativt. Den kan enten være en sample, der testes mod en given grænseværdi; to grupper der testes mod hinanden eller tre (eller flere) grupper der testes mod hinanden.
Hypotese test for mean imod en given grænseværdi
gn = 72
x = strokedata$age
z = (gn-mean(x)) / (sd(x)/sqrt(length(x)))
pnorm(abs(-z), lower.tail = FALSE)*2
[1] 0.6994493
t.test(x, mu = gn)
One Sample t-test
data: x
t = -0.38606, df = 1030, p-value = 0.6995
alternative hypothesis: true mean is not equal to 72
95 percent confidence interval:
71.09142 72.60984
sample estimates:
mean of x
71.85063
library(ggplot2)
ggplot(aes(x = age), data = strokedata) + geom_boxplot() +
geom_vline(xintercept = gn, color = "orange", size = 1)
Boxplot for alder; den orange streg angiver grænseværdien der tests imod
Parametrisk test forskel mellem to grupper
t.test(age ~ gender, data = strokedata)
Welch Two Sample t-test
data: age by gender
t = 5.4993, df = 991.93, p-value = 4.851e-08
alternative hypothesis: true difference in means between group F and group M is not equal to 0
95 percent confidence interval:
2.714512 5.726636
sample estimates:
mean in group F mean in group M
74.06531 69.84473
Non-parametrisk test for forskel mellem to grupper
wilcox.test(age ~ gender, data = strokedata)
Wilcoxon rank sum test with continuity correction
data: age by gender
W = 160684, p-value = 3.742e-09
alternative hypothesis: true location shift is not equal to 0
ggplot(aes(x = age, fill = gender), data = strokedata) + geom_boxplot() +
geom_vline(xintercept = median(strokedata$age), color = "orange", size = 1)
Boxplot for alder fordelt på køn. Den orange streg angiver gennemsnitsalderen for hele populationen.
anova( lm(age ~ smoking, data = strokedata) )
Df | Sum Sq | Mean Sq | F value | Pr(>F) | ||
---|---|---|---|---|---|---|
smoking | 2 | 12322 | 6161.2 | 43.724 | < 2.2e-16 | *** |
Residuals | 996 | 140346 | 140.9 |
ggplot(aes(x = age, fill = smoking), data = strokedata) + geom_boxplot() + geom_vline(xintercept = median(strokedata$age), color = "orange", size = 1)
Boxplot for alder fordelt på rygning. Den orange streg angiver gennemsnitsalderen for hele populationen.