Statistik
eksempler i R

Neurologi Neuroanatomi Statistik Home

Deskriptiv


Kvantitative data
Kategoriske data
Intervaller

Analytisk


Sandsynligheder

Kategoriske udfald

Kategoriske eksponeringer
Logistisk regression

Kvantitative udfald

Kvantitative udfald
Linær regression
Korrelationer
Overlevelse
Poisson regression

Tilfældighed


Randomisering

Forskning


PhD thesis



Jacob Liljehult
Klinisk sygeplejespecialist
cand.scient.san, Ph.d.

Neurologisk afdeling
Nordsjællands Hospital

Grupperede kvantitative data

På denne side gives eksempler på hvordan man kan analysere data hvor eksponeringen er kategorisk men udfaldet er kvantitativt. Den kan enten være en sample, der testes mod en given grænseværdi; to grupper der testes mod hinanden eller tre (eller flere) grupper der testes mod hinanden.

One-sample tests

Hypotese test for mean imod en given grænseværdi

One-sample z-test

gn = 72
x = strokedata$age
z = (gn-mean(x)) / (sd(x)/sqrt(length(x)))
pnorm(abs(-z), lower.tail = FALSE)*2

[1] 0.6994493

One-sample t-test

t.test(x, mu = gn)

One Sample t-test

data: x
t = -0.38606, df = 1030, p-value = 0.6995
alternative hypothesis: true mean is not equal to 72
95 percent confidence interval:
  71.09142 72.60984
sample estimates:
mean of x
  71.85063

Visualisering

library(ggplot2)
ggplot(aes(x = age), data = strokedata) + geom_boxplot() +
geom_vline(xintercept = gn, color = "orange", size = 1)

Boxplot for alder; den orange streg angiver grænseværdien der tests imod



To grupper

Student's t-test

Parametrisk test forskel mellem to grupper

t.test(age ~ gender, data = strokedata)

Welch Two Sample t-test

data: age by gender
t = 5.4993, df = 991.93, p-value = 4.851e-08
alternative hypothesis: true difference in means between group F and group M is not equal to 0
95 percent confidence interval:
2.714512 5.726636
sample estimates:
mean in group F mean in group M
  74.06531   69.84473

Wilcoxon rank sum test

Non-parametrisk test for forskel mellem to grupper

wilcox.test(age ~ gender, data = strokedata)

Wilcoxon rank sum test with continuity correction

data: age by gender
W = 160684, p-value = 3.742e-09
alternative hypothesis: true location shift is not equal to 0

ggplot(aes(x = age, fill = gender), data = strokedata) + geom_boxplot() +
geom_vline(xintercept = median(strokedata$age), color = "orange", size = 1)

Boxplot for alder fordelt på køn. Den orange streg angiver gennemsnitsalderen for hele populationen.



Tre eller flere grupper

ANOVA

anova( lm(age ~ smoking, data = strokedata) )

Analysis of Variance Table

Response: age
Df Sum Sq Mean Sq F value Pr(>F)
smoking2123226161.243.724 < 2.2e-16***
Residuals 996 140346 140.9
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

ggplot(aes(x = age, fill = smoking), data = strokedata) + geom_boxplot() + geom_vline(xintercept = median(strokedata$age), color = "orange", size = 1)

Boxplot for alder fordelt på rygning. Den orange streg angiver gennemsnitsalderen for hele populationen.