The R program below generates hypothetical data concerning student characteristics (GPA and Effort) and whether or not the student should be hired (Hirable).
The following results were extracted from the output and should be used to answer parts a, b. There is no need to run a computer program.
n for training data set=100
Prior probabilities for Hirable: Yes=.55 No p=.45
Frequency Counts |
||||
Hirable |
||||
No (n=45) |
Effort |
|||
GPA |
lots |
some |
||
1.poor |
8 |
8 |
||
2.average |
8 |
12 |
||
3.excellent |
3 |
6 |
||
Yes (n=55) |
Effort |
|||
GPA |
lots |
some |
||
1.poor |
9 |
9 |
||
2.average |
5 |
6 |
||
3.excellent |
18 |
8 |
Posterior Probabilities |
|||
Hirable |
|||
GPA |
Effort |
No |
Yes |
1.poor |
lots |
0.392 |
0.608 |
1.poor |
some |
0.551 |
0.449 |
2.average |
lots |
0.569 |
0.431 |
2.average |
some |
0.715 |
0.285 |
3.excellent |
lots |
0.201 |
0.799 |
3.excellent |
some |
0.324 |
0.676 |
Questions:
- a) Verify the posterior probability computations for a student with a poor GPA who showed lots of Effort.
Posterior Probabilities |
|||
Hirable |
|||
GPA |
Effort |
No |
Yes |
1.poor |
lots |
0.392 |
0.608 |
- b) How would a student with a poor GPA who showed lots of Effort be classified? Hirable Yes? No?
R program
#HW09
library(faraway)
library(caret)
library(e1071)
library(psych)
library(naivebayes)
set.seed(1432)
### Simulate example data
n <- 100
train = data.frame(Hirable = sample(c(“Yes”, “No”), n, TRUE),
GPA = sample(c(“1.poor”,”2.average”,”3.excellent”), n, TRUE),
Effort = sample(c(“lots”,”some”), n, TRUE))
xtabs(~ GPA + Effort + Hirable, data=train)
nb <- naive_bayes(Hirable ~ GPA + Effort, data=train)
summary(nb)
test1 = data.frame(GPA = c(“1.poor”), Effort=c(“lots”))
test2 = data.frame(GPA = c(“1.poor”), Effort=c(“some”))
test3 = data.frame(GPA = c(“2.average”), Effort=c(“lots”))
test4 = data.frame(GPA = c(“2.average”), Effort=c(“some”))
test5 = data.frame(GPA = c(“3.excellent”), Effort=c(“lots”))
test6 = data.frame(GPA = c(“3.excellent”), Effort=c(“some”))
test=rbind(test1,test2,test3,test4,test5,test6)
test
# Classification
predict(nb, test, type = “class”)
nb %class% test
# Posterior probabilities
predict(nb, test, type = “prob”)
nb %prob% test
program already post here.