library(tidyverse)
library(tidymodels)
library(knitr)
library(NHANES)
AE 16: Conditions for logistic regression
Go to the course GitHub organization and locate your ae-16
repo to get started.Render, commit, and push your responses to GitHub by the end of class.
Render, commit, and push your responses to GitHub by the end of class. The responses are due in your GitHub repo no later than Saturday, November 18 at 11:59pm.
Packages
Today’s data is the nhanes_adult
data set derived from the NHANES
data set in the NHANES R package. It contains information for U.S. adults (age 18+) who participated in the 2009 - 2010 and 2011 - 2012 years of the National Health and Nutrition Examination Survey (NHANES). You can find details about how participants are selected on the CDC website.
We will use the following variables:
HealthGen
: Self-reported rating of participant’s health in general. Excellent, Vgood, Good, Fair, or Poor.Age
: Age at time of screening (in years). Participants 80 or older were recorded as 80.PhysActive
: Participant does moderate to vigorous-intensity sports, fitness or recreational activities.
<- read_csv("data/nhanes-adult.csv") nhanes_adult
The goal of this AE is to check the conditions for a model that uses age and physical activity to understand the odds an adult in the United States rates their health as “excellent”.
Click here to see notes on conditions for logistic regression.
- Create a new variable called
HealthExcellent
that takes the value 1 ifHealthGen
= “Excellent” and 0 otherwise. Make a table showing the distribution ofHealthExcellent
.
# add code
- Calculate the empirical logit of
HealthExcellent == 1
for each level ofPhysActive
. Then describe what the empirical logit means in the context of the data.
# add code
- Check the model conditions for a logistic regression model that uses age and physical activity to understand the odds an adult in the United States rates their health as “excellent”.
State whether each condition is satisfied and briefly explain your response.Linearity
Randomness
Independence
You can make the plots to check linearity “manually” using ggplot or using the functions from the Stat2Data package.
If you use the Stat2Data package, you need to add library(Stat2Data)
to the load-packages code chunk at the top of the document. You may also need to install the package by running the code below in the console.
install.packages("Stat2Data")
Submission
To submit the AE:
- Render the document to produce the PDF with all of your work from today’s class.
- Push all your work to your
ae-16
repo on GitHub. (You do not submit AEs on Gradescope).