Piping

Author

Jeffrey R. Stevens

Published

February 17, 2023

For these exercises, we’ll use the dog breed traits data set.

assign pipeline to traits
import data from https://jeffreyrstevens.quarto.pub/dpavir/data/dog_breed_traits.csv
subset only the columns Breed through Coat Length
remove the Drooling Level column

# >

Rename the column names to "breed", "affectionate", "children", "other_dogs", "shedding", "grooming", "coat_type", "coat_length".

# >

assign to traits2
rescale all of the ratings columns by subtracting 1 from all of the values
create a new column called coat that combines the coat_type and coat_length columns by pasting the values of those two columns separated by -
create a new column called shed that dichotomizes shedding such that values of 3 and above are “A lot” and values below 3 are “Not much” and places the new column after shedding
calculate the mean rating for the children and other_dogs columns in a column called mean_rating and place it after other_dogs

# >

assign to coat_grooming
subset only the grooming and coat_type columns
run a linear model (lm) using the formula grooming ~ coat_type (remember to use a placeholder for the data)
apply the summary() function
print the results to console

# >