# >
Piping
For these exercises, we’ll use the dog breed traits data set.
- Create a pipeline to do all of the following:
- assign pipeline to
traits
- import data from https://jeffreyrstevens.quarto.pub/dpavir/data/dog_breed_traits.csv
- subset only the columns Breed through Coat Length
- remove the Drooling Level column
- Rename the column names to
"breed", "affectionate", "children", "other_dogs", "shedding", "grooming", "coat_type", "coat_length"
.
# >
- Do the following using
traits
.
- assign to
traits2
- rescale all of the ratings columns by subtracting 1 from all of the values
- create a new column called coat that combines the coat_type and coat_length columns by pasting the values of those two columns separated by
-
- create a new column called shed that dichotomizes shedding such that values of 3 and above are “A lot” and values below 3 are “Not much” and places the new column after shedding
- calculate the mean rating for the children and other_dogs columns in a column called
mean_rating
and place it after other_dogs
# >
- Do the following using
traits2
.
- assign to
coat_grooming
- subset only the grooming and coat_type columns
- run a linear model (
lm
) using the formulagrooming ~ coat_type
(remember to use a placeholder for the data) - apply the
summary()
function - print the results to console
# >