Merging rows

Author

Jeffrey R. Stevens

Published

March 3, 2023

For these exercises, we’ll use the dog breed traits and dog breed popularity rankings data sets.

  1. Load tidyverse, import dog_breed_traits_clean.csv to traits, import dog_breed_ranks.csv to ranks, and import dog_breed_ranks.csv to popularity.
# >
  1. First, set a random seed by using set.seed(2). Then create a subset of ranks that is a random selection of 10% of the rows, sort by breed name, and assign to ranks2.
# >
  1. Use a filtering join to return the subset of traits that matches the breeds in ranks2 and assign this to traits2.
# >
  1. Use a filtering join to return the subset of traits that excludes the breeds in ranks2.
# >
  1. Now we want to filter traits based on breeds in popularity. Notice that the breeds column in popularity is called Breed. This is problematic because the breed column in traits is called breed and names are case-sensitive. Use join_by() to filter traits by breeds in popularity. How many rows are there?
# >
  1. Use filter() (not joins) to return the subset of traits that excludes the breeds in ranks2.
# >
  1. Append traits2 to the bottom of itself.
# >
  1. Append traits2 to the right of itself.
# >
  1. Append traits2 to the right of ranks2.
# >
  1. Why is this not a good idea? What would be a better way to achieve this?
# >