2023-02-08
Excel (.xls
/.xlsx
): Binary matrix file with formatting, formulas, multiple sheets
Comma-separated values (.csv
): Plain text matrix file without formatting, etc. (also TSV)
Other program-specific files: SPSS, SAS, etc.
Text files (.txt
): Plain text file of raw text
Start saving CSVs and convert other formats to CSVs
Download data for dog breed popularity.
Create data/
directory in your dpavir2023
course directory.
Save dog_breed_popularity.csv
into the data/
directory.
View file in RStudio file manager
read.table()
Header row (turn off with header = FALSE
)
Comma separated (change with sep=";"
or use read.csv2()
)
Outputs data frame
read.csv(file = "path/to/file.csv")
Control column names with col_names
(including renaming)
Control column types with col_types
Control missing values with na
and quoted_na
Can skip rows before reading data with skip
or cut off with n_max
Outputs tibble
read_csv(file = "path/to/file.csv")
Both read.csv()
and read_csv()
import CSV files available online by using the URL as the path.
https://jeffreyrstevens.quarto.pub/dpavir/data/dog_breed_traits.csv
Character/factor columns in quotes with quote = TRUE
Remove row/column names with row.names = FALSE
or col.names = FALSE
write.csv(df, file = "path/to/file.csv")
write_csv(df, file = "path/to/file.csv")
Functions: read_xls()
, read_xlsx()
, read_excel()
Specify sheets with sheets
argument
Specify subset of cells with range
argument
Like read_csv()
, has col_names
, col_types
, na
, skip
, n_max
read_excel(path = "path/to/file.csv")
*
library(readxl)
mydf5 <- read_excel(here("data/dog_breed_data.xlsx"), sheet = "Sheet2")
haven::read_sav("mtcars.sav")
haven::read_sas("mtcars.sas7bdat")
haven::read_dta("mtcars.dta")
Register your Qualtrics credentials with qualtRics::qualtrics_api_credentials()
*
Get survey ID by viewing qualtRics::all_surveys()
Import data with qualtRics::fetch_survey()
Never have to download Qualtrics data again!
Download choice text by default or numeric values with label = FALSE
Set time zone with time_zone = "America/Chicago"
Turn off sublabels with add_var_labels = FALSE
mydf6 <- qualtRics::fetch_survey("SV_xxxxxxxxxxxxx", save_dir = "data", label = FALSE, convert = FALSE,
force_request = TRUE, time_zone = "America/Chicago")
OneDrive {Microsoft365R}*
Google sheets {googlesheets4}*
Box {boxr}*