學習筆記,僅供參考
利用readr和readxl包讀寫數據
讀取數據
- 相關函數
函數包readr和readxl提供了一系列的數據讀入功能,主要函數如下:
#readr包
read_delim(file, delim, quote = "\"", escape_backslash = FALSE,
escape_double = TRUE, col_names = TRUE, col_types = NULL,
locale = default_locale(), na = c("", "NA"), quoted_na = TRUE,
comment = "", trim_ws = FALSE, skip = 0, n_max = Inf,
guess_max = min(1000, n_max), progress = show_progress(),
skip_empty_rows = TRUE)
read_csv(file, col_names = TRUE, col_types = NULL,
locale = default_locale(), na = c("", "NA"), quoted_na = TRUE,
quote = "\"", comment = "", trim_ws = TRUE, skip = 0,
n_max = Inf, guess_max = min(1000, n_max),
progress = show_progress(), skip_empty_rows = TRUE)
#readxl包
read_excel(path, sheet = NULL, range = NULL, col_names = TRUE,
col_types = NULL, na = "", trim_ws = TRUE, skip = 0,
n_max = Inf, guess_max = min(1000, n_max),
progress = readxl_progress(), .name_repair = "unique")
read_xls(path, sheet = NULL, range = NULL, col_names = TRUE,
col_types = NULL, na = "", trim_ws = TRUE, skip = 0,
n_max = Inf, guess_max = min(1000, n_max),
progress = readxl_progress(), .name_repair = "unique")
- 參數
- 舉個例子
輸入:
library(readr)
library(readxl)
cp <-read_delim("comp.csv", ",")
cp.csv <- read_csv("comp.csv")
cp.xl <- read_excel("comp.xlsx")
#summary(cp.csv)
#summary(cp.xl)
system.time(read_csv("data.csv"))
system.time(read.csv("data.csv"))
輸出:
> system.time(read_csv("data.csv"))
用戶 系統 流逝
0.88 0.00 0.89
Warning message:
Duplicated column names deduplicated: 'DATE_R' => 'DATE_R_1' [48]
> system.time(read.csv("data.csv"))
用戶 系統 流逝
3.77 0.05 3.86
通過與R中的read.csv()函數進行比對,我們發現,利用函數包readr和readxl中的函數進行數據讀入的速度有很大提升。
寫入數據
函數包readr提供了數據讀取功能的同時,還提供了數據寫入功能,即將data.frame對象重新寫爲csv, xlsx,等格式的文件。
- 相關函數
write_delim(x, path, delim = " ", na = "NA", append = FALSE,
col_names = !append, quote_escape = "double")
write_csv(x, path, na = "NA", append = FALSE, col_names = !append,
quote_escape = "double")
write_excel_csv(x, path, na = "NA", append = FALSE,
col_names = !append, delim = ",", quote_escape = "double")
- 參數
- 舉個例子
輸入:
df <- data.frame(x = c(1,2,3,4,5), y = c(6,7,NA,9,0))
write_delim(df, "df1.csv", delim = ",")
write_csv(df, "df2.csv", na = "-")
df1.csv:
df2.csv: