作業:
現在,所有數據科學中最令人興奮的領域之一就是可穿戴計算 - 請看這篇文章。公司(例如,Fitbit、Nike和Jawbone Up)正在競相發展最先進的算法來吸引新用戶。與課程網站關聯的數據表示從三星Galaxy S智能手機的加速器上收集的數據。完整的解釋可在獲得數據的網站上獲取:
http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones以下是該項目的數據:
https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip作業要求:
You should create one R script called run_analysis.R that does the following.
- Merges the training and the test sets to create one data set.
- Extracts only the measurements on the mean and standard deviation for each measurement.
- Uses descriptive activity names to name the activities in the data set
- Appropriately labels the data set with descriptive variable names.
-
From the data set in step 4, creates a second, independent tidy data set with the average of each variable for each activity and each subject
# run_analysis.R
#載入dplyr包
library(dplyr)
#獲得train_X和train_Y並合併成train_Data#
setwd("/Users/fushanshan/Downloads/UCI HAR Dataset/train")
a <- list.files(pattern=".*.txt")
train_Data <- do.call(cbind,lapply(a, read.table))
#獲得test_X和test_Y並合併成test_Data#
setwd("/Users/fushanshan/Downloads/UCI HAR Dataset/test")
b <- list.files(pattern=".*.txt")
test_Data <- do.call(cbind,lapply(b, read.table))
#將兩個數據合併在一個dataset#
dataset <- rbind(train_Data, test_Data)
#返回所有列的平均值
apply(train_Data, 1, mean)
apply(train_Data, 1, std)
apply(test_Data, 1, mean)
apply(test_Data, 1, std)
#將Y的1-6修改爲對應的activity
dataset$V1[dataset$V1 == 1] <-"WALKING"
dataset$V1[dataset$V1 == 2] <-"WALKING UPSTAIRS"
dataset$V1[dataset$V1 == 3] <- "WALKING_DOWNSTAIRS"
dataset$V1[dataset$V1 == 4] <- "SITTING"
dataset$V1[dataset$V1 == 5] <- "STANDING"
dataset$V1[dataset$V1 == 6] <- "LAYING"
#讀取標籤
features <- read.table("/Users/fushanshan/Downloads/UCI HAR Dataset/features.txt")
feature <- rbind(features[,c(1,2)], matrix(c(562,"activity", 563, "subject"), nrow = 2, byrow = TRUE))
#將標籤分別賦予dataset的每一列
colnames(dataset) <- feature[,2]
#不同活動的平均值形成新的數據
act_mean <- aggregate(dataset$activity, dataset, mean)
#不同主題的平均值形成新的數據
sub_mean <- aggregate(act_mean$subject, act_mean, mean)
new_table <- sub_mean[,c(564,565)]
#讀出數據
write.table(new_table, file = "/Users/fushanshan/Downloads/new_table.txt", row.name = F, quote = F)
github地址:github