一、R基本操作
1、創建一個名爲x的變量,它包含5個來自標準正態分佈的隨機誤差;
eg、x<-norm(5)
2、賦值
age<-c(1,3,5,2,11,9,3,9,12,3)
3、求平均數
mean(weight)
4、求標準差
sd(weight)
5、求相關係數
cor(age,weight)
6、繪製曲線
plot(age,weight)
7、查看當前目錄
> getwd()
[1] "C:/Users/yangxuejun/Documents"
8、創建並設定當前工作目錄
dir.create("E:/Raction")
> setwd("E:/Raction")
9、列出當前工作空間中對象
> ls()
[1] "age" "weight"
二、創建數據集
數據集通常是有數據構成的矩形數組,行表示觀測,列表式變量
2.1、R數據結構
1、 向量
同一向量中的數據必須具有相同模式
1)組合功能函數c()用來創建向量
a<-c(1,2,3,4,5,6)
2) 訪問元素
> a<-c(1,2,3,4,5)
> a[3]
[1] 3
> a[c(1,3,4)]
[1] 1 3 4
> a[2:6]
[1] 2 3 4 5 NA
2、矩陣
通過函數matrix來創建矩陣
1) 創建矩陣
> y<-matrix(1:20,nrow=5,ncol=4)
> y
[,1] [,2] [,3] [,4]
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 13 18
[4,] 4 9 14 19
[5,] 5 10 15 20
> cells<-c(1,26,24,68)
> rnames<-c("R1","R2")
> cnames<-c("C1","C2")
> mymatrix<-matrix(cells,nrow=2,ncol=2,byrow=TRUE,dimnames=list(rnames,cnames)) //創建並按行填充矩陣
> mymatrix
C1 C2
R1 1 26
R2 24 68
>mymatrix<matrix(cells,nrow=2,ncol=2,byrow=FALSE,dimnames=list(rnames,cnames)) //創建並按列填充矩陣
> mymatrix
C1 C2
R1 1 24
R2 26 68
2) 矩陣下標
x<-matrix(1:10,nrow=2)
> x
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
> x[2,]
[1] 2 4 6 8 10
> x[,2]
[1] 3 4
3 數組
1) 創建數組—通過array()函數創建
> dim1<-c("A1","A2")
> dim2<-c("B1","B2","B3")
> dim3<-c("C1","C2","C3","C4")
> z<-array(1:24,c(2,3,4),dimnames=list(dim1,dim2,dim3))
> z
, , C1
B1 B2 B3
A1 1 3 5
A2 2 4 6
, , C2
B1 B2 B3
A1 7 9 11
A2 8 10 12
, , C3
B1 B2 B3
A1 13 15 17
A2 14 16 18
, , C4
B1 B2 B3
A1 19 21 23
A2 20 22 24
2) 訪問元素
> z[1,2,3]
[1] 15
4 數據框 —- 數據具有多種模式,無法將數據集放入矩陣中,這種情況下數據框爲最佳選擇
使用data.frame()來創建
1) 創建數據框
> patientID<-c(1,2,3,4)
> age<-c(25,34,28,52)
> diabetes<-c("T1","T2","T2","T1")
> status<-c("Poor","Improved","Excellent","Poor")
> patientdata<-data.frame(patientID,age,diabetes,status)
> patientdata
patientID age diabetes status
1 1 25 T1 Poor
2 2 34 T2 Improved
3 3 28 T2 Excellent
4 4 52 T1 Poor
2)選取數據框中的元素
> patientdata[1:2]
patientID age
1 1 25
2 2 34
3 3 28
4 4 52
> patientdata[c("diabetes","status")]
diabetes status
1 T1 Poor
2 T2 Improved
3 T2 Excellent
4 T1 Poor
$選取給定數據框中的某個特定變量
table(patientdata$diabetes,patientdata$status)
Excellent Improved Poor
T1 0 0 2
T2 1 1 0
3) attach()—-將數據框添加到R的搜索路徑中
detach()—-將數據從R的搜索路徑中移除
with()
attach(patientdata)
The following objects are masked _by_ .GlobalEnv:
age, diabetes, patientID, status
> summary(age)
Min. 1st Qu. Median Mean 3rd Qu. Max.
25.00 27.25 31.00 34.75 38.50 52.00
> detach(patientdata)
> with(patientdata,{summary(age)})
Min. 1st Qu. Median Mean 3rd Qu. Max.
25.00 27.25 31.00 34.75 38.50 52.00
使用特殊賦值符<<-創建在with()結構以外存在的對象
5 因子
變量可歸結爲名義型、有序型、連續型變量;
類別變量(名義型)和有序類別(有序型)變量在R中稱爲因子
> diabetes<-c("T1","T2","T3","T4")
> status<-c("Poor","Improved","Excellent","Poor")
> diabetes<-factor(diabetes)
> status<-factor(status,order=TRUE)
6 列表—-使用函數list()創建列表
> g<- "my first list"
> h<-c(25,26,18,39)
> j<-matrix(1:10,nrow=5)
> k<-c("one","two","three")
> mylist<-list(title=g,ages=h,j,k)
> mylist
$title
[1] "my first list"
$ages
[1] 25 26 18 39
[[3]]
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
[[4]]
[1] "one" "two" "three"
> mylist[[2]]
[1] 25 26 18 39