R語言數據結構4—dataframe

dataframe is about datasets containing different data types instead of only one.由於不同的列可以包含不同模式(數值型、字符型等)的數據,數據框的概念較矩陣來說更爲一般。由於數據有多種模式,無法將此數據集放入一個矩陣。在這種情況下,使用數據框是最佳選擇。


#head,tail分別顯示mtcars的前幾行和後幾行,使我們對數據有個大概的瞭解

head(mtcars)

tail(mtcars)


str(mtcars)

  • For a data frame it tells you:

    • The total number of observations (e.g. 32 car types)

    • The total number of variables (e.g. 11 car features)

    • A full list of the variables names (e.g. mpg, cyl ... )

    • The data type of each variable (e.g. num for car features)

    • The first observations

#創建一個dataframe

mydataframe <- dataframe(col1,col2,col3,....)

planets <- c("Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus",

   "Neptune")

type <- c("Terrestrial planet", "Terrestrial planet", "Terrestrial planet",

   "Terrestrial planet", "Gass giant", "Gass giant", "Gass giant", "Gass giant")

diameter <- c(0.382, 0.949, 1, 0.532, 11.209, 9.449, 4.007, 3.883)

rotation <- c(58.64, -243.02, 1, 1.03, 0.41, 0.43, -0.72, 0.67)

rings <- c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE)



# Create the data frame:

planets.df <- data.frame(planets,type,diameter,rotation,rings)

planets.df

> planets.df

planets               type diameter rotation rings
1 Mercury Terrestrial planet    0.382    58.64 FALSE
2   Venus Terrestrial planet    0.949  -243.02 FALSE
3   Earth Terrestrial planet    1.000     1.00 FALSE
4    Mars Terrestrial planet    0.532     1.03 FALSE
5 Jupiter         Gass giant   11.209     0.41  TRUE
6  Saturn         Gass giant    9.449     0.43  TRUE
7  Uranus         Gass giant    4.007    -0.72  TRUE
8 Neptune         Gass giant    3.883     0.67  TRUE

> str(planets.df)

'data.frame':8 obs. of  5 variables:
$ planets : Factor w/ 8 levels "Earth","Jupiter",..: 4 8 1 3 2 6 7 5
$ type    : Factor w/ 2 levels "Gass giant","Terrestrial planet": 2 2 2 2 1 1 1 1
$ diameter: num  0.382 0.949 1 0.532 11.209 ...
$ rotation: num  58.64 -243.02 1 1.03 0.41 ...
$ rings   : logi  FALSE FALSE FALSE FALSE TRUE TRUE ...


#只選擇一個屬性

furthest.planets.diameter <- planets.df[3:8,"diameter"]

furthest.planets.diameter

> furthest.planets.diameter

[1]  1.000  0.532 11.209  9.449  4.007  3.883


#完整顯示數據防止被截斷,使用$符號

rings.vector <- planets.df$rings

rings.vector

> rings.vector

[1] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE


# Select the information on planets with rings:

rings.vector <- planets.df$rings

planets.with.rings.df <- planets.df[rings.vector,]

planets.with.rings.df


> planets.with.rings.df

      type planets diameter rotation rings
5 Gass giant Jupiter   11.209     0.41  TRUE
6 Gass giant  Saturn    9.449     0.43  TRUE
7 Gass giant  Uranus    4.007    -0.72  TRUE
8 Gass giant Neptune    3.883     0.67  TRUE


#它和下面的語句是等價的

subset(planets.df, subset=(planets.df$rings == TRUE))


order() is a function that, when applied on a variable, gives you in return the position of each element. Let's look at the vector a: a <- c(100,9,101). Now order(a)returns 2,1,3.

a[order(a)]返回排列好後的a


positions <-order(planets.df$diameter,decreasing=TRUE)

# Create new 'ordered' data frame:

largest.first.df <- planets.df[positions,]

# Show me the

largest.first.df

> largest.first.df

              type planets diameter rotation rings
5         Gass giant Jupiter   11.209     0.41  TRUE
6         Gass giant  Saturn    9.449     0.43  TRUE
7         Gass giant  Uranus    4.007    -0.72  TRUE
8         Gass giant Neptune    3.883     0.67  TRUE
3 Terrestrial planet   Earth    1.000     1.00 FALSE
2 Terrestrial planet   Venus    0.949  -243.02 FALSE
4 Terrestrial planet    Mars    0.532     1.03 FALSE
1 Terrestrial planet Mercury    0.382    58.64 FALSE


發佈了24 篇原創文章 · 獲贊 2 · 訪問量 1萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章