如何更改DataFrame列的順序?

本文翻譯自:How to change the order of DataFrame columns?

I have the following DataFrame ( df ): 我有以下DataFramedf ):

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))

I add more column(s) by assignment: 我通過分配添加了更多列:

df['mean'] = df.mean(1)

How can I move the column mean to the front, ie set it as first column leaving the order of the other columns untouched? 如何將欄mean移到最前面,即將其設置爲第一欄,而其他欄的順序保持不變?


#1樓

參考:https://stackoom.com/question/tAVR/如何更改DataFrame列的順序


#2樓

One easy way would be to reassign the dataframe with a list of the columns, rearranged as needed. 一種簡單的方法是使用列的列表重新分配數據框,並根據需要重新排列。

This is what you have now: 這是您現在擁有的:

In [6]: df
Out[6]:
          0         1         2         3         4      mean
0  0.445598  0.173835  0.343415  0.682252  0.582616  0.445543
1  0.881592  0.696942  0.702232  0.696724  0.373551  0.670208
2  0.662527  0.955193  0.131016  0.609548  0.804694  0.632596
3  0.260919  0.783467  0.593433  0.033426  0.512019  0.436653
4  0.131842  0.799367  0.182828  0.683330  0.019485  0.363371
5  0.498784  0.873495  0.383811  0.699289  0.480447  0.587165
6  0.388771  0.395757  0.745237  0.628406  0.784473  0.588529
7  0.147986  0.459451  0.310961  0.706435  0.100914  0.345149
8  0.394947  0.863494  0.585030  0.565944  0.356561  0.553195
9  0.689260  0.865243  0.136481  0.386582  0.730399  0.561593

In [7]: cols = df.columns.tolist()

In [8]: cols
Out[8]: [0L, 1L, 2L, 3L, 4L, 'mean']

Rearrange cols in any way you want. 重新排列cols在任何你想要的方式。 This is how I moved the last element to the first position: 這就是我將最後一個元素移到第一個位置的方式:

In [12]: cols = cols[-1:] + cols[:-1]

In [13]: cols
Out[13]: ['mean', 0L, 1L, 2L, 3L, 4L]

Then reorder the dataframe like this: 然後像這樣重新排列數據框:

In [16]: df = df[cols]  #    OR    df = df.ix[:, cols]

In [17]: df
Out[17]:
       mean         0         1         2         3         4
0  0.445543  0.445598  0.173835  0.343415  0.682252  0.582616
1  0.670208  0.881592  0.696942  0.702232  0.696724  0.373551
2  0.632596  0.662527  0.955193  0.131016  0.609548  0.804694
3  0.436653  0.260919  0.783467  0.593433  0.033426  0.512019
4  0.363371  0.131842  0.799367  0.182828  0.683330  0.019485
5  0.587165  0.498784  0.873495  0.383811  0.699289  0.480447
6  0.588529  0.388771  0.395757  0.745237  0.628406  0.784473
7  0.345149  0.147986  0.459451  0.310961  0.706435  0.100914
8  0.553195  0.394947  0.863494  0.585030  0.565944  0.356561
9  0.561593  0.689260  0.865243  0.136481  0.386582  0.730399

#3樓

How about: 怎麼樣:

df.insert(0, 'mean', df.mean(1))

http://pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion http://pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion


#4樓

之前已經回答了這個問題但是reindex_axis現在已被棄用,所以我建議使用:

df.reindex(sorted(df.columns), axis=1)

#5樓

You could also do something like this: 您還可以執行以下操作:

df = df[['mean', '0', '1', '2', '3']]

You can get the list of columns with: 您可以通過以下方式獲取列列表:

cols = list(df.columns.values)

The output will produce: 輸出將產生:

['0', '1', '2', '3', 'mean']

...which is then easy to rearrange manually before dropping it into the first function ...然後輕鬆將其放到第一個功能中即可手動重新排列


#6樓

This function avoids you having to list out every variable in your dataset just to order a few of them. 此功能避免了僅列出一些變量就不必列出數據集中的每個變量。

def order(frame,var):
    if type(var) is str:
        var = [var] #let the command take a string or list
    varlist =[w for w in frame.columns if w not in var]
    frame = frame[var+varlist]
    return frame 

It takes two arguments, the first is the dataset, the second are the columns in the data set that you want to bring to the front. 它有兩個參數,第一個是數據集,第二個是您要放在最前面的數據集中的列。

So in my case I have a data set called Frame with variables A1, A2, B1, B2, Total and Date. 因此,在我的情況下,我有一個名爲Frame的數據集,其中包含變量A1,A2,B1,B2,總計和日期。 If I want to bring Total to the front then all I have to do is: 如果我想讓道達爾走在前列,那麼我要做的就是:

frame = order(frame,['Total'])

If I want to bring Total and Date to the front then I do: 如果我想將“總計”和“日期”放在首位,那麼我會這樣做:

frame = order(frame,['Total','Date'])

EDIT: 編輯:

Another useful way to use this is, if you have an unfamiliar table and you're looking with variables with a particular term in them, like VAR1, VAR2,... you may execute something like: 使用此功能的另一種有用方法是,如果您有一個陌生的表,並且正在查找其中包含特定術語的變量,例如VAR1,VAR2等,則可以執行以下操作:

frame = order(frame,[v for v in frame.columns if "VAR" in v])
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章