熊猫的DataFrame-重命名多个相同名称的列

吴建中

2023-03-14

问题内容：

我在df中有几列相同的名称。需要重命名它们。无论如何，通常的重命名都会重命名，我可以将下面的blah重命名为blah1，blah4，blah5吗？

    In [6]:

    df=pd.DataFrame(np.arange(2*5).reshape(2,5))
    df.columns=['blah','blah2','blah3','blah','blah']
    df
    Out[6]:


blah    blah2   blah3   blah    blah
0    0   1   2   3   4
1    5   6   7   8   9

在[7]中：

df.rename(columns = {'blah':'blah1'})
Out[7]:
        blah1   blah2   blah3   blah1   blah1
        0    0   1   2   3   4
        1    5   6   7   8   9

问题答案：

我希望在Pandas中找到比通用Python解决方案更多的解决方案。如果Column的get_loc（）函数找到带有“
True”值的重复项，则该掩码数组将返回掩码数组，“
True”值指向找到重复项的位置。然后，我使用掩码将新值分配到这些位置。在我的情况下，我提前知道我要获得多少个dups，以及我将分配给他们什么，但是看起来df.columns.get_duplicates（）将返回所有dups的列表，然后您就可以如果您需要更通用的重复除草操作，请将该列表与get_loc（）结合使用

’‘’更新至2020年9月’‘’

cols=pd.Series(df.columns)
for dup in df.columns[df.columns.duplicated(keep=False)]: 
    cols[df.columns.get_loc(dup)] = ([dup + '.' + str(d_idx) 
                                     if d_idx != 0 
                                     else dup 
                                     for d_idx in range(df.columns.get_loc(dup).sum())]
                                    )
df.columns=cols

    blah    blah2   blah3   blah.1  blah.2
 0     0        1       2        3       4
 1     5        6       7        8       9

更好的新方法（更新03Dec2019）

下面的这段代码比上面的代码更好。从下面的另一个答案（@SatishSK）复制：

#sample df with duplicate blah column
df=pd.DataFrame(np.arange(2*5).reshape(2,5))
df.columns=['blah','blah2','blah3','blah','blah']
df

# you just need the following 4 lines to rename duplicates
# df is the dataframe that you want to rename duplicated columns

cols=pd.Series(df.columns)

for dup in cols[cols.duplicated()].unique(): 
    cols[cols[cols == dup].index.values.tolist()] = [dup + '.' + str(i) if i != 0 else dup for i in range(sum(cols == dup))]

# rename the columns with the cols list.
df.columns=cols

df

输出：

    blah    blah2   blah3   blah.1  blah.2
0   0   1   2   3   4
1   5   6   7   8   9

熊猫的DataFrame-重命名多个相同名称的列

相关阅读

相关文章

相关问答

相关工具

相关文档