根据另一个数据框python熊猫替换列值-更好的方法？

禄俊逸

2023-03-14

问题内容：

注意：为简单起见，我使用一个玩具示例，因为复制/粘贴数据帧在堆栈溢出中很困难（请让我知道是否有简便的方法来执行此操作）。

有没有一种方法可以将一个数据帧中的值合并到另一个数据帧中而无需获取_X，_Y列？我希望一列中的值替换另一列中的所有零值。

df1:

Name   Nonprofit    Business    Education

X      1             1           0
Y      0             1           0   <- Y and Z have zero values for Nonprofit and Educ
Z      0             0           0
Y      0             1           0

df2:

Name   Nonprofit    Education
Y       1            1     <- this df has the correct values. 
Z       1            1



pd.merge(df1, df2, on='Name', how='outer')

Name   Nonprofit_X    Business    Education_X     Nonprofit_Y     Education_Y
Y       1                1          1                1               1
Y      1                 1          1                1               1
X      1                 1          0               nan             nan   
Z      1                 1          1                1               1

在上一篇文章中，我尝试了Combine_First和dropna（），但是这些都做不到。

我想用df2中的值替换df1中的零。此外，我希望根据df2更改具有相同名称的所有行。

Name    Nonprofit     Business    Education
Y        1             1           1
Y        1             1           1 
X        1             1           0
Z        1             0           1

（需要澄清：“业务”列中name = Z的值应为0。）

我现有的解决方案执行以下操作：我基于df2中存在的名称进行子集设置，然后将这些值替换为正确的值。但是，我希望采用一种更简洁的方法。

pubunis_df = df2
sdf = df1

regex = str_to_regex(', '.join(pubunis_df.ORGS))

pubunis = searchnamesre(sdf, 'ORGS', regex)

sdf.ix[pubunis.index, ['Education', 'Public']] = 1
searchnamesre(sdf, 'ORGS', regex)

问题答案：

使用布尔掩码fromisin过滤df并从rhs
df分配所需的行值：

In [27]:

df.loc[df.Name.isin(df1.Name), ['Nonprofit', 'Education']] = df1[['Nonprofit', 'Education']]
df
Out[27]:
  Name  Nonprofit  Business  Education
0    X          1         1          0
1    Y          1         1          1
2    Z          1         0          1
3    Y          1         1          1

[4 rows x 4 columns]

根据另一个数据框python熊猫替换列值-更好的方法？

相关阅读

相关文章

相关问答

相关工具

相关文档