当前位置: 首页 > 知识库问答 >
问题:

使用熊猫获取两个数据帧之间的匹配值

卢翔宇
2023-03-14

我有两个具有多列的数据帧。

我想比较df1['id']和df2['id'],并返回一个新的df,其中列['correct_id']具有匹配值。例子:

df1:

     id      Name
0   123      Paul
1  c345      Jean
2     0    Alicia
3   345  Jennifer

df2

    id      Name
0  123      Paul
1  980      Jean
2    0    Alicia
3  945  Jennifer

这是我的代码:

import pandas as pd 

df1=pd.DataFrame({'id':['123','c345','0','345'],
    'Name':['Paul','Jean','Alicia','Jennifer'],
})
print(df1)

df2 = pd.DataFrame({'id':[123,980,0,945],
    'Name':['Paul','Jean','Alicia','Jennifer'],})

print(df2)

df1['id'] = pd.to_numeric(df1['id'], errors='coerce')
df1["correct_id"] = (df1["id"].isin(df2["id"]) * df1["id"]).replace(0, "N/A")
print(df1)

我得到的结果是:

     id      Name correct_id
0  123.0      Paul      123.0
1    NaN      Jean        NaN
2    0.0    Alicia        N/A
3  345.0  Jennifer        N/A

预期输出:

      id      Name correct_id
0    123      Paul        123
1    c345      Jean       N/A
2    0      Alicia        0
3   345  Jennifer        N/A

我该怎么解决这个问题拜托

共有1个答案

程仲卿
2023-03-14

你可以试试这个:

new_df = df1.copy()

new_df.loc[:, "correct_id"] = df1.loc[
    df1["id"].astype(str).isin(df2["id"].astype(str).values), "id"
]
new_df.fillna("N/A", inplace=True)

print(new_df)
# Outputs
     id      Name correct_id
0   123      Paul        123
1  c345      Jean        N/A
2     0    Alicia          0
3   345  Jennifer        N/A
 类似资料: