在pandas数据框中的特定索引处插入新行

马正初

2023-03-14

问题内容：

我有一个带有两列“标识符”，“值”和“子标识”的以下数据框 df ：

     identifier   values    subid
0      1          101       1
1      1          102       1
2      1          103       2 #index in list x        
3      1          104       2
4      1          105       2
5      2          106       3   
6      2          107       3
7      2          108       3
8      2          109       4 #index in list x
9      2          110       4
10     3          111       5
11     3          112       5 
12     3          113       6 #index in list x

我有一个索引列表，例如

x = [2, 8, 12]

我想在列表x中提到的索引之前插入行。就像，对于在索引2之前插入的行，将具有以下值，它将具有与在索引2处的行 相同的标识符 ，即1；
与索引2的行 相同的值 ，即103；但是新行中的 subid
将是（（索引2处的subid）-1），或者仅仅是前一行的subid，即1。

以下是我期望的最终结果df：

   identifier   values    subid
0      1          101       1
1      1          102       1
2      1          103       1 #new row inserted     
3      1          103       2 #index in list x        
4      1          104       2
5      1          105       2
6      2          106       3   
7      2          107       3
8      2          108       3
9      2          109       3 #new row inserted
10     2          109       4 #index in list x
11     2          110       4
12     3          111       5
13     3          112       5 
14     3          113       5 #new row inserted
15     3          113       6 #index in list x

我一直在尝试的代码：

 m = df.index       #storing the indices of the df
 #m

 for i in m:
     if i in x:     #x is the given list of indices
         df.iloc[i-1]["identifier"] = df.iloc[i]["identifier"]
         df.iloc[i-1]["values"] = df.iloc[i]["values"]
         df.iloc[i-1]["subid"] = (df.iloc[i]["subid"]-1)
 df

上面的代码只是替换（i-1）索引处的行，而不用上述值插入其他行。请帮忙。

如果有任何不清楚的地方，请告诉我。

问题答案：

保留索引顺序是棘手的部分。我不确定这是否是最有效的方法，但是它应该可以工作。

x = [2,8,12]
rows = []
cur = {}

for i in df.index:
    if i in x:
        cur['index'] = i
        cur['identifier'] = df.iloc[i].identifier
        cur['values'] = df.iloc[i]['values']
        cur['subid'] = df.iloc[i].subid - 1
        rows.append(cur)
        cur = {}

然后，遍历新行列表，并执行增量连接，将每个新行插入正确的位置。

offset = 0; #tracks the number of rows already inserted to ensure rows are inserted in the correct position

for d in rows:
    df = pd.concat([df.head(d['index'] + offset), pd.DataFrame([d]), df.tail(len(df) - (d['index']+offset))])
    offset+=1


df.reset_index(inplace=True)
df.drop('index', axis=1, inplace=True)
df

    level_0 identifier  subid   values
0         0          1      1      101
1         1          1      1      102
2         0          1      1      103
3         2          1      2      103
4         3          1      2      104
5         4          1      2      105
6         5          2      3      106
7         6          2      3      107
8         7          2      3      108
9         0          2      3      109
10        8          2      4      109
11        9          2      4      110
12       10          3      5      111
13       11          3      5      112
14        0          3      5      113
15       12          3      6      113

在pandas数据框中的特定索引处插入新行

相关阅读

相关文章

相关问答

相关工具

相关文档