在熊猫中，.iloc方法是否提供副本或视图？

刘子实

2023-03-14

问题内容：

我发现结果有点随机。有时是副本，有时是视图。例如：

df = pd.DataFrame([{'name':'Marry', 'age':21},{'name':'John','age':24}],index=['student1','student2'])

df
              age   name
   student1   21  Marry
   student2   24   John

现在，让我尝试对其进行一些html" target="_blank">修改。

df2= df.loc['student1']
df2 [0] = 23
df
              age   name
   student1   21  Marry
   student2   24   John

如您所见，什么都没有改变。df2是副本。但是，如果我将另一个学生添加到数据框中…

df.loc['student3'] = ['old','Tom']
df
               age   name
    student1   21  Marry
    student2   24   John
    student3  old    Tom

尝试再次更改年龄。

df3=df.loc['student1']
df3[0]=33
df
               age   name
    student1   33  Marry
    student2   24   John
    student3  old    Tom

现在df3突然变成了视图。到底是怎么回事？我猜值“旧”是关键吗？

问题答案：

通常，如果数据框具有单个dtype，则可以获得视图，而原始数据框则不是这样：

In [4]: df
Out[4]:
          age   name
student1   21  Marry
student2   24   John

In [5]: df.dtypes
Out[5]:
age      int64
name    object
dtype: object

但是，当您这样做时：

In [6]: df.loc['student3'] = ['old','Tom']
   ...:

第一列get强制为object，因为各列不能具有混合dtypes：

In [7]: df.dtypes
Out[7]:
age     object
name    object
dtype: object

在这种情况下，基础.values将始终返回具有相同基础缓冲区的数组，并且对该数组的更改将反映在数据帧中：

In [11]: vals = df.values

In [12]: vals
Out[12]:
array([[21, 'Marry'],
       [24, 'John'],
       ['old', 'Tom']], dtype=object)

In [13]: vals[0,0] = 'foo'

In [14]: vals
Out[14]:
array([['foo', 'Marry'],
       [24, 'John'],
       ['old', 'Tom']], dtype=object)

In [15]: df
Out[15]:
          age   name
student1  foo  Marry
student2   24   John
student3  old    Tom

另一方面，与原始数据帧一样使用混合类型：

In [26]: df = pd.DataFrame([{'name':'Marry', 'age':21},{'name':'John','age':24}]
    ...: ,index=['student1','student2'])
    ...:

In [27]: vals = df.values

In [28]: vals
Out[28]:
array([[21, 'Marry'],
       [24, 'John']], dtype=object)

In [29]: vals[0,0] = 'foo'

In [30]: vals
Out[30]:
array([['foo', 'Marry'],
       [24, 'John']], dtype=object)

In [31]: df
Out[31]:
          age   name
student1   21  Marry
student2   24   John

但是请注意，只有在可能成为视图时（即，如果它是适当的切片），才会返回视图，否则，将与dtypes无关地进行复制：

In [39]: df.loc['student3'] = ['old','Tom']


In [40]: df2
Out[40]:
          name
student3   Tom
student2  John

In [41]: df2.loc[:] = 'foo'

In [42]: df2
Out[42]:
         name
student3  foo
student2  foo

In [43]: df
Out[43]:
          age   name
student1   21  Marry
student2   24   John
student3  old    Tom

在熊猫中，.iloc方法是否提供副本或视图？

相关阅读

相关文章

相关问答

相关工具

相关文档