问题：

Sklearn 错误： [Int64索引（[2， 3]， dtype='int64'）] 中没有一个在 [列] 中

柳宾实

2023-03-14

有人可以解释为什么这段代码：

from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.model_selection import StratifiedKFold
from sklearn.svm import SVC
import numpy as np

#df = pd.read_csv('missing_data.csv',sep=',')

df = pd.DataFrame(np.array([[1, 2, 3,4,5,6,7,8,9,1],
                            [4, 5, 6,3,4,5,7,5,4,1],
                            [7, 8, 9,6,2,3,6,5,4,1],
                            [7, 8, 9,6,1,3,2,2,4,0],
                            [7, 8, 9,6,5,6,6,5,4,0]]),
                            columns=['a', 'b', 'c','d','e','f','g','h','i','j'])

X_train = df.iloc[:,:-1]
y_train = df.iloc[:,-1]


clf=SVC(kernel='linear')
kfold = StratifiedKFold(n_splits=2,random_state=42,shuffle=True)
for train_index,test_index in kfold.split(X_train,y_train):
    x_train_fold,x_test_fold = X_train[train_index],X_train[test_index]
    y_train_fold,y_test_fold = y_train[train_index],y_train[test_index]
    clf.fit(x_train_fold,y_train_fold)

引发此错误:

Traceback (most recent call last):
  File "test_traintest.py", line 23, in <module>
    x_train_fold,x_test_fold = X_train[train_index],X_train[test_index]
  File "/Users/slowat/anaconda/envs/nlp_course/lib/python3.7/site-packages/pandas/core/frame.py", line 3030, in __getitem__
    indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
  File "/Users/slowat/anaconda/envs/nlp_course/lib/python3.7/site-packages/pandas/core/indexing.py", line 1266, in _get_listlike_indexer
    self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
  File "/Users/slowat/anaconda/envs/nlp_course/lib/python3.7/site-packages/pandas/core/indexing.py", line 1308, in _validate_read_indexer
    raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Int64Index([2, 3], dtype='int64')] are in the [columns]"

我看到了这个答案，但我的专栏长度是相等的。

共有1个答案

郑鸿朗

2023-03-14

KFold.split（）返回训练和测试索引，这些索引应该与如下的DataFrame一起使用：

X_train.iloc[train_index]

根据您的语法，您正试图将它们用作列名。将代码更改为：

from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.model_selection import StratifiedKFold
from sklearn.svm import SVC
import numpy as np

#df = pd.read_csv('missing_data.csv',sep=',')

df = pd.DataFrame(np.array([[1, 2, 3,4,5,6,7,8,9,1],
                            [4, 5, 6,3,4,5,7,5,4,1],
                            [7, 8, 9,6,2,3,6,5,4,1],
                            [7, 8, 9,6,1,3,2,2,4,0],
                            [7, 8, 9,6,5,6,6,5,4,0]]),
                            columns=['a', 'b', 'c','d','e','f','g','h','i','j'])

X_train = df.iloc[:,:-1]
y_train = df.iloc[:,-1]


clf=SVC(kernel='linear')
kfold = StratifiedKFold(n_splits=2,random_state=42,shuffle=True)
for train_index,test_index in kfold.split(X_train,y_train):
    x_train_fold,x_test_fold = X_train.iloc[train_index],X_train.iloc[test_index]
    y_train_fold,y_test_fold = y_train.iloc[train_index],y_train.iloc[test_index]
    clf.fit(x_train_fold,y_train_fold)

请注意，我们使用＜code＞。iloc而不是.loc。这是因为＜code＞。iloc使用整数索引，就像我们从split（）获得的索引一样，而则使用整数索引。loc适用于索引值。在您的情况下，这并不重要，因为熊猫索引与整数索引匹配，但在其他项目中，您可能会遇到这种情况，所以请使用.iloc。

或者，当您提取< code>X_train和< code>y_train时，您可以将它们转换为numpy数组:

X_train = df.iloc[:,:-1].to_numpy()
y_train = df.iloc[:,-1].to_numpy()

然后你的代码就可以正常工作，因为 numpy 数组可以很好地处理整数索引。

类似资料：

键错误：[Int64Index…]dtype='int64]中没有一个在[列]中

我试图在管道上运行k-折叠交叉验证（标准化定标器，决策树分类器）。首先，我导入数据。然后对数据帧进行预处理然后对特征和目标进行切片并使用SMOTE来平衡数据这是问题的一部分。错误代码
键错误：[Int64Index…]dtype='int64]中没有一个在列中

我试图使用np.random.shuffle（）方法对索引进行洗牌，但我一直收到一个我不理解的错误。如果有人能帮我解决这个问题，我将不胜感激。非常感谢。当我在开始创建我的raw_csv_数据变量时，我尝试使用分隔符='、'和delim_空格=0，因为我认为这是另一个问题的解决方案，但它不断抛出相同的错误这是我尝试洗牌索引时不断遇到的错误： getitem（self，key）中的~\Anacon
键错误：“列中没有[Int64Index…]dtype='int64]”

法典：- 错误我试图在列和它们的前陈列室价格之间画一个箱线图。前展厅价格的值是分类的，因此，我首先将它们转换为整数，然后尝试绘制箱线图，但它会抛出错误，关键错误:“None of [Int64Index...] dtype='int64]在列中。
键错误：没有[Int64Index（[…]dtype='int64'）]位于[列]

这是我的数据帧：我试着用它做一个非常简单的情节：但我一直收到一条关键错误消息：我尝试将列[a]转换为日期时间，但仍然收到相同的错误消息。
KeyError："[Int64Index dtype='int64'，长度=9313）]中没有[列]"

有一个323列和10348行的数据帧。我想用下面的代码用分层k-Fold来划分它但是我得到了以下错误有人告诉我为什么会出现这个错误以及如何修复它吗
接收关键错误："[Int64Index（[... dtype='int64'，长度=1323）]中没有[列]"

将测试和列车数据输入ROC曲线图时，我收到以下错误： KeyError:“[Int64Index（[0，1，2，…dtype='int64'，length=1323]）中没有一个在[columns]中” 错误似乎是说它不喜欢我的数据格式，但它在第一次运行时起作用，我无法让它再次运行。我是否错误地拆分数据或将格式错误的数据发送到函数中？阅读几个StackOverflow帖子与相同的KeyErro

Sklearn 错误： [Int64索引（[2， 3]， dtype='int64'）] 中没有一个在 [列] 中

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档