当我想把数据框中的一列日期转换成pandas自带的日期格式的时候,遇到了这种报错。
reader = pd.read_csv(f'新建文件夹/2020-12-22-5-10.csv', usecols=['passCarTime'],dtype={'passCarTime':'string'})
pd.to_datetime(reader.passCarTime.head())
Out[98]:
0 2020-12-22 10:00:00
1 2020-12-22 10:00:00
2 2020-12-22 10:00:00
3 2020-12-22 10:00:00
4 2020-12-22 10:00:00
Name: passCarTime, dtype: datetime64[ns]
pd.to_datetime(reader.passCarTime)
Traceback (most recent call last):
File "D:\PyCharm2020\python2020\lib\site-packages\pandas\core\arrays\datetimes.py", line 2085, in objects_to_datetime64ns
values, tz_parsed = conversion.datetime_to_datetime64(data)
File "pandas\_libs\tslibs\conversion.pyx", line 350, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\PyCharm2020\python2020\lib\site-packages\IPython\core\interactiveshell.py", line 3427, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-99-e1b00dc18517>", line 1, in <module>
pd.to_datetime(reader.passCarTime)
File "D:\PyCharm2020\python2020\lib\site-packages\pandas\core\tools\datetimes.py", line 801, in to_datetime
cache_array = _maybe_cache(arg, format, cache, convert_listlike)
File "D:\PyCharm2020\python2020\lib\site-packages\pandas\core\tools\datetimes.py", line 178, in _maybe_cache
cache_dates = convert_listlike(unique_dates, format)
File "D:\PyCharm2020\python2020\lib\site-packages\pandas\core\tools\datetimes.py", line 465, in _convert_listlike_datetimes
result, tz_parsed = objects_to_datetime64ns(
File "D:\PyCharm2020\python2020\lib\site-packages\pandas\core\arrays\datetimes.py", line 2090, in objects_to_datetime64ns
raise e
File "D:\PyCharm2020\python2020\lib\site-packages\pandas\core\arrays\datetimes.py", line 2075, in objects_to_datetime64ns
result, tz_parsed = tslib.array_to_datetime(
File "pandas\_libs\tslib.pyx", line 364, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 591, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 726, in pandas._libs.tslib.array_to_datetime_object
File "pandas\_libs\tslib.pyx", line 717, in pandas._libs.tslib.array_to_datetime_object
File "pandas\_libs\tslibs\parsing.pyx", line 243, in pandas._libs.tslibs.parsing.parse_datetime_string
File "D:\PyCharm2020\python2020\lib\site-packages\dateutil\parser\_parser.py", line 1374, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "D:\PyCharm2020\python2020\lib\site-packages\dateutil\parser\_parser.py", line 649, in parse
raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: passCarTime
鄙人不是这专业的,英语也不好,看不大懂这是哪里出了问题。我看过文件的日期列并没有缺失值啊,也没有不符合格式的日期……这很奇怪,欢迎留言,Thanks in advance!
在转换的时候,加个参数errors = 'coerce'
。
reader = pd.read_csv(f'新建文件夹/2020-12-22-5-10.csv', usecols=['passCarTime'],dtype={'passCarTime':'string'})
reader.passCarTime = pd.to_datetime(reader.passCarTime,errors='coerce') # 看这里看这里 (●ˇ∀ˇ●)
reader.passCarTime.head()
Out[120]:
0 2020-12-22 10:00:00
1 2020-12-22 10:00:00
2 2020-12-22 10:00:00
3 2020-12-22 10:00:00
4 2020-12-22 10:00:00
Name: passCarTime, dtype: datetime64[ns]
reader.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 307707 entries, 0 to 307706
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 passCarTime 307703 non-null datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 2.3 MB