Python：如何将具有Unicode文件名的文件移动到Unicode文件夹

邢硕

2023-03-14

问题内容：

我在Windows下的Python脚本中在以unicode命名的文件夹之间移动以unicode命名的文件时遇到了麻烦…

您将使用什么语法在文件夹中查找* .ext类型的所有文件并将其移至相对位置？

假设文件和文件夹是unicode。

问题答案：

基本问题是Unicode和字节字符串之间未转换的混合。解决方案可以转换为单一格式，也可以避免一些麻烦而避免出现问题。我所有的解决方案都包含glob和shutil标准库。

举例来说，我有一些以结尾的Unicode文件名ods，我想将它们移动到名为א（希伯来语Aleph，一个Unicode字符）的子目录中。

第一种解决方案-将目录名称表示为字节字符串：

>>> import glob
>>> import shutil
>>> files=glob.glob('*.ods')      # List of Byte string file names
>>> for file in files:
...     shutil.copy2(file, 'א')   # Byte string directory name
...

第二种解决方案-将文件名转换为Unicode：

>>> import glob
>>> import shutil
>>> files=glob.glob(u'*.ods')     # List of Unicode file names
>>> for file in files:
...     shutil.copy2(file, u'א')  # Unicode directory name

感谢Ezio Melotti，Python错误列表。

第三种解决方案-避免目标Unicode目录名称

尽管这并不是我认为的最佳解决方案，但这里有个不错的技巧值得一提。

使用将目录更改为目标目录os.getcwd()，然后通过将其引用为来将文件复制到该目录.：

# -*- coding: utf-8 -*-
import os
import shutil
import glob

os.chdir('א')                   # CD to the destination Unicode directory
print os.getcwd()               # DEBUG: Make sure you're in the right place
files=glob.glob('../*.ods')     # List of Byte string file names
for file in files:
        shutil.copy2(file, '.') # Copy each file
# Don't forget to go back to the original directory here, if it matters

更深入的解释

直接方法shutil.copy2(src, dest)失败了，因为shutil用ASCII字符串连接了Unicode而不进行转换：

>>> files=glob.glob('*.ods')
>>> for file in files:
...     shutil.copy2(file, u'א')
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/lib/python2.6/shutil.py", line 98, in copy2
    dst = os.path.join(dst, os.path.basename(src))
  File "/usr/lib/python2.6/posixpath.py", line 70, in join
    path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 1: 
                    ordinal not in range(128)

如前所述，使用'א'而不是Unicode可以避免这种情况u'א'

这是一个错误吗？

在我看来，这是一个错误，因为Python无法期望basedir名称始终为str，而不是unicode。我在Python
Buglist中报告了这个问题
，并等待响应。

进一步阅读

Python的官方Unicode
HOWTO

Python：如何将具有Unicode文件名的文件移动到Unicode文件夹

第一种解决方案-将目录名称表示为字节字符串：

第二种解决方案-将文件名转换为Unicode：

第三种解决方案-避免目标Unicode目录名称

更深入的解释

这是一个错误吗？

进一步阅读

相关阅读

相关文章

相关问答

相关工具

相关文档