问题：

不同计算机上与NLTK库相关的一个python代码的不同结果

单于俊智

2023-03-14

我编写了以下代码，在我的计算机上运行良好，但在其他计算机上返回null。你能帮我解决这个问题吗。

import string
import nltk
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords

def preprocess(sentence):
    sentence = sentence.lower()
    specialChrs={'\xc2',''} 
    pattern=pattern = r'''(?x)               # set flag to allow verbose regexps
              ([A-Z]\.)+         # abbreviations, e.g. U.S.A.
              | \$?\d+%?
              | \$?\d+(,|.\d+)*
              | \w+([-'/]\w+)*    # words w/ optional internal hyphens/apostrophe
              |/\m+([-'/]\w+)*
            '''
    tokenizer = RegexpTokenizer(pattern)
    tokens = tokenizer.tokenize(sentence)
    print tokens
    realToken= [e for e in tokens if  len(e)>= 3 and len(e)<10]
    stopWords = set(stopwords.words('english'))
    stop_words = [w for w in realToken if not w in stopWords]
    filtered_words = [w for w in stop_words if not w in specialChrs]
    print filtered_words
   # final_words = [w for w in filtered_words if not w[0]=='0' and w[1]=='x']
    return filtered_words


str='I have one generalized rule, where in shellscript I check for all need packages, if any package does not exist, then install it other wise skip to next check. As I need to check and execute few other python as well shellscripts, I am using it. Is using shellscript for this is bad idea?'
preprocess(str)

这些是我的计算机输出的一部分：

['i'、'have'、'one'、'generalized'、'rule'、'where'、'in'、'shellscript'、'i'、'check'、'for'、'all'、'need'、……idea']

其他计算机结果：

[('', '', '', ''), ('', '', '', ''), ('', '', '', ''), ('', '', '', ''), ('', '', '', ''), ('', '', '', ''), ('', '', '', ''),... ]

我的电脑信息

python 2.7.12 | Anaconda 2.3.0（64位）|（默认值，2016年7月2日，17:42:40）[GCC 4.4.7 20120313（Red Hat 4.4.7-1）]在linux2上键入“帮助”、“版权”、“信用”或“许可证”以获取更多信息。Anaconda由Continuum Analytics为您带来。请查收：http://continuum.io/thanks 和https://anaconda.org

导入nltk

打印（'nltk版本为{}.'.format（nltk.version））

nltk版本是3.2.1。

我的电脑朋友

python 2.7.12 | Anaconda 4.1.1（64位）|（默认值，2016年6月29日，11:42:40）[MSC v.1500 64位（AMD64）]在win32上键入“帮助”、“版权”、“信用”或“许可证”以了解更多信息。Anaconda由Continuum Analytics为您带来。请查收：http://continuum.io/thanks 和https://anaconda.org

导入nltk

打印（'nltk版本为{}.'.format（nltk.version））

nltk版本是3.2.1。

另外，我在另一台计算机上测试我的代码，得到了相同的结果。

那台电脑的信息是：

linux2上的Python 2.7.3（默认，10月26 2016, 21:01:49)[GCC 4.6.3]键入“帮助”、“版权”、“信用”或“许可”获取更多信息。

共有1个答案

周枫涟

2023-03-14

你的问题在这一页上得到了回答

您需要以这种方式更改正则表达式，以解决您的问题。

`pattern = r'''(?x)          # set flag to allow verbose regexps
            (?:[A-Z]\.)+        # abbreviations, e.g. U.S.A.
         | \$?\d+(?:\.\d+)?%?
         | \w+(?:-\w+)*        # words with optional internal hyphens
         |/\m+(?:[-'/]\w+)*
      '''`

类似资料：

为什么相同的代码在不同的机器上产生两种不同的fp结果？

这是代码：如果我在我的机器（）或这里（）上尝试：相反，这里（）：这是不同的。这是由于机器厄普西隆？还是编译器精度标志？还是不同的评估？造成这种漂移的原因是什么？问题似乎出现在函数中（因为其他值似乎相同）。
相同代码的Python不同性能[重复]

我偶然发现了一些毫无意义的东西。我有这个Python代码，它做2个简单的for循环，只是测量执行时间。然而，我发现从一个函数调用完全相同的代码需要一半的时间。有人能解释一下为什么吗？这里是输出：
求一个生成n个不相同数值的python代码？

底价为125.5元的产品，如何生成10条不低于底价且每条报价高于底价不能超过底价的（100分之一到200分之一间的数）并且10条报价不能相同(报价要相对自然，不是每一个多固定数的那种)的python代码
spring，使用HikariCP VS c3p0，相同的代码，不同的结果

环境
ForkJoinPool相同代码不同风格的延迟不同

我试图将paralleStream与自定义的ForkJoin池一起使用，该任务执行网络调用。当我使用以下样式时如果使用parallelStream,那么ForkJoinPool.Common是否以某种方式参与其中？下面是模拟上述两种样式的整个程序
同一台计算机上有多个Python版本？

问题内容： Python网站上是否有官方文档，内容涉及如何在Linux的同一台计算机上安装和运行多个版本的Python？我可以找到成千上万的博客文章和答案，但是我想知道是否有“标准”的官方方式来做到这一点？还是全部取决于操作系统？问题答案：我认为这是完全独立的。只需安装它们，然后即可使用命令例如和。链接到您要用作默认值的那个。无论如何，所有库都位于单独的文件夹中（以版本命名）。如果要手

不同计算机上与NLTK库相关的一个python代码的不同结果

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档