问题：

JAVA regex包含在字母中间带点的单词，并包含带点的特殊单词

易风华

2023-03-14

我有两个问题。第一）是如何包含两个字母之间带有点的单词，比如“C.J.Johnson”；第二）是是否可以创建包含点的单词列表，我的regex将包括它们？基本上，我想用单词搜索文本文件，并列出所有包含这些单词的句子。我的代码：

public void search_sentences() throws FileNotFoundException, IOException {
        //FileReader fr1 = new FileReader(get_File());
        BufferedReader br1 = new BufferedReader(new InputStreamReader(new FileInputStream(get_File()),  "UTF-8"));
        ArrayList<String> words = new ArrayList();
        PrintWriter writer = new PrintWriter("rivit.txt", "UTF-8");
        String str="";
        //String [] words = {};
        String sanat = get_Text();
        for(String w: sanat.split(", ")){
            words.add(w);
        }
        String word_re = words.get(0);

        for (int i = 1; i < words.size(); i++)
                word_re += "|" + words.get(i);
            word_re = "[^.!?]*\\b(" + word_re + ")\\b[^.!?]*[.!?]";
            while(br1.ready()) { str += br1.readLine(); }
            Pattern re = Pattern.compile(word_re, 
                    Pattern.MULTILINE | Pattern.COMMENTS | 
                    Pattern.CASE_INSENSITIVE);
            Matcher match = re.matcher(str);
            String sentenceString="";
            while (match .find()) {
                sentenceString = match.group(0);
                if(!txtFile.isSelected()){
                tekstiAlue.append(sentenceString);
                } else {
                    writer.println(sentenceString);
                }

            }
        writer.close();

    }

我认为第一个问题是可行的。Ive尝试将//s添加到

共有1个答案

吕俊才

2023-03-14

正如Jeff Holt在评论中所说，“查看pattern.quote()”：

[...]生成一个字符串，该字符串可用于创建与字符串s相匹配的模式，就像它是文字模式一样。

输入序列中的元字符或转义序列将没有特殊含义。

public static List<String> findSentencesContaining(String fullText, String word, String[] specials) {
    Pattern p = buildRegexToFindSentencesContaining(word, specials);
    List<String> sentences = new ArrayList<>();
    for (Matcher m = p.matcher(fullText); m.find(); )
        sentences.add(m.group().replaceAll("\\s+", " ").trim()); // normalize group of whitespace into a single space
    return sentences;
}
public static Pattern buildRegexToFindSentencesContaining(String word, String[] specials) {
    StringJoiner regexText = new StringJoiner("|", "(?:", "|[^.!?])*").setEmptyValue("[^.!?]*");
    for (String s : specials)
        regexText.add(toWordRegex(s));
    String regex = regexText + toWordRegex(word) + regexText + "[.!?]";
    return Pattern.compile(regex, Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
}
private static String toWordRegex(String word) {
    String regex = Pattern.quote(word);
    if (word.matches("\\b.*"))
        regex = "\\b" + regex;
    if (word.matches(".*\\b"))
        regex = regex + "\\b";
    return regex;
}

String fullText = "This is a test. We're testing that sentences\n" +
                  "can span multiple lines, i.e. that line" +
                  "terminators can appear in a sentence. We're\n" +
                  "also testing that sentences can contain\n" +
                  "special words containing sentence-ending\n" +
                  "\"words\", e.g. \"i.e.\" and \"etc.\". In\n" +
                  "addition, (special) word matching is\n" +
                  "case-insensitive.";
String[] specials = { "i.e.", "e.g.", "etc." };
for (String word : new String[] { "test", "also", "we're", "is", "happy" }) {
    System.out.println("Sentences containing word \"" + word + "\":");
    List<String> sentences = findSentencesContaining(fullText, word, specials);
    if (sentences.isEmpty())
        System.out.println("  ** NOT FOUND");
    else {
        for (String sentence : sentences)
            System.out.println("  " + sentence);
    }
}

输出

Sentences containing word "test":
  This is a test.
Sentences containing word "also":
  We're also testing that sentences can contain special words containing sentence-ending "words", e.g. "i.e." and "etc.".
Sentences containing word "we're":
  We're testing that sentences can span multiple lines, i.e. that lineterminators can appear in a sentence.
  We're also testing that sentences can contain special words containing sentence-ending "words", e.g. "i.e." and "etc.".
Sentences containing word "is":
  This is a test.
  In addition, (special) word matching is case-insensitive.
Sentences containing word "happy":
  ** NOT FOUND

类似资料：

XPath：排除包含特定单词的文本节点

我对Xpath有问题。我试图查找div第一次迭代的所有文本节点，但排除其中包含关键字的节点。一个简单的例子：我想从第一个div“blabla”中获取所有文本，但排除所有包含“bananas”一词的段落。在这种情况下，我只想要“我也喜欢苹果”。段落数和单词“bananas”的位置是随机的。以下是我尝试过的：我不知道为什么这样不行。如果有人有想法，我们将不胜感激！
SQL SELECT WHERE字段包含单词

问题内容：我需要一个选择，它会返回如下结果：我需要所有结果，即这包括带有’word2 word3 word1’或’word1 word3 word2’或三者的任何其他组合的字符串。所有单词都必须包含在结果中。问题答案：相当慢，但是可以包括任何单词的工作方法：如果您需要所有单词出现，请使用以下命令：如果您想要更快的速度，则需要研究全文搜索，这对于每种数据库类型都是非常特定的。
从单词列表中查找包含不同字符的单词

我有一个包含50000个单词的单词列表，还有一个逐行查找字母字符的txt文件。我试图通过按顺序阅读单词列表中的单词来找到包含7个不同字母的单词，我为此编写了一个方法。首先，我浏览单词并同步字符列表，然后通过导航字母txt文件在单词中相互检查，如果有，则增加计数器。通过这种方式，我试图了解单词中有多少不同的字母，最后，如果它提供了控制，我会将其添加到列表中。读取txt文件并返回哈希集。但它不是
检查字符串是否包含特定单词

问题内容：那么，如何检查字符串中是否包含特定单词？这是我的代码：我遇到了错误。问题答案：并不像他们所说的那么复杂，选中此选项您不会后悔。您可以根据需要更改。
Solr-包含多个单词的同义词

我的文件：syn.txt 一切都很好，除了同义词：我做了一些研究，我发现了以下几点：所以我试图改变我的配置文件，并在索引中添加过滤器，但它不起作用。什么东西有什么想法吗？
名称中包含特定单词的所有类的AeyJ切入点表达式

怎么了？各位！我正在尝试拦截所有名称中包含特定单词的类...如下所示：我有以下拦截方法：我试过：（有效，但看起来很可怕）谢谢！！！

JAVA regex包含在字母中间带点的单词，并包含带点的特殊单词

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档