æ,ø,å是挪威字母中的最新字母
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Æ Ø Å
当我们试图使用Hibernate的Lucene然后对它进行排序 Å clubs with A
,Ø clubs with Ø
,Æ clibs with A
这是不对的。例如:
当前结果:
Aaalu,Åaalu,Baalu,Zaalu,
预期成绩:
Aaalu,Baalu,Zaalu,Åaalu,
以下是工作代码:
@AnalyzerDef(name = "myOwnAnalyzer",
tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
@Parameter(name = "pattern", value = "('-&\\.,\\(\\))"),
@Parameter(name = "replacement", value = " "),
@Parameter(name = "replace", value = "all")
}),
@TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
@Parameter(name = "pattern", value = "([^0-9\\p{L} ])"),
@Parameter(name = "replacement", value = ""),
@Parameter(name = "replace", value = "all")
}),
@TokenFilterDef(factory = TrimFilterFactory.class)
}
)
public class KikaPaya implements Serializable {
@Fields({ @Field(index = Index.YES, store = Store.YES), @Field(name = "KikaPayaName_for_sort", index = Index.YES, analyzer = @Analyzer(definition = "myOwnAnalyzer")) })
@Column(name = "NAME", length = 100)
private String name;
主要:
FullTextEntityManager ftem = Search.getFullTextEntityManager(factory.createEntityManager());
QueryBuilder qb = ftem.getSearchFactory().buildQueryBuilder().forEntity( KikaPaya.class ).get();
org.apache.lucene.search.Query query = qb.all().getQuery();
FullTextQuery fullTextQuery = ftem.createFullTextQuery(query, KikaPaya.class);
fullTextQuery.setSort(new Sort(new SortField("KikaPayaName_for_sort", SortField.STRING, true)));
fullTextQuery.setFirstResult(0).setMaxResults(150);
int size = fullTextQuery.getResultSize();
List<KikaPaya> result = fullTextQuery.getResultList();
for (KikaPayauser : result) {
logger.info("KikaPaya Name:" + user.getName());
}
以下是Lucene的版本(我无法更改):
<hibernate.version>4.2.8.Final</hibernate.version>
<hibernate.search.version>4.3.0.Final</hibernate.search.version>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-entitymanager</artifactId>
<version>4.2.8.Final</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>3.6.2</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers</artifactId>
<version>3.6.2</version>
</dependency>
有人可以建议获得正确结果的方法吗?
您可以org.apache.lucene.collation.CollationKeyFilter
在Hibernate
Search版本4.3.0.Final中使用类。创建自己的归类过滤器工厂:
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.collation.CollationKeyFilter;
import org.apache.solr.analysis.BaseTokenFilterFactory;
import java.text.Collator;
import java.util.Locale;
public final class NorwegianCollationFactory extends BaseTokenFilterFactory {
@Override
public TokenStream create(TokenStream input) {
Collator norwegianCollator = Collator.getInstance(new Locale("no", "NO"));
return new CollationKeyFilter(input, norwegianCollator);
}
}
并在AnalyzerDef中使用以下整理工厂:
@AnalyzerDef(name = "myOwnAnalyzer",
tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
@Parameter(name = "pattern", value = "('-&\\.,\\(\\))"),
@Parameter(name = "replacement", value = " "),
@Parameter(name = "replace", value = "all")
}),
@TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
@Parameter(name = "pattern", value = "([^0-9\\p{L} ])"),
@Parameter(name = "replacement", value = ""),
@Parameter(name = "replace", value = "all")
}),
@TokenFilterDef(factory = TrimFilterFactory.class)
,
@TokenFilterDef(factory = NorwegianCollationFactory .class)
}
)
public class KikaPaya implements Serializable {
问题内容: 如何在Python中进行不区分大小写的字符串比较? 我想以一种非常简单且Pythonic的方式封装对常规字符串与存储库字符串的比较。我还希望能够使用常规python字符串在由字符串散列的字典中查找值。 问题答案: 假设字符串:
我需要一个像
我喜欢在我的SortimentRespository中有一个findAll方法,它允许排序和分页,并且我可以在其中传递一个参数,该参数包含当前页面和页面的大小以及我想要排序的列的名称(不区分大小写)。 请求应类似于: http://localhost:8081/x/rest/sorti ments?尺寸=20 或 http://localhost:8081/x/rest/sortiments?si
我试图使字典的大小写不敏感。但是,我宣布它是一种财产,我怎么能让它变得不敏感呢。 我知道在定义时,我可以像这样使用它: 但是,我在接口和类中分别定义了它,比如 我怎么能让这个案子变得麻木不仁呢?
问题内容: 我正在尝试为其中一个根据您的回答做出响应的程序编写代码。我想这样做,以便某些变量不区分大小写。例如,如果我的变量等于我希望它也等于。那可能吗? 到目前为止,这是我的代码: 我不想为每个答案都设置2种情况,其中一种是大写,另一种是小写。 稍微偏离主题的问题。如何关闭扫描仪的资源泄漏? 问题答案: 值得一提: 或简单地使用:
如何在索引字段上搜索cypher 2.0/Neo4j 2.1.7不区分大小写? 正则表达式不使用索引