我已经用不同的方式问过这个问题好几次了。每次我取得突破时,我都会遇到另一个问题。这也是因为我还不精通Java,并且很难处理像“地图”这样的集合。所以请容忍我。
我有两张这样的地图:
Map1 -{ORGANIZATION=[Fulton Tax Commissioner 's Office, Grady Hospital, Fulton Health Department], LOCATION=[Bellwood, Alpharetta]}
Map2 - {ORGANIZATION=[Atlanta Police Department, Fulton Tax Commissioner, Fulton Health Department], LOCATION=[Alpharetta], PERSON=[Bellwood, Grady Hospital]}
地图定义为:LinkedHashMap
我正在根据值比较这两张地图,只有3个键,即组织、个人和位置。Map1是我比较Map2的黄金集。现在我面临的问题是,当我在Map1中迭代ORGANIZATION键的值并在Map2中检查匹配项时,即使我的第一个条目在Map2(Fulton税务专员)中有部分匹配,但由于Map2(亚特兰大警察局)的第一个条目的不匹配,我得到的结果不正确(我正在寻找精确匹配和部分匹配)。这里的结果是增加真阳性、假阳性和假阴性计数器,使我能够最终计算准确度和召回率,即命名实体识别。
编辑
我期待的结果是
Organization:
True Positive Count = 2
False Negative Count = 1
False Positive Count = 1
Person:
False Positive Count = 2
Location:
True Positive Count = 1
False Negative Count = 1
我目前得到的输出是:
Organization:
True Positive Count = 1
False Negative Count = 2
False Positive Count = 0
Person:
True Positive Count = 0
False Negative Count = 0
False Positive Count = 2
Location:
True Positive Count = 0
False Negative Count = 1
False Positive Count = 0
法典
private static List<Integer> compareMaps(LinkedHashMap<String, List<String>> annotationMap, LinkedHashMap<String, List<String>> rageMap)
{
List<Integer> compareResults = new ArrayList<Integer>();
if (!annotationMap.entrySet().containsAll(rageMap.entrySet())){
for (Entry<String, List<String>> rageEntry : rageMap.entrySet()){
if (rageEntry.getKey().equals("ORGANIZATION") && !(annotationMap.containsKey(rageEntry.getKey()))){
for (int j = 0; j< rageEntry.getValue().size(); j++) {
orgFalsePositiveCount++;
}
}
if (rageEntry.getKey().equals("PERSON") && !(annotationMap.containsKey(rageEntry.getKey()))){
// System.out.println(rageEntry.getKey());
// System.out.println(annotationMap.entrySet());
for (int j = 0; j< rageEntry.getValue().size(); j++) {
perFalsePositiveCount++;
}
}
if (rageEntry.getKey().equals("LOCATION") && !(annotationMap.containsKey(rageEntry.getKey()))){
for (int j = 0; j< rageEntry.getValue().size(); j++) {
locFalsePositiveCount++;
}
}
}
}
for (Entry<String, List<String>> entry : annotationMap.entrySet()){
int i_index = 0;
if (rageMap.entrySet().isEmpty()){
orgFalseNegativeCount++;
continue;
}
// for (Entry<String, List<String>> rageEntry : rageMap.entrySet()){
if (entry.getKey().equals("ORGANIZATION")){
for(String val : entry.getValue()) {
if (rageMap.get(entry.getKey()) == null){
orgFalseNegativeCount++;
continue;
}
recusion: for (int i = i_index; i< rageMap.get(entry.getKey()).size();){
String rageVal = rageMap.get(entry.getKey()).get(i);
if(val.equals(rageVal)){
orgTruePositiveCount++;
i_index++;
break recusion;
}
else if((val.length() > rageVal.length()) && val.contains(rageVal)){ //|| dataB.get(entryA.getKey()).contains(entryA.getValue())){
orgTruePositiveCount++;
i_index++;
break recusion;
}
else if((val.length() < rageVal.length()) && rageVal.contains(val)){
orgTruePositiveCount++;
i_index++;
break recusion;
}
else if(!val.contains(rageVal)){
orgFalseNegativeCount++;
i_index++;
break recusion;
}
else if(!rageVal.contains(val)){
orgFalsePositiveCount++;
i_index++;
break recusion;
}
}
}
}
......................... //(Same for person and location)
compareResults.add(orgTruePositiveCount);
compareResults.add(orgFalseNegativeCount);
compareResults.add(orgFalsePositiveCount);
compareResults.add(perTruePositiveCount);
compareResults.add(perFalseNegativeCount);
compareResults.add(perFalsePositiveCount);
compareResults.add(locTruePositiveCount);
compareResults.add(locFalseNegativeCount);
compareResults.add(locFalsePositiveCount);
System.out.println(compareResults);
return compareResults;
}
我编写这个类来比较地图:
public class MapComparison<K, V> {
private final Map<K, Collection<ValueCounter>> temp;
private final Map<K, Collection<V>> goldMap;
private final Map<K, Collection<V>> comparedMap;
private final BiPredicate<V, V> valueMatcher;
public MapComparison(Map<K, Collection<V>> mapA, Map<K, Collection<V>> mapB, BiPredicate<V, V> valueMatcher) {
this.goldMap = mapA;
this.comparedMap = mapB;
this.valueMatcher = valueMatcher;
this.temp = new HashMap<>();
goldMap.forEach((key, valueList) -> {
temp.put(key, valueList.stream().map(value -> new ValueCounter(value, true)).collect(Collectors.toList()));
});
comparedMap.entrySet().stream().forEach(entry -> {
K key = entry.getKey();
Collection<V> valueList = entry.getValue();
if(temp.containsKey(key)) {
Collection<ValueCounter> existingMatches = temp.get(key);
Stream<V> falsePositives = valueList.stream().filter(v -> existingMatches.stream().noneMatch(mv -> mv.match(v)));
falsePositives.forEach(fp -> existingMatches.add(new ValueCounter(fp, false)));
} else {
temp.putIfAbsent(key, valueList.stream().map(value -> new ValueCounter(value, false)).collect(Collectors.toList()));
}
});
}
public String formatMatchedCounters() {
StringBuilder sb = new StringBuilder();
for(Entry<K, Collection<ValueCounter>> e : temp.entrySet()) {
sb.append(e.getKey()).append(":");
int[] counters = e.getValue().stream().collect(() -> new int[3], (a, b) -> {
a[0] += b.truePositiveCount;
a[1] += b.falsePositiveCount;
a[2] += b.falseNegativeCount;
}, (c, d) -> {
c[0] += d[0];
c[1] += d[1];
c[2] += d[2];
});
sb.append(String.format("\ntruePositiveCount=%s\nfalsePositiveCount=%s\nfalseNegativeCount=%s\n\n", counters[0], counters[1], counters[2]));
}
return sb.toString();
}
private class ValueCounter {
private final V goldValue;
private int truePositiveCount = 0;
private int falsePositiveCount = 0;
private int falseNegativeCount = 0;
ValueCounter(V value, boolean isInGoldMap) {
this.goldValue = value;
if(isInGoldMap) {
falseNegativeCount = 1;
} else {
falsePositiveCount = 1;
}
}
boolean match(V otherValue) {
boolean result = valueMatcher.test(goldValue, otherValue);
if(result) {
truePositiveCount++;
falseNegativeCount = 0;
}
return result;
}
}
}
它所做的基本上是创建一个映射项的联合,每个项都有自己的可变计数器来计算匹配值。方法formatMatchedCounters()
只是对每个键的这些计数器进行迭代和求和。
以下测试:
public class MapComparisonTest {
private Map<String, Collection<String>> goldMap;
private Map<String, Collection<String>> comparedMap;
private BiPredicate<String, String> valueMatcher;
@Before
public void initMaps() {
goldMap = new HashMap<>();
goldMap.put("ORGANIZATION", Arrays.asList("Fulton Tax Commissioner", "Grady Hospital", "Fulton Health Department"));
goldMap.put("LOCATION", Arrays.asList("Bellwood", "Alpharetta"));
comparedMap = new HashMap<>();
comparedMap.put("ORGANIZATION", Arrays.asList("Atlanta Police Department", "Fulton Tax Commissioner", "Fulton Health Department"));
comparedMap.put("LOCATION", Arrays.asList("Alpharetta"));
comparedMap.put("PERSON", Arrays.asList("Bellwood", "Grady Hospital"));
valueMatcher = String::equalsIgnoreCase;
}
@Test
public void test() {
MapComparison<String, String> comparison = new MapComparison<>(goldMap, comparedMap, valueMatcher);
System.out.println(comparison.formatMatchedCounters());
}
}
有以下结果:
ORGANIZATION:
truePositiveCount=2
falsePositiveCount=1
falseNegativeCount=1
LOCATION:
truePositiveCount=1
falsePositiveCount=0
falseNegativeCount=1
PERSON:
truePositiveCount=0
falsePositiveCount=2
falseNegativeCount=0
请注意,我不知道您想如何比较相似的值(例如,“富尔顿税务专员”与“富尔顿税务专员”),所以我决定将该决定放在签名中(在这种情况下,BiPredicate
作为参数)。
例如,字符串比较可以使用列文施泰因距离来实现:
valueMatcher = (s1, s2) -> StringUtils.getLevenshteinDistance(s1, s2) < 5;
如果我做对了,可能会有帮助。
我创建了一个自定义字符串来覆盖部分匹配的等于
public class MyCustomString {
private String myString;
public MyCustomString(String myString) {
this.myString = myString;
}
public String getMyString() {
return myString;
}
public void setMyString(String myString) {
this.myString = myString;
}
@Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final MyCustomString other = (MyCustomString) obj;
if (!Objects.equals(this.myString, other.myString) && !other.myString.contains(this.myString)) {
return false;
}
return true;
}
// add getter and setter for myString
// or delegate needed methods to myString object.
@Override
public int hashCode() {
int hash = 3;
hash = 47 * hash + Objects.hashCode(this.myString);
return hash;
}
}
这是我用你地图的第一部分尝试的代码。
LinkedHashMap<String, List<MyCustomString>> sampleMap1 = new LinkedHashMap<String, List<MyCustomString>>();
sampleMap1.put("a", new ArrayList<>());
sampleMap1.get("a").add(new MyCustomString("Fulton Tax Commissioner 's Office"));
sampleMap1.get("a").add(new MyCustomString("Grady Hospital"));
sampleMap1.get("a").add(new MyCustomString("Fulton Health Department"));
LinkedHashMap<String, List<MyCustomString>> sampleMap2 = new LinkedHashMap<String, List<MyCustomString>>();
sampleMap2.put("a", new ArrayList<>());
sampleMap2.get("a").add(new MyCustomString("Atlanta Police Department"));
sampleMap2.get("a").add(new MyCustomString("Fulton Tax Commissioner"));
sampleMap2.get("a").add(new MyCustomString("Fulton Health Department"));
HashMap<String, Integer> resultMap = new HashMap<String, Integer>();
for (Map.Entry<String, List<MyCustomString>> entry : sampleMap1.entrySet()) {
String key1 = entry.getKey();
List<MyCustomString> value1 = entry.getValue();
List<MyCustomString> singleListOfMap2 = sampleMap2.get(key1);
if (singleListOfMap2 == null) {
// all entry are false negative
System.out.println("Number of false N" + value1.size());
}
for (MyCustomString singleStringOfMap2 : singleListOfMap2) {
if (value1.contains(singleStringOfMap2)) {
//True positive
System.out.println("true");
} else {
//false negative
System.out.println("false N");
}
}
int size = singleListOfMap2.size();
System.out.println(size + " - numero di true");
//false positive = size - true
}
for (String string : sampleMap2.keySet()) {
if (sampleMap1.get(string) == null) {
//all these are false positive
System.out.println("numero di false P: " + sampleMap2.get(string).size());
}
}
我想出了一个简化的版本。这是我得到的输出:
Organization:
False Positive: Atlanta Police Department
True Positive: Fulton Tax Commissioner
True Positive: Fulton Health Department
False Negative: Grady Hospital
Person:
False Positive: Bellwood
False Positive: Grady Hospital
Location:
True Positive: Alpharetta
False Negative: Bellwood
[2, 1, 1, 0, 0, 2, 1, 1, 0]
这是我创建的代码:
public class MapCompare {
public static boolean listContains(List<String> annotationList, String value) {
if(annotationList.contains(value)) {
// 100% Match
return true;
}
for(String s: annotationList) {
if (s.contains(value) || value.contains(s)) {
// Partial Match
return true;
}
}
return false;
}
public static List<Integer> compareLists(List<String> annotationList, List<String> rageList){
List<Integer> compareResults = new ArrayList<Integer>();
if(annotationList == null || rageList == null) return Arrays.asList(0, 0, 0);
Integer truePositiveCount = 0;
Integer falseNegativeCount = 0;
Integer falsePositiveCount = 0;
for(String r: rageList) {
if(listContains(annotationList, r)) {
System.out.println("\tTrue Positive: " + r);
truePositiveCount ++;
} else {
System.out.println("\tFalse Positive: " + r);
falsePositiveCount ++;
}
}
for(String s: annotationList) {
if(listContains(rageList, s) == false){
System.out.println("\tFalse Negative: " + s);
falseNegativeCount ++;
}
}
compareResults.add(truePositiveCount);
compareResults.add(falseNegativeCount);
compareResults.add(falsePositiveCount);
System.out.println();
return compareResults;
}
private static List<Integer> compareMaps(LinkedHashMap<String, List<String>> annotationMap, LinkedHashMap<String, List<String>> rageMap) {
List<Integer> compareResults = new ArrayList<Integer>();
System.out.println("Organization:");
compareResults.addAll(compareLists(annotationMap.get("ORGANIZATION"), rageMap.get("ORGANIZATION")));
System.out.println("Person:");
compareResults.addAll(compareLists(annotationMap.get("PERSON"), rageMap.get("PERSON")));
System.out.println("Location:");
compareResults.addAll(compareLists(annotationMap.get("LOCATION"), rageMap.get("LOCATION")));
System.out.println(compareResults);
return compareResults;
}
public static void main(String[] args) {
LinkedHashMap<String, List<String>> Map1 = new LinkedHashMap<>();
List<String> m1l1 = Arrays.asList("Fulton Tax Commissioner's Office", "Grady Hospital", "Fulton Health Department");
List<String> m1l2 = Arrays.asList("Bellwood", "Alpharetta");
List<String> m1l3 = Arrays.asList();
Map1.put("ORGANIZATION", m1l1);
Map1.put("LOCATION", m1l2);
Map1.put("PERSON", m1l3);
LinkedHashMap<String, List<String>> Map2 = new LinkedHashMap<>();
List<String> m2l1 = Arrays.asList("Atlanta Police Department", "Fulton Tax Commissioner", "Fulton Health Department");
List<String> m2l2 = Arrays.asList("Alpharetta");
List<String> m2l3 = Arrays.asList("Bellwood", "Grady Hospital");
Map2.put("ORGANIZATION", m2l1);
Map2.put("LOCATION", m2l2);
Map2.put("PERSON", m2l3);
compareMaps(Map1, Map2);
}
}
希望这有帮助!
问题内容: 我有两个列表 , 都包含 MyData 类型的对象,而 MyData* 包含这些变量。 利斯塔和数组listB都包含MyData的对象,现在我要两个列表的对象值比较这里 的名字 ,以及 检查 变量一样,如果 利斯塔 包含这些对象值 和ListB也包含 然后我必须比较列表并返回false,因为两个列表相同但是如果ListA包含 和 ListB 包含 然后我必须比较列表并返回true,因为
我有两个列表*
问题内容: 我有两个列表(不是Java列表,可以说两列) 例如 我想要一个返回多少个相同元素的方法。对于此示例,它应该为3,并且应该返回列表的相似值和不同的值。 如果是,我应该使用哈希图,然后用什么方法获得结果? 请帮忙 PS:这不是学校作业:)因此,如果您只是指导我就足够了 问题答案: 编辑 这是两个版本。一种使用,另一种使用 比较它们并从中创建您自己的版本,直到获得所需的内容。 这应该足以覆盖
我有两个表,分别是产品和采购: PRODUCTS表-将获得购买的所有新产品。这意味着表prroducts中不存在相同的productname。所有独特产品列表 采购表-具有唯一purchase_id的所有采购产品的列表。 > 如果要在PURCHASE中添加或插入值,表PRODUCTS将获得PURCHASE的所有值,但前提是PURCHASE中的productname不存在于PRODUCTS中的'pr
我有一个关于列表比较器的问题。我有一个带有表格的网页应用程序。我可以在这个表格中编辑数据,也可以删除行。当我编辑数据时,标准比较器工作正常,但当我删除行时,我有问题。这个问题很常见(我想),当我删除一行时,javers比较旧列表和现在的列表时,看起来是这样的:旧列表有两个对象,现在列表有一个对象(我删除了第一个),现在javers不知道哪个对象被删除了,他比较旧列表中的第一个对象和新列表中的第二个
第一次来这里,所以我希望这是有意义的! 我有两个对象数组,比如l1和l2,我想在这两个列表之间进行比较,并在l3中得到一个不匹配的值。用户类包含2个字符串: 比如,l1包含:Java、JSF、JAXR、foo l2包含:JSF、JAXR 我可以对匹配的值进行比较,但不能对不匹配的值进行比较。这种逻辑似乎有缺陷。有什么帮助吗? 对于匹配值: 但是,对于不匹配,当我说不等于时,我得到的不是唯一的值,而