当前位置: 首页 > 知识库问答 >
问题:

JAVAutil。登录中。记录器在循环后多次显示

关飞翼
2023-03-14

我有一个for循环,在块关闭后,cull logger显示一些日志,但不知道在这之后发生了什么,每次调用for循环都有长度,这很奇怪!在这种情况下

log.log(Level.INFO , "Translated Setences from "+countOfTranslated+" / "+sentenses.size()+" successfully");

是在for循环块关闭之后,没有任何意义。如果有人知道什么,分享。看看我的完整代码:

package mehritco.ir.megnatis.institute.reflex.nlp;


import java.util.logging.Level;
import java.util.logging.Logger;

import javax.annotation.Nullable;

import org.json.JSONObject;

import edu.stanford.nlp.coref.data.CorefChain;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.ie.util.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.semgraph.*;
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations;

import edu.stanford.nlp.util.CoreMap;
import mehritco.ir.megnatis.institute.reflex.instagram.Application;
import mehritco.ir.megnatis.institute.reflex.instagram.repository.RepositoryTranslate;
import mehritco.ir.megnatis.tools.file.BasicLocation;


import java.util.*;

/**
 * https://stanfordnlp.github.io/CoreNLP/human-languages.html
 * https://stanfordnlp.github.io/stanfordnlp/models.html
 * @author Megnatis
 *
 */
public class TextAnalyzer {
    public static String unTranslatedText = """
    درما در اینجا هستیم . سلام خوبی؟ وقت بخیر چیکار میکنی
تسلیت 
راستی یادت نره بیای . دوست دارم عزیزم
چه میشه کرد؟
اونجایی
            """;
    
    /**
     * @link https://stanfordnlp.github.io/CoreNLP/ssplit.html#sentence-splitting-from-java
     * @param paragraph Max-length is 1024 byte 
     * @return
     */
    public static ArrayList<String> breakParagraphToSentence(String paragraph){
        ArrayList<String> sentenses = new ArrayList<String>();
        Properties props = new Properties();
        
         props.setProperty("annotators", "tokenize, ssplit");
         /**
             * @Link https://stanfordnlp.github.io/CoreNLP/ssplit.html#options
             * Whether to treat newlines as sentence breaks. This property has 3 legal values.
             * “always” means that a newline is always a sentence break
             * (but there still may be multiple sentences per line).
             */
         props.setProperty("ssplit.newlineIsSentenceBreak", "always");
            /**
             * @link https://stanfordnlp.github.io/CoreNLP/tokenize.html
             * Java character offsets (stored in the CharacterOffset{Begin,End}Annotation) 
             * are in terms of 16-bit char’s, with Unicode characters outside the basic 
             * multilingual prime encoded as two chars via a surrogate pair.
             * if need emoji use true .
             */
         props.setProperty("tokenize.codepoint", "true");
         
         StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
         CoreDocument document = new CoreDocument(paragraph);
            pipeline.annotate(document);
            for (CoreSentence sentence : document.sentences()) {
                sentenses.add(sentence.text());//split sentence with newLine char
            }
        return sentenses;
    }
    /**
     * For create Paragraph from TranslatedText
     * @param sentenses
     * @param log
     * @return String translated Paragraph
     */
    @Nullable
    public static String getParagraphInTranslated(ArrayList<String> sentenses , Logger log) {
        //Check null sentence or not
                if(sentenses == null || sentenses.size() < 0 || sentenses.isEmpty()) {
                    return null ;
                }
                String allSentenceToGetter = "";
                int countOfTranslated = 0;
                for (String sentense : sentenses) {
                      RepositoryTranslate repositoryTranslate = new RepositoryTranslate();
                      Logger logger = Application.setupLog(BasicLocation.getBaseFileDir()+"logs");
                      JSONObject translateJson =  repositoryTranslate.translate(sentense, "fa", "en", logger);
                      boolean isTranslated = translateJson.optBoolean(RepositoryTranslate.IS_TRANSLATE);
                      if(isTranslated) {
                          String textFromTranslate = translateJson.optString(RepositoryTranslate.TRANSLATE_TEXT);
                          allSentenceToGetter += (textFromTranslate+"\n") ;
                          countOfTranslated++;
                      }
                }
                
                log.log(Level.INFO , "Translated Setences from "+countOfTranslated+" / "+sentenses.size()+" successfully");
                return allSentenceToGetter;
    }
    /**
     * @param sentenses Use breakParagraphToSentence() to get input
     */
    public static void analyzer(ArrayList<String> sens , Logger log) {
        String paragraphTranslated = null;
        //Check null sentence or not
        if(sens == null || sens.size() < 0 || sens.isEmpty()) {
            return ;
        }else {
            paragraphTranslated = getParagraphInTranslated(sens, log);
        }
        if(paragraphTranslated == null) {
            return;
        }
         Properties properties = new Properties();
         properties.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, sentiment");
         properties.setProperty("ssplit.newlineIsSentenceBreak", "always");
         properties.setProperty("tokenize.codepoint", "true");
         StanfordCoreNLP pipeline = new StanfordCoreNLP(properties);
        
         Annotation annotation = pipeline.process(paragraphTranslated);
            List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);

           // sentences
            for (CoreMap sentence : sentences) {
              String sentiment = sentence.get(SentimentCoreAnnotations.SentimentClass.class);
              System.out.println(sentiment + "\t" + sentence);
            }
            //Tokenizing process https://stanfordnlp.github.io/CoreNLP/tokenize.html
            CoreDocument doc = new CoreDocument(unTranslatedText);
            pipeline.annotate(doc);
//          for (CoreMap sentence : sentences) {
//            // Get the parse tree for each sentence
//            Tree parseTree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
//            // Do something interesting with the parse tree!
//            System.out.println(parseTree);
//          }
            for (CoreLabel tok : doc.tokens()) {
                System.out.println(String.format("%s\t%d\t%d", tok.word(), tok.beginPosition(), tok.endPosition()));
              }
        
    }
    
    

      public static void main(String[] args)  {
          Logger logger = Application.setupLog(BasicLocation.getBaseFileDir()+"logs");
          analyzer(breakParagraphToSentence(unTranslatedText), logger);


        }

    }

输出为:


[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
 -> 2022/05/04 07:40:47.140 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
 -> 2022/05/04 07:40:47.150 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
 -> 2022/05/04 07:40:47.155 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
 -> 2022/05/04 07:40:47.162 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
 -> 2022/05/04 07:40:47.167 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
 -> 2022/05/04 07:40:47.172 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
 -> 2022/05/04 07:40:47.177 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
 -> 2022/05/04 07:40:47.182 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words-distsim.tagger ... done [0.7 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.7 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator sentiment
[main] INFO edu.stanford.nlp.sentiment.SentimentModel - Loading sentiment model edu/stanford/nlp/models/sentiment/sentiment.ser.gz ... done [0.1 sec].
Neutral We are here.
Neutral Hi how are you?
Neutral Good morning, what are you doing?
Neutral condolences
Negative    I really do not remember.
Positive    I love you baby
Neutral What can be done?
Neutral There
درما    1   5
در  6   8
اینجا   9   14
هستیم   15  20
.   21  22
سلام    23  27
خوبی    28  32
؟   32  33
وقت 34  37
بخیر    38  42
چیکار   43  48
میکنی   49  54
تسلیت   55  60
راستی   61  66
یادت    67  71
نره 72  75
بیای    76  80
.   81  82
دوست    83  87
دارم    88  92
عزیزم   93  98
چه  99  101
میشه    102 106
کرد 107 110
؟   110 111
اونجایی 112 119

输出结果必须如下所示:


[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
 -> 2022/05/04 07:40:47.140 {INFO} [mehritco.ir.megnatis.institute.reflex.nlp.TextAnalyzer At Method :getParagraphInTranslated (Line :0) ]  : Translated Setences from 7 / 7 successfully
successfully
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words-distsim.tagger ... done [0.7 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.7 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator sentiment
[main] INFO edu.stanford.nlp.sentiment.SentimentModel - Loading sentiment model edu/stanford/nlp/models/sentiment/sentiment.ser.gz ... done [0.1 sec].
Neutral We are here.
Neutral Hi how are you?
Neutral Good morning, what are you doing?
Neutral condolences
Negative    I really do not remember.
Positive    I love you baby
Neutral What can be done?
Neutral There
درما    1   5
در  6   8
اینجا   9   14
هستیم   15  20
.   21  22
سلام    23  27
خوبی    28  32
؟   32  33
وقت 34  37
بخیر    38  42
چیکار   43  48
میکنی   49  54
تسلیت   55  60
راستی   61  66
یادت    67  71
نره 72  75
بیای    76  80
.   81  82
دوست    83  87
دارم    88  92
عزیزم   93  98
چه  99  101
میشه    102 106
کرد 107 110
؟   110 111
اونجایی 112 119

共有1个答案

微生曾琪
2023-03-14

很明显这条线

    paragraphTranslated = getParagraphInTranslated(sens, log);  

被执行8次。

这可能会发生,如果

  • 文本包含8个段落。
  • 您的代码在段落或句子中进行的拆分中普遍存在一些错误。在这种情况下,您会在调试中找到它。
 类似资料:
  • 问题内容: 我想在我的应用程序中将slf4j + logback用于两个目的-日志和审计。 对于日志记录,我以常规方式记录日志: 对于审计,我创建一个特殊的命名记录器并登录到它: 登录配置: 问题:通过审核记录器记录的消息出现两次-一次在AUDIT_LOGGER下,一次在根记录器下。 14:41:57.975 [main]调试com.gammay.example.Main–> main() 14:

  • 我正在使用Sentinel-https://github.com/rydurham/Sentinel-来处理我的应用程序的用户身份验证,并将整个'admin'子域routes.php这样包装: 在本地机器上,一切正常-应用程序。领域如果登录,dev将显示管理仪表板,如果未登录,则显示登录页面,而注销会将用户返回到登录页面。 但是,一旦通过Forge部署,应用程序将不再可用。领域如果未登录,dev将

  • 我想在我的应用程序中使用SLF4J+logback用于两个目的--日志和审计。 14:41:57.978[main]信息AUDIT_LOGGER-110欢迎使用main 如何确保审核消息在审核记录器下只出现一次?

  • 读取用户登陆信息 调用地址 http://api.bilibili.cn/log/login 需要 App Key 并验证登录状态(Access key) 参数 字段 必选 类型 说明 page false int 结果分页选择 默认为第1页 pagesize false int 单页返回的记录条数,最大不超过300,默认为10。 返回 返回值字段 字段类型 字段说明 results int 返回

  • 使用Spring Boot1.4和Logback,我在中配置日志记录: 请注意,第二种配置的建议直接来自Spring Cloud Service Registration and Discovery文档。它在信息和其他“正常”级别上工作得很好。但是,日志还显示(由我重新格式化):