IntelliJ IDEA 引用搜索原理

段干靖
2023-12-01

注:如果排版格式有问题,点击最后的原文链接查看IntelliJ IDEA 我们都很熟悉,强大的开源IDE。Android Studio 就是基于IDEA社区版开发的。平常我们会经常使用到Find Usage 功能,查找一个类或者方法的引用,那我们看看它内部是如何实现的?

IDEA 在Github上是开源的,地址:https://github.com/JetBrains/intellij-community。clone 下来可以直接用它自己打开,感觉很奇妙,自己可以开发自己:-D

IDEA整个源码量是非常庞大的,只是Java和Python源代码加起来就有四百多万行。从这样庞大的项目中找到某个功能的入口,寻找相应的测试用例是个比较好的方法,那我们试着搜下有没有Find Usage的测试用例,果然在com.intellij.java.psi.search包下有个类是FindUsagesTest,继续看下这个类里面有没有我们想要的入口,测试用例里面大部分都引用了一个函数:ReferencesSearch.search,看来感觉有点像,跟进去看看这个函数的定义:

  /**
   * Searches for references to the specified element in the scope in which such references are expected to be found, according to
   * dependencies and access rules.
   *
   * @param element the element (declaration) the references to which are requested.
   * @return the query allowing to enumerate the references.
   */
  @NotNull
  public static Query<PsiReference> search(@NotNull PsiElement element) {
    return search(element, GlobalSearchScope.allScope(PsiUtilCore.getProjectInReadAction(element)), false);
  }

大概就是根据搜索的范围,去找到这个element的引用,注意这个函数返回的是一个Query接口。

搜索引用大概就是这个函数了,继续跟进去。

public static Query<PsiReference> search(@NotNull PsiElement element, @NotNull SearchScope searchScope, boolean ignoreAccessScope) {
    return search(new SearchParameters(element, searchScope, ignoreAccessScope));
  }

根据上一步的参数组成搜索参数,忽略无关紧要的细节,继续。

  /**
   * Searches for references to the specified element according to the specified parameters.
   *
   * @param parameters the parameters for the search (contain also the element the references to which are requested).
   * @return the query allowing to enumerate the references.
   */
  @NotNull
  public static Query<PsiReference> search(@NotNull final SearchParameters parameters) {
    final Query<PsiReference> result = INSTANCE.createQuery(parameters);
    if (parameters.isSharedOptimizer) {
      return uniqueResults(result);
    }

    final SearchRequestCollector requests = parameters.getOptimizer();

    final PsiElement element = parameters.getElementToSearch();

    return uniqueResults(new MergeQuery<>(result, new SearchRequestQuery(PsiUtilCore.getProjectInReadAction(element), requests)));
  }

这一步大概就是,创建了两个Query,然后进行了合并,返回UniqueResultsQuery,这里面的SearchRequestQuery感觉比较重要,先着重留意下。返回的Query,肯定是为了让上层调用查找接口,那我们看下这个Query里面的查找接口是如何实现的,如下

  @Override
  @NotNull
  public Collection<T> findAll() {
    List<T> result = Collections.synchronizedList(new ArrayList<>());
    Processor<T> processor = Processors.cancelableCollectProcessor(result);
    forEach(processor);
    return result;
  }

这个意思就比较清楚了,把一个结果的List作为引用经过处理,最后返回给上层。Processor只是把结果List进行暂存,处理还是在forEach里面,forEach里面还是调用了myOriginal的Query的forEach,如下:

  private boolean process(@NotNull Set<M> processedElements, @NotNull Processor<? super T> consumer) {
    return myOriginal.forEach(new MyProcessor(processedElements, consumer));
  }

myOriginal就是刚刚的MergeQuery,那看下MergeQuery里面的forEach做了什么,最后调用了processSubQuery,如下:

  private <V extends T> boolean processSubQuery(@NotNull Query<V> subQuery, @NotNull final Processor<? super T> consumer) {
    return subQuery.forEach(consumer);
  }

也就是MergeQuery最后调用了各个子Query的forEach,上面我们注意到SearchRequestQuery嫌疑比较大,先跟进去看下,forEach最后调用到了processResults,如下:

  @Override
  protected boolean processResults(@NotNull Processor<? super PsiReference> consumer) {
    return PsiSearchHelper.getInstance(myProject).processRequests(myRequests, consumer);
  }

里面调用到了PsiSearchHelper的processRequests,如下:

  @Override
  public boolean processRequests(@NotNull SearchRequestCollector collector, @NotNull Processor<? super PsiReference> processor) {
    ......
    do {
    ......
      if (!processGlobalRequestsOptimized(globals, progress, localProcessors)) {
        return false;
      }
      for (RequestWithProcessor local : locals) {
        progress.checkCanceled();
        if (!processSingleRequest(local.request, local.refProcessor)) {
          return false;
        }
      }
    ......
    }
    while (true);
  }

其中省略掉了一些无关代码,注意到有个processGlobalRequestsOptimized还有个processSingleRequest,先看下processGlobalRequestsOptimized的实现,如下:

  private boolean processGlobalRequestsOptimized(@NotNull MultiMap<Set<IdIndexEntry>, RequestWithProcessor> singles,
                                                 @NotNull ProgressIndicator progress,
                                                 @NotNull final Map<RequestWithProcessor, Processor<PsiElement>> localProcessors) {
    ......
    if (singles.size() == 1) {
      final Collection<? extends RequestWithProcessor> requests = singles.values();
      if (requests.size() == 1) {
        final RequestWithProcessor theOnly = requests.iterator().next();
        return processSingleRequest(theOnly.request, theOnly.refProcessor);
      }
    }
    ......
    return result;
  }

忽略掉无关代码,发现当请求为1的时候,还是调用了上层的processSingleRequest,那我们就先分析简单情况,跟进去看下实现,如下:

  private boolean processSingleRequest(@NotNull PsiSearchRequest single, @NotNull Processor<? super PsiReference> consumer) {
    final EnumSet<Options> options = EnumSet.of(Options.PROCESS_ONLY_JAVA_IDENTIFIERS_IF_POSSIBLE);
    if (single.caseSensitive) options.add(Options.CASE_SENSITIVE_SEARCH);
    if (shouldProcessInjectedPsi(single.searchScope)) options.add(Options.PROCESS_INJECTED_PSI);

    return bulkProcessElementsWithWord(single.searchScope, single.word, single.searchContext, options, single.containerName,
                                       adaptProcessor(single, consumer)
    );
  }

先配置了请求参数,然后调用了bulkProcessElementsWithWord,先看下adaptProcessor实现,如下:

  @NotNull
  private static BulkOccurrenceProcessor adaptProcessor(@NotNull PsiSearchRequest singleRequest,
                                                       @NotNull Processor<? super PsiReference> consumer) {
    ......
    final RequestResultProcessor wrapped = singleRequest.processor;
    return new BulkOccurrenceProcessor() {
      @Override
      public boolean execute(@NotNull PsiElement scope, @NotNull int[] offsetsInScope, @NotNull StringSearcher searcher) {
        ......
          return LowLevelSearchUtil.processElementsAtOffsets(scope, searcher, !ignoreInjectedPsi,
                                                             getOrCreateIndicator(), offsetsInScope,
                                                             (element, offsetInElement) -> {
            if (ignoreInjectedPsi && element instanceof PsiLanguageInjectionHost) return true;
            return wrapped.processTextOccurrence(element, offsetInElement, consumer);
          });
      }
    };
  }

adaptProcessor最后还是调用了wrapped.processTextOccurrence调用,先留意下这个地方,从上一层继续向下看,bulkProcessElementsWithWord的实现,如下:

  private boolean bulkProcessElementsWithWord(@NotNull SearchScope searchScope,
                                              @NotNull final String text,
                                              final short searchContext,
                                              @NotNull EnumSet<Options> options,
                                              @Nullable String containerName, @NotNull final BulkOccurrenceProcessor processor) {
    ......
    if (searchScope instanceof GlobalSearchScope) {
      StringSearcher searcher = new StringSearcher(text, options.contains(Options.CASE_SENSITIVE_SEARCH), true,
                                                   searchContext == UsageSearchContext.IN_STRINGS,
                                                   options.contains(Options.PROCESS_ONLY_JAVA_IDENTIFIERS_IF_POSSIBLE));

      return processElementsWithTextInGlobalScope((GlobalSearchScope)searchScope, searcher, searchContext,
                                                  options.contains(Options.CASE_SENSITIVE_SEARCH), containerName, progress, processor);
    }
    ......
    return JobLauncher.getInstance().invokeConcurrentlyUnderProgress(Arrays.asList(scopeElements), progress, localProcessor);
  }

跟进去看下processElementsWithTextInGlobalScope的实现,如下:

  private boolean processElementsWithTextInGlobalScope(@NotNull final GlobalSearchScope scope,
                                                       @NotNull final StringSearcher searcher,
                                                       final short searchContext,
                                                       final boolean caseSensitively,
                                                       @Nullable String containerName,
                                                       @NotNull ProgressIndicator progress,
                                                       @NotNull final BulkOccurrenceProcessor processor) {
    progress.pushState();
    boolean result;
    try {
      progress.setText(PsiBundle.message("psi.scanning.files.progress"));

      String text = searcher.getPattern();
      Set<VirtualFile> fileSet = new THashSet<>();
      getFilesWithText(scope, searchContext, caseSensitively, text, fileSet);

      progress.setText(PsiBundle.message("psi.search.for.word.progress", text));

      final Processor<PsiElement> localProcessor = localProcessor(progress, searcher, processor);
      ......
      result = fileSet.isEmpty() || processPsiFileRoots(new ArrayList<>(fileSet), fileSet.size(), 0, progress, localProcessor);
    }
    finally {
      progress.popState();
    }
    return result;
  }

localProcessor比较可疑,跟进去看下,如下:

  private static Processor<PsiElement> localProcessor(@NotNull final ProgressIndicator progress,
                                                      @NotNull final StringSearcher searcher,
                                                      @NotNull final BulkOccurrenceProcessor processor) {
    return new ReadActionProcessor<PsiElement>() {
      @Override
      public boolean processInReadAction(PsiElement scopeElement) {
        ......
        return scopeElement.isValid() &&
               processor.execute(scopeElement, LowLevelSearchUtil.getTextOccurrencesInScope(scopeElement, searcher, progress), searcher);
      }
    };
  }

终于,看到了processor的execute调用的地方,这个processor就是adaptProcessor返回的,执行的就是wrapped.processTextOccurrence,wrapped指向的processor就是SingleTargetRequestResultProcessor。

那这个wrapped是什么时候注入进来的呢?还记得在新建MergeQuery时有两个Query一个是Search,另一个就是ExecutorsQuery,这个Query在执行时会根据参数通过一系列流程把wrapped指向SingleTargetRequestResultProcessor类型的Processor

所以,最后是执行的SingleTargetRequestResultProcessor的processTextOccurrence,看下实现,如下:

  @Override
  public boolean processTextOccurrence(@NotNull PsiElement element, int offsetInElement, @NotNull final Processor<? super PsiReference> consumer) {
    ......
    final List<PsiReference> references = ourReferenceService.getReferences(element,
                                                                            new PsiReferenceService.Hints(myTarget, offsetInElement));
    ......
    return true;
  }

跟进去getReferences实现,一路跳转...

  private static PsiReferenceRegistrarImpl createRegistrar(Language language) {
    ......
    List<PsiReferenceProviderBean> referenceProviderBeans = REFERENCE_PROVIDER_EXTENSION.allForLanguageOrAny(language);
    for (final PsiReferenceProviderBean providerBean : referenceProviderBeans) {
      final ElementPattern<PsiElement> pattern = providerBean.createElementPattern();
      if (pattern != null) {
        registrar.registerReferenceProvider(pattern, new PsiReferenceProvider() {

          PsiReferenceProvider myProvider;

          @NotNull
          @Override
          public PsiReference[] getReferencesByElement(@NotNull PsiElement element, @NotNull ProcessingContext context) {
            if (myProvider == null) {

              myProvider = providerBean.instantiate();
              if (myProvider == null) {
                myProvider = NULL_REFERENCE_PROVIDER;
              }
            }
            return myProvider.getReferencesByElement(element, context);
          }
        });
      }
    }

    registrar.markInitialized();

    return registrar;
  }

最终调用的是PsiReferenceProvider的getReferencesByElement,myProvider又是通过PsiReferenceProviderBean转化而来的,看下这里面做了什么事情,然后发现如下注释:

/**
 * Registers a {@link PsiReferenceProvider} in plugin.xml
 */
public class PsiReferenceProviderBean extends AbstractExtensionPointBean implements KeyedLazyInstance<PsiReferenceProviderBean> {

  public static final ExtensionPointName<PsiReferenceProviderBean> EP_NAME =
    new ExtensionPointName<>("com.intellij.psi.referenceProvider");

  @Attribute("language")
  public String language = Language.ANY.getID();

  @Attribute("providerClass")
  public String className;

原来是在plugin.xml 里面注册PsiReferenceProvider类型的Class,用时再去反射实例化调用,那我们现在看看有哪些类继承了PsiReferenceProvider,其中的JavaClassReferenceProvider应该是我们想要的实现,跟进getReferencesByElement,又是一路跳转到JavaClassReferenceSet的reparse,终于找到了类引用搜索最核心的东西,如下:

  private void reparse(@NotNull String str, @NotNull PsiElement element, final boolean isStaticImport, JavaClassReferenceSet context) {
    myElement = element;
    myContext = context;
    List<JavaClassReference> referencesList = new ArrayList<>();
    int currentDot = -1;
    int referenceIndex = 0;
    boolean allowDollarInNames = isAllowDollarInNames();
    boolean allowSpaces = isAllowSpaces();
    boolean allowGenerics = false;
    boolean allowWildCards = JavaClassReferenceProvider.ALLOW_WILDCARDS.getBooleanValue(getOptions());
    boolean allowGenericsCalculated = false;
    boolean parsingClassNames = true;

    while (parsingClassNames) {
      int nextDotOrDollar = -1;
      for (int curIndex = currentDot + 1; curIndex < str.length(); ++curIndex) {
        char ch = str.charAt(curIndex);

        if (ch == DOT || ch == DOLLAR && allowDollarInNames) {
          nextDotOrDollar = curIndex;
          break;
        }

        if (ch == LT || ch == COMMA) {
          if (!allowGenericsCalculated) {
            allowGenerics = !isStaticImport && PsiUtil.isLanguageLevel5OrHigher(element);
            allowGenericsCalculated = true;
          }

          if (allowGenerics) {
            nextDotOrDollar = curIndex;
            break;
          }
        }
      }

      if (nextDotOrDollar == -1) {
        nextDotOrDollar = currentDot + 1;
        for (int i = nextDotOrDollar; i < str.length() && Character.isJavaIdentifierPart(str.charAt(i)); ++i) nextDotOrDollar++;
        parsingClassNames = false;
        int j = skipSpaces(nextDotOrDollar, str.length(), str, allowSpaces);

        if (j < str.length()) {
          char ch = str.charAt(j);
          boolean recognized = false;

          if (ch == '[') {
            j = skipSpaces(j + 1, str.length(), str, allowSpaces);
            if (j < str.length() && str.charAt(j) == ']') {
              j = skipSpaces(j + 1, str.length(), str, allowSpaces);
              recognized = j == str.length();
            }
          }

          Boolean aBoolean = JavaClassReferenceProvider.JVM_FORMAT.getValue(getOptions());
          if (!recognized && (aBoolean == null || !aBoolean.booleanValue())) {
            nextDotOrDollar = -1; // abort resolve
          }
        }
      }

      if (nextDotOrDollar != -1 && nextDotOrDollar < str.length()) {
        char c = str.charAt(nextDotOrDollar);
        if (c == LT) {
          boolean recognized = false;
          int start = skipSpaces(nextDotOrDollar + 1, str.length(), str, allowSpaces);
          int j = str.lastIndexOf(GT);
          int end = skipSpacesBackward(j, 0, str, allowSpaces);
          if (end != -1 && end > start) {
            if (myNestedGenericParameterReferences == null) myNestedGenericParameterReferences = new ArrayList<>(1);
            myNestedGenericParameterReferences.add(new JavaClassReferenceSet(
              str.substring(start, end), myElement, myStartInElement + start, isStaticImport, myProvider, this));
            parsingClassNames = false;
            j = skipSpaces(j + 1, str.length(), str, allowSpaces);
            recognized = j == str.length();
          }
          if (!recognized) {
            nextDotOrDollar = -1; // abort resolve
          }
        }
        else if (c == COMMA && myContext != null) {
          if (myContext.myNestedGenericParameterReferences == null) myContext.myNestedGenericParameterReferences = new ArrayList<>(1);
          int start = skipSpaces(nextDotOrDollar + 1, str.length(), str, allowSpaces);
          myContext.myNestedGenericParameterReferences.add(new JavaClassReferenceSet(
            str.substring(start), myElement, myStartInElement + start, isStaticImport, myProvider, this));
          parsingClassNames = false;
        }
      }

      int maxIndex = nextDotOrDollar > 0 ? nextDotOrDollar : str.length();
      int beginIndex = skipSpaces(currentDot + 1, maxIndex, str, allowSpaces);
      int endIndex = skipSpacesBackward(maxIndex, beginIndex, str, allowSpaces);
      boolean skipReference = false;
      if (allowWildCards && str.charAt(beginIndex) == QUESTION) {
        int next = skipSpaces(beginIndex + 1, endIndex, str, allowSpaces);
        if (next != beginIndex + 1) {
          String keyword = str.startsWith(EXTENDS, next) ? EXTENDS : str.startsWith(SUPER, next) ? SUPER : null;
          if (keyword != null) {
            next = skipSpaces(next + keyword.length(), endIndex, str, allowSpaces);
            beginIndex = next;
          }
        }
        else if (endIndex == beginIndex + 1) {
          skipReference = true;
        }
      }
      if (!skipReference) {
        TextRange textRange = TextRange.create(myStartInElement + beginIndex, myStartInElement + endIndex);
        JavaClassReference currentContextRef = createReference(
          referenceIndex, str.substring(beginIndex, endIndex), textRange, isStaticImport);
        referenceIndex++;
        referencesList.add(currentContextRef);
      }
      if ((currentDot = nextDotOrDollar) < 0) {
        break;
      }
    }

    myReferences = referencesList.toArray(new JavaClassReference[0]);
  }

很长,但是基本可以理解为就是一个简单的语言Parser,和我最初的猜想也是相符的,就是基于源文件字符解析,引用相关的信息都包含在返回的PsiReference列表里面。

IDEA 确实有一个优秀的架构,虽然也有槽点:-D

 类似资料: