JavaScanningTestNetBeans70

(Difference between revisions)
(Test 1 - opening and closing projects)
Line 148: Line 148:
* Note Y) Nothing is going to help much in case of web.jsf functional test. From the tree of its dependencies it looks like it is very shallow. Is it right or have I made a mistake in generating this tree? Most of the dependencies seems to be coming directly from web.jsf/test/qa-functional/src/ and indirectly from web.kit/test/qa-functional/src/.
* Note Y) Nothing is going to help much in case of web.jsf functional test. From the tree of its dependencies it looks like it is very shallow. Is it right or have I made a mistake in generating this tree? Most of the dependencies seems to be coming directly from web.jsf/test/qa-functional/src/ and indirectly from web.kit/test/qa-functional/src/.
-
[[Category:Performance]]
+
[[Category:Performance]] [[Category:Performance:Measurements]]

Revision as of 09:42, 24 February 2011

Contents


I'm going to do some testing of Java scanning in Beta Dev build of NetBeans IDE 7.0 and will update these notes as I go. I'm starting with clean userdir and with this dev build:

Product Version         = NetBeans IDE Dev (Build 101129-b6130c56ca59) (#b6130c56ca59)
Operating System        = Linux version 2.6.35-23-generic running on amd64
Java; VM; Vendor        = 1.6.0_22; Java HotSpot(TM) 64-Bit Server VM 17.1-b03; 
Runtime                 = Java(TM) SE Runtime Environment 1.6.0_22-b04
Java Home               = /usr/lib/jvm/java-6-sun-1.6.0.22/jre

and I'm running the IDE with

-J-Dorg.netbeans.modules.parsing.impl.indexing.RepositoryUpdater.level=FINE

as suggested in issue 177950 to get more detailed logging.

Test 1 - opening and closing projects

I'm following instructions in issue 169252 and starting with opening web.jsf project. Initial scanning took:

Complete indexing of 610 source roots took: 644022 ms 
(New or modified files: 18148, Deleted files: 0) [Adding listeners took: 1747 ms]
Beta2:

Complete indexing of 605 source roots took: 584011 ms 
(New or modified files: 18203, Deleted files: 0) [Adding listeners took: 1114 ms]

The number itself does not say anything. It may be perfectly OK. It is relative and that's not what I'm after. web.jsf has quite big list of classpath dependencies and scanning all of them might just need that much time. I will test that assumption later. Let's try to restart IDE and see how long it will it take to reload indexed info. That ideally should be quick:

Complete indexing of 610 source roots took: 279685 ms 
(New or modified files: 0, Deleted files: 0) [Adding listeners took: 1952 ms]
Beta2:

I'm getting quite consistently two times:
T 000:00:52.288
T 000:00:51.793
T 000:00:40.538
or
T 000:01:38.709
T 000:01:36.411
T 000:01:38.702
T 000:01:51.139

It is much better in Beta2 but there seems to be an issue randomly
causing the indexing to take twice more time.

Well, that's not a good result. It's almost half the time of full scanning. Let's 'Go to Type' AntProjectHelper and open it. The response is instant. Great. CTRL+SHIFT+1 and open project.ant module. It is one of the dependencies so no scanning should be necessary:

Complete indexing of 2 source roots took: 4842 ms 
(New or modified files: 40, Deleted files: 0) [Adding listeners took: 9 ms]
newRootsToScan(2)=
 file:/home/dev/main/project.ant/antsrc/
 file:/home/dev/main/project.ant/test/unit/src/

Pretty good. Let's close both of the projects. For some reason closing of both projects triggers scanning although IDE is now completly empty - no project, no file in editor, everything is closed. Scanning took:

Complete indexing of 611 source roots took: 248263 ms 
(New or modified files: 0, Deleted files: 0) [Adding listeners took: 5 ms]
Beta2:

Scanning is much faster but still happens:

Resolving dependencies took: 2,045 ms
Complete indexing of 44 binary roots took: 205 ms
Complete indexing of 419 source roots took: 9893 ms
(New or modified files: 0, Deleted files: 0) [Adding listeners took: 550 ms]

That's wrong and useless. First, all these projects were scanned couple of minutes ago and I have not changed single file so IDE should not need to recheck all of them again. Second, I closed the projects so I will not need this parsing information. From logging (see log1) it looks like PATHS_REMOVED was correctly received but for some reason all previous dependencies are kept and rescan.

Now, let's try to reopen closed projects again. IDE was not restarted so it should have knowledge that all source folders were scanned about 1 minute ago and are up to date. Let's try Open Recent Project: projects.ant. Scanning took (label has changed after a while to "Refreshing indicies"):

Complete indexing of 118 source roots took: 31835 ms 
(New or modified files: 0, Deleted files: 0) [Adding listeners took: 233 ms]
Beta2:

Scanning is much faster but still happens:

Complete indexing of 75 source roots took: 3219 ms
(New or modified files: 0, Deleted files: 0)

which is followed by tone of INFO exceptions. I'm not sure how relevant they are but see attachment (exp1) and they finished with:

Finished RefreshCifIndices@6bde06[followUpJob=false, 
checkEditor=false indexer=TaskListIndexer/2 () in 21,000 ms with result Done

Let reopen also web.jsf:

Complete indexing of 493 source roots took: 196282 ms 
(New or modified files: 0, Deleted files: 0) [Adding listeners took: 598 ms]

I know that web.jsf has lots of dependencies but figures of opening and closing and reopening projects are quite shocking or unusable - 4 minutes after projects were closed; 4 minutes after they were reopen again. That explains to me why I can see scanning progress very frequently because I thought that in order to make IDE and scanning faster it would be better to close projects which I do not need and reopen ones I work on and try to keep number of open projects between 5 to 10. But in case of projects like web.jsf this does not make any difference. And for NetBeans EE team most of the projects we work with have similar dependencies to web.jsf.

Test 2 - Direct versus Indirect dependencies

Summary of the problem

Let me start from the beginning just to double check my understanding of the problem. Please correct me where necessary.

Let say for example that:

  • compilation classpath of Module A contains Module B; and
  • compilation classpath of Module B contains Module C; and
  • compilation classpath of Module C contains Jar D;

that means that:

  • B is direct dependency of A; and
  • C is direct dependency of B; and
  • D is direct dependency of C; and
  • C and D are indirect dependencies of A; and
  • D is indirect dependency of B

Let say we opened project A in the IDE. Both direct and indirect classpath is being indexed, that is A, B, C, D.

Jesse's argument is that only direct classpath indexing is necessary. He says "Accuracy is important (1) for code actually in open projects, (2) for signatures which might affect error badges on open projects (i.e. direct deps)". I do agree with that. If Module A is opened why Module C and D should be scanned when user has not shown any intention to use them?

Honza and Tomas's argument is that scanning of indirect dependencies is necessary. Two examples were given:

  • Example 1) if a class from direct dependency is opened in the IDE, for example class Foo from Module B then Java features like Code Completion, Goto Source, etc. require that all direct dependencies of Module B (that is A's indirect dependencies) are indexed. There is couple of possible solutions:
    • Answer A) Module B is not opened in the IDE and therefore Goto Source, Code Completion, etc. may not work properly and user has to open Module B first. I think such answer is wrong and would force user to do steps which IDE should do automatically so I do not consider this option
    • Answer B) trigger scanning of Module B's direct dependencies when first class from Module B is opened in editor.
  • Example 2) GoTo Type would not offer classes from indirect dependencies. Honza says "I, for one, quite often open classes from un-opened projects via Go to Type, and I do not really care if the class is in direct dependency of in indirect dependency. (I typically try to type the class name, and in the rare case the containing source root is not accessible transitively from the opened projects, I open it)". I'm doing this frequently myself but I consider this more of a workaround for a missing feature: what I really want is to tell IDE where all my projects are regardless of dependencies and which project I have open and then offer in GoTo Type all of them. Due to lack of this feature the workaround is to open one of top level projects and then GoTo Type will work in most of the cases. Would be good if it could work always, that is even in case of Type which is not in direct nor indirect dependencies of opened projects.

Evaluation

I looked at web.jsf project example and collected some data. Attached document lists direct and indirect dependencies for web.jsf/src and web.jsf/test/qa-functional/src/. There is 6 reports - three for web.jsf and three for functional tests of web.jsf. Direct and indirect dependencies are listed first followed by simplified tree of these dependencies.

Here is my conclusion. Scanning of indirect dependencies does not scale. The higher in dependency tree a project sits the longer the scanning will be regardless of the size of its direct dependencies. And regardless of number of opened projects. In my experience and confirmed by Denis this can take up to 10 minutes which is not acceptable.

Some ideas how this could be improved:

  • Idea A) give scanned roots different importance and handle them accordingly. For example any direct dependency or open project has scanning priority P1, indirect dependencies has P2 priority; scanning of any other sources is P3 (not sure there is such case); P2 and P3 scanning should be done on background in low priority task not blocking anything; some scanning could be done max once a day(??);
  • Idea B) consider depth of indirect dependencies. Looking at the tree of dependencies of web.jsf/src the deeper you go in indirect dependencies the less likely it is they will be required by user and so their scanning should be done later or with lower priority
  • Idea C) based on size of dependency graph decide on different strategies; for small graph index everything in one go; for large graphs apply scanning priorities or simply do not scan everything or scan indirect dependencies only into certain depth
  • Idea D) Try experimentally different approaches to scanning and record how often scanning is done, why and how long it took. That could provide us with a data based on which we could tune it up or decide in favor of a particular solution.
  • Idea E) have dedicated low priority thread to index everything what has not been indexed yet so that GoTo Type works.
  • Note X) The nature of features depending on Java indexing is that they must handle gracefully state when scanning data are not ready yet. So everything should be in place for "scanning on demand" mode.
  • Note Y) Nothing is going to help much in case of web.jsf functional test. From the tree of its dependencies it looks like it is very shallow. Is it right or have I made a mistake in generating this tree? Most of the dependencies seems to be coming directly from web.jsf/test/qa-functional/src/ and indirectly from web.kit/test/qa-functional/src/.
Not logged in. Log in, Register

By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo