cornercorner
FeaturesPluginsDocs & SupportCommunityPartners

ParsingAPITCReview

Technical Council Review Specification - NetBeans 7.0.

Status

Go for Phase1. Outstanding tasks:

  • PJ - involve Marek Fukala to assess the embedding aspects of Parsing  API vs. GSF.
  • PJ - investigate and confirm staffing in the areas of migrating GSF - based languages to Parsing API, and TaskList
  • JT - performance team to investigate the possibility of enabling Go To Type while scanning is in progress


Notes from Inception Review held on Oct 30, 2008

Petr Jiřička (PJ)
Jaroslav Tulach (JT)
Tomáš Zezula (TZ)
Tor Norbye (TN)
Vítězslav Stejskal (VS)
Jesse Glick (JG)
Jan Bečička (JB)
Jan Lahoda (JL)


PJ: critical mass is here
PJ: welcome, first reviewers.
PJ: different from API review, dependencies between teams, benefits
JT: devrev inception hopefully not needed after TC review
TZ: giving overview and reason for the parsing API work
TZ: 3 clones of Retouche (javafx, java, gsf). Unify! Unify critical parts.
TZ: Threading model, caching, indexing, registation, embading processors, base layer.
TZ: Big changes in javafx, java, gsf
TZ: easy upgrade for gsf
TN: lets bridge the old interface 
VS: Is not the API closer to Java than GSF? => impossible to bridge everything!?
TN: GSF many parts. utitilites are different. the rest yes. Moreover there is the embeding in parser
VS: We will delete the old gsf APIs (embeding, parsing).
PJ: Is it 1-1 mapping?
TZ: No. Embdeding computed on demand = different model.
TN: Unification is good. Ruby, python, ok. Groovy, scala complicated.
TN: Biggest part of GSF is utilities - that has to remain. Minimizes the impact of rewrite.
TN: Marek Fukala, shall evaluate impact of embeding.
TN: Incremenatal parsing of embedings is important. That is what new parsing API does, right?
TZ: Nods. Twice.
TN: We need preindex data.
JT: In JAR files directly without any copying. OK.
JT: Really incompatible rewrite?
TZ: Yes.
JT: Will the api be exposed only to javafx, java, gsf?
TZ: People will use new Parsing API methods. Work correctly with embeding.
JT: Stability of the new API, same as JavaSource? Deprecate JavaSource?
TZ: Simple things OK with old API.
JG: No strong reason to deprecate old API.
PJ: JavaFX - will it rewrite for 7.0? 
TZ: Nobody knows. It only affects performance.
JT: Future requirement for JavaFX: If they want to become part of the release, they need to switch.

PJ: Planning. Are all tasks covered?
JB: API, we have several implementations (schliemann, java). We will integrate it by Nov 14, 2008. (we cover)
JB: 2nd step: rewrite of GSF and one GSF language (javascript) - by Chrismas, 2008 (we cover + help of TN)
VS: We shall try to rewrite more to test embedings, we have prototype of java in JSP
VS: I want more! Two GSF langs that embed into each other. We need to make it simple!
JB: then we can start rewrite of the rest languages (help by PJ).

PJ: Tasklist & Parsing API.
TZ: perf team wants one touch of each file in project. right now: java, tasklist, gsf
JT: tasklist shall do nothing after up to date check
TN: Caching across sessions
JG: Filled a bug to show hints in tasklist
JT: Indexing. Can it be used in tasklist?
TZ: Two layers: general API for indexing lifecycle. Plus access to Lucene. 
TZ: Lucene no good for storing tasklist data - use plain file.
TN: GSF indexing API is tight to classpath, language, etc.
TZ: When a file is changed, the tasklist will be called by the indexing API for given mimetype (or all)
TZ: There is EmbedingIndexer and also CustomIndexer
JT: The file shall be opened only once. Can they share one input stream?
TZ: Yes, they can. There is indexable, why not FileObject? 35% slower than plain File.
task for JT: speedup FileObject.
JL: One opening of file is hard in Java. We definetely cannot parse file by file.
JG: I started to use Open File over Open Type. It is faster.
JT: Nice VOC!
JT: We rely on Parsing API and its improvements. Plus we want the tasklist to switch to Parsing API.

VS: Parsing API is not the ultimate solution. 
TN: Huge improvement by caching of PHP data.
TZ: GoTo Type needs access to current document.
JT: Can we go without it?
TZ: We need to put the cursor in the position of the type => we need to parse the new file
JG: Can you suspend parsing?
TZ: JavaC is only fast if it is parsing in batch.
JT: Perf team will take this offline.


PJ: Important task. Tought schedule. Mitigation plan? Can we phase things?
JB: Possible to skip GSF, deliver just Parsing API. 
PJ: Under-development: /0
VS: Check point in Nov 30, 2008.

PJ: Go for phase I.

Parsing API

Motivation

The Netbeans is not more a Java IDE, it supports various languages built on the top of various frameworks. The most of these languages are using a retouche core which was cloned into several different versions, Java core, GSF core, Java FX core. The cloning introduced several problems. Duplication of classes negatively affects performance. Each clone of the retouche core was done in different version and has some specific changes which makes impossible to propagate fixes among them. As a retouche core provides also concurrency model it's very hard to safely use a combination of these frameworks, which is needed for languages like groovy, jsp, scala. The main goal of the parsing api is to provide single simple infrastructure for plugging new language supports.

Features and tasks planned for NetBeans 7.0

  1. Common ClassPath to replace the Java ClassPath and the GSF ClassPath, the common ClassPath was already introduced in the NB 6.5 but the GSF needs to be updated to use it.
  2. Parsing infrastructure
  • Supports scheduled tasks as well as synchronous tasks
  • Defines threading and mutual exclusion
  • Supports sharing of parser data
  • Supports embedding
  • Supports partial reparse
  • Defines registration of parsers, embedding providers
  1. Indexing and search engine
  • Provides scanning infrastructure - Common RepositoryUpdater sources are scanned just once, file events are processed once.
  • Allows registration of Index providers for different languages
  • Builds file path index
  • Supports task list
  • Provides support for lucene ready to use indexer and searcher
  • The indexing system should support versioning of indexes
  • Ability to index not just files on disk, but also files in .jar files, and files in the System File System

More detailed requirements can be found at Parsing API Requirements Page.

Significant features and tasks not planned for NetBeans 7.0

Not yet determined, but it is obvious that not all requirements from the Parsing API Requirements Page can be delivered in the NetBeans 7.0 time frame. The goal is to deliver the core features and postpone the features which are either not so critical or can be implemented on the language level. No plans to use Parsing API in C++ nor JavaFX.


Interactions with other features and teams

  1. The Java core has to be rewritten to use parsing api.
  2. The GSF core has to be rewritten to use parsing api.
  3. Schliemann has to be rewritten to use parsing api.
  4. All languages based on the GSF or Schliemann need to be updated to use the changed GSF or Schliemann.

High level design

The parsing api should provide a common layer for languages integration as shown on the following picture.

http://wiki.netbeans.org/attach/ParsingAPITCReview/overview_ParsingAPITCReview.png.

The parsing api module provides both an API and a SPI. The API is used by languages clients requesting parser operations, for example code completion, hints and other editor features. The SPI is implemented by parser providers and embedding providers. The actual module can be split into two separated areas, parsing and indexing.
The parsing support provides the main parser loop which dispatches registered tasks according to their priorities and executes them. The loop caches the last used parser(s), so the same file (document) is parsed just once and it is used until the parser result becomes invalid, for example by document modification. The loop is also responsible for listening on changes, invalidation of results and calling the parsers. The tasks are registered into loop per mime type base using factories. In addition to this the parsing api provides synchronous call of parser operations. The synchronous call uses the same cached parser, the parsing infrastructure is responsible for mutual exclusion of the synchronous calls and the loop working thread.
The indexing allows different language supports to create and maintain language specific metadata. When the project is opened the indexing support collects project's source files splits them according to mime type into categories and passes them to registered indexers. The indexer analyses the source and stores the metadata either using lucene support provided by this module or into custom storage. The indexing infrastructure than listens on file events and calls corresponding indexers to update the metadata. The indexing provides also support to store and to retrieve metadata from lucene index.
More information about the parsing api can be found on the Parsing API Home Page and in the Parsing API Inception Review Issue.