ProjectTypeMusings

(Difference between revisions)
(Created page with ' '''DRAFT''' Musings about mistakes made in the NB 4.0+ project system (specifically in individual project types rather than the generic architecture) and possible future direct…')
m (nbproject/private/)
 
Line 239: Line 239:
a project which should not be shared. The assumptions were that
a project which should not be shared. The assumptions were that
-
# A combination of the suggestive name, documentation, and integration with the IDE's VCS support would make
+
# A combination of the suggestive name, documentation, and integration with the IDE's VCS support would make sure this directory was never checked into VCS (and was also listed in the VCS ignore pattern).
-
sure this directory was never checked into VCS (and was also listed in the VCS ignore pattern).
+
# People sharing projects outside VCS, e.g. as ZIP files, would use the IDE to create such archives, and this would automatically exclude the private directory.
-
# People sharing projects outside VCS, e.g. as ZIP files, would use the IDE to create such archives, and this
+
-
would automatically exclude the private directory.
+
In practice, neither of these assumptions held, and <tt>nbproject/private/</tt> directories have proliferated in
In practice, neither of these assumptions held, and <tt>nbproject/private/</tt> directories have proliferated in
Line 260: Line 258:
Of course, a major component of these directories today is absolute path names for libraries or other resources
Of course, a major component of these directories today is absolute path names for libraries or other resources
required by the project, which ought to be addressed by better sharability management.
required by the project, which ought to be addressed by better sharability management.
-
 
===Prospects for Maven-based projects===
===Prospects for Maven-based projects===

Current revision as of 21:03, 14 June 2012

DRAFT

Musings about mistakes made in the NB 4.0+ project system (specifically in individual project types rather than the generic architecture) and possible future directions.

XXX link to Issuezilla, wiki, etc. where appropriate.

Contents


Failings of "canned" project types

The "canned" Ant-based project types introduced in NB 4.0, such as j2seproject and web/project, have proven difficult to maintain - both for fixing bugs and introducing new useful functionality.

Limited expressiveness of Ant

Since Ant lacks any true scripting constructs, relying instead on a simple macro language with no support for loops or conditionals or serious data manipulation, it is very difficult to write generic Ant scripts which handle the range of actions wanted in a given project type, parametrized only by data in *.properties files (deciphering and modifying Ant scripts themselves is tricky at best, uncomputable at worst).

One example is the building of subprojects. It might seem possible to encode a list of a subproject paths in a project property, like so:

subprojects=lib1:lib2:../shared

and build these all with:

<subant path="${subprojects}" target="build"/>

or similar. But this system breaks down in case some subprojects need to be invoked with different targets. There is also no straightforward way to prevent rebuilding a subsubproject shared by two different paths, since Ant cannot (easily) pass information from a subbuild back to a parent build.

Another example is copying classpath entries to the WEB-INF/lib/ folder of a WAR. Ant's <copy> does not accept a path, so a fixed copy target cannot handle a classpath of arbitrary length. Obviously a line or two of any scripting language would fix this, but not in Ant.

Two workarounds have been used in NB, neither satisfactory:

Some project types encode such structural information in project.xml and generate a different build-impl.xml from that depending on an XSL stylesheet. For example, a j2seproject with two subprojects gets two <ant> tasks generated in a target to build its dependencies. Besides the numerous problems encountered with XSLT across various JDK versions, this has the undesirable effect of making the build scripts change in response to common user customizations. People do not know whether or not to commit changes to build-impl.xml. genfiles.properties tries to track the status of this regeneration but introduces its own problems, especially for VCS merges. Changes in the stylesheet between IDE versions also cause silent changes in build-impl.xml with no coherent versioning.

Other project types introduce custom Ant tasks which can process the project's metadata directly. This is much more flexible, but it then means that a separate JAR has to be supplied just to build the project. If the JAR is kept only in the IDE installation, the project cannot be built from a VCS checkout; but if it is kept in the project, users will be aggravated at the IDE overhead needed just to run a simple application, multiple related projects may not correctly share one copy of the JAR, and the IDE needs to carefully manage upgrades to the JAR as bugs are fixed or features added.

Another possible workaround would be to use Ant's <script> to handle special cases like this, essentially replacing large blocks of build-impl.xml with JavaScript, Ruby, etc. This would be potentially more comfortable for NB developers (at the risk of making the average user even less likely to be able to follow how build-impl.xml actually works); but since Ant does not ship any scripting engine by default, the IDE would need to do so, with all the drawbacks of the previous workaround (compounded by a much larger and more rapidly-changing JAR).

Confusion between build.xml vs. build-impl.xml

Despite prominent comments at the tops of these files, many users do not grasp that build-impl.xml is intended to be maintained by the IDE whereas build.xml is intended for hand-written overrides. It is natural upon seeing where some functionality is implemented to make customizations by directly editing it.

Anyway, writing overrides in build.xml generally seems beyond the capacity of most users, even those with prior Ant experience. The macros and large number of properties used in build-impl.xml are overwhelming.

Use of Ant for critical-path operations

While it is nice to know that doing a build from the IDE invokes exactly the same code path as would be used on a continuous build server (modulo any settings in nbproject/private/private.properties), most users seem to care more about speed and functionality of a handful of basic Java operations, namely: compile, debug, and unit test.

Compilation from Ant is not ideally suited to an IDE. The whole source tree (and perhaps even subprojects) must be checked for modifications, even though the IDE can usually know which files have been edited recently. (There are potential problems when running external VCS tools.) Inter-class dependency management using <depend> is not very efficient, but not using this task is arguably worse. The whole process involves significant overhead beyond what javac (e.g. via JSR 199) necessarily needs.

Launching the target process (perhaps under a debugger or profiler) is certainly possible from Ant, but then Ant offers no java.lang.Process handle or any other means of directly monitoring the process. System I/O management is difficult, there is no way to request a thread dump, etc. Starting and stopping application servers from Ant is possible but awkward.

Unit testing from Ant means that the IDE needs to use hacks (VERBOSE-level log messages) just to monitor which tests are being run at which time. <junit> lacks any means of running a single test method, so the IDE does as well.

While you can in the current system take advantage of Ant's (limited) scriptability to customize these targets (e.g. to start a daemon process before beginning unit tests), it is doubtful many users have the need, expertise, or inclination to do so.

Some project operations are less commonly run; inherently slower; or likely to involve much more variation. For example, final packaging and creation of Javadoc do not need to be run constantly, can legitimately take twenty seconds, and may well require special customization. For such operations, usage of Ant seems more appropriate.

Lack of coherent library management

Ant provides no notion of a "library", just low-level classpath customization, so the IDE is forced to create a library management infrastructure of its own and generate Ant targets to use it. The IDE's version has yet to even properly support libraries stored in a version-controlled directory, much less transitive dependencies, discovery of updates, JNI support, license management, etc. Retrofitting such features into the existing code would be very costly. The most glaring lack is integration with the now well-established public repositories of open-source code libraries.

Redundancy and inconsistency between project types

While the several canned project types provide various differentiating features, they also share a large amount of common functionality which is unfortunately not reflected well at the code level. The result is a large degree of code duplication, and numerous undocumented (and often unknown) inconsistencies between project types. This situation can perhaps be addressed piecemeal by introducing new shared support code but such work rarely gets prioritized.

Sharing common functionality in generated Ant scripts is especially troublesome because the abstractions available from Ant are so weak.

Difficulty extending build functionality of project types

While NB 6.0 introduces a limited SPI for inserting fragments of Ant code into existing project types from other modules, it is difficult to make this work consistently across project types, since there is no preexisting standard for how to hook together such fragments. Thus, it remains difficult to add support for special build steps such as obfuscation from third-party modules. Even if the extensibility were made easier, there would remain the deployment problem of bundling and maintaining support JARs needed for such extensions.


Failings of "freeform" projects

Although the "freeform" project type (...from Existing Ant Script) is surprisingly widely used (maybe up to 25% of projects), it has been on the whole a failure for usability.

Conceptual differentiation

Most people who create freeform projects probably should not have. A user has an existing Ant script, sees that there is an option to use it, and takes it, without understanding the implication that the IDE will expect all project configuration to be done via this script. The distinction is poorly explained in the GUI, and the insistence on consistency with the GUIs of the canned project types further confuses things.

The worst problem is that the Classpath customization panel in the GUI gives the mistaken impression that it is analogous to a similarly-named panel in j2seproject, when in fact it only records and does not control the classpath. In particular, people expect the Library Manager to work with this, which makes little sense since an arbitrary Ant script cannot be expected to understand the IDE's library concept; but there is no clearly visible means of associating sources and Javadoc to a JAR in a freeform project, which is done using this manager in j2seproject's. (In fact the manager can be used for this purpose for freeform projects as well, if you are willing to manually duplicate the binary classpath information - but scarcely anyone realizes this.)

Initial setup

Setting up a new freeform project is notoriously cumbersome. It is necessary to manually specify all the Java source packages and their classpaths, or the editor will be unusable. The critical configuration of specifying output directories is not even hinted at in the wizard. There have been proposals to try to partially or fully automate the setup wizard, by inspecting or even trying to run the Ant script, but no progress to date.

Special target setup

Currently all IDE actions on the freeform project must run Ant targets. This extends to compiling individual file selections and running or debugging the program. Since most hand-written scripts lack such targets, the IDE tries in some (not all) cases to generate a target. But its guesses are often wrong and sometimes manual customization is required. While the flexibility for unusual situations is of occasional value (e.g. ability to launch files with certain extensions as if they were main classes), most people do not care and just expect these project operations to be run in a predetermined way from the IDE with no Ant involvement. All interesting customizations would anyway require direct editing of project.xml, which is typically lengthy, and the documentation is nowhere accessible from the IDE.

Inappropriateness of GUI customizer

The Properties dialog for freeform projects was added at the last minute for NB 4.0. It has never come close to being able to represent the range of customizations that can be made in project.xml; worse, using it can silently clobber some customizations. The GUI dialog encourages the user to duplicate information available elsewhere: for example, a classpath specified in a *.properties file should really be loaded as such in project.xml, but this is not an option from the GUI. If everything you really need can be controlled from this dialog, you should have been using a j2seproject to begin with anyway; the advantages of freeform projects are realized only by directly editing project.xml.

Failure to disseminate maintained setups

The original intention was that the first developer to use NB on a complex Ant-based project would spend a little time setting up the project nicely, this configuration would be checked into version control, and colleagues would continue to tweak the setup. Future developers on the project would just check out and open the project. To this end, support was added for loading *.properties files to facilitate one-point maintenance of project configuration which could also be interpreted from Ant, etc.

In practice, after some years very few open-source projects seem to have existing freeform project setups. There are even guides published on netbeans.org on how to configure a well-known software project as a freeform project for use in NetBeans - which would be gratuitous if it were already done for you! It seems people are reluctant to commit the nbproject/ directory to version control, probably in part due to the difficulty in understanding what it means. Since so few such projects exist, there is no good source of examples online, compounding the documentation issue. Many of the freeform projects in available repositories were probably added by people with no knowledge of the project.xml format whatsoever, so are little better than what a novice would anyway create in a few minutes, and in some cases probably a j2seproject would have been the better choice.

Inflexibilities

For the occasional special project which could really use a carefully tuned project definition - e.g. OpenJDK - project format limitations are a serious issue. Some IDE features would really benefit from use of a scripting language in place of Ant and static XML declarations.


General mistakes

There are also some design mistakes which are shared by freeform and canned project types.

XML

Using XML for configuration data can be useful for complex systems, but also introduces a number of headaches. Reading and writing XML data is quite cumbersome (NB has never used JDOM or higher-level libraries). XML namespaces confuse even experienced developers and most versions of structured XML output libraries have never correctly supported namespaces.

Declaring formats using XML Schema seems like the right thing to do, yet few people understand Schema and fewer want to; NB developers have never consistently produced new schema revisions that accurately reflect what the code is actually reading and writing, and nb.org CVS sources persistently contain dozens of projects which do not validate according to their nominal schema.

XSLT has proven a disaster for generating build scripts; depending on the exact library version used, indentation may be lost, attributes reordered, or namespaces even written incorrectly. Writing XSLT scripts to perform nontrivial data transformations is also notoriously hard.

Parsing XML is also quite slow and contributes to the inefficiency of opening many projects at once.

In retrospect, *.properties files would probably have sufficed for many tasks, while being faster, easier to read and write, and more simply versioned.

nbproject/private/

The intention of the nbproject/private/ directory was to capture per-user or per-checkout customizations to a project which should not be shared. The assumptions were that

  1. A combination of the suggestive name, documentation, and integration with the IDE's VCS support would make sure this directory was never checked into VCS (and was also listed in the VCS ignore pattern).
  2. People sharing projects outside VCS, e.g. as ZIP files, would use the IDE to create such archives, and this would automatically exclude the private directory.

In practice, neither of these assumptions held, and nbproject/private/ directories have proliferated in version control systems, even often in nb.org CVS. The impact is very bad because new project checkouts are misconfigured, and attempted fixes result in mysterious VCS commits which weaken confidence in the IDE. (Although hand-written Ant scripts do sometimes have user.properties or site.properties files, these are known to the developers, who know how to deal with them.)

Another problem is that a user moving a project to a new location without using the IDE, which would delete (or correct) the directory as part of the move, can get misconfigured metadata after the move.

In retrospect, while most developers grasp that a subdirectory build/ or perhaps dist/ is a build product which should be ignored, nothing else unsharable should have been stored inside the project directory. It would have sufficed to keep such information in the user directory, tied to an absolute project pathname, together with some simple heuristic to clean up stale metadata for long-deleted projects.

Of course, a major component of these directories today is absolute path names for libraries or other resources required by the project, which ought to be addressed by better sharability management.

Prospects for Maven-based projects

TBD. In principle Maven is far better suited to IDE usage than Ant: its metadata is essentially declarative; it has its own system for managing orderly build process extensions; it has a well-established mechanism for using and versioning libraries and other dependencies; it has a rich library of integrations with useful tools in the Java world that do not need to be invented by an IDE; and there is a reasonable expectation that a preexisting Maven project can be opened directly in the IDE with no special setup.

The drawbacks are reported difficulty tracking down the cause of errors; the uncertainty about the transition to Maven 2 and its API stability; the unacceptable overhead of running Maven for critical-path operations such as compilation; less widespread acceptance than Ant in the developer world; and potential problems providing clear GUI differentiation between different project "types".

It can be expected that there will be plenty of bugs and limitations in Maven as it exists today that would need to be resolved for IDE usage, but this was also true for Ant; it is open-source and it is reasonable to devote some time to contributions to Maven itself.


Prospects for freeform Ant-based projects done differently

AutomaticProjects

Not logged in. Log in, Register

By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo