JSR-175… Great spec for the language, totally ignores tools

Java really needs annotations so we can stop shipping around all these files that are representing one logical entity. JSR-175 addresses the need on the declaration side but fails to produce a processing model.

I love the new annotations. We have been waiting for them for a long time at BEA and in the mean-time have been using Javadocs to do the markup. See my previous blog entry about EJBGen for an example. Being able to move that markup into the language is a great advantage and will let us catch some of the errors at compile time that we normally catch when someone runs EJBGen on their source file.

However, after reading the specification, I feel there is a large gaping hole in it. There is no specification for the processing of these annotations in the document. They don’t talk about what an end user needs to do once they have an annotated source file and a set of annotations in order to completely process that source file. If I have a .java file I know I need to run javac. If I have a .g file I know to run Antlr. Since all these files are going to say .java I think I should just have to run javac and only javac. The annotations should tell javac what tools are needed. Additionally, they do not talk about how these annotated classes are efficiently found at runtime. Let’s run through some scenarios that I would like to see addressed:

1) End user has a set of SOURCE annotations and a source file with those annotations. How will this be processed?
2) End user has a set of annotations and a source file with annotations that are semantically incorrect because there is some conflict between two annotations that is not syntactically wrong. When is the error caught?
3) Javac decides to recompile a source file with annotations when it was told to compile a dependent class. How does the end user know to rerun the annotation processing tool?
4) Class annotations have some global effect. How does a runtime efficiently find these classes and apply the global effect? For instance, a servlet with its registration annotated within it a la XDoclet.
5) I am given a Jar file that contains annotations and classes that use those annotations. How do I determine the tool or runtime requirements to use that jar file?
6) Given a tool that needs to process the .jar file in 5) how do I determine which classes have annotations that I am interested in without scanning every class?

I don’t like putting up these scenarios without some idea how I would handle them if I were writing the specification. For the cases where we need to process or reprocess annotations with a tool or report errors at compilation time I believe that the best solution is to make javac extensible. Annotation tool jars could be configured for javac’s use and they would register themselves as interested in annotations. Annotations would be able to specify whether or not a tool is required to be registered in order for them to work properly. The API for these tools would include the ability to abort the compilation with an error if they determine that any of their internal semantics wrt to the annotations they find are in violation. It would also give the required hook for reprocessing classes that are recompiled by javac without scanning the all of the class files and comparing them with whatever the output of the processing tool is.

For instance, if I try and compile a new annotated RMI class and there is not RMI stub generator registered then I would get an error that the compilation could not complete. In this case there may be a default registration that uses the JDKs RMIC to process the file but in the case of say WebLogic it would register an empty processor because we do all the work at runtime.

In order to quickly find annotated classes within a classpath or a Jar file I believe that a manifest or index should be generated by Javac detailing all the classes that have annotations and the types of those annotations. Otherwise, lets say we have a system like XDoclet where you specify the servlet path for a servlet class. Everytime I deploy a webapp I would have to scan all the classes in the entire classpath to determine if there are any servlets that need to be registered. This clearly requires some sort of index mechanism, probably at the package.class level and additionally at the jar manifest level that aggregates all the package.class information.

When given a jar file with annotated classes and annotations the there should be some way to indicate in the jar file what runtime or compile time processor is required in order to use the classes. Simply handing these out without any programmatic way to determine dependencies is liable to be frustrating to the end user. There are dependency mechanisms build into the Jar format that can be exploited for this.

I feel like JSR-175 is only handling half the problem. Can you imagine if the other Java specifications simply gave you a data format without specifying how that data leads to processing? It’s like God made a new creature with no ecosystem for it to live in.

P.S. Other problems of note include the requirement for the annotations for compilation. Javac could generate fake annotations based on usage. You would have to get rid of default annotations though.