Dienstag, 25. September 2012

Performance Apache Common StringUtils Split and Google Guava Splitter

Last Friday I have a discussion with a colleague at SEITENBAU about the semantic of the split method of the Apache common lang StringUtils class. At the end we have compared the Google Guava Splitter API with the Apache commons Lang StringUtils split methods. Our opinion after that is that the source code based on Guava could be better understood and is much clearer.

After comparing the APIs, we have thought about which of the two APIs has the faster split implementation. So I have build simple performance test for the two String split implementations. The result has surprised me. The StringUtils split method is in my test case much faster then the Guava Splitter split method.

Test setup is I generate 5000 random strings with a length of 10000. The test strings contains commas to split the strings in the test. I invoke the Apache common spilt method and the Guava Splitter with the same test data, the performance result is shown in the table bellow.

Test Runs 1 2 3 4
Apache Common
StringUtils.split(…)
126 ms 122 ms 121 ms 122 ms
Google Guava
splitter.split(…)
352 ms 350 ms 346 ms 349 ms


Here the source of my simple performance test:
Why
Has anybody an idea why the Guava API in my test is slower then the StringUtils split method? I read that the Guava Splitter performance should be very good. Therefore, I am surprised about the result.

Here the dependencies I have used for the performance test:

Sonntag, 9. September 2012

Equinox CM and ECF inconsistent Behavior

This evening I had a long OSGi debugging session with the Equinox Configuration Admin and ECF based Remote Services implementation. In the end I found out, it was my mistake used configuration admin wrong. And I update this blog post.

But the problem is that the behavior of the configuration admin depends on which bundles calls the createFactoryConfiguration(...) method and this makes debugging hard.

The problem was that when I create the first time a factory configuration via my remote bundle (which is running on the same system local in other OSGi framework), I become a exception. The remote bundle invokes the method "createWall(…)" see the code sample:

When the first caller of this method was the remote bundle I become the following exception (when the second caller of the configuration admin is a local bundle got the same exception, e.g. when a configuration is created via the apache webconsole):

When the same logic is called first time from a local bundle everything works fine. After some time of debugging, I found the issue. The point is not that the bundle is local or remote.

In the implementation of the method "createWall(...)" the Configuration Admin method "createFactoryConfiguration(...)" with location null is used. The JavaDoc of this method says, when this method is called the first time, with location null, then the location of the first bundle is used which register a service.

OSGi Compendium Specification JavaDoc for the createFactoryConfiguration says:
"The Configuration is bound to the location specified. If this location is null it will be bound to the location of the first bundle that registers a Managed Service Factory with a corresponding PID."

In my case the first bundle was sometimes indirect the remote bundle. So I have the problem and become the exception. My fix was to set the location to the bundle location of the bundle which creates the wall configuration, see the code sample:

In end I found out, that the two OSGi frameworks has the same configuration area, the same local directory. Provider an host are running on the same system. It was a coincidence that the fix work. If the PID for the configuration folder was not locked by the other OSGi instance everything works. So in the end I must say it is my mistake because the OSGi instances had the same directory configured as configuration area. But the design that the location depends on the first caller makes debugging crazy. Does anybody know if this is the expected behavior? This is really hard to debug, that the directory where the configuration is stored could depend on the first caller. What is the best practice set the location parameter always?