Continuous Updating

What is Continuous Updating?

Continuous Updating means the regular update of your software projects. Mainly it focuses on libraries used. Like guava, Spring, Java.

But it also means in my opinion to regularly question your already made design decisions, architecture or in short everything. So ask yourself the question, if you would get the time and budget to rebuilt your current system.
What would you change if you would start from fresh?!

Keeping up2date with libraries

Sadly we all know how updating of libraries are handled. Usually you create your maven project and add the dependencies which are needed and then you never touch the library version again except if you need a new feature or need to update it because of another dependency.

Continuous Updating of libraries is very important for following reasons:

  • You don’t come to a point when updating is getting really difficult and painful to update
  • You get security fixes
  • You get bug fixes you might not even know of
  • You get performance improvements
  • You get feature updates for free
    • Syntax features. Like API extensions to support Lambdas
    • Evolution of a feature, because we are no cave men anymore. For example. New date time API with Java 8.
    • small or big new extensions of what a library already offers. For example guava added one more eviction strategy.
  • Updating other depending libraries will be a smaller step
  • You have a community which can help you if something goes wrong

But wait

Updates can also break your system. I mean it can introduce bugs, security holes or might not yet be ready for use in an productive environment.

In the end you need to update at some point. But is it directly at the release of Java 9? Do you wait until the first bugfix release?

Well the answer is like always. It depends how high is the risk it will break something?

You need to lower the risk with:

  • good (>90%) test coverage with unit and integration tests. Otherwise you  might not catch the one special usecase where it’s broken.
  • Have a good as close to production test environment
  • Small software project is updated at a time. If the project is small (e.g. micro service), then the risk and impact is simple to tell.
  • For productive environment.
    • Have multiple server running with different versions of the library in question.
    • Use the feature toggle pattern to only effect a small amount of users, if the library change is like a major change in how it works. Like from apache http client 3 to 4.

Depending on the amount of risk involved you choose how much effort you are willing to invest to mitigate the risk.

Wait library updating – we forgot the bigger picture

Library updating means also to check if the library meets still our standards. Or are you still at log4j 1 instead of logback or log4j 2?

That means for me:

  • Living community that maintains it and still produces regular releases
  • Best suited for your usecase
  • State of the art/current best practice for what it does. E.g. Guava caching should be replaced by caffeine.
  • Software license is still usable in your usecase. E.g. library switched to GPL and you use it for commercially.

So throw away the log4j 1. Because frankly it’s ancient history. 🙂

The same also counts for your server software, OS, everything!

Like many people nowadays switched from their ancient JBoss 4 to micro services. Keep up2date with modern design patterns. But also don’t jump on the train just because it’s fancy looking or it seems to solve everything.

In the end the software you use needs to be more than just marketing

Posted in common practice | Tagged , , , | Leave a comment

Unknown Java Features Part 2: TimeUnit

Simple Way to use different time units

TimeUnit is the simplest way to work with different time units since Java 5.

Aren’t you annoyed about writing all the time how to convert e.g. minutes to milliseconds? Like “5 * 60 * 1000”?

Sure it’s no rocket science, but it’s difficult to read. The next step is often to extract it as a constant. Then you at least know immediately that’s 15 minutes.

But there is an easier way to express the same thing. The TimeUnit enum. So let’s see it in action.


With TimeUnit you just express the unit directly and you can still extract it to a constant if needed.

Wanna go to sleep, wait or join?

There are shortcut functions for those as well:



TimeUnit is an easy way to convert between time units and express it more clearly. Shortcuts methods like sleep round things up.

Posted in common practice, Java, Unknown feature | Tagged , | Leave a comment

Unknown Java Features Part 1: BitSet

BitSet – the holy grail of working with bits

What does it solve?

java.util.BitSet solves the memory consumption of boolean[]. Adding a tiny CPU overhead. And offering a lot of helpful methods. It’s available since Java 1.0.

Boolean and memory in detail

Have you ever wondered why a boolean takes up 1 byte instead of 1 bit? Of course because of the way the memory is accessed. Because in the memory everything is aligned to fit your architecture to 4 or 8 bytes depending on 32bit or 64 bit system. So even 1 byte boolean consumes 8 bytes if necessary. Further I will only calculate the 64bit consumption to make it simpler.

An improvement is an array or an Object. Because the space taken by other attributes will fill up to the 8 bytes or a multiple of 8 bytes.

Comparing different memory consumptions:

java.lang.Boolean (not cached): 4 bytes pointer to instance + 16 bytes instance header +  1 byte boolean + 7 bytes padding to be at 8 bytes = 28 bytes per boolean

8 booleans in an object (don’t use it. Just to make a point): 4 byte pointer + 16  byte header + 8 * 1 byte boolean= 28 / 8 = 3.5 bytes per boolean

boolean[8] : 4 bytes pointer + 16 bytes instance header + 8 bytes special array header + 8*1 byte boolean = 36 / 8 = 4.5 bytes per boolean

Of course it doesn’t make sense to make an object of x booleans just to save some memory. But it’s good to know that memory is automatically saved here if you got multiple boolean, short, etc attributes in your object.

How does BitSet solves the memory consumption?

In short BitSet saves the booleans as bits. It does that using internally a long[]. And it has two additional attributes a boolean and a int. I ignore any constants here of course.

Size comparison boolean[] versus BitSet

Let’s have a higher calculation size of 1000 booleans.

boolean[1000]: 4 bytes pointer + 16 bytes header + 8 bytes special array header + 1000*1 boolean = 1028 bytes/1000 = 1.028 bytes per boolean

new BitSet(1000): 4 bytes pointer + 16 bytes header + 16 byte for boolean, int, pointer to long[] and padding + long[16] to save 1000 bits  (24+ 8*16=152 bytes) = 188 bytes/1000 = 0.188 bytes per boolean

For 1000 booleans BitSet saves 840 bytes. You can have 5 BitSets for one boolean[]! Of course with more booleans this will increase even further.

What does it costs?

To map the booleans to a bit. We need some CPU cycles. But the overhead is quite low to be honest. Some bit shifting here. Some range checks there. But of course a boolean array access is way more faster to access in terms of micro optimization. But in this case the memory benefit outperforms the CPU costs in probably most cases.

Wait what else does BitSet offer?

BitSet offers all methods you can think of for boolean.

  • Automatic size increase. The size is always doubled!
  • Setting bits to true/false in range
  • Finding next/previous clear(false) or set(true) bit
  • inverting bits
  • logical calculations with other BitSets! (AND, XOR, intersect…) – be aware a bitset is immutable. You always get a new instance returned.

When should you use it?

Here are some rule of thumbs:

  • When it makes your code more readable
  • When you benefit from using the methods, instead of implementing them yourself (don’t reinvent the wheel)
  • When you got a lot of booleans in an array otherwise
  • When you do logical calculations (XOR, OR, AND)

When shouldn’t you use it?

  • When every nano second in read access counts more than a lot of memory usage
  • When it’s more obvious to use a boolean[] for the domain you are in
    • When your boolean array size is only in the hundreds. Then the memory advantage is only minimal. So it’s probably not worth it.


  • In general favor BitSet over boolean[]. Like you would favor ArrayList over []. Except that BitSet is not just a wrapper.
  • memory usage is quite high of boolean [] instead of BitSet. Will have 5 (1000 booleans) to nearly 8 times memory advantage. Depending on the size.
  • On smaller sizes use what makes the code more readable
Posted in Unknown feature | Tagged , , , , , , , | 2 Comments

Linux – a fight against prejudices

Reaction on Linux

If I tell someone that I use Linux I get many disturbed prejudices reactions.

  • laughed at
  • spending too much time get the OS running
  • taken it as a joke

The reality on Linux

My current situation

A year ago I switched with my main Desktop PC at home to Linux (xubuntu at first, now with Mint). For the time being I still had Windows 7 installed, but I rarely used it. Some months later after no usage of Windows I removed it completely and converted remaining partitions to ext4.

What do I miss?

  • Nothing and apart from that
  • that some applications don’t support Linux. But then I either don’t buy it or if I have to use wine to get it running.
  • that some hardware vendors doesn’t support Linux. That only happens rarely nowadays and mostly for periphery devices.
  • better graphic driver support

What I don’t miss?

  • Viruses / Virus scanner
  • adware bloated applications (Linux has only a few top)
  • Install Windows updates now/automatic restart
  • changing USB ports because it isn’t recognized any more
  • Do you want to switch to Windows Basic Theme
  • Pressing Shift x times message
  • Changing keyboard layout by default by pressing alt shift l
  • Changing display rotation by shortcut
  • Internet Explorer, which is used built-in in many applications
  • Windows 8
  • The monopol of document files
  • and probably many more 🙂
  • No administrator account by default like in Windows XP

What do I like on Linux?


You are free to customize your Linux the way you like it. Beginning with the distribution flavour, desktop design, terminal software, software versions (ranging from bleeding edge to hardened) and many more possibilities.

Free software?

If we talk about freedom, then of course we have to talk about free software. But what sadly many people don’t get that the main focus is not on free in terms of money, but free in terms of source code aka open source. I have no problem paying money for an application. I often do that buying Linux games on Steam or Humble Bundle.

Open Source has the big advantage that you can see for yourself what the code does, change it or fix it. Or you can use existing software to enhance your own application.


I like how you can write commands, have auto-completion and have a wide variety of different terminal applications. I like to use mplayer to play my videos and music.

And many more…

  • Many simple tools for daily jobs
    • Simple Scan. There’s probably no easier scanner software out there.
    • youtube-dl. Simple command line tool to download youtube videos.
  • plenty of games. Yet there could be more. Steam counts 1132 games.
    • Steam box on its way with more native games to come
    • new opengl api Vulkan
  • network configuration profiles
  • Multiple workspaces (coming with Windows 10)
  • Desktop Notifications service (coming with Windows 10)
  • file system organization
    • separation of user content and everything else is already in the data structure
      • easy to backup a user
  • no gui needed for servers


Linux is a cool feature rich, customizable OS. It is a real competitor to all other OS out there. Even if only 1.5 % users world wide are using Linux. In absolute numbers these are still a lot of people.


Posted in Linux | Tagged | Leave a comment

Autoboxing Performance


Because of my previous post about autoboxing I get many search queries about autoboxing performance. Now I’m curious, too. So let’s get ready for some benchmarks.

When we talk about boxing performance, then we always have two sides. Throughput and memory consumption.

Throughput with Arrays

For micro benchmarking I use JMH framework, which I already described in my other post here.

To have a bigger test set I create an array of 2 million entries. We measure the throughput (operations/second).


The result (ops/s, higher is better)

Benchmark Score Error
testIntegerBoxing 176 919 ± 5 843
testIntegerInclUnboxing 193 915 ± 4 597
testPrimitiveIntegers 521 617 ± 5 347

Primitives seems to be 2-3 times faster then it’s Integer equivalent. But wait a minute why is boxing and unboxing faster than only boxing? The reason seems to be the handling of Integer for array and return value.

But still we can create nearly 200 k  arrays the size of 2 million in one second using autoboxing and Integer arrays! But we are very deep in the basic operations where some milliseconds can have a big impact in the overall application performance.

Now I’ve done the same with the byte datatype. The result is the following:

Benchmark Score Error
testByteBoxing 398 484 ±  3 886
testByteInclUnboxing 1029 260 ±  7 629
testPrimitiveByte 1025 612 ± 10 827

That seems even more weird. The performance loose is here only because of handling Byte[] and returning it instead of byte[]. Boxing has here no performance penalty, because all Byte values are cached in the JVM internally. The same value range is cached for all other wrapper data types, too.

Memory consumption

Explanation of WORD

The memory consumption of objects, data types in the JVM is pretty straight forward. The most important factor is the architecture data type “WORD“. WORD is dependent of your used system architecture and operating system. In most modern PCs this is currently either 64 bit or 32 bit. This defines what bite size the CPU can handle.

Wrapper Types

Boolean, Byte

If autoboxing (or manual valueOf) is used there is no need to create a new instance, because they are already cached in the VM. Every JVM has to cache this.

But we still need a pointer (“oop”) to the instance.

32bit: 4 byte pointer
64bit: 4 or 8 byte pointer (see Hotspot enhancements)

Short, Integer, Character, Float

-128 to 127 is cached. Character is unsigned so only 0 to 127. Float and Double values have no cache.
The rest always yields in a new instance. A Short, Character variable needs 4/8 bytes, too. Because it has to fit in the word size (padding). This has the same impact for int in 64bit environment.

32bit: 4 bytes pointer + 8 bytes instance header + 4 bytes content = 16 bytes
64bit: 4/8 bytes pointer + 16 bytes instance header + 8 bytes content = 28/32 bytes

Long, Double

32bit: 4 bytes pointer + 8 bytes instance header + 8 bytes content = 24 bytes
64bit: 4/8 bytes pointer + 16 bytes instance header + 8 bytes content = 28/32 bytes


All data types will have at least word size. Therefore…

boolean, byte, short, char, int, float

32bit: 4 byte
64bit: 8 byte

long, double

32bit, 64bit: 8 byte

primitives in arrays

In arrays primitives retain their size, but they still have to fit to the WORD size.

Some examples:

boolean[299]: 299 * 1 +12/24 header + 4/8 pointer = 315/331 bytes + padding = 316 / 336 bytes
int[123]: 123 * 4 + 12/24 header + 4/8 pointer = 508/524 + padding = 508/528 bytes

object sizes

Of course objects have to fit to 4/8 bytes. But internally attributes can retain their sizes. That means primitives have their original 1 to 8 byte and objects probably have a reduced 4 byte pointer in 64 bit systems (see Hotspot enhancements).

For example:


Looking at the flat memory consumption (ignoring the object attributes itself) we have following consumption:

32bit: 4 (id) + 4 (name) + 4 (lastLogin) + 1 (enabled) + 2 (short) = 15 + padding = 16
most 64bit: 16
rare 64bit: 4 + 8 + 8 + 1 + 2 = 23 + padding = 24

all with wrapper instead of primitives:

32 bit: 5 * 4 = 24
most 64 bit: 24
rare 64 bit:  5 * 8 = 35

Hotspot enhancements

Compressed Oops

This feature reduces the pointer in 64bit system from 8 byte to 4 byte, if the heap is smaller than 32 GB. This is simply done, because the pointer doesn’t refer any more the hardware address. Instead the pointer is an offset from the start of the heap space. And it does only refer to objects and not byte.


Boxing is still quiet fast. I suggest to use primitives for mandatory attributes in a object, for heavy CPU calculations and for last resorts in bottleneck fights.

I wouldn’t recommended to use primitives for optional attributes, because that creates evil magic numbers. Of course in very rare cases where every tiny bit of performance counts this might be needed too.

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

The difference on a higher scale

Let’s say we have an JEE application with 5 million instances where the mandatory id is a Integer instead of an primitive int.

Assuming we have a 64bit system with 4 byte pointer:
Integer: 28 * 5 million = 140 000 000 bytes
int: 4 * 5 million = 20 000 000

Makes a difference of 120 MB we can safe.

Computer setup

CPU: 4 core Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz
Java: Java 8 update 25 64bit
OS: Linux Mint 17.1 64bit
Kernel: 3.13.0-37-generic
IDE used: eclipse Lunar

Posted in Java, Performance | Tagged , , , , | 2 Comments

micro benchmarking with openjdks jmh

Micro benchmarking in Java

What is micro benchmarking?

It’s for benchmarking very small parts of logic. Small like e.g. arraylist.add or string.split.

Why do we have a problem with micro benchmarking?

Often you encounter some blog post where someone claims that x is faster than y. It is faster in the execution the author has done, but there are many circumstances to take into consideration before you really can say that x is faster than y.
Especially for Java, where we have a Virtual Machine, JIT, GC, etc, which makes getting scientific micro benchmarking results rather difficult.

Solution – use JMH framework for micro benchmarking

There are some tools out there, which make it easy to write micro benchmarkable code. I will here focus on jmh. This tool is developed by the openjdk team. So we can expect that they know the VM internals and its pitfalls.

As described on their main page just grab the maven archetype “jmh-java-benchmark-archetype” and you are good to go. An archetype is like a blueprint for your maven project.

You can integrate it into eclipse like any other maven project. Just be aware that validation is only done if you build the project with maven. The validation is very important. Otherwise you start the application out of your IDE and wonder why it doesn’t work like expected.

micro benchmarking

After the archetype is generated you have a class named MyBenchmark with a method testMethod. It looks like:


Let’s start micro benchmarking

To start you can just fill the provided testMethod with code to benchmark. For example:

You may ask why did I change the method signature? It’s important to return the value otherwise we have dead code, which will be removed by the compiler.

Now build it

Now execute “mvn clean install” to build it in maven, which additionaly does some validation regarding jmh. So if something goes wrong let maven build.

Or you can just press save in eclipse. 😛

Execute it

Use either “java -jar target/benchmarks.jar”.

Or execute the main method org.openjdk.jmh.Main

Now the benchmark is executed in the default configuration. This takes pretty long, because many warmup iterations and many measurement iterations are done. But with that setup you are ready to go and do some simple benchmarking.

Enhance your benchmark

There are several ways to enhance your benchmarks. First of all I recommend the samples provided by jmh. They are really cool and explain themselves in their javadoc. Yet I will go into details of some points.

Make your own configuration of how to micro benchmark

20 warmup iterations, 20 measured iterations. Man that takes time. I don’t need it to be that exact. So let’s do our own configuration. Just create our own micro benchmark configuration using a main method. Let maven build again and we are ready to go 😉


Everything you see in the first lines of the console can be changed in the main method. For example:

Here we use AverageTime instead of Throughput. So we get the execution time of the method.
It is important to change timeunit too, otherwise you may get 0 seconds everywhere, because your timeunit is too small.

But imo throughput mode is better for micro benchmarking, because it often goes down to nanoseconds, which is just is too fast to measure right.

Add another method for micro benchmarking

Just add another method and you can directly compare two different implementations. It’s important that you often have to rebuild with maven!

Don’t forget in Throughput mode higher is better.

Now we can conclude in this micro benchmark that splitting with apache commons for two chars separator is better than Java split. But for one char separator Javas core implementation is similar in performance.

Code for above benchmark:


Posted in Tooling | Tagged , , , , , | 2 Comments is now

Standing still is going backwards. That’s why I decided to move to my own domain and take matters into my own hand.

If you miss a feature on the website you can leave a comment here any time.

Posted in Uncategorized | Tagged | Leave a comment

Singleton Pattern

The singleton pattern is one every developer must know. It is a class which has only one instance, which is handled inside it. You can only get the instance, but never (reflection etc excluded) create a new instance.


  • It can be used multithreaded without further notice, because the state is handled in the one instance
  • The implementation is good encapsulated.
  • single point of access


  • leads to a god class with too many code and too many responsibilities
  • Unit-Testing can be difficult
    • you may need a setter method for the instance or mocking framework
  • Hides dependencies
  • The possibilities of extensions are limited.

Now what?

The question is do you really need a singleton?

In some cases you would instantiate it anyway only one time (e.g. in the field declaration) of the calling class, then just make a normal class. “You ain’t gonna need it” (the singleton).

You can create a usual interface and implementation and as default implementation use the singleton you would have used. Then inject it where you need it.

Don’t get me wrong I really like the Singleton Pattern, but you should ask yourself  if it is really needed. Maybe it can be avoided. Here you can find maybe some more alternatives.

Posted in common practice, Java, tips | Tagged , , | Leave a comment

Use Jaxb where possible

I was asked in an job interview, if I know something about DOM or SAX parsing. I know something about it, but I have only used it once. Why should I bother myself with this, when I just can use JAXB?
What is JAXB? I would say it is an XML Object Mapper.
JAXB is the simplest way to parse and write XML-files. Since Java 6 it is even included in the Java Runtime. JAXB of course needs a scheme as reference otherwise how should it now its mapping?

To use it you need data classes, which have JAXB annotations. The easiest way to do this, is to let JAXB generate it for you. You can even automatic this process in your build with various maven plugins.

And what about the Performance? The performance is great it even can outperform DOM. SAX is still better, but it may not worth the hassle to get the performance benefit. Yes if you have many files or very big files to process, then SAX is a better choice because of on the fly processing and lower overhead, but usually you don’t need that.


Of course for simple XML files JAXB is too much. Here even I would just parse it using SAX.

Use JAXB where possible, especially for complex XML/XSD files. For simple files I would still use SAX, but JAXB can also be a good option.


Update 1

Clarified when SAX is useful.

Posted in Java, tips | Tagged , , | Leave a comment

Java 7 is here…’s about time

Finally after a long time Java 7 is finally here. It has many features.

The most interesting features in my opinion are:

  • NIO 2, which is a real revolting to Java IO
  • Syntax extensions: Automatic resource cleanup(instead of close in finally), String in switch, binary literals, literals separator and shorter generic write
  • new Garbage Collector as default

You can download it from oracles download site for developer:

The default user download site still points to Java 6.

Posted in News | Tagged , , | 1 Comment