Wednesday, October 14, 2015

Null is optional

Null reference is commonly used to express the absence of value. Most programming languages allow the use of them without explicit syntax, and that cause problems. Let's take a look at the following java method:

User findUser(String name);

What happens when a user with the given name can't be found? Returning null represents the concept of an absent value in probably the simplest way, but there is a specific risk: the interface doesn't reveal the possibility of a null reference. Why is that a big deal? Null references need handling, otherwise NullPointerException may occur at runtime, a very common error.

Generally not all the values are nullable in a system. Indicating where exactly handling is useful to reduce both risks and unnecessary effort. Meta code or documentation helps, but it's not ideal: it's possible to miss and lacks compile safety.

An interesting alternative to address the issue is using the type system: the option type. Java 8 has introduced Optional for instance. "A container object which may or may not contain a non-null value."

Optional<User> findUser(String name);
This interface communicates clearly that a user may not be present in the result and the client is forced to take that into account:
// findUser("jonhdoe").address(); compile error!
Handling the absent value:
Optional<User> user = findUser("johndoe");
if(user.isPresent()) {
 user.get().address();
}
Note that this example is intentionally kept somewhat similar to how null references are often handled. A more idiomatic way of doing the same:
findUser("johndoe").ifPresent(user -> user.address());

It's interesting to consider the effects of the pattern in the wider context of a system. With consistent use of Optional, it is possible to establish a powerful convention of avoiding the use of null references altogether. It transforms an interface from:

interface User {
  String name();
  Address address();
  BankAccount account();
}
to:
interface User {
  String name();
  Address address();
  Optional<BankAccount> account();
}
Given the convention the client can safely assume a user may not have a bank account and that it always has a name and address. Such practice facilitates changes also: if address becomes optional at some point in the future all client code will be forced to conform.

The Optional type is a nice fit for functional programming, it's api simplifies implementation in many scenarios:

Optional<String> foo = Optional.ofNullable(some3rdPartyNullableVal);
String fooWithDefauult = foo.orElse("No value.. take this default one.");
String guaranteedFoo = foo.orElseThrow(() -> new RuntimeException("I really can't work without a value though"));
if(foo.equals(otherOptional)) { 
  ...
}
Optional<Integer> fooInt = foo.map(Integer::valueOf);

Conclusion

Avoiding implicit nulls can have nice effects on a codebase, making it a much safer place to be. Disadvantages of overusing Optional? Optional is a container object, so the heap consumption needs to be considered in extreme cases. 3rd party libraries may also force the use of some null references.

An example of the use of optional: the java 8's Stream API

Monday, February 23, 2015

Sneaky exceptions

Streams and lambdas are powerfool constructs to facilitate functional style in Java 8. There is a specific inconvenience related to checked exception handling though. I'll illustrate the issue through a simple example: converting a collection of Strings - containing urls - to a collection of URL objects using the Stream API.
List<String> urlStrings = Arrays.asList(
    "http://google.com",
    "http://microsoft.com",
    "http://amazon.com"
);

List<URL> urls = urlStrings.stream().map(URL::new).collect(Collectors.toList());
This would look nice but it won't compile: the constructor of the URL throws a checked exception: MalformedURLException. The map method expects a Function, it's apply method implemented by the lambda expression doesn't declare any checked exceptions:
@FunctionalInterface
public interface Function<T, R> {
    R apply(T t);
}
Placing the stream operation into a try block or declaring the exception in throws of the method containing it wouldn't help anything - the Function / lambda expression itself won't compile. The only viable option is to handle the exception within the Function. This sacrifices quite a bit of brevity:
List<URL> urls = urlStrings
    .stream()
    .map(url -> {
        try {
            return new URL(url);
        } catch (MalformedURLException e) {
            throw new SomeRuntimeException(e);
        }
    }).collect(Collectors.toList());
Compact code is one thing, the bigger issue is the original exception can't be directly propagated - to be handled outside the stream for instance. There is a pattern that may come handy in such situations: the sneaky throw. The intent is to throw the original checked exception as is, but hide it from the method's signature - effectively imitating characteristics of runtime exceptions. In our case this would be useful for defining Functions that may throw checked exceptions that propagate from the stream operation.

The first step is to define an alternative Function that defines checked exceptions.
public class Sneaky {
    @FunctionalInterface
    public static interface SneakyFunction<T, R> {
        R apply(T t) throws Exception;
    }
    ...
}
This allows us to write lambda expressions identical to normal Functions. It won't be useful in itself, the Stream.map() expects a Function implementation. We need a way to transform our SneakyFunction into a Function.
public class Sneaky {
    @FunctionalInterface
    public static interface SneakyFunction<T, R> {
        R apply(T t) throws Exception;
    }

    public static <T, R> Function<T, R> function(SneakyFunction<T, R> function) {
    ...
 }
}
Type erasure can be used to hide the checked exception from the method signature and make it behave just like a runtime exception.
public class Sneaky {
    @FunctionalInterface
    public static interface SneakyFunction<T, R> {
        R apply(T t) throws Exception;
    }

    public static <T, R> Function<T, R> function(SneakyFunction<T, R> function) {
        return o -> {
            try {
                return function.apply(o);
            } catch (Exception e) {
                Sneaky.<RuntimeException>sneakyException(e);
                return null;
            }
        };
    }

    @SuppressWarnings("unchecked")
    private static <T extends Throwable> T sneakyException(Throwable t) throws T {
        throw (T) t;
    }
}
Transforming SneakyFunction to Function allows checked exception throws in the transformation code:
List<URL> urls = urlStrings.stream().map(Sneaky.function(URL::new)).collect(Collectors.toList());
This code compiles and runs fine, will throw MalformedURLException when needed and it's possible to propagate the exception easily. There are simpler and "safer" solutions for this specific problem of course, that's not the point. The need for propagating checked exceptions from Streams is not uncommon, and this pattern can come handy despite it's controversies. The same pattern may be applied to other functional or pseudo-functional interfaces, like the Consumer. Sneaky throws are used in quite a few libraries, and there are some that support sneaky throws, like Project Lombok.

Friday, January 16, 2015

Functional, fluent JDBC - FluentJdbc

Some projects relying on a relational database do just fine using a high-level ORM, some have need for more low-level control over the queries. Using JDBC directly for application development can be really cumbersome.It's so low-level it's better suited for tool- than application development .
 
There are many abstraction layer libraries over JDBC that add much needed syntax sugar and resource handling safety. Even so, sometimes I find it difficult to find a suitable one for the applications I work on. Either because they constrain in some way or have other unwanted properties (like transitive dependencies).

In some cases I implemented abstractions of my own and a library has started to emerge. I decided to collect those ideas and make it a library for easier reuse. A library that makes the most common SQL querying operations as simple, declarative as possible, is light-weight, has no dependencies. Most similar libraries don't make use of a fluent API which makes way for some interesting possibilities in terms of both readability and flexibility. Also I wanted to support operations that work with large amount of data (streaming, batching).

Some of FluentJdbc 's features: 
  • execution of select/insert/update/delete/alter/... statements as one-liners
  • parameter mapping (named, positional, supporting java.time, Optional, plugins for any types)
  • access to generated keys
  • result -> pojo mapping
  • transaction handling
  • big data (streaming select / batch insert)
Check out FluentJdbc's github page for usage details. The artifact is up in maven central, if you want to give it a shot:
<dependency>
    <groupid>org.codejargon</groupid>
    <artifactid>fluentjdbc</artifactid>
    <version>1.3</version>
</dependency>
Note: java 8 only

Tuesday, November 4, 2014

A common pitfall of Domain-Driven Design

The philosophy of domain-driven design - placing the primary focus on the core domain - is a very natural and powerful concept for application design. Despite the core idea being simple, the learning curve is apparently steep. DDD is very often implemented wrongly, an unideal architectural decision can lose the potential of obtaining and taking advantage of a rich domain model.

There is a specific pattern I've come across surprisingly often in systems built on the notion of DDD. I think it's useful to know about this anti-pattern, and why/how systems end up implementing it. It is to navigate a design to wrong directions.

Given some domain, the development starts with a single layer including some prototype of a domain model. Then the need for some infrastructure comes into picture - most commonly some sort of persistence solution. That's placed into another layer with the good intention of keeping responsibilities separate.
The result is this:
Domain model prototype
This won't work well, there is a problem with the dependency. The domain model depends on the infrastructure layer to do its job. The infrastructure layer can't access the domain objects - to persist them for instance - due to the direction of the dependency. There are many ways to solve this:
  • Pushing part of the model to the infrastructure layer. Responsibility leak, something to be avoided.
  • Transfer objects defined in the infrastructure layer. The domain will still be concerned with infrastructure specifics. 
  • Lets assume a somewhat better alternative is chosen: moving the domain model to it's own layer:
Dedicated domain model

The infrastructure trouble is solved and layer design of many applications stops at this point. In my experience it is a really common pattern. It looks like the real deal with the domain model having a dedicated layer. So let's take a look at the issues:

The domain model can't access the infrastructure layer. As a result, parts of the domain model that rely on infrastructure - actually most of it - will have to be moved up to the application service layer. Gradually - to keep things consistent, symmetrical - most logic will be moved out from the domain layer, leaving mere property bags behind with little or no logic. The end result of all the efforts is the classic case of an anemic domain model:
Anemic domain model
 
Where did things go wrong? As it turns out, at the very beginning, when the infrastructure layer was introduced. In DDD the domain model is supposed to have the main focus, yet it was made dependent of a layer with infrastructural responsibilities. Subsequent efforts try to address this and while it succeeds on the surface level, sacrifices are made on domain model.

The main problem was caused be the violation of the dependency inversion principle: a higher level module (domain) should not be dependent on a lower level one (infrastructure). This problem can be properly solved by the dependency inversion pattern:
Rich domain model

This simple change makes a world of difference: obstructions are out of the way of building a rich domain model, implementing all the domain functionality. Other layers become thin - no domain logic in them - and can concentrate on their own resonsibilities (eg application services or infrastructure integration details).

In the context of tackling high-complexity problems with DDD, this pattern described above is definitely something to look out for. It's very easy to compromise a domain model with constraints imposed by fundamental layering issues.

Monday, February 10, 2014

Coding style

Style has an important role when it comes to writing clean, readable code. Conforming to conventions and best practices is a good starting point. In addition to that there is still plenty of room for choice. Those depend greatly on the level of the developers and personal preferences.

There is a very specific danger with conventions and best practices. Most of them are distilled experience of the industry and are there for good reasons. Some of them may be questionable but are rarely - if ever - questioned. They get incorporated into the coding style and without conscious effort to pinpoint them become invisible. I think all developers should investigate their code from time to time and inspect even the most trivial elements for such issues.

A couple of things that I think are worthy to look at:

Getter/setter methods
Since IDEs can generate them so easily, these are very often abused. Some developers have the habit of generating them right away, throwing away the most important element of object oriented programming - encapsulation - without further thought. Mutators are created on objects that could be immutable. The get...() prefix is often meaningless, and actually harms readability:
  person.getName();
  person.getAddress();
  person.getMobileNumber(); 
  
  person.name();
  person.address();
  person.mobileNumber();
Constant naming conventions
The convention is to use uppercase names which can be quite unreadable - especially if there are many long-named constants in a class. Unless it's part of the api, the fact that a field is constant is an implementation detail.

Final keyword on local variables
Some static code analyis tools show warnings when local variables are not marked as final - if they've been assigned only once. While it adds value in some specific cases, if applied on all local variables without reason, the clutter it introduces can make the code harder to read. One can argue their rigorous use is good in long and complex methods, then the problem is the complex method in the first place.

Exception suffix
A really controversial one, but I think it's worth to at least toy with the idea. Almost all exceptions have the ...Exception suffix in their names. This is actually not really necessary and it does add some noise. Note that in whatever context we encounter an exception, the syntax makes it already obvious:
  • Definition of an exception - Something extends Exception
  • Throwing an exception - throw new Something
  • Declaration of an exception thrown - throws Something
  • Catching an exception - catch(Something)
Dropping the suffix can work on exceptions that are named as events, not just plain nouns - this is btw a useful practice on it's own right. Compare FileNotFound vs FileNotFoundException. Event-based names and lack of suffix applied throughout can have nice effects in a codebase. Some libraries do this, eg javassist, bouncycastle.

These are just some specific examples, the important part is to notice things we just do automatically and be concious about the reason why we're doing them, making sure they actually make sense.

Wednesday, June 6, 2012

Refactoring legacy systems

Working on a greenfield project once in a while is nice and liberating. More often though, maintenance of legacy systems is the reality. People define the term "legacy system" differently. A naive - and common - definition is legacy = old. Since the industry evolves so quickly, old must be bad as a consequence. I think there is more to it than that. There are new systems - most of them actually - that can be considered legacy even before they're made.

Systems are built to satisfy requirements. Requirements have the tendency to change often. Ideally a system can be feasibly adapted to these changes. My definition of legacy is simple: when such adapting is not feasible. This can happen for various different reasons: bad internal quality, badly chosen or obsolete technology, compatibility constraints, and a lot more.

The pace systems turning into legacy can be controlled and some improvements can be made even when the damage is already done. Compatibility constraints (like APIs) can be tough to challenge, changing certain technologies may be difficult. Improving internal quality through continuous refactoring is usually available as a viable option. In refactoring, the technical todo is usually the easy part. The hard part is how to do it in the context of development process, especially if new functionality needs to be continuously delivered. In such a scenario, several ideas can come into mind.

One is to make small changes at a time. This approach can be problematic if the development team is not disciplined enough. Bad code is viral and tends to reinfect the already healed parts. Also there are certain types of changes that can't be reasonably split into several smaller changes. A bit more elaborate strategy is called for. An enticing option is to create bubbles: carefully separate a part of the system from the others, refactor the bubble to it's "ideal" state. Eventually bubbles grow, merge and take over the system. This is a most useful approach, and can result in success.

These bubbles have interesting pitfalls though. The anti corruption layer that protects a bubble has its cost. If the bubble is left alone, not nurtured actively, the additional complexity it introduces becomes a burden to the system. In some cases it's worse to live with it than with the problems of original system. Fragmented pieces of hanging refactorings can be expensive.

Friday, May 6, 2011

Mocking frameworks vs Unit Testing style

I've never paid much attention on what mocking framework I use. I've used EasyMock and JMock, I observed no big difference between the two really. Recently I started to change some existing JMock unit tests to use Mockito instead. It seemed to have a more elegant interface, and it turned out this had other implications on my tests than mere syntactic sugar. It actually gave some new insight on the subject.

A little toy code to illustrate this:
public interface GreetingFormula {
 String formula();
}

public class Greeting {
 private GreetingFormula gf;

 public Greeting(GreetingFormula gf) {
  this.gf = gf;
 }

 public String greet(String name) {
  return gf.formula() + " " + name;
 }
}
In my Unit Test for Greeting I want a Mock object for the GreetingFormula interface. I want to verify that formula() has been actually called, and also want to define some return value for it's method. A JMock implementation would look like this:
public class GreetingTest {
 private final static String formulaText = "Hello";
 private final static String name = "Mr. Doe"
 private final static String expectedGreeting = "Hello Mr. Doe";

 private Mockery mockingContext;
 private GreetingFormula greetingFormula;
 private Greeting greeting;
 
 @Test
 public void greetPutsNameIntoFormula() {
  mockingContext.checking(new Expectations() {
   {
    one(greetingFormula).formula();
    will(returnValue(formulaText));
   }
  });
  Assert.assertEquals(expectedGreeting, greeting.greet());     
 }

 @Before
 public void setUp() {
  mockingContext = new JUnit4Mockery();
  greetingFormula = mockingContext.mock(GreetingFormula.class);
  greeting = new Greeting(greetingFormula);    
 }
}
The solution may look ok at first glance but there is a subtle issue here. In expectations two very different concepts are mixed together: how the mock object should behave (return the right value), and what should be verified on the mock object afterwards (it has been called exactly once). Lets take a look at how the same thing is done using Mockito.
public class GreetingTest {
 private final static String formulaText = "Hello";
 private final static String name = "Mr. Doe"
 private final static String expectedGreeting = "Hello Mr. Doe";

 private GreetingFormula greetingFormula;
 private Greeting greeting;
 
 @Test
 public void greetPutsNameToFormula() {
  when(greetingFormula.formula()).thenReturn(formulaText);
  assertEquals(expectedGreeting, greeting.greet());
  verify(greetingFormula).formula();     
 }

 @Before
 public void setUp() {
  greetingFormula = mock(GreetingFormula.class);
  greeting = new Greeting(greetingFormula);    
 }
}
Mockito enforces the clear separation of the concepts of mock object behavior and verification by its design, there is simply no other valid way of doing it. This might not look like a big deal - on a simple example like this - but in fact it is. Unit test code with a lot of mockery does gain a lot of clarity simply by expressing these steps in the right order:

  • mock behaviour

  • perform test

  • mock verification