(Note: This is the second part of a whitepaper I wrote a couple of years ago which I never had a chance to publish in full. You’ll find the first part here.)
Although the functional programming community struggles to agree on an exact definition of what makes a language inherently functional, an agreed common ground exists:
- Computation via functional composition
- Higher-order functions
- Favouring immutability
Functional languages are also declarative, which means they focus on what operations should be executed, rather than how they should be executed. This distinction immediately puts them at odds with the dominant object-oriented/imperative approach to software development and more inline with other declarative languages like SQL, HTML, CSS and the formulae of Excel.
Not surprisingly, the role of functions takes centre stage in the definition of functional languages. While most languages come with similar ways of creating named blocks of code (e.g., methods, procedures, subroutines, etc.), functional languages define this term very specifically. In this world, a function:
- Creates no side effects
- References no global state
- Is referentially transparent
Here is a simple Java function that violates the strict definition of pure functions by outputting content to the console, which is a form of side effect.
A Java function with a side-effect.
A side effect is any action that changes the execution environment that is not observable through the value returned by the function. Other common examples of side effects include updating databases or other data stores within the function. Code producing side effects is much harder to reason about because the scope of change produced by the function is so wide.
Here is another version of the previous function that removes the side effect but accesses global state in the form of a collaborator that provides the current exchange rate.
A Java function using non-local state.
As with side effects, to reason about how this function works, we also need to understand how the collaborator behaves, increasing the complexity of the problem significantly.
The final version of this function is one that includes no side effects and references no global state.
A pure Java function.
This function is therefore referentially transparent, which means it will return exactly the same value every single time it is evaluated. Referential transparency offers enormous benefits to both the programmer and compiler in terms of caching, parallelisation and other performance optimisations.
The lack of side effect and global state is what classifies a function as a pure function and languages that only provide pure functions are called pure functional languages (e.g., Haskell). In practice, many functional languages are impure and developers make disciplined use of side effects for things like logging and database access. Pure functional languages like Haskell use concepts like IO monads (see http://www.haskell.org/tutorial/io.html) to perform these types of operations while maintaining referential transparency and without the downside of side effects.
Using functions as the centrepiece of computation is not the only key differentiator for functional languages. Functions are first class in these languages and, as such, can be passed as arguments to other functions, as well as returned as values from functions. The combination of these properties defines higher-order functions.
The judicious use of higher-order functions is what provides much of the expressiveness of functional languages and removes the need for much of the boilerplate code found in non-functional languages. Classic examples of this are performing some arbitrary processing (e.g., sorting, filtering, collecting, etc.) on a list of values. Solutions in imperative languages require explicit looping constructs (e.g., a “for” or “while” loop) and references to the start and end of the list and with the processing logic buried inside of the loop. Conversely, functional languages provide a number of higher-order functions that contain the logic necessary to iterate through the loop. Calling these functions requires an additional predicate function that performs the necessary business logic: all loop iteration initialization and control is abstracted away within the higher-order function.
Below is a trivial example of how Java mimics some of the benefits of higher order functions, by using interfaces (in this case java.util.Comparator) to provide function objects. In this example, the ListSorter parameter to the sortUsingComparator method exists only for its implementation of the compare method.
Using Java interfaces to mimic a higher order function.
Another key difference between functional and imperative/object-oriented languages comes from their respective approaches to state management. Imperative/object-oriented systems are built around the notion of ongoing changes to the state of the software at runtime. Conversely, functional systems emphasize immutability of state and seek to minimize the scope of state that is mutable.
At runtime, a Java system is a network of objects with state, represented by the value of the object’s attributes. Over time, different pieces of code will read and write these attributes, mutating the state of the object and its enclosing system. In order to understand what state a particular object is in, you need to understand what values its attributes contain.
By contrast, functional systems maintain no mutable state. At runtime there is no equivalent of a Java object’s attributes that are changed over time. “Variables” can be initialized with values, but once this initialization is done, the values are persisted unless the variables are re-created. Re-assignment is not permitted so the value of a variable doesn’t vary across its lifetime.
Immutability helps greatly when you need to reason about a piece of code within a system, and one of the earliest times this reasoning needs to occur is when you are writing test cases. If you need to test a Java method that takes two arguments and returns a single value (see below), the number of logical test cases is tied to the set of values the two arguments can contain. If that same method also uses the value of one of its object’s attributes, then all the values of that attribute now need to be considered. And the more this method relies on its state, the harder it is to reason about its behaviour.
Conversely, a function in a language such as Haskell will only act upon its arguments and return a result. Building the test cases for this function is a much smaller, quicker and simpler task.
Immutability is not a property uniquely associated with functional languages. Programmers can use discipline to create immutable data structures in any language (more on this in a later post) and several non-functional languages include immutable data structures by default or via third-party libraries.