Implementing and Using Memoization in PHP
Memoization is a simple optimisation technique to understand and in most cases implement. The base idea is to speed up function calls by avoiding the re-calculation of previously processed input results (very cache-like). Storing these results in a key-value lookup store can result in major speed increases when repetitive function calls occur. Dynamic programming algorithms such as the Knapsack problem benefit greatly from using this technique. However, the trade off is that due to its caching underpinnings, functions with side-effects and reliance on external factors such as the current time may return incorrect results. Having spent sometime looking into languages such as Groovy which include this functionality out of the box, I wished to see if it was possible to create an implementation in PHP. Further research found php-memoize which looks like a great C module, showing promise in implementing the concept at a language library level. Though I am impressed with the discussed module I wished to see if it was possible to create an implementation in PHP userspace.
Below depicts a simple example implementation of a memoization function which takes advantage of PHP’s first-class function support.
Using a static ‘cache’ array allows us to keep a persistent lookup table between calls of the returned function. In conjunction with this, hashing the serialised arguments array allows us to create a unique key we can set and lookup in the associative array per call. The method supports either anonymous or string identified functions, using the ‘call_user_func_array’ method.
Now that we have an implementation it is time to benchmark its performance, to see the resulting gains. Below is a simple function which prints out the resulting time duration taken for the supplied function to run.
We can now use the above function to benchmark the performance of the side-effect free ‘sleepz’ function declared below.
As you can deduce from the output, the second call to the memoized function is significantly quicker than the first. Comparing the non-memoized and first memoized function calls results in very similar time duration, as they are doing the same work. Once the function calls result has been stored in the memoized cache, the second call simply needs to lookup the result hit and return the contents. At an API level this call looks the same as any other, but thanks to the cache, performance gains occur.
One issue that appears when you spend a little time with the above implementation are recursive function calls. Unlike other languages this high-level implementation is unable to rewrite internal function calls, however, to get around this we are able to use PHP’s first-class function support. Declaring use of a reference to the function assignment variable successfully allows us to recursively call the memoized implementation. This detail now requires the code logic be present in an anonymous function. As this is a thought exercise rather than a production ready implementation I am happy with the capabilities currently available in the language.