Just-in-time Compiling with Numba

This week I worked on implementing a numba version of PerfForesightConsumerType in ConsIndShockNumba.py. The class PerfForesightConsumerTypeNumba is a child class of PerfForesightConsumerType, inheriting most of the existing structure but changing the solver to a numba implementation called ConsPerfForesightSolverNumba.

The class ConsPerfForesightSolverNumba itself inherits from the standard ConsPerfForesightSolver class, but modifies certain methods to allow for numba just-in-time compiling. In particular, the methods makePFcFunc and addSSmNrm are computationally intensive and would benefit most from jit-compiling.

First, I created an global jitted function called makePFcFuncNumba that takes only primitives and (numpy) arrays. While most of the code can stay as in makePFcFunc, numba implementations of numpy ufuncs can only take numpy arrays, and not lists. This is easily adaptable by replacing any [x,y] with np.array([x,y]) , as in this instance. Any assignments to objects or code that can not be implemented in numba are left to the class function to take care of, such as creating a LinearInterp instance for cfunc here.

Implementing addSSmNrmNumbatook a bit more work. This method uses calls to cfunc, which is an instance of LinearInterp, to find roots using scipy.optimize.newton. Again, numba does not (yet) support user defined classes (although there is @jitclass, which could be interesting to look into) or non-jitted functions such as those from scipy, so we need alternatives to both. As a replacement to LinearInterp, I used interpolation.py from EconForge. This replacement is straightforward; instead of cfunc(m), I take the parameters that created cfunc and use interp(mNrmNow, cNrmNow, m) which returns the interpolated value of the consumption function at m.

In the standard (non-numba) implementation of ConsPerfForesightSolver, scipy’s newton takes 2 parameters, func and x0. This means, according to the documentation, that without fprime, scipy will default to a newton secant algorithm. Fortunately, I know just the library that has jitted optimization functions, QuantEcon. I thus use quantecon.optimize.newton_secant in place of scipy’s newton for root-finding. Another shortcoming of numba is that lambda functions can not be passed as arguments to jitted functions, such as quantecon’s newton_secant. Thus I create an additional searchSSfuncNumba to be used by newton_secant.

These small changes create significant gains in speed, even while I haven’t parallelized or used gpu computing, only just-in-time compiling. For some basic benchmarks on my system (Intel Core i7-8565U CPU @ 1.80GHz, 1992 Mhz, 4 Cores, 8 Logical Processors, 16.0 GB RAM), see PerfForesightConsumerType.ipynb and PerfForesightConsumerTypeNumba.ipynb for a comparison. Using %timeit, the standard solution reports 234 ms ± 82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each), while the numba-fied solution comes in at 63.1 ms ± 1.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each), meaning the numba version is almost 4 times faster. It is important to note, however, that %timeit runs several loops of the same command and reports averages, but with numba implementations the first call always takes significantly more time, as it performs just-in-time compiling. So if we were to compare first run to first run, the numba implementation would be significantly slower. Nevertheless, PerfForesightConsumerType is a simple model, and as a proof of concept this exercise was instructive.

I will next work on implementing a numba version of IndShockConsumerType. Perhaps with a more complex model the gains from using numba will be more apparent, even at first run.

Alan Lujan
Alan Lujan
PhD Candidate in Economics

My research interests include poverty and inequality, household finance, and computational economics.