# Just-in-time Compiling with Numba

This week I worked on implementing a numba version of `PerfForesightConsumerType`

in ConsIndShockNumba.py. The class `PerfForesightConsumerTypeNumba`

is a child class of `PerfForesightConsumerType`

, inheriting most of the existing structure but changing the solver to a `numba`

implementation called `ConsPerfForesightSolverNumba`

.

The class `ConsPerfForesightSolverNumba`

itself inherits from the standard `ConsPerfForesightSolver`

class, but modifies certain methods to allow for `numba`

just-in-time compiling. In particular, the methods `makePFcFunc`

and `addSSmNrm`

are computationally intensive and would benefit most from jit-compiling.

First, I created an global jitted function called `makePFcFuncNumba`

that takes only primitives and (`numpy`

) arrays. While most of the code can stay as in `makePFcFunc`

, `numba`

implementations of `numpy`

ufuncs can only take `numpy`

arrays, and not lists. This is easily adaptable by replacing any `[x,y]`

with `np.array([x,y])`

, as in this instance. Any assignments to objects or code that can not be implemented in `numba`

are left to the class function to take care of, such as creating a `LinearInterp`

instance for `cfunc`

here.

Implementing `addSSmNrmNumba`

took a bit more work. This method uses calls to `cfunc`

, which is an instance of `LinearInterp`

, to find roots using `scipy.optimize.newton`

. Again, `numba`

does not (yet) support user defined classes (although there is `@jitclass`

, which could be interesting to look into) or non-jitted functions such as those from `scipy`

, so we need alternatives to both. As a replacement to `LinearInterp`

, I used interpolation.py from EconForge. This replacement is straightforward; instead of cfunc(m), I take the parameters that created `cfunc`

and use `interp(mNrmNow, cNrmNow, m)`

which returns the interpolated value of the consumption function at `m`

.

In the standard (non-`numba`

) implementation of `ConsPerfForesightSolver`

, `scipy`

’s newton takes 2 parameters, `func`

and `x0`

. This means, according to the documentation, that without `fprime`

, `scipy`

will default to a newton secant algorithm. Fortunately, I know just the library that has jitted optimization functions, QuantEcon. I thus use `quantecon.optimize.newton_secant`

in place of `scipy`

’s newton for root-finding. Another shortcoming of `numba`

is that lambda functions can not be passed as arguments to jitted functions, such as quantecon’s `newton_secant`

. Thus I create an additional `searchSSfuncNumba`

to be used by `newton_secant`

.

These small changes create significant gains in speed, even while I haven’t parallelized or used gpu computing, only just-in-time compiling. For some basic benchmarks on my system (`Intel Core i7-8565U CPU @ 1.80GHz, 1992 Mhz, 4 Cores, 8 Logical Processors, 16.0 GB RAM`

), see `PerfForesightConsumerType.ipynb`

and `PerfForesightConsumerTypeNumba.ipynb`

for a comparison. Using `%timeit`

, the standard solution reports `234 ms ± 82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)`

, while the numba-fied solution comes in at `63.1 ms ± 1.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)`

, meaning the numba version is almost 4 times faster. It is important to note, however, that `%timeit`

runs several loops of the same command and reports averages, but with numba implementations the first call always takes significantly more time, as it performs just-in-time compiling. So if we were to compare first run to first run, the numba implementation would be significantly slower. Nevertheless, `PerfForesightConsumerType`

is a simple model, and as a proof of concept this exercise was instructive.

I will next work on implementing a numba version of `IndShockConsumerType`

. Perhaps with a more complex model the gains from using numba will be more apparent, even at first run.