Optimizing of the plotA3A4 routine¶

In order to make the method plotA3A4 useful in a practical manner, it needs to be somewhat fast in its computation. Otherwise, one will not use it and instead use binning, or even worse another program! And as it is already mentioned in the documentation, the computation time for generating all pixels individually for the methods plotA3A4 and plotQPatches is too long for practical use. Thus, an optimization of the underlying algorithm made sense to pursue. Before changing any code, the end test was set up: By first running the old method and then the new, two results are generated and they are checked to be identical. Is so, and the with a speed-up, the goal is reached.

Thus, before headlessly trying to perform optimization, a test of the current speed is needed. The following is a dump of times for the un-optimized function using four files and one plane with 8 sub-planes (T0Phonon10meV.nxs, T0Phonon10meV90A4InterlaceA3.nxs, T0Phonon10meV93_5A4.nxs, and T0Phonon10meV93_5A4InterlaceA3.nxs, planes 8 through 15):

Function	Time [s]	Uncertainty [s]
testFiles	0.0003143	2.2458e-05
getA3A4	2.8371810e-05	1.0929e-05
getData	0.0561441	0.0118
concatINormMon	0.0405629	0.0068
A4Instr	0.0001758	2.9835-05
genPointsAndBoundary	0.2238357	0.0151
voronoiTessellation	14.070171	1.6175
calcCentroids	0.4847603	0.0366
sortPoints	25.009635	0.7448
calculateQ	3.4248633	0.0778
binPlanes	0.0295210	0.0007
genPatchesAndCollection	8.7695337	0.0920
plotter	0.1842771	0.0093

From this it is clear that three functions are to be looked at: voronoiTessellation, sortPoints, and genPatchesAndCollection.

Voronoi Tessellation subroutine¶

Individual functions are timed by the using my_timing_N decorator. This allows for a partitioning of the program into smaller pieces that are easier to be optimized individually. Also when changing code in these sub-functions, the test was of course that the output is identical to the old one. This method is not the optimal for speeding up the code as it is most probable that changing larger coding structures might allow for a quicker speed-up.

Below is a table of the resulting computational times for the tests used showing a clear speed-up of around 1.7 for real data.

Tests (5 runs)	Original [s]	Optimized [s]	Gain
100000 random points between 0 and 1	10.27 \(\pm\) 0.15	4.45 \(\pm\) 0.08	2.310 \(\pm\) 0.05
1000000 random points between 0 and 1	106.7 \(\pm\) 0.8	47.75 \(\pm\) 0.7	2.235 \(\pm\) 0.04
4 Data files and 8 planes `*`	10.93 \(\pm\) 0.08	6.3 \(\pm\) 0.2	1.72 \(\pm\) 0.06

*: set-up from above.

The speed-up comes mainly from two changes; when generating the return data the intersections points of all of the polygons where recalculated while only the polygons cut by the boundary needed to be calculated. Secondly, when testing if all data points are within the boundary a list comprehension is changed into the vectorized contains function from the shapely.vectorized sub-library.

Optimizing of the plotA3A4 routine¶

Voronoi Tessellation subroutine¶

Table of Contents

Previous topic

This Page