On a current project I am working on, I either have to or desire to plot information using the MatPlotLib library. However – the plots generated by the program have no bearing on the remainder of the program. In essence – the plots are generated externally to the program and saved to the disk (and/or displayed). However, when coded serially – the some plots can take a half minute or more to plot.
Here, MatPlotLib was used to plot geospatial data for small portions of a sphere. A separate file called
plotting_toolbox.py was created to store a function and sub functions to plot similar data for the region in question. the
plotting_toolbox.py and it’s main function
plotter(*) is used when I need to plot different attributes in different figures to highlight some aspect of the problem I am solving. At the time of this writing – the code is embargoed. However, the function is built as follows:
def plotter(title, out_file_name=None, roads=None, fire_stations=None, counties=False, display=True, dpi_override=300):
print("Plotting", title, "(", out_file_name, ")")
fig, ax = plt.subplots()
_setup(ax, fig, title, dpi_override)
if roads is not None:
if fire_stations is not None:
if out_file_name is not None:
def _setup(ax, fig, title, dpi_override):
fig.dpi = dpi_override
def _plot_roads(ax, roads):
for road in roads:
x, y = zip(*road)
ax.plot(x, y, 'b,-', linewidth=0.5, zorder=35)
For example, if I wanted a map to display just the fire stations, I may call
plotter('fire stations', out_file_name='fire_stations.png', fire_stations=fs)
If I wanted a map of the roads and counties, I may call
plotter('Roads', out_file_name='roads.png', roads=rds, counties=True)
and so forth.
Depending on the size of the road file and other data in the program, it can take a little bit of time to process and plot the graphs. Using
multiprocessing, we can create a new process where the plotting is done in a separate python process and we can let the computational part of the program continue to run in the original process.
In Python, to import the multiprocessing module, we call:
from multiprocessing import Process
and can use the
Process class as indicated in the Python documentation, such as:
p = Process(target=f, args=('bob',))
In this case, we have multiple optional arguments in the
plotter(*)function. To retain the simplicity of being able to call the function with the optional parameters, I created a new function called
plotter_args_parser(*) which took the identical arguments as
Process Cannot handle optional arguments, the function
plotter_args_parser(*) exists to convert the optional arguments to their default values. It simply returns a
tuple of all the arguments ensuring that the default arguments retain their values if they are default.
def plotter_args_parser(title, out_file_name=None, roads=None, fire_stations=None, counties=False, display=True, dpi_override=300):
return (title, out_file_name, roads, fire_stations, counties, display, dpi_override)
When used in conjunction with
plotter_mp(*), we see:
def plotter_mp(title, out_file_name=None, roads=None, fire_stations=None, counties=False, display=True, dpi_override=300):
print('Plotting with Multiprocessing:', title)
p_args = plotter_args_parser(title, out_file_name, roads, fire_stations, counties, display, dpi_override)
p = Process(target=plotter, args=p_args)
and the plot will be saved and/or output whenever it is finished.
When executed on an AMD A8-7600 3.10ghz 4 core computer with 16.0GB RAM (15.0GB useable) on 64 bit Windows 10, without multiprocessing the program took approximately 10 to 11 minutes to complete. However, when plotting on a separate process (11 images), the process took around 8 to 9 minutes to complete – or about 10-20% in savings. Multiprocessing was also leveraged to read the large data files, however – that only shaved seconds of a serial implementation.
To maintain the flexibility that I had when originally making the
plotter(*) function, I created two wrapper functions, one called
plotter_args_parser(*) and another called
plotter_mp(*), where the former turns the arguments into a tuple and the latter wraps the Process class and lets the new python process do it’s plotting thing until it’s finished.