Simulation Timestep & Substep Profiling

Simulation Timestep & Substep Profiling#

In this notebook we will perform profiling of the simulation substeps and timesteps. By profiling the substeps we can see if there are any performance bottlenecks in our partial state update blocks (one substep for each block), and by profiling the timesteps we can see if there is any performance degradation over the simulation run time.

import project_path
import notebooks.setup
import copy
import logging
from radcad import Engine
from time import time
from cadCAD_tools.profiling.visualizations import visualize_substep_impact, visualize_elapsed_time_per_ts

import experiments.notebooks.visualizations as visualizations
import experiments.default_experiment as default_experiment
from experiments.run import run
WARNING:root:SUBGRAPH_API_KEY not defined
ERROR:root:
def update_run_time(params, substep, state_history, previous_state, policy_input): return ('run_time', time())

measure_run_time_block = {
    'policies': {},
    'variables': {
        'run_time': update_run_time
    }
}
simulation = default_experiment.experiment.simulations[0]
profiled_state_update_blocks = []
profiled_state_update_blocks.append(measure_run_time_block)

for block in simulation.model.state_update_blocks:
    profiled_state_update_blocks.append(block)
    profiled_state_update_blocks.append(measure_run_time_block)
    
simulation.model.state_update_blocks = profiled_state_update_blocks
simulation.model.initial_state.update({'run_time': 0})
default_experiment.experiment.engine = Engine(drop_substeps=False)
df, exceptions = run(default_experiment.experiment)
INFO:root:Running experiment
2024-08-14 12:28:20,031 - root - INFO - Running experiment
INFO:root:Starting simulation 0 / run 0 / subset 0
2024-08-14 12:28:20,059 - root - INFO - Starting simulation 0 / run 0 / subset 0
INFO:root:Experiment complete in 3.8434646129608154 seconds
2024-08-14 12:28:23,876 - root - INFO - Experiment complete in 3.8434646129608154 seconds
INFO:root:Post-processing results
2024-08-14 12:28:23,877 - root - INFO - Post-processing results
INFO:root:Post-processing complete in 2.7345986366271973 seconds
2024-08-14 12:28:26,610 - root - INFO - Post-processing complete in 2.7345986366271973 seconds
df['run_time']
1        1.723639e+09
2        1.723639e+09
3        1.723639e+09
4        1.723639e+09
5        1.723639e+09
             ...     
11156    1.723639e+09
11157    1.723639e+09
11158    1.723639e+09
11159    1.723639e+09
11160    1.723639e+09
Name: run_time, Length: 11160, dtype: float64

By profiling and visualizing the substep (partial state update block) performance, we can see that the last two partial state update blocks that calculate the system metrics take significantly more time than the other partial state update blocks. To improve the model implementation, it might be worth moving the metrics that aren’t needed during runtime into a post-processing step. The downside of this technique is that it would introduce more software complexity in post-processing.

visualize_substep_impact(df, relative=True)
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/cadCAD_tools/profiling/visualizations.py:66: FutureWarning: Not prepending group keys to the result index of transform-like apply. In the future, the group keys will be included in the index, regardless of whether the applied function returns a like-indexed object.
To preserve the previous behavior, use

	>>> .groupby(..., group_keys=False)

To adopt the future behavior and silence this warning, use 

	>>> .groupby(..., group_keys=True)
  fig_df = fig_df.assign(relative_psub_time=fig_df.groupby(indexes[:-1]).psub_time.apply(lambda x: x / x.sum()))

By profiling the elapsed time per timestep, we can see a linear increase in the time since start, or that each timestep has roughly the same execution time throughout the simulation. This is good! It means there’s no performance degradation due to for example an increasing size of state history.

visualize_elapsed_time_per_ts(df, relative=False)

You can also profile the memory use of a Python function or script, but we’ll leave that up to you to implement! https://pypi.org/project/memory-profiler/