Experiment README

Experiment README#

Table of Contents#

Overview of Experiment Architecture
Experiment Workflow
Creating New, Customized Experiment Notebooks
- Step 1: Select an experiment template
- Step 2: Create a new notebook
- Step 3: Customize the experiment
- Step 4: Execute the experiment
Advanced Experiment-configuration & Simulation Techniques

Overview of Experiment Architecture#

The experiment architecture is composed of the following four elements – the model, default experiment, experiment templates, and experiment notebooks:

The model is initialized with a default Initial State and set of System Parameters defined in the model module.
The default experiment – in the experiments.default_experiment module – is an experiment composed of a single simulation that uses the default cadCAD model Initial State and System Parameters. Additional default simulation execution settings such as the number of timesteps and runs are also set in the default experiment.
The experiment templates – in the experiments.templates module – contain pre-configured analyses based on the default experiment. Examples include experiments.templates.time_domain_analysis (simulation in the time-domain over a period of 5 years) and experiments.templates.eth_price_sweep_analysis (simulation in the phase-space sweeping over discrete ETH Price values).
The experiment notebooks perform various scenario analyses by importing existing experiment templates, optionally modifying the Initial State and System Parameters within the notebook, and then executing them.

Experiment Workflow#

If you just want to run (execute) existing experiment notebooks, simply open the respective notebook and execute all cells.

Depending on the chosen template and planned analysis, the required imports might differ slightly from the below standard dependencies:

# Import the setup module:
# * sets up the Python path
# * runs shared notebook-configuration methods, such as loading IPython modules
import setup

# External dependencies
import copy
import logging
import numpy as np
import pandas as pd
import plotly.express as px
from pprint import pprint

# Project dependencies
import model.constants as constants
import experiments.notebooks.visualizations as visualizations
from experiments.run import run
from experiments.utils import display_code

WARNING:root:SUBGRAPH_API_KEY not defined

ERROR:root:

We can then import the default experiment, and create a copy of the simulation object – we create a new copy for each analysis we’d like to perform:

import experiments.default_experiment as default_experiment
import experiments.templates.time_domain_analysis as time_domain_analysis
import experiments.templates.eth_price_eth_staked_grid_analysis as eth_price_eth_staked_grid_analysis

simulation_analysis_1 = copy.deepcopy(default_experiment.experiment.simulations[0])
simulation_analysis_2 = copy.deepcopy(time_domain_analysis.experiment.simulations[0])
simulation_analysis_3 = copy.deepcopy(eth_price_eth_staked_grid_analysis.experiment.simulations[0])

We can use the display_code method to see the configuration of the default experiment before making changes:

display_code(default_experiment)

"""
The default experiment with default model Initial State, System Parameters, and Simulation Configuration.

The defaults are defined in their respective modules:
* Initial State in `model/state_variables.py`
* System Parameters in `model/system_parameters.py`
* Simulation Configuration in `experiments/simulation_configuration.py`
"""

from radcad import Simulation, Experiment, Backend

from model import model
from experiments.simulation_configuration import TIMESTEPS, DELTA_TIME, MONTE_CARLO_RUNS


# Create Model Simulation
simulation = Simulation(
    model=model,
    timesteps=TIMESTEPS,
    runs=MONTE_CARLO_RUNS
)
# Create Experiment of single Simulation
experiment = Experiment([simulation])
# Configure Simulation & Experiment engine
simulation.engine = experiment.engine
experiment.engine.backend = Backend.SINGLE_PROCESS
experiment.engine.deepcopy = False
experiment.engine.drop_substeps = True

Modifying State Variables#

To view what the Initial State (radCAD model-configuration setting initial_state) of the State Variables are, and to what value they have been set, we can inspect the dictionary as follows:

pprint(simulation_analysis_1.model.initial_state)

{'amount_slashed': 0,
 'attestation_penalties': 0,
 'average_effective_balance': 32000000000.0,
 'base_fee_per_gas': 1,
 'base_reward': 0,
 'block_proposer_reward': 0,
 'eth_price': 1251.477131147541,
 'eth_staked': 33927258.04728536,
 'eth_supply': 116250000.0,
 'head_reward': 0,
 'network_issuance': 0,
 'number_of_active_validators': 1058287,
 'number_of_awake_validators': 1058287,
 'number_of_validators_in_activation_queue': 0,
 'pow_issuance': 0,
 'source_reward': 0,
 'stage': None,
 'supply_inflation': 0,
 'sync_committee_penalties': 0,
 'sync_reward': 0,
 'target_reward': 0,
 'timestamp': None,
 'total_base_fee': 0,
 'total_network_costs': 0,
 'total_online_validator_rewards': 0,
 'total_priority_fee_to_miners': 0,
 'total_priority_fee_to_validators': 0,
 'total_profit': 0,
 'total_profit_yields': 0,
 'total_realized_mev_to_miners': 0,
 'total_realized_mev_to_validators': 0,
 'total_revenue': 0,
 'total_revenue_yields': 0,
 'validating_penalties': 0,
 'validating_rewards': 0,
 'validator_cloud_costs': array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]]),
 'validator_costs': array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]]),
 'validator_count_distribution': array([[0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0]]),
 'validator_eth_staked': array([[0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0]]),
 'validator_hardware_costs': array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]]),
 'validator_profit': array([[0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0]]),
 'validator_profit_yields': array([[0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0]]),
 'validator_revenue': array([[0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0]]),
 'validator_revenue_yields': array([[0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0]]),
 'validator_third_party_costs': array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]]),
 'validator_uptime': 1,
 'whistleblower_rewards': 0}

To modify the value of State Variables for a specific analysis, you need to select the relevant simulation and update the chosen model Initial State. For example, updating the eth_supply Initial State to 100e6 (100 million ETH):

simulation_analysis_1.model.initial_state.update({
    "eth_supply": 100e6, 
})

Modifying System Parameters#

To view what the System Parameters (radCAD model configuration setting params) are, and to what value they have been set, we can inspect the dictionary as follows:

pprint(simulation_analysis_1.model.params)

{'BASE_FEE_MAX_CHANGE_DENOMINATOR': [8],
 'BASE_REWARD_FACTOR': [64],
 'CHURN_LIMIT_QUOTIENT': [65536],
 'EFFECTIVE_BALANCE_INCREMENT': [1000000000.0],
 'ELASTICITY_MULTIPLIER': [2],
 'MAX_EFFECTIVE_BALANCE': [32000000000.0],
 'MAX_VALIDATOR_COUNT': [None],
 'MIN_PER_EPOCH_CHURN_LIMIT': [4],
 'MIN_SLASHING_PENALTY_QUOTIENT': [64],
 'PROPORTIONAL_SLASHING_MULTIPLIER': [2],
 'PROPOSER_REWARD_QUOTIENT': [8],
 'PROPOSER_WEIGHT': [8],
 'SYNC_REWARD_WEIGHT': [2],
 'TIMELY_HEAD_WEIGHT': [14],
 'TIMELY_SOURCE_WEIGHT': [14],
 'TIMELY_TARGET_WEIGHT': [26],
 'WEIGHT_DENOMINATOR': [64],
 'WHISTLEBLOWER_REWARD_QUOTIENT': [512],
 'base_fee_process': [<function Parameters.<lambda> at 0x7fc945b7a9d0>],
 'daily_pow_issuance': [13527.628415300547],
 'date_eip1559': [datetime.datetime(2021, 8, 4, 0, 0)],
 'date_pos': [datetime.datetime(2022, 9, 15, 0, 0)],
 'date_start': [datetime.datetime(2024, 8, 14, 12, 27, 44, 11384)],
 'dt': [225],
 'eth_price_process': [<function Parameters.<lambda> at 0x7fc945b6b700>],
 'eth_staked_process': [<function Parameters.<lambda> at 0x7fc945b6b820>],
 'gas_target_process': [<function Parameters.<lambda> at 0x7fc945b7ac10>],
 'mev_per_block': [0],
 'priority_fee_process': [<function Parameters.<lambda> at 0x7fc945b7aaf0>],
 'slashing_events_per_1000_epochs': [1],
 'stage': [<Stage.ALL: 1>],
 'validator_cloud_costs_per_epoch': [array([0.     , 0.00027, 0.     , 0.     , 0.00136, 0.     , 0.     ])],
 'validator_hardware_costs_per_epoch': [array([0.0014, 0.    , 0.    , 0.0007, 0.    , 0.    , 0.    ])],
 'validator_percentage_distribution': [array([0.37, 0.13, 0.27, 0.05, 0.02, 0.08, 0.08])],
 'validator_process': [<function Parameters.<lambda> at 0x7fc945b6b940>],
 'validator_third_party_costs_per_epoch': [array([0.  , 0.  , 0.12, 0.  , 0.  , 0.15, 0.12])],
 'validator_uptime_process': [<function Parameters.<lambda> at 0x7fc945b7a5e0>]}

To modify the value of System Parameters for a specific analysis, you need to select the relevant simulation, and update the chosen model System Parameter (which is a list of values). For example, updating the BASE_REWARD_FACTOR System Parameter to a sweep of two values, 64 and 32:

simulation_analysis_1.model.params.update({
    "BASE_REWARD_FACTOR": [64, 32],
})

Executing Experiments#

We can now execute our custom analysis and retrieve the post-processed Pandas DataFrame using the run(...) method:

df, exceptions = run(simulation_analysis_1)

INFO:root:Running experiment

2024-08-14 12:27:46,059 - root - INFO - Running experiment

INFO:root:Starting simulation 0 / run 0 / subset 0

2024-08-14 12:27:46,062 - root - INFO - Starting simulation 0 / run 0 / subset 0

INFO:root:Starting simulation 0 / run 0 / subset 1

2024-08-14 12:27:46,114 - root - INFO - Starting simulation 0 / run 0 / subset 1

INFO:root:Experiment complete in 0.10578322410583496 seconds

2024-08-14 12:27:46,166 - root - INFO - Experiment complete in 0.10578322410583496 seconds

INFO:root:Post-processing results

2024-08-14 12:27:46,167 - root - INFO - Post-processing results

INFO:root:Post-processing complete in 0.25049495697021484 seconds

2024-08-14 12:27:46,417 - root - INFO - Post-processing complete in 0.25049495697021484 seconds

Post-processing and Analysing Results#

We can see that we had no exceptions for the single simulation we executed:

exceptions[0]['exception'] == None

True

We can simply display the Pandas DataFrame to inspect the results. This DataFrame already has some default post-processing applied (see experiments/post_processing.py)

df

	stage	timestamp	eth_price	eth_supply	eth_staked	supply_inflation	network_issuance	pow_issuance	number_of_validators_in_activation_queue	average_effective_balance	...	target_reward_eth	head_reward_eth	block_proposer_reward_eth	sync_reward_eth	whistleblower_rewards_eth	amount_slashed_eth	daily_revenue_yields_pct	cumulative_revenue_yields_pct	daily_profit_yields_pct	cumulative_profit_yields_pct
1	4.0	2024-08-14 12:27:44.011384	1251.477131	9.999926e+07	33886784.0	-0.002693	-737.355367	0	0	3.200000e+10	...	1032.257086	555.830738	324.099556	81.024889	0.014063	0.1125	0.008023	0.008023	0.007244	0.007244
2	4.0	2024-08-15 12:27:44.011384	1251.477131	9.999853e+07	33908384.0	-0.002687	-735.760076	0	0	3.200000e+10	...	1032.915064	556.185034	324.306142	81.076536	0.014063	0.1125	0.008023	0.016046	0.007244	0.014488
3	4.0	2024-08-16 12:27:44.011384	1251.477131	9.999779e+07	33929984.0	-0.002682	-734.164786	0	0	3.200000e+10	...	1033.573041	556.539330	324.512729	81.128182	0.014063	0.1125	0.008022	0.024068	0.007243	0.021731
4	4.0	2024-08-17 12:27:44.011384	1251.477131	9.999706e+07	33951584.0	-0.002676	-732.569496	0	0	3.200000e+10	...	1034.231019	556.893626	324.719315	81.179829	0.014063	0.1125	0.008022	0.032090	0.007243	0.028974
5	4.0	2024-08-18 12:27:44.011384	1251.477131	9.999633e+07	33973184.0	-0.002670	-730.974206	0	0	3.200000e+10	...	1034.888997	557.247921	324.925902	81.231475	0.014063	0.1125	0.008021	0.040111	0.007243	0.036217
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
717	4.0	2025-08-04 12:27:44.011384	1251.477131	9.931584e+07	41554784.0	-0.006841	-1860.343153	0	0	3.200000e+10	...	569.080456	306.427938	178.675183	44.668796	0.014063	0.1125	0.003840	1.450916	0.003287	1.249498
718	4.0	2025-08-05 12:27:44.011384	1251.477131	9.931398e+07	41576384.0	-0.006839	-1859.625962	0	0	3.200000e+10	...	569.376262	306.587218	178.768057	44.692014	0.014063	0.1125	0.003840	1.454756	0.003287	1.252785
719	4.0	2025-08-06 12:27:44.011384	1251.477131	9.931212e+07	41597984.0	-0.006836	-1858.908771	0	0	3.200000e+10	...	569.672067	306.746498	178.860932	44.715233	0.014063	0.1125	0.003840	1.458595	0.003287	1.256071
720	4.0	2025-08-07 12:27:44.011384	1251.477131	9.931026e+07	41619584.0	-0.006834	-1858.191580	0	0	3.200000e+10	...	569.967873	306.905778	178.953806	44.738452	0.014063	0.1125	0.003839	1.462435	0.003286	1.259358
721	4.0	2025-08-08 12:27:44.011384	1251.477131	9.930840e+07	41641184.0	-0.006831	-1857.474389	0	0	3.200000e+10	...	570.263679	307.065058	179.046681	44.761670	0.014063	0.1125	0.003839	1.466274	0.003286	1.262644

720 rows × 156 columns

We can also use Pandas for numerical analyses:

# Get the maximum validating rewards in ETH for each subset
df.groupby('subset')['validating_rewards'].max() / constants.gwei

subset
0    2827.106753
1    1413.553376
Name: validating_rewards, dtype: float64

Visualizing Results#

Once we have the results post-processed and in a Pandas DataFrame, we can use Plotly for plotting our results:

# Plot the total validating rewards in ETH for each subset
px.line(df, x='timestamp', y='validating_rewards_eth', facet_col='subset')

# Plot the individual validating rewards in ETH for each subset
visualizations.plot_validating_rewards(df, subplot_titles=["Base Reward Factor = 64", "Base Reward Factor = 32"])

Creating New, Customized Experiment Notebooks#

If you want to create an entirely new analysis, you’ll need to create a new experiment notebook, which entails the following steps:

Step 1: Select an experiment template from the experiments/templates/ directory to start from. If you’d like to create your own template, the example_analysis.py template gives an example of extending the default experiment to override default State Variables and System Parameters that you can copy.
Step 2: Create a new notebook in the experiments/notebooks/ directory, using the template.ipynb notebook as a guide, and import the experiment from the experiment template.
Step 3: Customize the experiment for your specific analysis.
Step 4: Execute your experiment, post-process and analyze the results, and create Plotly charts!

Advanced Experiment-configuration & Simulation Techniques#

Setting Simulation Timesteps and Unit of Time `dt`#

from experiments.simulation_configuration import TIMESTEPS, DELTA_TIME, SIMULATION_TIME_MONTHS

We can configure the number of simulation timesteps TIMESTEPS from a simulation time in months SIMULATION_TIME_MONTHS, multiplied by the number of epochs in a month, and divided by the simulation unit of time DELTA_TIME:

SIMULATION_TIME_MONTHS / 12  # Divide months by 12 to get number of years

1.0

DELTA_TIME is a variable that sets how many epochs are simulated for each timestep. Sometimes, if we don’t need a finer granularity (1 epoch per timestep, for example), we can then set DELTA_TIME to a larger value for better performance. The default value is 1 day or 225 epochs. This means that all our time-based states will be for a period of 1 day (we call this “aggregation”), which is convenient.

DELTA_TIME

TIMESTEPS is now simply the simulation time in months, multiplied by the number of epochs in a month, divided by DELTA_TIME:

TIMESTEPS = constants.epochs_per_month * SIMULATION_TIME_MONTHS // DELTA_TIME

TIMESTEPS

Finally, to set the simulation timesteps (note, you may have to update the environmental processes that depend on the number of timesteps, and override the relevant parameters):

simulation_analysis_1.timesteps = TIMESTEPS

Considerations When Performing Efficient Phase-space Simulations#

In simulation_analysis_3, timesteps is decreased to 1, but dt is increased to TIMESTEPS * DELTA_TIME, where DELTA_TIME is the full duration of the simulation. This produces the final result in a single processing cycle, producing the full phase-space with very low processing overhead. This is achieved by ignoring all time-series information between the beginning and end of the simulation.

There is a test function test_dt(...) in tests/test_integration.py that can be used to verify that no information is lost due to the approximations taken along the time axis for the specific State Variables that you are interested in, and that your custom code has not introduced mechanisms that might not work well with this kind of approximation.

An example of a type of mechanism that would not work with this kind of approximation is a mechanism that implements some form of feedback loop.

Changing the Ethereum Network Upgrade Stage#

The model operates over different Ethereum-network upgrade stages. The default experiment operates in the “post-merge” Proof of Stake stage.

Stage is an Enum; we can import it and see what options we have:

from model.types import Stage

The model is well documented, and we can view the Python docstring to see what a Stage is, and create a dictionary to view the Enum members:

print(Stage.__doc__)
{e.name: e.value for e in Stage}

Stages of the Ethereum network upgrade process finite-state machine

{'ALL': 1, 'BEACON_CHAIN': 2, 'EIP1559': 3, 'PROOF_OF_STAKE': 4}

The PROOF_OF_STAKE stage, for example, assumes the Beacon Chain has been implemented, EIP-1559 has been enabled, and POW issuance is disabled:

display_code(Stage)

class Stage(Enum):
    """Stages of the Ethereum network upgrade process finite-state machine"""

    ALL = 1
    """Transition through all stages"""
    BEACON_CHAIN = 2
    """Beacon Chain implemented; EIP1559 disabled; POW issuance enabled"""
    EIP1559 = 3
    """Beacon Chain implemented; EIP1559 enabled; POW issuance enabled"""
    PROOF_OF_STAKE = 4
    """Beacon Chain implemented; EIP1559 enabled; POW issuance disabled"""

As before, we can update the “stage” System Parameter to set the relevant Stage:

simulation_analysis_1.model.params.update({
    "stage": [Stage.PROOF_OF_STAKE]
})

Performing Large-scale Experiments#

When executing an experiment, we have three degrees of freedom - simulations, runs, and subsets (parameter sweeps).

We can have multiple simulations for a single experiment, multiple runs for every simulation, and we can have multiple subsets for every run. Remember that simulation, run, and subset are simply additional State Variables set by the radCAD engine during execution – we then use those State Variables to index the results for a specific dimension, e.g. simulation 1, run 5, and subset 2.

Each dimension has a generally accepted purpose:

Simulations are used for A/B testing
Runs are used for Monte Carlo analysis
Subsets are used for parameter sweeps

In some cases, we break these “rules” to allow for more degrees of freedom or easier configuration.

One example of this is the eth_price_eth_staked_grid_analysis experiment template we imported earlier:

display_code(eth_price_eth_staked_grid_analysis)

"""
# ETH Price / ETH Staked Grid Analysis

Creates a cartesian product grid of ETH price and ETH staked processes, for phase-space analyses.
"""

import numpy as np
import copy
from radcad.utils import generate_cartesian_product_parameter_sweep

from model.state_variables import eth_staked, eth_supply, eth_price_max
from experiments.default_experiment import experiment, TIMESTEPS, DELTA_TIME


# Make a copy of the default experiment to avoid mutation
experiment = copy.deepcopy(experiment)

sweep = generate_cartesian_product_parameter_sweep({
    # ETH price range from 100 USD/ETH to the maximum over the last 12 months
    "eth_price_samples": np.linspace(start=100, stop=eth_price_max, num=20),
    # ETH staked range from current ETH staked to minimum of 2 x ETH staked and 30% of total ETH supply
    "eth_staked_samples": np.linspace(start=eth_staked, stop=min(eth_staked * 2, eth_supply * 0.3), num=20),
})

parameter_overrides = {
    "eth_price_process": [
        lambda run, _timestep: sweep["eth_price_samples"][run - 1]
    ],
    "eth_staked_process": [
        lambda run, _timestep: sweep["eth_staked_samples"][run - 1]
    ]
}

# Override default experiment parameters
experiment.simulations[0].model.params.update(parameter_overrides)
# Set runs to number of combinations in sweep
experiment.simulations[0].runs = len(sweep["eth_price_samples"])
# Run single timestep, set unit of time to multiple epochs
experiment.simulations[0].timesteps = 1
experiment.simulations[0].model.params.update({"dt": [TIMESTEPS * DELTA_TIME]})

Here, we create a grid of two State Variables – ETH price and ETH staked – using the eth_price_process and eth_staked_process.

Instead of sweeping the two System Parameters to create different subsets, we pre-generate all possible combinations of the two values first and use the specific run to index the data, i.e. for each run we get a new ETH price and ETH staked sample.

This allows the experimenter (you!) to use a parameter sweep on top of this analysis if they choose, and we have kept one degree of freedom.

Composing an Experiment Using simulations, runs, and subsets#

from radcad import Experiment, Engine, Backend


# Create a new Experiment of three Simulations:
# * Simulation Analysis 1 has one run and two subsets – a parameter sweep of two values (BASE_REWARD_FACTOR = [64, 32])
# * Simulation Analysis 2 has one run and one subset – a basic simulation configuration
# * Simulation Analysis 3 has 400 runs (20 * 20) and one subset – a parameter grid indexed using `run`
experiment = Experiment([simulation_analysis_1, simulation_analysis_2, simulation_analysis_3])

Configuring the radCAD Engine for High Performance#

To improve simulation performance for large-scale experiments, we can set the following settings using the radCAD Engine. Both Experiments and Simulations have the same Engine; when executing an Experiment we set these settings on the Experiment instance:

# Configure Experiment Engine
experiment.engine = Engine(
    # Use a single process; the overhead of creating multiple processes
    # for parallel-processing is only worthwhile when the Simulation runtime is long
    backend = Backend.SINGLE_PROCESS,
    # Disable System Parameter and State Variable deepcopy:
    # * Deepcopy prevents mutation of state at the cost of lower performance
    # * Disabling it leaves it up to the experimenter to use Python best-practises to avoid 
    # state mutation, like manually using `copy` and `deepcopy` methods before
    # performing mutating calculations when necessary
    deepcopy = False,
    # If we don't need the state history from individual substeps,
    # we can get rid of them for higher performance
    drop_substeps = True,
)

# Disable logging
# For large experiments, there is lots of logging. This can get messy...
logger = logging.getLogger()
logger.disabled = True

# Execute Experiment
raw_results = experiment.run()

Indexing a Large-scale Experiment Dataset#

# Create a Pandas DataFrame from the raw results
df = pd.DataFrame(experiment.results)
df

	stage	timestamp	eth_price	eth_supply	eth_staked	supply_inflation	network_issuance	pow_issuance	number_of_validators_in_activation_queue	average_effective_balance	...	total_network_costs	total_revenue	total_profit	total_revenue_yields	total_profit_yields	simulation	subset	run	substep	timestep
0	NaN	NaT	1251.477131	1.000000e+08	3.392726e+07	0.000000	0.000000	0	0	3.200000e+10	...	0.000000e+00	0.000000e+00	0.000000e+00	0.000000	0.000000	0	0	1	0	0
1	4.0	2024-08-14 12:27:44.011384	1251.477131	9.999926e+07	3.388678e+07	-0.002693	-737.355367	0	0	3.200000e+10	...	3.303383e+05	3.402462e+06	3.072124e+06	0.029304	0.026459	0	0	1	15	1
2	4.0	2024-08-15 12:27:44.011384	1251.477131	9.999853e+07	3.390838e+07	-0.002687	-735.760076	0	0	3.200000e+10	...	3.305396e+05	3.404459e+06	3.073919e+06	0.029302	0.026457	0	0	1	15	2
3	4.0	2024-08-16 12:27:44.011384	1251.477131	9.999779e+07	3.392998e+07	-0.002682	-734.164786	0	0	3.200000e+10	...	3.307408e+05	3.406455e+06	3.075714e+06	0.029301	0.026456	0	0	1	15	3
4	4.0	2024-08-17 12:27:44.011384	1251.477131	9.999706e+07	3.395158e+07	-0.002676	-732.569496	0	0	3.200000e+10	...	3.309421e+05	3.408452e+06	3.077510e+06	0.029299	0.026455	0	0	1	15	4
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2598	4.0	2024-08-14 12:27:44.011384	4181.280000	1.159975e+08	3.477524e+07	-0.002204	-252484.052470	0	0	3.200000e+10	...	2.780804e+08	4.146644e+09	3.868564e+09	0.028933	0.026993	2	0	398	15	1
2599	NaN	NaT	1251.477131	1.162500e+08	3.392726e+07	0.000000	0.000000	0	0	3.200000e+10	...	0.000000e+00	0.000000e+00	0.000000e+00	0.000000	0.000000	2	0	399	0	0
2600	4.0	2024-08-14 12:27:44.011384	4181.280000	1.159962e+08	3.482512e+07	-0.002215	-253841.316072	0	0	3.200000e+10	...	2.778517e+08	4.140969e+09	3.863117e+09	0.028852	0.026916	2	0	399	15	1
2601	NaN	NaT	1251.477131	1.162500e+08	3.392726e+07	0.000000	0.000000	0	0	3.200000e+10	...	0.000000e+00	0.000000e+00	0.000000e+00	0.000000	0.000000	2	0	400	0	0
2602	4.0	2024-08-14 12:27:44.011384	4181.280000	1.159975e+08	3.487500e+07	-0.002204	-252533.998373	0	0	3.199999e+10	...	2.782246e+08	4.146435e+09	3.868211e+09	0.028849	0.026913	2	0	400	15	1

2603 rows × 52 columns

# Select each Simulation dataset
df_0 = df[df.simulation == 0]
df_1 = df[df.simulation == 1]
df_2 = df[df.simulation == 2]

datasets = [df_0, df_1, df_2]

# Determine size of Simulation datasets
for index, data in enumerate(datasets):
    runs = len(data.run.unique())
    subsets = len(data.subset.unique())
    timesteps = len(data.timestep.unique())
    
    print(f"Simulation {index} has {runs} runs * {subsets} subsets * {timesteps} timesteps = {runs * subsets * timesteps} rows")

Simulation 0 has 1 runs * 2 subsets * 361 timesteps = 722 rows
Simulation 1 has 1 runs * 1 subsets * 1081 timesteps = 1081 rows
Simulation 2 has 400 runs * 1 subsets * 2 timesteps = 800 rows

# Indexing simulation 0, run 1 (indexed from one!), subset 1, timestep 1
df.query("simulation == 0 and run == 1 and subset == 1 and timestep == 1")

	stage	timestamp	eth_price	eth_supply	eth_staked	supply_inflation	network_issuance	pow_issuance	number_of_validators_in_activation_queue	average_effective_balance	...	total_network_costs	total_revenue	total_profit	total_revenue_yields	total_profit_yields	simulation	subset	run	substep	timestep
362	4.0	2024-08-14 12:27:44.011384	1251.477131	9.999801e+07	33886784.0	-0.007277	-1992.33316	0	0	3.200000e+10	...	245527.210611	1.831886e+06	1.586359e+06	0.015777	0.013663	0	1	1	15	1

1 rows × 52 columns