dmx.compressor.utils.benchmark

Functions

`collect_layer_activations`(model_maker, mode)	Collect the output activations for each named DMX module in the model for the given mode
`compute_error`(out1, out2)	Computes the MSE error and the maximum delta between the tensors in the given pair of tensor collections
`compute_maxdelta_error`(t_list1, t_list2)	Compute the maximum delta observed between two tensor elements across all pairs of tensors in the given lists
`compute_mse_error`(t_list1, t_list2)	Compute sum of MSE errors between corresponding pairs of tensors in the given tensor lists
`evaluate_vsimd_ops_deltas`(results)	Generate summary table with the runtime impact of each type of vsimd op
`gather_tensors`(tensor_collection)	Gathers all torch tensors from arbitrary nested structures of Lists and Dicts
`measure_mode_perf`(model, model_runner, ...)	Measure model's runtime statistics for a given mode
`measure_model_accuracy`(model_maker, modes)	Entry function for measuring a model's accuracy across various modes
`measure_model_error`(model_maker, modes[, ...])	Entry function for measuring the error at the output of each layer of the model for each mode relative that layer's output in the reference mode
`measure_model_runtime`(model_maker, modes)	Entry function for measuring various runtime statistics
`prepare_model`(model, evaluation_mode, ...)	Prepares a DMXModel if needed in Baseline, Basic, or Basic_NoVSIMD modes

Classes

EVALUATION_MODE(value)

class dmx.compressor.utils.benchmark.EVALUATION_MODE(value)

Bases: Enum

BASELINE = 'Baseline'

BASIC = 'Basic'

BASIC_NOVSIMD = 'Basic_NoVSIMD'

FP8 = 'fp8'

VANILLA = 'Vanilla'

dmx.compressor.utils.benchmark.collect_layer_activations(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], mode: EVALUATION_MODE)

Collect the output activations for each named DMX module in the model for the given mode

Parameters:

model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy
mode (EVALUATION_MODE) – The mode for which we want to collect the layer activations

Computes the MSE error and the maximum delta between the tensors in the given pair of tensor collections

Parameters:

out1 (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]])
out2 (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]])

dmx.compressor.utils.benchmark.compute_maxdelta_error(t_list1: List[Tensor], t_list2: List[Tensor]) → float

Compute the maximum delta observed between two tensor elements across all pairs of tensors in the given lists

Parameters:

t_list1 (List[torch.Tensor])
t_list2 (List[torch.Tensor])

Return type:

float

dmx.compressor.utils.benchmark.compute_mse_error(t_list1: List[Tensor], t_list2: List[Tensor]) → float

Compute sum of MSE errors between corresponding pairs of tensors in the given tensor lists

Parameters:

t_list1 (List[torch.Tensor])
t_list2 (List[torch.Tensor])

Return type:

float

dmx.compressor.utils.benchmark.evaluate_vsimd_ops_deltas(results: Dict[str, Any])

Generate summary table with the runtime impact of each type of vsimd op

Parameters:: results (Dict[str, Any]) – The runtime measuerement results

dmx.compressor.utils.benchmark.gather_tensors(tensor_collection: Tensor | List[Any] | Tuple[Any] | Dict[str, Any]) → List[Tensor]

Gathers all torch tensors from arbitrary nested structures of Lists and Dicts

Parameters:: tensor_collection (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]]) – A Torch tensor or an arbitrary collection of tensors such as what you would typically get as an output from a HuggingFace model
Return type:: List[torch.Tensor]

dmx.compressor.utils.benchmark.measure_mode_perf(model: Module, model_runner: Callable[[Module | DmxModel], None], device: device, evaluation_mode: EVALUATION_MODE, n_warmup_runs: int = 1, n_measure_runs: int = 5)

Measure model’s runtime statistics for a given mode

Parameters:

model (torch.nn.Module) – Pytorch model to measure
model_runner (Callable[[Union[torch.nn.Module, DmxModel]], None]) – Callable to run a sample input through the model
device (torch.device) – device on which to run the model
evaluation_mode (EVALUATION_MODE) – Model mode (Vanilla torch, Baseline, Basic without VSIMD ops, Basic)
n_warmup_runs (int) – Number of warmup runs before gathering statistics
n_measure_runs (int) – Number of runs across which to average the gathered statistics

dmx.compressor.utils.benchmark.measure_model_accuracy(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE])

Entry function for measuring a model’s accuracy across various modes

Parameters:

model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy
modes (List[EVALUATION_MODE]) – List of modes on which to measure the model’s accuracy

dmx.compressor.utils.benchmark.measure_model_error(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE], reference_mode: EVALUATION_MODE = EVALUATION_MODE.BASELINE)

Entry function for measuring the error at the output of each layer of the model for each mode relative that layer’s output in the reference mode

Parameters:

model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, togetherwith some callables to run a sample input through the model or to evaluate the model’s accuracy
modes (List[EVALUATION_MODE]) – The modes for which we want to evaluate the errors in the layer’s activations
reference_mode (EVALUATION_MODE) – The reference mode whose layer activations serve as the ground truth used to evaluate the per-layer errors for the other modes

dmx.compressor.utils.benchmark.measure_model_runtime(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE])

Entry function for measuring various runtime statistics

Parameters:

model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy
modes (List[EVALUATION_MODE]) – List of modes on which to evaluate the model’s runtime

dmx.compressor.utils.benchmark.prepare_model(model: Module, evaluation_mode: EVALUATION_MODE, model_runner: Callable[[Module | DmxModel], None])

Prepares a DMXModel if needed in Baseline, Basic, or Basic_NoVSIMD modes

Parameters:

model (torch.nn.Module) – torch model to prepare
evaluation_mode (EVALUATION_MODE) – The mode for the DMXModel to create out of the torch model
model_runner (Callable[[Union[torch.nn.Module, DmxModel]], None]) – A function for running a sample input through the model