dmx.compressor.utils.benchmark

Functions

collect_layer_activations(model_maker, mode)

Collect the output activations for each named DMX module in the model for the given mode

compute_error(out1, out2)

Computes the MSE error and the maximum delta between the tensors in the given pair of tensor collections

compute_maxdelta_error(t_list1, t_list2)

Compute the maximum delta observed between two tensor elements across all pairs of tensors in the given lists

compute_mse_error(t_list1, t_list2)

Compute sum of MSE errors between corresponding pairs of tensors in the given tensor lists

evaluate_vsimd_ops_deltas(results)

Generate summary table with the runtime impact of each type of vsimd op

gather_tensors(tensor_collection)

Gathers all torch tensors from arbitrary nested structures of Lists and Dicts

measure_mode_perf(model, model_runner, ...)

Measure model's runtime statistics for a given mode

measure_model_accuracy(model_maker, modes)

Entry function for measuring a model's accuracy across various modes

measure_model_error(model_maker, modes[, ...])

Entry function for measuring the error at the output of each layer of the model for each mode relative that layer's output in the reference mode

measure_model_runtime(model_maker, modes)

Entry function for measuring various runtime statistics

prepare_model(model, evaluation_mode, ...)

Prepares a DMXModel if needed in Baseline, Basic, or Basic_NoVSIMD modes

Classes

EVALUATION_MODE(value)

class dmx.compressor.utils.benchmark.EVALUATION_MODE(value)

Bases: Enum

BASELINE = 'Baseline'
BASIC = 'Basic'
BASIC_NOVSIMD = 'Basic_NoVSIMD'
FP8 = 'fp8'
VANILLA = 'Vanilla'
dmx.compressor.utils.benchmark.collect_layer_activations(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], mode: EVALUATION_MODE)

Collect the output activations for each named DMX module in the model for the given mode

Parameters:
  • model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy

  • mode (EVALUATION_MODE) – The mode for which we want to collect the layer activations

dmx.compressor.utils.benchmark.compute_error(out1: Tensor | List[Any] | Tuple[Any] | Dict[str, Any], out2: Tensor | List[Any] | Tuple[Any] | Dict[str, Any])

Computes the MSE error and the maximum delta between the tensors in the given pair of tensor collections

Parameters:
  • out1 (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]])

  • out2 (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]])

dmx.compressor.utils.benchmark.compute_maxdelta_error(t_list1: List[Tensor], t_list2: List[Tensor]) float

Compute the maximum delta observed between two tensor elements across all pairs of tensors in the given lists

Parameters:
  • t_list1 (List[torch.Tensor])

  • t_list2 (List[torch.Tensor])

Return type:

float

dmx.compressor.utils.benchmark.compute_mse_error(t_list1: List[Tensor], t_list2: List[Tensor]) float

Compute sum of MSE errors between corresponding pairs of tensors in the given tensor lists

Parameters:
  • t_list1 (List[torch.Tensor])

  • t_list2 (List[torch.Tensor])

Return type:

float

dmx.compressor.utils.benchmark.evaluate_vsimd_ops_deltas(results: Dict[str, Any])

Generate summary table with the runtime impact of each type of vsimd op

Parameters:

results (Dict[str, Any]) – The runtime measuerement results

dmx.compressor.utils.benchmark.gather_tensors(tensor_collection: Tensor | List[Any] | Tuple[Any] | Dict[str, Any]) List[Tensor]

Gathers all torch tensors from arbitrary nested structures of Lists and Dicts

Parameters:

tensor_collection (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]]) – A Torch tensor or an arbitrary collection of tensors such as what you would typically get as an output from a HuggingFace model

Return type:

List[torch.Tensor]

dmx.compressor.utils.benchmark.measure_mode_perf(model: Module, model_runner: Callable[[Module | DmxModel], None], device: device, evaluation_mode: EVALUATION_MODE, n_warmup_runs: int = 1, n_measure_runs: int = 5)

Measure model’s runtime statistics for a given mode

Parameters:
  • model (torch.nn.Module) – Pytorch model to measure

  • model_runner (Callable[[Union[torch.nn.Module, DmxModel]], None]) – Callable to run a sample input through the model

  • device (torch.device) – device on which to run the model

  • evaluation_mode (EVALUATION_MODE) – Model mode (Vanilla torch, Baseline, Basic without VSIMD ops, Basic)

  • n_warmup_runs (int) – Number of warmup runs before gathering statistics

  • n_measure_runs (int) – Number of runs across which to average the gathered statistics

dmx.compressor.utils.benchmark.measure_model_accuracy(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE])

Entry function for measuring a model’s accuracy across various modes

Parameters:
  • model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy

  • modes (List[EVALUATION_MODE]) – List of modes on which to measure the model’s accuracy

dmx.compressor.utils.benchmark.measure_model_error(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE], reference_mode: EVALUATION_MODE = EVALUATION_MODE.BASELINE)

Entry function for measuring the error at the output of each layer of the model for each mode relative that layer’s output in the reference mode

Parameters:
  • model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, togetherwith some callables to run a sample input through the model or to evaluate the model’s accuracy

  • modes (List[EVALUATION_MODE]) – The modes for which we want to evaluate the errors in the layer’s activations

  • reference_mode (EVALUATION_MODE) – The reference mode whose layer activations serve as the ground truth used to evaluate the per-layer errors for the other modes

dmx.compressor.utils.benchmark.measure_model_runtime(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE])

Entry function for measuring various runtime statistics

Parameters:
  • model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy

  • modes (List[EVALUATION_MODE]) – List of modes on which to evaluate the model’s runtime

dmx.compressor.utils.benchmark.prepare_model(model: Module, evaluation_mode: EVALUATION_MODE, model_runner: Callable[[Module | DmxModel], None])

Prepares a DMXModel if needed in Baseline, Basic, or Basic_NoVSIMD modes

Parameters:
  • model (torch.nn.Module) – torch model to prepare

  • evaluation_mode (EVALUATION_MODE) – The mode for the DMXModel to create out of the torch model

  • model_runner (Callable[[Union[torch.nn.Module, DmxModel]], None]) – A function for running a sample input through the model