dmx.compressor.utils.benchmark
Functions
|
Collect the output activations for each named DMX module in the model for the given mode |
|
Computes the MSE error and the maximum delta between the tensors in the given pair of tensor collections |
|
Compute the maximum delta observed between two tensor elements across all pairs of tensors in the given lists |
|
Compute sum of MSE errors between corresponding pairs of tensors in the given tensor lists |
|
Generate summary table with the runtime impact of each type of vsimd op |
|
Gathers all torch tensors from arbitrary nested structures of Lists and Dicts |
|
Measure model's runtime statistics for a given mode |
|
Entry function for measuring a model's accuracy across various modes |
|
Entry function for measuring the error at the output of each layer of the model for each mode relative that layer's output in the reference mode |
|
Entry function for measuring various runtime statistics |
|
Prepares a DMXModel if needed in Baseline, Basic, or Basic_NoVSIMD modes |
Classes
|
- class dmx.compressor.utils.benchmark.EVALUATION_MODE(value)
Bases:
Enum- BASELINE = 'Baseline'
- BASIC = 'Basic'
- BASIC_NOVSIMD = 'Basic_NoVSIMD'
- FP8 = 'fp8'
- VANILLA = 'Vanilla'
- dmx.compressor.utils.benchmark.collect_layer_activations(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], mode: EVALUATION_MODE)
Collect the output activations for each named DMX module in the model for the given mode
- Parameters:
model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy
mode (EVALUATION_MODE) – The mode for which we want to collect the layer activations
- dmx.compressor.utils.benchmark.compute_error(out1: Tensor | List[Any] | Tuple[Any] | Dict[str, Any], out2: Tensor | List[Any] | Tuple[Any] | Dict[str, Any])
Computes the MSE error and the maximum delta between the tensors in the given pair of tensor collections
- Parameters:
out1 (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]])
out2 (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]])
- dmx.compressor.utils.benchmark.compute_maxdelta_error(t_list1: List[Tensor], t_list2: List[Tensor]) float
Compute the maximum delta observed between two tensor elements across all pairs of tensors in the given lists
- Parameters:
t_list1 (List[torch.Tensor])
t_list2 (List[torch.Tensor])
- Return type:
float
- dmx.compressor.utils.benchmark.compute_mse_error(t_list1: List[Tensor], t_list2: List[Tensor]) float
Compute sum of MSE errors between corresponding pairs of tensors in the given tensor lists
- Parameters:
t_list1 (List[torch.Tensor])
t_list2 (List[torch.Tensor])
- Return type:
float
- dmx.compressor.utils.benchmark.evaluate_vsimd_ops_deltas(results: Dict[str, Any])
Generate summary table with the runtime impact of each type of vsimd op
- Parameters:
results (Dict[str, Any]) – The runtime measuerement results
- dmx.compressor.utils.benchmark.gather_tensors(tensor_collection: Tensor | List[Any] | Tuple[Any] | Dict[str, Any]) List[Tensor]
Gathers all torch tensors from arbitrary nested structures of Lists and Dicts
- Parameters:
tensor_collection (Union[torch.Tensor, List[Any], Tuple[Any], Dict[str, Any]]) – A Torch tensor or an arbitrary collection of tensors such as what you would typically get as an output from a HuggingFace model
- Return type:
List[torch.Tensor]
- dmx.compressor.utils.benchmark.measure_mode_perf(model: Module, model_runner: Callable[[Module | DmxModel], None], device: device, evaluation_mode: EVALUATION_MODE, n_warmup_runs: int = 1, n_measure_runs: int = 5)
Measure model’s runtime statistics for a given mode
- Parameters:
model (torch.nn.Module) – Pytorch model to measure
model_runner (Callable[[Union[torch.nn.Module, DmxModel]], None]) – Callable to run a sample input through the model
device (torch.device) – device on which to run the model
evaluation_mode (EVALUATION_MODE) – Model mode (Vanilla torch, Baseline, Basic without VSIMD ops, Basic)
n_warmup_runs (int) – Number of warmup runs before gathering statistics
n_measure_runs (int) – Number of runs across which to average the gathered statistics
- dmx.compressor.utils.benchmark.measure_model_accuracy(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE])
Entry function for measuring a model’s accuracy across various modes
- Parameters:
model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy
modes (List[EVALUATION_MODE]) – List of modes on which to measure the model’s accuracy
- dmx.compressor.utils.benchmark.measure_model_error(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE], reference_mode: EVALUATION_MODE = EVALUATION_MODE.BASELINE)
Entry function for measuring the error at the output of each layer of the model for each mode relative that layer’s output in the reference mode
- Parameters:
model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, togetherwith some callables to run a sample input through the model or to evaluate the model’s accuracy
modes (List[EVALUATION_MODE]) – The modes for which we want to evaluate the errors in the layer’s activations
reference_mode (EVALUATION_MODE) – The reference mode whose layer activations serve as the ground truth used to evaluate the per-layer errors for the other modes
- dmx.compressor.utils.benchmark.measure_model_runtime(model_maker: Callable[[], Tuple[Module, Callable, Callable, device]], modes: List[EVALUATION_MODE])
Entry function for measuring various runtime statistics
- Parameters:
model_maker (Callable[[], Tuple[torch.nn.Module, Callable, Callable, torch.device]]) – A callable that returns the model to be measured, together with some callables to run a sample input through the model or to evaluate the model’s accuracy
modes (List[EVALUATION_MODE]) – List of modes on which to evaluate the model’s runtime
- dmx.compressor.utils.benchmark.prepare_model(model: Module, evaluation_mode: EVALUATION_MODE, model_runner: Callable[[Module | DmxModel], None])
Prepares a DMXModel if needed in Baseline, Basic, or Basic_NoVSIMD modes
- Parameters:
model (torch.nn.Module) – torch model to prepare
evaluation_mode (EVALUATION_MODE) – The mode for the DMXModel to create out of the torch model
model_runner (Callable[[Union[torch.nn.Module, DmxModel]], None]) – A function for running a sample input through the model