dmx.compressor.advanced_recipe

Classes

DmxApproximationFunctionTuningHyperparams([...])

Approximation function extra_params tuning hyperparameters with default values

DmxApproximationFunctionTuningRecipe(hp_gen, ...)

Approximation function extra_params tuning recipe

DmxBaseRecipe(hp_gen, **kwargs)

This is an abstract class of ADVANCED mode recipe.

DmxGPTQRecipe(hp_gen, **kwargs)

GPTQ recipe

DmxModuleGPTQHyperparams([microblock_size, ...])

DmxModule GPTQ hyperparameters with default values

DmxModuleQuantizerCalibrationHyperparams([...])

DmxModule boundary cast quantizers calibration hyperparameters with default values

DmxModuleSmoothQuantHyperparams([...])

DmxModule SmoothQuant hyperparameters with default values

DmxQuantizerCalibrationHyperparams(...)

Fake quantizer (i.e. CastTo) calibration hyperparameters with default values.

DmxQuantizerCalibrationRecipe(hp_gen, **kwargs)

Fake quantizer calibration recipe

DmxSLaNCHyperparams([position, mlp_type, ...])

SLaNC hyperparamters with default values

DmxSLaNCRecipe(hp_gen, **kwargs)

SLaNC norm tuning for LayerNorm|RMSNorm recipe Paper: https://arxiv.org/abs/2410.10553

DmxSmoothQuantRecipe(hp_gen, **kwargs)

SmoothQuant recipe

class dmx.compressor.advanced_recipe.DmxApproximationFunctionTuningHyperparams(search_space: Space | None = None)

Bases: object

Approximation function extra_params tuning hyperparameters with default values

search_space: Space | None = None
class dmx.compressor.advanced_recipe.DmxApproximationFunctionTuningRecipe(hp_gen, **kwargs)

Bases: DmxBaseRecipe

Approximation function extra_params tuning recipe

class dmx.compressor.advanced_recipe.DmxBaseRecipe(hp_gen: Callable, **kwargs)

Bases: ABC

This is an abstract class of ADVANCED mode recipe.

applied_to(_model, save_checkpoint_to: str | None = None)
class dmx.compressor.advanced_recipe.DmxGPTQRecipe(hp_gen, **kwargs)

Bases: DmxBaseRecipe

GPTQ recipe

class dmx.compressor.advanced_recipe.DmxModuleGPTQHyperparams(microblock_size: int = 1, block_size: int = 128, percdamp: float = 0.01)

Bases: object

DmxModule GPTQ hyperparameters with default values

block_size: int = 128
microblock_size: int = 1
percdamp: float = 0.01
class dmx.compressor.advanced_recipe.DmxModuleQuantizerCalibrationHyperparams(inputs: List[DmxQuantizerCalibrationHyperparams] | None = None, outputs: List[DmxQuantizerCalibrationHyperparams] | None = None, weight: DmxQuantizerCalibrationHyperparams | None = None, weight_storage: DmxQuantizerCalibrationHyperparams | None = None)

Bases: object

DmxModule boundary cast quantizers calibration hyperparameters with default values

inputs: List[DmxQuantizerCalibrationHyperparams] | None = None
outputs: List[DmxQuantizerCalibrationHyperparams] | None = None
weight: DmxQuantizerCalibrationHyperparams | None = None
weight_storage: DmxQuantizerCalibrationHyperparams | None = None
class dmx.compressor.advanced_recipe.DmxModuleSmoothQuantHyperparams(migration_strength: float = 0.5, fuse_to_weight: bool = False)

Bases: object

DmxModule SmoothQuant hyperparameters with default values

fuse_to_weight: bool = False
migration_strength: float = 0.5
class dmx.compressor.advanced_recipe.DmxQuantizerCalibrationHyperparams(observer_cls: ~torch.ao.quantization.observer.ObserverBase = <class 'dmx.compressor.numerical.observer.HistogramObserver'>, qscheme_to_overload: ~torch.qscheme = torch.per_tensor_symmetric, group_size: int | None = None, ch_axis: int | None = None)

Bases: object

Fake quantizer (i.e. CastTo) calibration hyperparameters with default values

ch_axis: int | None = None
group_size: int | None = None
observer_cls

alias of HistogramObserver

qscheme_to_overload: qscheme = torch.per_tensor_symmetric
class dmx.compressor.advanced_recipe.DmxQuantizerCalibrationRecipe(hp_gen, **kwargs)

Bases: DmxBaseRecipe

Fake quantizer calibration recipe

class dmx.compressor.advanced_recipe.DmxSLaNCHyperparams(position: str | None = None, mlp_type: str | None = None, device: device | None = None, prev_ln_weight: Module | None = None, fc1: Module | None = None, fc2: Module | None = None, up_proj: Module | None = None, down_proj: Module | None = None, gate_proj: Module | None = None, v_proj: Module | None = None, o_proj: Module | None = None)

Bases: object

SLaNC hyperparamters with default values

device: device | None = None
down_proj: Module | None = None
fc1: Module | None = None
fc2: Module | None = None
gate_proj: Module | None = None
mlp_type: str | None = None
o_proj: Module | None = None
position: str | None = None
prev_ln_weight: Module | None = None
up_proj: Module | None = None
v_proj: Module | None = None
class dmx.compressor.advanced_recipe.DmxSLaNCRecipe(hp_gen, **kwargs)

Bases: DmxBaseRecipe

SLaNC norm tuning for LayerNorm|RMSNorm recipe Paper: https://arxiv.org/abs/2410.10553

class dmx.compressor.advanced_recipe.DmxSmoothQuantRecipe(hp_gen, **kwargs)

Bases: DmxBaseRecipe

SmoothQuant recipe