dmx.compressor.modeling.nn.custom_modules

Functions

apply_rotary_embeddings(x, cos_embedding, ...)

Classes

`ApplyRotaryPosEmb`()
`ApplyRotaryPosEmbBase`(args, *kwargs)
`BloomGELU`(args, *kwargs)
`ClippedGELU`(args, *kwargs)
`FastGELU`(args, *kwargs)
`GemmaRMSNorm`(dim[, eps])	An extension of RMSNorm layer to support DmxModule configurations.
`NewGELU`(args, *kwargs)
`QuickGELU`(args, *kwargs)
`RotaryEmbedding`(config[, device])

class dmx.compressor.modeling.nn.custom_modules.ApplyRotaryPosEmb

Bases: DmxModule, ApplyRotaryPosEmbBase

to_compiler_graph(): Returns a compiler friendly graph

training: bool

class dmx.compressor.modeling.nn.custom_modules.ApplyRotaryPosEmbBase(*args, **kwargs)

Bases: Module

forward(q, k, cos, sin, unsqueeze_dim=1)

Applies Rotary Position Embedding to the query and key tensors.

Parameters:

q (torch.Tensor) – The query tensor.
k (torch.Tensor) – The key tensor.
cos (torch.Tensor) – The cosine part of the rotary embedding.
sin (torch.Tensor) – The sine part of the rotary embedding.
unsqueeze_dim (int, optional, defaults to 1) – The ‘unsqueeze_dim’ argument specifies the dimension along which to unsqueeze cos[position_ids] and sin[position_ids] so that they can be properly broadcasted to the dimensions of q and k. For example, note that cos[position_ids] and sin[position_ids] have the shape [batch_size, seq_len, head_dim]. Then, if q and k have the shape [batch_size, heads, seq_len, head_dim], then setting unsqueeze_dim=1 makes cos[position_ids] and sin[position_ids] broadcastable to the shapes of q and k. Similarly, if q and k have the shape [batch_size, seq_len, heads, head_dim], then set unsqueeze_dim=2.

Returns:

tuple(torch.Tensor) comprising of the query and key tensors rotated using the Rotary Position Embedding.

rotate_half(x): Rotates half the hidden dims of the input.

class dmx.compressor.modeling.nn.custom_modules.BloomGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)

training: bool

class dmx.compressor.modeling.nn.custom_modules.ClippedGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)

training: bool

class dmx.compressor.modeling.nn.custom_modules.FastGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)

training: bool

class dmx.compressor.modeling.nn.custom_modules.GemmaRMSNorm(dim: int, eps: float = 1e-06)

Bases: DmxModule, GemmaRMSNorm

An extension of RMSNorm layer to support DmxModule configurations. This module performs RMS-based layer normalization on the input tensor. The layer normalization is characterized by the hidden_size and an optional eps value for numerical stability.

Parameters:

dim (int) – The size of the hidden layer (number of hidden units).
eps (float, optional) – A small constant added to the denominator for numerical stability. Defaults to 1e-6.

_forward (_input: Tensor) -> Tensor: Computes the forward pass of the RMS layer normalization.

classmethod from_raw(raw: Module) → DmxModule

Creates a new RMSNorm object (DmxModule) from a given PyTorch RMSNorm layer.

Parameters:: raw (torch.nn.Module) – A PyTorch RMSNorm layer to be converted.
Returns:: A RMSNorm object that has the same configuration as the input PyTorch RMSNorm layer.
Return type:: DmxModule

functional_forward(x, weight, eps)

to_compiler_graph() → Graph: Returns a compiler friendly graph

training: bool

class dmx.compressor.modeling.nn.custom_modules.NewGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)

training: bool

class dmx.compressor.modeling.nn.custom_modules.QuickGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)

training: bool

class dmx.compressor.modeling.nn.custom_modules.RotaryEmbedding(config, device=None)

Bases: DmxModule, LlamaRotaryEmbedding

classmethod from_raw(raw: Module) → DmxModule

to_compiler_graph() → Graph: Returns a compiler friendly graph

training: bool

dmx.compressor.modeling.nn.custom_modules.apply_rotary_embeddings(x, cos_embedding, sin_embedding)

`ApplyRotaryPosEmb`()
`ApplyRotaryPosEmbBase`(args, *kwargs)
`BloomGELU`(args, *kwargs)
`ClippedGELU`(args, *kwargs)
`FastGELU`(args, *kwargs)
`GemmaRMSNorm`(dim[, eps])	An extension of RMSNorm layer to support DmxModule configurations.
`NewGELU`(args, *kwargs)
`QuickGELU`(args, *kwargs)
`RotaryEmbedding`(config[, device])