dmx.compressor.modeling.nn.custom_modules

Functions

apply_rotary_embeddings(x, cos_embedding, ...)

Classes

ApplyRotaryPosEmb()

ApplyRotaryPosEmbBase(*args, **kwargs)

BloomGELU(*args, **kwargs)

ClippedGELU(*args, **kwargs)

FastGELU(*args, **kwargs)

GemmaRMSNorm(dim[, eps])

An extension of RMSNorm layer to support DmxModule configurations.

NewGELU(*args, **kwargs)

QuickGELU(*args, **kwargs)

RotaryEmbedding(config[, device])

class dmx.compressor.modeling.nn.custom_modules.ApplyRotaryPosEmb

Bases: DmxModule, ApplyRotaryPosEmbBase

to_compiler_graph()

Returns a compiler friendly graph

training: bool
class dmx.compressor.modeling.nn.custom_modules.ApplyRotaryPosEmbBase(*args, **kwargs)

Bases: Module

forward(q, k, cos, sin, unsqueeze_dim=1)

Applies Rotary Position Embedding to the query and key tensors.

Parameters:
  • q (torch.Tensor) – The query tensor.

  • k (torch.Tensor) – The key tensor.

  • cos (torch.Tensor) – The cosine part of the rotary embedding.

  • sin (torch.Tensor) – The sine part of the rotary embedding.

  • unsqueeze_dim (int, optional, defaults to 1) – The ‘unsqueeze_dim’ argument specifies the dimension along which to unsqueeze cos[position_ids] and sin[position_ids] so that they can be properly broadcasted to the dimensions of q and k. For example, note that cos[position_ids] and sin[position_ids] have the shape [batch_size, seq_len, head_dim]. Then, if q and k have the shape [batch_size, heads, seq_len, head_dim], then setting unsqueeze_dim=1 makes cos[position_ids] and sin[position_ids] broadcastable to the shapes of q and k. Similarly, if q and k have the shape [batch_size, seq_len, heads, head_dim], then set unsqueeze_dim=2.

Returns:

tuple(torch.Tensor) comprising of the query and key tensors rotated using the Rotary Position Embedding.

rotate_half(x)

Rotates half the hidden dims of the input.

class dmx.compressor.modeling.nn.custom_modules.BloomGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)
training: bool
class dmx.compressor.modeling.nn.custom_modules.ClippedGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)
training: bool
class dmx.compressor.modeling.nn.custom_modules.FastGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)
training: bool
class dmx.compressor.modeling.nn.custom_modules.GemmaRMSNorm(dim: int, eps: float = 1e-06)

Bases: DmxModule, GemmaRMSNorm

An extension of RMSNorm layer to support DmxModule configurations. This module performs RMS-based layer normalization on the input tensor. The layer normalization is characterized by the hidden_size and an optional eps value for numerical stability.

Parameters:
  • dim (int) – The size of the hidden layer (number of hidden units).

  • eps (float, optional) – A small constant added to the denominator for numerical stability. Defaults to 1e-6.

_forward (_input

Tensor) -> Tensor: Computes the forward pass of the RMS layer normalization.

classmethod from_raw(raw: Module) DmxModule

Creates a new RMSNorm object (DmxModule) from a given PyTorch RMSNorm layer.

Parameters:

raw (torch.nn.Module) – A PyTorch RMSNorm layer to be converted.

Returns:

A RMSNorm object that has the same configuration as the input PyTorch RMSNorm layer.

Return type:

DmxModule

functional_forward(x, weight, eps)
to_compiler_graph() Graph

Returns a compiler friendly graph

training: bool
class dmx.compressor.modeling.nn.custom_modules.NewGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)
training: bool
class dmx.compressor.modeling.nn.custom_modules.QuickGELU(*args, **kwargs)

Bases: GELUBase

functional_forward(x)
training: bool
class dmx.compressor.modeling.nn.custom_modules.RotaryEmbedding(config, device=None)

Bases: DmxModule, LlamaRotaryEmbedding

classmethod from_raw(raw: Module) DmxModule
to_compiler_graph() Graph

Returns a compiler friendly graph

training: bool
dmx.compressor.modeling.nn.custom_modules.apply_rotary_embeddings(x, cos_embedding, sin_embedding)