dmx.compressor.sparse.BlockTopK

class dmx.compressor.sparse.BlockTopK(K=4, block_size=8, block_dim=-1, mask_gradient=False)

Fine-grain structured sparsity with K non-zeros out of block_size elements long block_dim.

Methods

`__init__`([K, block_size, block_dim, ...])
`apply`(args, *kwargs)
`backward`(ctx, grad_output)	Define a formula for differentiating the operation with backward mode automatic differentiation.
`forward`(ctx, score, params)	Define the forward of the custom autograd Function.
`from_shorthand`(sh)
`get_mask`(score)
`jvp`(ctx, *grad_inputs)	Define a formula for differentiating the operation with forward mode automatic differentiation.
`mark_dirty`(*args)	Mark given tensors as modified in an in-place operation.
`mark_non_differentiable`(*args)	Mark outputs as non-differentiable.
`mark_shared_storage`(*pairs)
`maybe_clear_saved_tensors`
`name`
`register_hook`
`register_prehook`
`save_for_backward`(*tensors)	Save given tensors for a future call to `backward()`.
`save_for_forward`(*tensors)	Save given tensors for a future call to `jvp()`.
`set_materialize_grads`(value)	Set whether to materialize grad tensors.
`setup_context`(ctx, inputs, output)	There are two ways to define the forward pass of an autograd.Function.
`vjp`(ctx, *grad_outputs)	Define a formula for differentiating the operation with backward mode automatic differentiation.
`vmap`(info, in_dims, *args)	Define the behavior for this autograd.Function underneath `torch.vmap()`.

Attributes