dmx.compressor.numerical.format.FloatingPoint

class dmx.compressor.numerical.format.FloatingPoint(mantissa=23, exponent=8, bias=None, flush_subnormal=True, unsigned=False, rounding='nearest')

This is a floating point format simulated in FP32, using QPyTorch.

__init__(mantissa=23, exponent=8, bias=None, flush_subnormal=True, unsigned=False, rounding='nearest')

Methods

__init__([mantissa, exponent, bias, ...])

cast(x, *args)

from_shorthand(sh)

Attributes

bfp_id

bit_precision

blocked

bytes_per_elem

largest_representable_power_of_two