问题：

在flat32中保存flat16最大数量

林浩漫

2023-03-14

如何保存flat16（https://en.wikipedia.org/wiki/Half-precision_floating-point_format）最大数量在flat32（https://en.wikipedia.org/wiki/Single-precision_floating-point_format）格式？

我想有一个可以将0x7bff转换为65504的函数。0x7bff是可以用浮点半精度表示的最大值：

0 11110 1111111111 -> decimal value: 65504

我希望有0x7bff来表示我的程序中的实际位。

float fp16_max = bit_cast(0x7bff); 
# want "std::cout << fp16_max" to be 65504

我试图实现这样一个功能，但似乎不起作用：

float bit_cast (uint32_t fp16_bits) {
    float i;
    memcpy(&i, &fp16_bits, 4);
    return i; 
}    
float test = bit_cast(0x7bff);
# print out test: 4.44814e-41

共有3个答案

刘乐童

2023-03-14

如何以float32格式保存float16最大值？

65504

您只需将整数转换为浮点：

float half_max = 65504;

如果要计算该值，可以使用ldexpf：

float half_max = (2 - ldexpf(1, -10)) * ldexpf(1, 15)

或者，对于任何IEEE浮点：

// in case of half float
int bits = 16;
int man_bits = 10;

// the calculation
int exp_bits = bits - man_bits - 1;
int exp_max = (1 << (exp_bits - 1)) - 1;
long double max = (2 - ldexp(1, -1 * man_bits)) * ldexp(1, exp_max);

位转换0x7bff不起作用，因为0x7bff是二进制16格式（在某些端序中）的表示，而不是二进制32格式。不能对冲突表示进行位转换。

严天逸

2023-03-14

通过声明，您的值已经是32位浮点；不需要在这里施放。我想你可以简单地：

float i = fp16_max;

这里的假设是您的“魔法”bit_cast函数已经正确返回了一个32位浮点数。由于您还没有向我们展示bit-cast做了什么或实际返回了什么，我将假设它确实返回了一个正确的浮点数值。

方奕

2023-03-14

#include <cmath>
#include <cstdio>


/*  Decode the IEEE-754 binary16 encoding into a floating-point value.
    Details of NaNs are not handled.
*/
static float InterpretAsBinary16(unsigned Bits)
{
    //  Extract the fields from the binary16 encoding.
    unsigned SignCode        = Bits >> 15;
    unsigned ExponentCode    = Bits >> 10 & 0x1f;
    unsigned SignificandCode = Bits       & 0x3ff;

    //  Interpret the sign bit.
    float Sign = SignCode ? -1 : +1;

    //  Partition into cases based on exponent code.

    float Significand, Exponent;

    //  An exponent code of all ones denotes infinity or a NaN.
    if (ExponentCode == 0x1f)
        return Sign * (SignificandCode == 0 ? INFINITY : NAN);

    //  An exponent code of all zeros denotes zero or a subnormal.
    else if (ExponentCode == 0)
    {
        /*  Subnormal significands have a leading zero, and the exponent is the
            same as if the exponent code were 1.
        */
        Significand = 0 + SignificandCode * 0x1p-10;
        Exponent    = 1 - 0xf;
    }

    //  Other exponent codes denote normal numbers.
    else
    {
        /*  Normal significands have a leading one, and the exponent is biased
            by 0xf.
        */
        Significand = 1 + SignificandCode * 0x1p-10;
        Exponent    = ExponentCode - 0xf;
    }

    //  Combine the sign, significand, and exponent, and return the result.
    return Sign * std::ldexp(Significand, Exponent);
}


int main(void)
{
    unsigned Bits = 0x7bff;
    std::printf(
        "Interpreting the bits 0x%x as an IEEE-754 binary16 yields %.99g.\n",
        Bits,
        InterpretAsBinary16(Bits));
}

类似资料：

C ++中数组中存在的最大连续数

本文向大家介绍C ++中数组中存在的最大连续数，包括了C ++中数组中存在的最大连续数的使用技巧和注意事项，需要的朋友参考一下给定一个正整数数组。目的是找到其中存在的最大连续数。首先，我们将对数组进行排序，然后比较相邻元素arr [j] == arr [i] +1（j = i + 1），如果差为1，则递增计数，索引i ++，j ++，否则更改计数= 1 。将到目前为止找到的最大计数存储在maxc
在Python中保存最长列表

我尝试通过像这样更改if语句来只获得最长的回文。这并不像预期的那样有效。我只打印第一个回文[“w”]，然后在列表中返回整个字符串[“w”,“h”,...]
如何在数据库中保存图像大小？

我需要在数据库中按kb保存大小，以便在前端查看，这是我正在使用的控制器。我使用的是Laravel 5.8 所以我的问题是，拉雷维尔是否提供了处理这种情况的任何假象？或者任何其他框架都有更适合问题的功能是什么？
保存并设置“未最大化”窗口大小？

当用户关闭一个程序时，我想保存一些关于主窗口的信息，以便下次用户打开程序时，窗口具有相同的属性。这对于窗口是否最大化很容易做到：得到：舞台。isMaximized（）场景：舞台。setMaximized（布尔最大化）如果窗口没有最大化，这也很容易做到：得到：舞台。getX（）/stage。getY（）/舞台。getWidth（）/stage。getHeight（）场景：舞台。setX（
临时在php中保存数据的最佳方法

问题内容：我有一些将数据存储在页面中的表格。我想找到一种方法来临时保存用户在导航中输入的数据，并保持到确认订单为止。谁能帮我？问题答案：这正是会议的目的
保持最大系数的阈值

我一直在做一些实验，使用一些变换，例如在Matlab中对图像数据进行DCT变换。使用512x512像素lena图像的DCT示例： x=双（imread（'lenna.bmp'））；R=dct2（x）；然后，我想通过保持R的100000个最大系数并将剩余的设置为零来阈值变换系数。我该怎么做？

在flat32中保存flat16最大数量

共有3个答案

相关问答

相关文章

相关阅读

相关工具

相关文档