问题：

为什么C#内存流保留了这么多内存？

罗法

2023-03-14

我们的软件正在通过一个从内存流读取数据的GZipStream解压某些字节数据。这些数据以4KB的块解压缩，并写入另一个内存流。

我们已经意识到进程分配的内存远高于实际解压的数据。

示例：具有2425536字节的压缩字节数组被解压缩为23050718字节。我们使用的内存分析器显示了方法MemoryStream。设置容量（Int32值）分配的67104936字节。这是保留内存和实际写入内存之间的2.9倍。

注意：MemoryStream.set_Capacity是从MemoryStream调用的。Ensure容量本身是从MemoryStream调用的。在我们的函数中编写。

为什么内存流（MemoryStream）保留了这么多容量，即使它只附加了4KB的块？

以下是解压缩数据的代码片段：

private byte[] Decompress(byte[] data)
{
    using (MemoryStream compressedStream = new MemoryStream(data))
    using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
    using (MemoryStream resultStream = new MemoryStream())
    {
        byte[] buffer = new byte[4096];
        int iCount = 0;

        while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
        {
            resultStream.Write(buffer, 0, iCount);
        }
        return resultStream.ToArray();
    }
}

注：如果相关，这是系统配置：

Windows XP 32位，
. NET 3.5
使用Visual Studio 2008编译

共有3个答案

谭曦

2023-03-14

看起来您正在查看分配的内存总量，而不是最后一次调用。由于内存流在重新分配时会加倍其大小，因此每次都会增长大约两倍-因此总分配的内存大约是2的幂之和，如下所示：

和_i=1^k（2ⁱ）=2^{k 1}-1。

（其中k是重新分配的数量，例如k=1 log₂StreamSize

这就是你所看到的。

解柏

2023-03-14

内存流（MemoryStream）在空间不足时将其内部缓冲区加倍。这可能导致2倍的浪费。我不知道为什么你看到的不止这些。但这种基本行为是意料之中的。

如果您不喜欢这种行为，请编写您自己的流，将其数据存储在较小的块中（例如List

冷善

2023-03-14

因为这是它如何扩展容量的算法。

public override void Write(byte[] buffer, int offset, int count) {

    //... Removed Error checking for example

    int i = _position + count;
    // Check for overflow
    if (i < 0)
        throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));

    if (i > _length) {
        bool mustZero = _position > _length;
        if (i > _capacity) {
            bool allocatedNewArray = EnsureCapacity(i);
            if (allocatedNewArray)
                mustZero = false;
        }
        if (mustZero)
            Array.Clear(_buffer, _length, i - _length);
        _length = i;
    }

    //... 
}

private bool EnsureCapacity(int value) {
    // Check for overflow
    if (value < 0)
        throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));
    if (value > _capacity) {
        int newCapacity = value;
        if (newCapacity < 256)
            newCapacity = 256;
        if (newCapacity < _capacity * 2)
            newCapacity = _capacity * 2;
        Capacity = newCapacity;
        return true;
    }
    return false;
}

public virtual int Capacity 
{
    //...

    set {
         //...

        // MemoryStream has this invariant: _origin > 0 => !expandable (see ctors)
        if (_expandable && value != _capacity) {
            if (value > 0) {
                byte[] newBuffer = new byte[value];
                if (_length > 0) Buffer.InternalBlockCopy(_buffer, 0, newBuffer, 0, _length);
                _buffer = newBuffer;
            }
            else {
                _buffer = null;
            }
            _capacity = value;
        }
    }
}

因此，每次您达到容量限制时，它都会将容量大小翻倍。它这样做的原因是Buffer。InternalBlockCopy操作对于大型数组来说很慢，因此如果它必须经常调整每个写入调用的大小，性能会显着下降。

要提高性能，可以做的几件事是，可以将初始容量至少设置为压缩阵列的大小，然后可以将大小增加一个小于2.0的系数，以减少正在使用的内存量。

const double ResizeFactor = 1.25;

private byte[] Decompress(byte[] data)
{
    using (MemoryStream compressedStream = new MemoryStream(data))
    using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
    using (MemoryStream resultStream = new MemoryStream(data.Length * ResizeFactor)) //Set the initial size to be the same as the compressed size + 25%.
    {
        byte[] buffer = new byte[4096];
        int iCount = 0;

        while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
        {
            if(resultStream.Capacity < resultStream.Length + iCount)
               resultStream.Capacity = resultStream.Capacity * ResizeFactor; //Resize to 125% instead of 200%

            resultStream.Write(buffer, 0, iCount);
        }
        return resultStream.ToArray();
    }
}

如果您愿意，您可以执行更奇特的算法，例如根据当前压缩比调整大小

const double MinResizeFactor = 1.05;

private byte[] Decompress(byte[] data)
{
    using (MemoryStream compressedStream = new MemoryStream(data))
    using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
    using (MemoryStream resultStream = new MemoryStream(data.Length * MinResizeFactor)) //Set the initial size to be the same as the compressed size + the minimum resize factor.
    {
        byte[] buffer = new byte[4096];
        int iCount = 0;

        while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
        {
            if(resultStream.Capacity < resultStream.Length + iCount)
            {
               double sizeRatio = ((double)resultStream.Position + iCount) / (compressedStream.Position + 1); //The +1 is to prevent divide by 0 errors, it may not be necessary in practice.

               //Resize to minimum resize factor of the current capacity or the 
               // compressed stream length times the compression ratio + min resize 
               // factor, whichever is larger.
               resultStream.Capacity =  Math.Max(resultStream.Capacity * MinResizeFactor, 
                                                 (sizeRatio + (MinResizeFactor - 1)) * compressedStream.Length);
             }

            resultStream.Write(buffer, 0, iCount);
        }
        return resultStream.ToArray();
    }
}

类似资料：

为什么CUDA固定内存这么快？

问题内容：当我使用固定内存进行CUDA数据传输时，我发现数据传输速度大大提高。在linux上，实现此目标的底层系统调用是mlock。从mlock的手册页中可以看出，锁定该页可防止将其换出： mlock（）将页面锁定在地址范围内，该地址范围从addr开始并持续len个字节。当调用成功返回时，保证所有包含指定地址范围一部分的页面都驻留在RAM中；在测试中，我的系统上有几千个可用内存，因此从不存在内
C#内存流

我在使用SharpZipLib的GZipInputStream编写未压缩的GZIP流时遇到问题。我似乎只能获得256字节的数据，其余的数据没有写入并保留为零。已检查压缩流（compressedSection），所有数据都在那里（1500字节）。解压缩过程的片段如下：因此，在这段代码中： 1）压缩的部分被传入，准备解压缩。 2）未压缩输出的预期大小（以2字节小endian值的形式存储在文件头中
为什么字节码中保留了这个类型参数？

打字擦除页面上说如果类型参数是无界的，则用其边界或对象替换泛型类型中的所有类型参数。因此，生成的字节码只包含普通类、接口和方法。但是，对于以下类别： javap-c Foo打印：为什么类型参数没有替换为绑定（CharSequence），而是保留为E？
为什么@FunctionalInterface保留RUNTIME？

问题内容：用Javadoc说：如果使用该注释类型对类型进行注释，则编译器需要生成错误消息，除非… 为什么不是还是不够的，像。问题答案：该批注有两个目的。关于编译器和它产生的错误的确足以拥有一个，因为在这方面它仅影响带有注释的类。但是，它还有第二个目的，即记录以下事实：确实打算将其用作功能性接口，以及以这种方式使用它的可能性不仅是巧合，例如，并非旨在以这种方式使用。因此，它带有注释并具
为什么lxml.etree.iterparse（）占用了我所有的内存？

问题内容：这最终会消耗我所有的可用内存，然后进程被杀死。我曾尝试将标签从更改为“较小”标签，但这并没有什么不同。我在做什么错/如何处理这个大文件？我可以轻松地将其切碎并以较小的块进行处理，但这比我想要的还要难看。问题答案：当遍历整个文件时，将构建一棵树，并且不会释放任何元素。这样做的好处是元素可以记住其父元素是谁，并且您可以形成引用祖先元素的XPath。缺点是它会消耗大量内存。为了在解
JSF 2.2内存消耗：为什么Mojarra将最近25个视图的ViewScoped Bean保留在内存中？

每个会话的内存增长使用JSF2.2 (2.2.12)和Mojarra，我们遇到了高内存消耗。在调查了我们的负载测试后，发现我们的ViewScoped Beans中的数据大小相当高（有时超过1MB）。无论如何——当从一个视图导航到另一个视图时，会话内存大小会不断增长。我们不能在短期内减少bean的大小，因此这种行为会产生相当大的影响。解决方案1-更改上下文参数（不起作用）现在，我们使用了Moj

为什么C#内存流保留了这么多内存？

共有3个答案

相关问答

相关文章

相关阅读

相关工具

相关文档