Transform Stream into Byte Array

0

Good !!

I have a Stream of a ZIP file of approx. 450 Mb, and I need to convert it to an array of bytes. To do this, MemoryStream (System.IO.MemoryStream) is used by default, it follows the code I used:

Stream receiveStream = response.GetResponseStream();

using (MemoryStream ms = new MemoryStream())
{
    receiveStream.CopyTo(ms);
    byte[] dadosArquivo = ms.ToArray();
}
return dadosArquivo;

The problem is that when using the CopyTo method, an exception of type OutOfMemoryException occurs. From the tests I've done, the MemoryStream limitation is approx. 256 Mb of Stream size.

Some extra info:

  • This Stream I get via response from an Http request (HttpWebResponse);
  • I use MemoryStream to do this parse, because it was the only form I found in my searches.
  • In relation to memory, I'm using a machine with 6Gb of RAM, I did the same test on another 8Gb machine and the limitation of MemoryStream is the same.

The error StackTrace follows:

System.OutOfMemoryException was caught HResult=-2147024882
  Message=Exceção do tipo 'System.OutOfMemoryException' foi acionada.
  Source=mscorlib
  StackTrace:
       em System.IO.MemoryStream.set_Capacity(Int32 value)
       em System.IO.MemoryStream.EnsureCapacity(Int32 value)
       em System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
       em System.IO.Stream.InternalCopyTo(Stream destination, Int32 bufferSize)
       em System.IO.Stream.CopyTo(Stream destination)
       em HiperPdvLibrary.Integracao.Api.ApiRequest.GetByteRequest(HttpStatusCode& status)
  InnerException: 

I wonder if anyone has ever had this situation or do you have any other suggestions for doing this conversion, maybe doing this process in parts?

Hugs

    
asked by anonymous 14.07.2015 / 13:18

2 answers

1
The problem is as follows: MemoryStream starts by allocating a small buffer (for example, a 4-byte array), and when the buffer fills, MemoryStream creates a new buffer with twice the size, copy the contents to the new buffer, and discards the old buffer.

Pseudo-code:

void AddByteToMemoryStream(MemoryStream ms, byte b)
{
    if(ms.Length == ms.Capacity)
    {
        var newBuffer = new byte[ms.Buffer.Capacity];
        ms.Buffer = newBuffer;
    }

    ms.Add(b);
}

So when the buffer reaches 256MB, and we try to read one more byte, a new buffer is created with 512MB - that means at the moment at least 768MB. This is not a problem at first.

But more importantly, there needs to be 512MB of contiguous memory !!! It is likely that the memory space is fragmented, so the allocation fails.

There is no simple way to solve the problem, but I suggest these solutions:

Pre-allocate memory

If you know at the outset that you will receive a 450MB file, try allocating 460MB at the start to avoid unnecessary allocations.

var ms = new MemoryStream(460000000);

Stream chunks to the destination

The best solution, in my opinion, is to avoid having the whole file in memory. If the purpose is to receive a file via HTTP and then save it to disk, you can do direct streaming:

/// <summary>
/// Copies the contents of input to output. Doesn't close either stream.
/// </summary>
public static void CopyStream(Stream input, Stream output)
{
    byte[] buffer = new byte[8 * 1024];
    int len;
    while ( (len = input.Read(buffer, 0, buffer.Length)) > 0)
    {
        output.Write(buffer, 0, len);
    }    
}

using (Stream file = File.Create(filename))
{
    CopyStream(receiveStream, file);
}

( Jon Skeet Code )

This code is copying blocks of 8KB at a time to the disk as they are received over HTTP. So the program memory will never grow too much.

    
14.07.2015 / 19:27
1

Good !!

I was able to find a solution to my problem. It is a re-implementation of the MemoryStream class, it is called MemoryTributary, and it works with memory allocation in a different way than MemoryStream. In the link below, it contains details of the implementation and a comparison with MemoryStream.

CodeProject from MemoryTributary

    
14.07.2015 / 22:06