This project has moved. For the latest updates, please go here.

Large Files - OutOfMemory Exception

May 31, 2011 at 8:23 PM

Hi all,

I'm brand new to ClosedXML so apologies if I am repeating a question already asked and addressed. I looked around but had trouble finding info about it.

I am trying to simply open a large Excel file (17MB) for some simple read-only processing. I try to do the following code and it starts eating up memory like crazy:

var /*XLWorkbook*/ workbook = new XLWorkbook("story.xlsx");
var /*IXLWorksheet*/ ws = workbook.Worksheet("Sheet1");

It goes very slow on the first line of code and doesn't get past it. It eats up all of my system memory until the OOM exception is thrown. I also tried using a File Stream to see if this would help:

FileStream fs = new FileStream(docName, FileMode.Open, FileAccess.ReadWrite, FileShare.Read);
var /*XLWorkbook*/ workbook = new XLWorkbook(fs);
var /*IXLWorksheet*/ ws = workbook.Worksheet(sheetName);

Same problem once again. What is the recommended approach to handle large files with ClosedXML?  Thanks in advance for a quick response.

Doug

Coordinator
May 31, 2011 at 9:52 PM

If the information is not confidential send me a PM so you can send me the file for testing. Otherwise how many worksheets and cells per worksheet does your file has?

May 31, 2011 at 10:03 PM

The file can be found here:

http://www.nirodhasoftware.com/usuhs/events.xlsx

Coordinator
Jun 1, 2011 at 12:47 AM
Edited Jun 1, 2011 at 12:52 AM

Thanks for the feedback. I'm improving the performance based on that file. Let's see if I can make it load that file in a reasonable amount of time, if at all.

Coordinator
Jun 2, 2011 at 7:26 PM

To be honest, as it is right now, I don't think I'll be able to get it to the point where you can open a file of that size in a timely manner. I'm making a lot of improvements (all of which will make it to the next release) but even then I'm just shaving seconds from something that takes minutes.

Sorry.

Coordinator
Jun 7, 2011 at 4:54 PM
Edited Jun 7, 2011 at 5:44 PM

Edit - Scratch that... There was an error in the calculation =/

Still working on the performance issue...

Edit2: Bleh, I mixed up the test files. I still can't load your file =(

Sorry (again)

Jun 14, 2011 at 10:42 AM

Hi, i was also having Out Of Memory error creating a file larger then 10mb.  

I solved this by running the program in 64bit mode. This meant the program could use more then 4gb's of memory. Now runs a treat although it can easily chew up 6-7gb on my machine. Not a big issue!

Hope this helps.

Zed

Jul 4, 2011 at 1:41 PM

Despite the fact that more and more users have 8GB available, i still think that the footprint is quite high.

Here are my numbers:

30K rows, file size 4,2MB --> InsertData and SaveWorkbook consume 1,7 GB in max

70K rows, file size 8,6 MB--> Data is 2,3x, Runtime is 2,5x, Memusage 2,7x compared to 30K example.

I don't consider a 70K Excel file as large. 2007 and 2010 can address 1Mio rows, and those are the release where OpenXML/CloseXML are made for ...

 

Would the "Open XML SDK SAX-Like Approach" discussed for example here

<http://blogs.msdn.com/b/brian_jones/archive/2010/06/22/writing-large-excel-files-with-the-open-xml-sdk.aspx>

something which could be implemented in ClosedXML?

 

Peter

Coordinator
Jul 4, 2011 at 3:46 PM

I'm always improving the performance and memory usage of this library. The only thing I can say is that we're not there yet.

As for SAX, ClosedXML already uses it to read the files and I'll implement it to write the files when time permits.

-Manuel

Jul 7, 2011 at 5:30 PM

That would be a really great. Thanks.

 

Regarding SAX for reading files: i just read the 70k lines xlsx and the memory consumption is also quite high. I thought SAX will dramatically reduce the memory consumption. Do i understand something completely wrong?

Coordinator
Jul 7, 2011 at 5:58 PM

The most common way to use the OpenXML SDK is to let the SDK to load the entire file and then ask for the contents that you want. With SAX you don't load the entire file all at once, SAX starts reading and you pick up the pieces that you need along the way.

We're still loading the contents of the file into memory as ClosedXML objects. The difference is that with SAX we don't ask the OpenXML SDK to load the entire file before giving us the contents. We use SAX to load the contents as they're being read.

I hope that clarifies the issue.

Aug 18, 2011 at 8:28 AM
Edited Aug 18, 2011 at 8:28 AM

Thanks a lot MDealeon for building ClosedXML. This really helped me a lot while manipulating excel files.

But I am also encountering the out-of-memory exception in my project while working with larger volume of records. Please let us know when you publish the next version considering this issue.

Thanks a lot once again...!

-Jahedur Rahman

Coordinator
Aug 18, 2011 at 2:05 PM

Please download the latest check-in which has major memory and performance improvements and let me know...

Aug 19, 2011 at 6:01 AM

Yes using the latest version (v0.58.0) solved this problem to a certain extent.

But still getting it while using below code for 50,000 rows x 100 columns:

var wb = new XLWorkbook(filePath);

            var ws = wb.Worksheet(1);

            for (int i = 1; i <= 50000; i++)
            {
                for (int j = 1; j <= 100; j++)
                {
                    ws.Cell(i, j).Value = i.ToString() + ":" + j.ToString();
                }
            }

            wb.Save();

But it worked good for 50000 rows x 20 columns.

I am wondering if implementing iDisposable interface in the workbook, worksheet to enable Using statement would allow us not to force running Garbage Collector, because the mamory consumed by the system is not released immediately after the workbook is saved.

Thanks again...

Jahedur Rahman

 


Coordinator
Aug 19, 2011 at 3:50 PM

You'll always be able to increase the file size to a point where you run out of memory.

I'll continue to improve the memory consumption just like I continue to improve the performance of the library, but this just became a non issue.

Oct 11, 2011 at 11:11 AM

Thanks MDeLeon...

In the current days, I am working with some tools using ClosedXML that in some cases require inserting more than 1,00,000 of rows in a worksheet, and facing the OOM exception. But I am looking forward to your upcoming releases with a hope that memory usage would be reduced in the later versions and will solve the problem.

Thanks a lot again for building this wonderful tool.

Jahedur Rahman

Coordinator
Oct 11, 2011 at 4:49 PM

How many cells are we talking about?

How much memory does your box have?

Oct 17, 2011 at 4:32 AM
Edited Oct 17, 2011 at 4:34 AM

It's inserting around 1,00,000 rows * 22 cols into the worksheet.

My box is having 4 GB of RAM. But the application takes around 1.5 GB and throws this exception, though the box was having around 1 GB unused memory.

Oct 25, 2011 at 10:23 PM

sounds like your application runs only with 32bit.