Posted by: lrrp | February 29, 2008

Generating Huge reports in JasperReports

There are certain things to care while implementing the Jasper Reports for huge dataset to handle the memory efficiently, so that the appliacation does not go out of memory.

They are:

1) Pagination of the data and use of JRDataSource,

2) Viruatization of the report.

When there is a huge dataset, it is not a good idea to retrieve all the data at one time.The application will hog up the memory and you’re application will go out of memory even before coming to the jasper report engine to fill up the data.To avoid that, the service layer/Db layer should return the data in pages and you gather the data in chunks and return the records in the chunks using JRDataSource interface, when the records are over in the current chunk, get the next chunk untilall the chunks gets over.When I meant JRDataSource, do not go for the Collection datasources, you implement the JRDataSource interface and provide the data through next() and getFieldValue()To provide an example, I just took the “virtualizer” example from the jasperReports sampleand modified a bit to demonstrate for this article.To know how to implement the JRDataSource, Have a look at the inner class “InnerDS” in the example.

Even after returning the data in chunks, finally the report has to be a single file.Jasper engine build the JasperPrint object for this. To avoid the piling up of memory at this stage, JasperReports provided a really cool feature called Virtualizer. Virtualizer basically serializes and writes the pages into file system to avoid the out of memory condition. There are 3 types of Virtualizer out there as of now. They are JRFileVirtualizer, JRSwapFileVirtualizer, and JRGzipVirtualizer.JRFileVirtualizer is a really simple virtualizer, where you need to mention the number of pages to keep in memory and the directory in which the Jasper Engine can swap the excess pages into files. Disadvantage with this Virtualizer is file handling overhead. This Virtualizer creates so many files during the process of virtualization and finally produces the required report file from those files.If the dataset is not that large, then you can go far JRFileVirtualizer.The second Virtualizer is JRSwapFileVirtualizer, which overcomes the disadvantage of JRFileVirtualizer. JRSwapFileVirtualizer creates only one swap file,which can be extended based on the size you specify. You have to specify the directory to swap, initial file size in number of blocks and the extension size for the JRSwapFile. Then while creating the JRSwapFileVirtualizer, provide the JRSwapFile as a parameter, and the number of pages to keep in memory. This Virtualizer is the best fit for the huge dataset.The Third Virtualizer is a special virtualizer which does not write the data into files, instead it compresses the jasper print object using the Gzip algorithm and reduces the memory consumption in the heap memory.The Ultimate Guide of JasperReports says that JRGzipVirtualizer can reduce the memory consumption by 1/10th. If you are dataset is not that big for sure and if you want to avoid the file I/O, you can go for JRGzipVirtualizer.

Check the sample to know more about the coding part. To keep it simple, I have reused the “virtualizer” sample and added the JRDataSource implementation with paging.I ran the sample that I have attached here for four scenarios. To tighten the limits to get the real effects, I ran the application with 10 MB as the max heap size (-Xmx10M).

1a) No Virtualizer, which ended up in out of memory with 10MB max heap size limit.

export:
[java] Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
[java] Java Result: 1

1b) No Virtualizer with default heap size limit (64M)

export2:
[java] null
[java] Filling time : 44547
[java] PDF creation time : 22109
[java] XML creation time : 10157
[java] HTML creation time : 12281
[java] CSV creation time : 2078

2) 2) With JRFileVirtualizer
exportFV:
[java] Filling time : 161170
[java] PDF creation time : 38355
[java] XML creation time : 14483
[java] HTML creation time : 17935
[java] CSV creation time : 5812

3) With JRSwapFileVirtualizer
exportSFV:
[java] Filling time : 51879
[java] PDF creation time : 32501
[java] XML creation time : 14405
[java] HTML creation time : 16579
[java] CSV creation time : 5365

4a) With GZipVirtualizer with lots of GC
exportGZV:
[java] Filling time : 84062
[java] Exception in thread “RMI TCP Connection(22)-127.0.0.1″ java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread “RMI TCP Connection(24)-127.0.0.1″ java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread “RMI TCP Connection(25)-127.0.0.1″ java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread “RMI TCP Connection(27)-127.0.0.1″ java.lang.OutOfMemoryError: Java heap space
[java] Java Result: 1

4b) With GZipVirtualizer (max: 13MB)
exportGZV2:
[java] Filling time : 59297
[java] PDF creation time : 35594
[java] XML creation time : 16969
[java] HTML creation time : 19468
[java] CSV creation time : 10313

I have shared the updated virtualizer sample files at Updated Virtualizer Sample files

About these ads

Responses

  1. Hi,
    this article is really cool.
    I was searching for some sample implementations for a custimized JRDataSource and found this article.Its really interesting and thought provoking since i am working on this technology itself.
    The reason i got into searching this was due to a bug in my jasper report while filling the data in to the report. I am using the JRBeanCollectionDataSource class. And the fields in the first row in a group is showing the results differently.what it shows is the sum of all the rows in a group(for that particular field) + the actual value of the field.After reading the article i smell rat that this could be a mishap due to the use of
    JRBeanCollectionDataSource .Anyways i was trying to get hold of the sample you have in this article if that would help me fix this bug,.

    Thanks,
    Manoj

  2. Nice, but where is the link to the source code? That would be nice and appreciated.

  3. Follow this link for sample code

    http://jasperforge.org/plugins/espforum/view.php?group_id=102&forumid=103&topicid=17387

    • Hi
      chandu
      Please send a example program on Virtualizer in jasper report.what is the main use of the virtualizer

  4. need to make reports in PDF IReport table with very big for example 100,000 records (5000 to increase per month) which is time consuming for 5 minutes!!
    In my first attempt the divided into 4 rounds with a limit = 25,000 to display the records but is not a solution. Also concerned with subirle the memory to 256 JAVA_OPTS = “-Xmx256m” but it is not much difference.
    I’m looking for a way to generate PDF for asynchronous show that while the first records in the following pdf records go to be compilers.
    protected void generatePDFOutput(String folder, String reportName, HttpServletResponse resp, Map parameters) throws JRException,
    NamingException,
    SQLException,
    IOException {

    JasperReport reportCompiled = getCompiledReport(folder,reportName,parameters);
    DataSourceTransactionManager dataSource = null;
    // Equivalente: new InitialContext(null).

    dataSource = (DataSourceTransactionManager) this.getMiDataSource();

    Connection conn = dataSource.getDataSource().getConnection();

    byte[] bytes = null;

    bytes =
    JasperRunManager.runReportToPdf(reportCompiled, parameters,conn);

    conn.close();

    resp.setHeader(“Content-Disposition”,
    “attachment;filename=\”” + reportName +”.pdf” + “\””);
    resp.setContentType(“application/octet-stream”);
    resp.setContentLength(bytes.length);
    ServletOutputStream ouputStream = resp.getOutputStream();
    ouputStream.write(bytes, 0, bytes.length);
    ouputStream.flush();
    ouputStream.close();
    }

    /**
    * Metodo para compilar el reporte
    * @return
    * @throws JRException
    */
    private JasperReport getCompiledReport(String folder, String reportName,Map parameters) throws JRException {

    context = this.getServlet().getServletContext();

    JRFileVirtualizer virtualizer = new JRFileVirtualizer (100, getPathTemp());
    virtualizer.setReadOnly(false);
    parameters.put(JRParameter.REPORT_VIRTUALIZER, virtualizer);
    File reportFile =
    new File(context.getRealPath(folder + reportName +
    “.jasper”));

    // If compiled file is not found, then
    // compile XML template
    if (!reportFile.exists()) {
    JasperCompileManager.compileReportToFile(context.getRealPath(folder + reportName + “.jrxml”));
    }

    JasperReport jasperReport =
    (JasperReport)JRLoader.loadObject(reportFile.getPath());

    return jasperReport;

    }

  5. great article! for those who are also looking for the source code: http://sites.google.com/site/raffimd/jasperreports/virtualizer.zip

  6. I was trying to generate a report using jasper with a dataset of 3million records with JRFileVirtualizer. It did not generate the report even after six hours. The page memory seems to hit the max. Does anyone know what needs to be done to generate the report?

    • Hey Sanker,
      It seems that we’re same issues facing….
      Do you have you already a solution for this?

      tnx.

  7. Hi lrrp,

    Where can I find the example files of this example??

    Thanks in Advance,
    DNV Srikanth.

  8. I have the same issue too.
    I have to generate a report that consist of millions record and produce about 20000pages. I have used many approach, in my case there is no issue with my query. the sql execution only take maximum in 1min. But the filling, transform and pdf generation is almost 6hours.
    FYI my servers spec:
    blade server with oracle 11g database
    Mem: 8Gb
    Procss: 2 x Quad Core intel xeon

    • Were you able to resolve this issue? We are stuck with similar issue.

      • same problem here, abdout 5 to 7 million entries. Any solution?

  9. Thank you for such an inspirational post. It left me with a couple of lasting thoughts. Shanti!

  10. hi.. i use JRBeancollectionDataSource to create jasper print….where collection is size is 1 lakh(which is POJO)…so it throws out of memory error java heap space….can someone help how to send data to jasper print…

    satheesh.K


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: