In this page

Overview

This page describes tuning Better Excel Exporter for Jira for improved performance. It most importantly explains techniques to create custom Excel templates that can efficiently export large amount of data from Jira. By "large amount" we mean XLSX files with 50,000 - 1,000,000 cells.

If you export smaller data sets using your custom template, you will unlikely face performance problems. In that case, you can just skip this page.

Please note that albeit we do our best to make sure that the Better Excel Exporter app performs well under a wide variety of circumstances and the techniques below are generally good practices, every template and every situation may have its unique performance characteristics.

If you are having performance problems:

  1. Read this page and make those practical changes that are applicable to your template
  2. Ask for our help

Techniques

Use the <mt:autosize> tag judiciously

When the <mt:autosize> tag is processed, all rows and columns are searched through for the highest and widest cell content. Even worse, the content must be rendered using its specific font settings to find out its actual display dimensions. This is (apparently) instant in Excel, but can be time consuming in the server-side when rendering large worksheets.

As a best practice, use the <mt:autosize> tag only if column and row sizing actually matter. It matters in worksheets created for "human reading". It does not matter in worksheets that are not read by people, but act as source data for pivot tables or charts.

Limit the scope for <mt:autosize>

When you have lots of cells in a worksheet, but column and row sizing is important, you can't avoid using <mt:autosize> tag. In this case, use the limit parameters to reduce the range of columns and rows it will check to find the optimal width and height. It can reduce the tag processing time to a fraction.

Use copyColumnWidthsRight="false" for horizontal iterations

If you use the copyRight="true" parameter with the <jt:forEach> tag to iterate horizontally, remember to use copyColumnWidthsRight="false", too:

| <jt:forEach items="${issues}" var="issue" copyRight="true" copyColumnWidthsRight="false">...</jt:forEach> |

Without this, the renderer will waste quite some time to copy the width of a template column to the final columns. It is totally unnecessary if you also use the <mt:autosize> to optimize sizing. In fact, this is rather rare (although possible) that you want the final columns to inherit the width of the template column.

Create custom-tailored static templates

When you need a report with a fixed set of fields (columns), consider creating a specific template with the hard-coded static columns rather than using the fully dynamic issue-navigator.xlsx. The former can be more performant, because the renderer does not need to multiplicate columns dynamically.

Example scenario: you need 20 columns with fixed fields but using the general issue-navigator.xlsx for a large number of issues is not fast enough. Solution: create a new template, enter the template expression to the 20 columns, set the visual properties, set the formatting for the cells and use this static template instead.

Keep the template clean and tidy

It is easy to create blank garbage cells in Excel. For instance, accidentally click into a cell, enter a space and leave the cell. Or just accidentally set the font-weight to "bold" in an otherwise unused cell. Even if this is hard to see (and would be considered innocent anyway), that cell is actually existing now and it will stay there, eventually causing performance problems.

How can this be a problem? The rendering engine processes these blank cells, too, which can be a factor of performance. For example if you multiple nested <jt:forEach> loops and there are some blank cells around, the renderer will waste time and memory multiplicating those. For large exports, they can invisibly grow to as large as tens of thousands of cells, depending on the template.

Generally speaking, we suggest manually cleaning the template from garbage:

  1. Click the "A1" cell in the worksheet and press CTRL+SHIFT+END. This will select the rectangle shaped area of "living" cells, including blank ones.
  2. Locate all those columns and rows which are within the selected area, but do not contain any actual value. These are the garbage columns and rows.
  3. Select the garbage columns by clicking the column headers and click Delete.
  4. Repeat this for garbage rows, too.
  5. Save the file.
  6. Close and re-open the file. Press CTRL+SHIFT+END again. If the clean-up was correctly done, then only the expected area should be selected (i.e. no garbage anymore).

This technique is also explained by Microsoft here and here.

Coloring blank cells in output files

To fight with garbage, we offer a hidden feature called "blank cell coloring" that helps to find the blank rows, columns, cells in the output Excel files. When we know the blank cells in the output files, we can find the blank cells in the template from which those were generated and remove them.

How does it work? After activating the feature, the renderer will post-process each worksheet after its rendering has been completed and color those cells to pink that exist, but has no actual content. You can then open the file in Excel and check the pink cells. It's simple and visual:

How to activate this?

  1. Go to the Jira screen from which you wanted to export.
  2. Open the Export drop-down.
  3. Right-click the Excel view name and click Copy link address. (Your browser may call this function differently, we just want to copy the URL of that link.)
  4. Open a new browser tab.
  5. Paste the URL to the address bar and add an extra request parameter to end of the URL: &blankcellcoloring=true (don't forget the leading ampersand!).
  6. Hit enter, and the Excel file will be created with the blank cells colored to pink.

For instance, this original URL:

http://jira.mycompany.com/rest/com.midori.jira.plugin.betterexcel/1.0/xls/xls-view/14/render?tempMax=1000&context=issue_navigator&jql=project+%3D+FOO

Will be transformed to:

http://jira.mycompany.com/rest/com.midori.jira.plugin.betterexcel/1.0/xls/xls-view/14/render?tempMax=1000&context=issue_navigator&jql=project+%3D+FOO&blankcellcoloring=true

When looking at the output file, don't forget that there are pink cells that are not garbage!

  • For example, for unresolved issues the Resolution column will be pink, but this is not garbage. (The screenshot above shows "legally blank" cells.)
  • In contrary, if you find pink rows under your data or pink columns right next to that, that is garbage.

Denormalize data to enable simpler templates

When you have a collection of collections of collections kind of data and you want to render this with multiple embedded <jt:forEach> tags using different indentations or directions (horizontal vs. vertical), the renderer will have hard times to find out what cell regions should be optimally shifted to make space for the next item in the next collection. For example: you want to export project categories with projects with issues with worklogs using four <jt:forEach> tags in some tricky layout. (This can be even more complicated if also apply conditions that filter the items on the fly.)

You can denormalize this data to a single collection, where each item in the denormalized collection is a tuple of (project category, project, issue, worklog). Then use a single <jt:forEach> tag to iterate over this!

Don't worry about the extra memory required by denormalized data structure:

  • It will redundantly store nothing else, but Java object references that have small memory footprint.
  • Using the collection types correctly, either the JDK collections or the new collections from Guava, will make this efficient. (Multimap is a personal favourite of ours.)
  • Because you make the template simpler, the renderer will need less memory.

This technique may sound counter-intuitive, but the explanation is a simple: working with collections is far more efficient than multiplicating and shifting cells and updating cell references for the renderer.

Avoid shifting large regions

Shifting large regions of cells is expensive, especially if that's done multiple times during an iteration.

Example scenario: you create a template in which the issues is exported in the topmost row, but you also insert N rows under this with formulas, static content, etc. Every time when the renderer makes space for a new exported issue, that N rows needs to be shifted downwards. For 2000 issues this means 2000*N row shifting, that can potentially mean hundreds of thousands of cell shifting.

If you restructure your template and put your extra content above the issue data, then it doesn't need to be shifted at all! Formulas can, of course, forward-reference the cells, therefore it is perfectly possible to have a "Total" row in the top part of the template.

Use templates in XLS file format (instead of XLSX)

In certain situations, rendering XLS templates (XLS is Excel's proprietary file format) can be faster and can require less memory than XLSX templates! This is because the XLS data model can be more efficiently represented in memory than the XLSX data model.

You should expect performance improvements only for larger amount of exported data. You can verify if this makes any difference in your use case:

  1. Open your original XLSX template in Excel.
  2. Save it as XLS (Excel 97-2003 Workbook).
  3. Upload the XLS variant to the Better Excel Exporter templates.
  4. Create a new Excel view to render the XLS template.
  5. Run measurements to compare the XLS and XLSX templates' performance. Please note that due to caching and general system warmup, the second, third, fourth rendering can complete significantly faster than the first one. Therefore, repeat your measurements until the duration becomes stable (e.g. the variance is less than 10%), and compare the durations only then.

Note: the <mt:autosize> tag has no effect in XLS templates. (It may or may not matter in your use case.)

Increase the application memory (heap size)

Exporting can be a extremely memory-intensive task, depending on the scale of the data and the complexity of the template.

Why? While generating the final XLSX file the complete Excel document model is kept in memory so that the renderer can go back and forth to any cell and modify its content, style, formatting at any time during the rendering process. Note that storing the document in memory using a mutable and structured representation consumes much more memory than when that is finally written to the compressed XLSX file. In other words, the process of creating an XLSX file can require several magnitude more memory than its final filesize.

There are two typical symptoms of not giving enough memory for the renderer:

  1. The export file is created successfully, but it requires more time than it should, as the Java Virtual Machine (JVM) gets busy with memory management and garbage collection.
  2. The JVM may eventually run out of memory, resulting in a classic Java OutOfMemoryError.

The solution is trivial: increase the available memory following Atlassian's guide.

Notes:

  • If you have a non-optimized template (see all other items!), it will not magically "fix" that, but even in that case it will mitigate the problems caused by inefficiency.
  • Although we were thinking about introducing a "streaming" mode to the renderer, that would allow writing out the "already completed" parts of the XLSX document to the file and thus reduce the memory requirements, this would introduce limitations and is not available yet.

Other techniques

Limit the number of concurrent Excel renderings

Exporting large Excel spreadsheets can be resource intensive. Most importantly, it may require lots of memory to compile the in-memory document model. Although the memory will be released after the spreadsheet rendering is completed, the rendering process itself may put significant load to the Jira server temporarily.

If you run multiple large Excel spreadsheet renderings at the same time (concurrently), the individual loads will effectively add up, therefore the total load can be even more significant. But note that typical spreadsheets with 50-100 rows are not generating significant load, not even when generated concurrently.

To control the total load, it is possible to set the maximum number of concurrent renderings via the standard Java system property called jxls.threads.maxConcurrent. You should set this system property to an integer like 3 or 5. For a detailed how-to, please see the Setting properties and options on startup page in the official Jira documentation.

How does it work? If you set this to 1 (most extreme setting), then the second user trying to export an Excel spreadsheet concurrently with an already running export will receive a mini spreadsheet (which is inexpensive to create by the app). The mini Excel spreadsheet explains that he needs to wait a bit until the first rendering completes. If he re-tries the export later, he will receive the actual Excel spreadsheet.

It's that simple.

Limit the number of concurrent Excel renderings in automation rules

Note that limiting the concurrent renderings may lead to unwanted behavior when used with certain types of automations.

When using the Scheduled trigger and unchecking the Process all issues produced by this trigger in bulk option, the Automation app will process all issue separately and with parallel threads. That means, if you want to export 100 issues each to its own Excel spreadsheet, then rendering the 100 spreadsheets will be attempted to renderer in parallel and will meet the limit immediately. This is "as expected", but we wanted to mention this explicitly.

Questions?

Ask us any time.