In this page

Overview

This page describes tuning Better Excel Exporter for Jira for improved performance. It most importantly explains techniques to create custom Excel templates that can efficiently export large amount of data from Jira. By "large amount" we mean XLSX files with 50,000 - 1,000,000 cells.

If you export smaller data sets using your custom template, you will unlikely face performance problems. In that case, you can just skip this page.

Please note that albeit we do our best to make sure that the Better Excel Exporter app performs well under a wide variety of circumstances and the techniques below are generally good practices, every template and every situation may have its unique performance characteristics.

If you are having performance problems:

  1. Read this page and make those practical changes that are applicable to your template
  2. Ask for our help

Techniques

Use the <mt:autosize> tag judiciously

When the <mt:autosize> tag is processed, all rows and columns are searched through for the highest and widest cell content. Even worse, the content must be rendered using its specific font settings to find out its actual display dimensions. This is (apparently) instant in Excel, but can be time consuming in the server-side when rendering large worksheets.

As a best practice, use the <mt:autosize> tag only if column and row sizing actually matter. It matters in worksheets created for "human reading". It does not matter in worksheets that are not read by people, but act as source data for pivot tables or charts.

Limit the scope for <mt:autosize>

When you have lots of cells in a worksheet, but column and row sizing is important, you can't avoid using <mt:autosize> tag. In this case, use the limit parameters to reduce the range of columns and rows it will check to find the optimal width and height. It can reduce the tag processing time to a fraction.

Use copyColumnWidthsRight="false" for horizontal iterations

If you use the copyRight="true" parameter with the <jt:forEach> tag to iterate horizontally, remember to use copyColumnWidthsRight="false", too:

| <jt:forEach items="${issues}" var="issue" copyRight="true" copyColumnWidthsRight="false">...</jt:forEach> |

Without this, the renderer will waste quite some time to copy the width of a template column to the final columns. It is totally unnecessary if you also use the <mt:autosize> to optimize sizing. In fact, this is rather rare (although possible) that you want the final columns to inherit the width of the template column.

Create custom-tailored static templates

When you need a report with a fixed set of fields (columns), consider creating a specific template with the hard-coded static columns rather than using the fully dynamic issue-navigator.xlsx. The former can be more performant, because the renderer does not need to multiplicate columns dynamically.

Example scenario: you need 20 columns with fixed fields but using the general issue-navigator.xlsx for a large number of issues is not fast enough. Solution: create a new template, enter the template expression to the 20 columns, set the visual properties, set the formatting for the cells and use this static template instead.

Keep the template clean and tidy

It is easy to create blank garbage cells in Excel. For instance, accidentally click into a cell, enter a space and leave the cell. Or just accidentally set the font-weight to "bold" in an otherwise unused cell. Even if this is hard to see (and would be considered innocent anyway), that cell is actually existing now and it will stay there, eventually causing performance problems.

How can this be a problem? The rendering engine processes these blank cells, too, which can be a factor of performance. For example if you multiple nested <jt:forEach> loops and there are some blank cells around, the renderer will waste time and memory multiplicating those. For large exports, they can invisibly grow to as large as tens of thousands of cells, depending on the template.

Generally speaking, we suggest manually cleaning the template from garbage:

  1. Click the "A1" cell in the worksheet and press CTRL+SHIFT+END. This will select the rectangle shaped area of "living" cells, including blank ones.
  2. Locate all those columns and rows which are within the selected area, but do not contain any actual value. These are the garbage columns and rows.
  3. Select the garbage columns by clicking the column headers and click Delete.
  4. Repeat this for garbage rows, too.
  5. Save the file.
  6. Close and re-open the file. Press CTRL+SHIFT+END again. If the clean-up was correctly done, then only the expected area should be selected (i.e. no garbage anymore).

This technique is also explained by Microsoft here and here.

Coloring blank cells in output files

To fight with garbage, we offer a hidden feature called "blank cell coloring" that helps to find the blank rows, columns, cells in the output Excel files. When we know the blank cells in the output files, we can find the blank cells in the template from which those were generated and remove them.

How does it work? After activating the feature, the renderer will post-process each worksheet after its rendering has been completed and color those cells to pink that exist, but has no actual content. You can then open the file in Excel and check the pink cells. It's simple and visual:

How to activate this?

  1. Go to AppsManage AppsExcel Views (in the left).
  2. Append "__BLANK" to the name of the view you want to export with. For example, rename "Excel (Current Fields)" to "Excel (Current Fields)__BLANK".
  3. Export using this view, and the Excel file will be created with the blank cells colored to pink.
  4. When finished, remember to drop "__BLANK" from the view name.

When looking at the output file, don't forget that there are pink cells that are not garbage!

  • For example, for unresolved issues the Resolution column will be pink, but this is not garbage. (The screenshot above shows "legally blank" cells.)
  • In contrary, if you find pink rows under your data or pink columns right next to that, that is garbage.

Denormalize data to enable simpler templates

When you have a collection of collections of collections kind of data and you want to render this with multiple embedded <jt:forEach> tags using different indentations or directions (horizontal vs. vertical), the renderer will have hard times to find out what cell regions should be optimally shifted to make space for the next item in the next collection. For example: you want to export project categories with projects with issues with worklogs using four <jt:forEach> tags in some tricky layout. (This can be even more complicated if also apply conditions that filter the items on the fly.)

You can denormalize this data to a single collection, where each item in the denormalized collection is a tuple of (project category, project, issue, worklog). Then use a single <jt:forEach> tag to iterate over this!

Don't worry about the extra memory required by denormalized data structure:

  • It will redundantly store nothing else, but Java object references that have small memory footprint.
  • Using the collection types correctly, either the JDK collections or the new collections from Guava, will make this efficient. (Multimap is a personal favourite of ours.)
  • Because you make the template simpler, the renderer will need less memory.

This technique may sound counter-intuitive, but the explanation is a simple: working with collections is far more efficient than multiplicating and shifting cells and updating cell references for the renderer.

Avoid shifting large regions

Shifting large regions of cells is expensive, especially if that's done multiple times during an iteration.

Example scenario: you create a template in which the issues is exported in the topmost row, but you also insert N rows under this with formulas, static content, etc. Every time when the renderer makes space for a new exported issue, that N rows needs to be shifted downwards. For 2000 issues this means 2000*N row shifting, that can potentially mean hundreds of thousands of cell shifting.

If you restructure your template and put your extra content above the issue data, then it doesn't need to be shifted at all! Formulas can, of course, forward-reference the cells, therefore it is perfectly possible to have a "Total" row in the top part of the template.

Use templates in XLS file format (instead of XLSX)

In certain situations, rendering XLS templates (XLS is Excel's proprietary file format) can be faster and can require less memory than XLSX templates! This is because the XLS data model can be more efficiently represented in memory than the XLSX data model.

You should expect performance improvements only for larger amount of exported data. You can verify if this makes any difference in your use case:

  1. Open your original XLSX template in Excel.
  2. Save it as XLS (Excel 97-2003 Workbook).
  3. Upload the XLS variant to the Better Excel Exporter templates.
  4. Create a new Excel view to render the XLS template.
  5. Run measurements to compare the XLS and XLSX templates' performance. Please note that due to caching and general system warmup, the second, third, fourth rendering can complete significantly faster than the first one. Therefore, repeat your measurements until the duration becomes stable (e.g. the variance is less than 10%), and compare the durations only then.

Note: the <mt:autosize> tag has no effect in XLS templates. (It may or may not matter in your use case.)

Questions?

Ask us any time.