Microsoft Power BI is a potent but intuitive tool that you can use to build visualizations, reports, and dashboards. You can even create complex calculations, transform data according to your liking, and model it directly in Power BI; instead of doing all this computation-heavy activity in the source environment or purchasing and learning to use other tools. However, one area that bothers many developers and organizations is that the size of the Power BI PBIX file can get dangerously large when you’re dealing with large amounts of data.
It may reach hundreds of MBs in size, leading to tons of problems, discussed in the next section. Today, we will be offering some tips that you can leverage to reduce the size of your PBIX file. These will lead to several other benefits too.
Dealing with Large PBIX File Sizes
When you are dealing with a large amount of data, the size of your PBIX file can get too large since all the data and calculations are stored directly in a single file. This happens especially while accessing multiple heterogeneous data sources – and utilizing Power BI’s Import Mode. Of course, you could consider transitioning to Direct Query mode so that no data resides in the file, but that is not possible for all use-cases due to bandwidth and performance considerations. Most organizations and developers tend to use Import mode and have everything in a single place.
This can have multiple detrimental effects, including difficulties in sharing files across the organization, longer publish times, and general performance detriments. If it’s a small file, you can simply upload or share it with others in an email within seconds. However, if your file size is hundreds of MBs, this becomes a much more difficult task, both for the person sharing the file and those consuming it. Since there’s a vast amount of data in your file, there’s also the likelihood that your report’s performance will suffer while handling the data.
It is essential to reduce the size of your Microsoft Power BI PBIX file to work around all the aforementioned problems, even when you’re working in Import mode.
How to Work Around Large File Sizes
As with most problems, there is no one-shoe-fits-all solution. However, depending upon your use case and your data, the following processes may drastically bring down the size of your power BI PBIX file:
- The biggest culprit is often that you have imported a lot of columns that you are not actually using. For example, a table may have 200+ columns, but you’re probably only using 20 or so in your visualizations and joins. It’s better to remove all the unneeded columns. Head over to Power Query and then select the “Keep Columns” option to keep only the columns you need. The most significant benefit of this approach is that removing unneeded columns doesn’t remove the associated data permanently. It gets compressed in an optimized manner. If you later realize that you accidentally removed a column; you can simply go to the “Keep Columns” step in Power Query and restore the withdrawn column within a few seconds. There is no disadvantage to cleaning up your data; everything can be rolled back quickly if needed. The screenshot below shows an example:
- Apart from removing the columns you don’t need, it’s also essential to remove the rows you don’t need. If you’re only visualizing data from the past 3 months, it makes little sense to store historical data from the past five years. There is no need to store any “deprecated” data either. You can remove this data by navigating to Power Query filtering based on your date and/or other columns. Keep only the data that you need. Once again, you can roll back any removals through Power Query. Check out this example in the screenshot below:
- It is also ideal for managing the grain of your data. If your visualizations show data daily, you don’t need to store date data as a date/time column; simply reduce the granularity. This will decrease the amount of data you are saving. While this may sound like a minor change, it can make a notable difference if you have millions of rows. See how you can change the data type below:
- Another good idea is to reduce reliance on Calculated Columns. Only use them when necessary since you cannot store them in a compressed format. It is preferable to use Measures where possible as they are calculated at runtime and do not use extra space.
- Disable Auto Date/Time loading in the Data Loading options in Power BI settings. This ensures that such tables are not automatically created on loads. You can see this setting below:
- While it may not make a massive difference, try removing unused and unneeded visuals and pages from your Power BI report.
As we discussed before, Power BI is a powerful tool but only when you realize its full potential. Non-destructive compression of unneeded data is a significant capability that can lead to massive gains in performance while drastically reducing your file size by several orders of magnitude. Depending upon your use case and your data, you can reduce the size of your PBIX file by up to 95%, which, in turn, downstream several benefits such as ease of file sharing, publishing, and performance gains.
This entire activity requires a solid understanding of the model and the underlying data. Xavor has proven experience in Power BI and its related domains, so if you are facing problems related to your data, modeling, and performance issues in general, reach out to us for a consultation.