When we launched Helm CONNECT in 2016, we had a fraction of the customers and users we do now. Today, there are 240 companies with thousands of users on the system, all with permissions that must be calculated each time they log in to Helm CONNECT. As a result of the way we were calculating and caching user permissions, our server was getting overloaded during peak login times, causing delays on page loads and overall system slowness. This wasn’t an issue when we first started, but, as our user base grew, the impact on performance became more obvious. In Version 1.21 of Helm CONNECT, to reduce the load on the server, we improved the method we use to calculate our permission tree, resulting in faster load times and a better overall user experience.
The Root of the Problem: Calculating User Permissions
Permissions tell Helm CONNECT which features each user can see and what actions they’re allowed to perform. A quick glance at the permission tree on the Setup > Users > Roles tab will give you a sense of the many possible options and combinations.
However, the permission tree visible on the Roles tab is just the tip of the iceberg. If you could see the entire permission tree in the “backend” of Helm CONNECT, you’d discover that there are actually 4120 permissions that must be checked each time a user logs in!
Just selecting one check box in the front-facing permission tree on the Roles tab can trigger 30 different backend permissions, each of which must then be filtered, based on the user’s division.
This brings us to the root of the problem: the way we calculate and cache user permissions. Each time a user logs in to Helm CONNECT or performs any API actions, we must check their permissions against the following:
- The master list of all Helm CONNECT features and permissions
- The user’s tenant permissions
- The user’s role-based permissions
- The user’s division
Based on the user access patterns we observed in Helm CONNECT, we determined that the best way to handle this huge volume of data was to process it when the user logs in and then cache the results in memory. However, the challenge we face with this method is that login traffic is more concentrated at certain times of day. At the beginning of each work day, large numbers of users log in to Helm CONNECT, requiring the system to complete thousands of calculations for each user and causing the server to slow down. As a result, users would experience system slowness and issues with pages loading. In the image below, this is represented by the flat lines occurring between 04:00-04:30, 05:00-05:30, and 06:00-07:00.
The Solution: Adding More Cache
In Helm CONNECT 1.21, we addressed the issue of server slowness at peak times by adding more caching. Previously, we had only one cache, called the “Identity Cache.” Now, we also have a second cache, called the “Feature Permissions Cache”. By adding this additional cache, we’ve greatly simplified the calculations that the login server and database need to do to calculate the end result of the permission controls.
Adding the second cache has resulted in a big impact on overall system performance. We first deployed the new code at a non-peak time (15:00 Pacific Time) so the server wasn’t under a high load. We didn’t expect to see much of a change right away because not many people log in at that time of day, however, as soon as we deployed the new code, the server load immediately dropped by more than 50 percent!
A few months later, we examined the impact of the new code during peak login hours and the results were the same: the new login code resulted in a 60 percent reduction on the server load.
So what do all these backend changes mean for you? As soon as you upgrade to Version 1.21, you should expect to see faster log in times, faster page loads, and an overall improved user experience with the system.