Welcome!

Machine Learning Authors: Yeshim Deniz, Pat Romanski, Liz McMillan, Elizabeth White, Corey Roth

Related Topics: Java IoT, Microservices Expo, Microsoft Cloud, Machine Learning , Agile Computing, @DXWorldExpo

Java IoT: Article

Performance Impact of Exceptions: Why Ops, Test and Dev Need to Care

Exceptions can have a severe impact on resource utilization as well as end-user performance

Does your Ops team care about the number of Exceptions thrown in the application - do they even monitor this number? Does your Test Team report the list of Exceptions thrown during a load test to engineering or are they just sending those that end up in a logfile? Is development interested in the Exceptions that are thrown within frameworks while executing their unit tests? Why should they care? Is there a real impact on performance that comes from a couple of exceptions?

Two years ago Alois Reitbauer wrote a nice article about The Cost of an Exception, which is typically hard to evaluate. After a recent deployment of a new version we saw that 30% of the CPU on our application server was consumed by creating Exception objects - these were Exceptions that never made it to a logfile - so nobody really cared until we identified it as being a performance impact on the infrastructure and to the end user. The root cause is simple - but also not that easy to find if you don't look at all Exceptions thrown and not just those that bubble up to the end user or as SEVERE messages into log files.

The big lesson learned was that Exceptions can have a severe impact on resource utilization as well as end-user performance. After this discovery Ops, Test and Dev are now watching out for high Exception creation in order to ensure that code changes, configuration changes or deployment mistakes are detected before they impact the end user.

Symptom: High CPU Utilization on an Application Server
During a recent production load test that we ran against an updated version of our community site we noticed that the CPU was behaving differently on our application server compared to the previous tests. We ran this test outside of regular business hours in order to not impact the regular users on the production system. We expected that CPU utilization increased with increased load - but - comparing it to a previous production load test this was much higher than expected. The following screenshot shows the Process Health Dashboard of our Java Application Server (Tomcat) where the CPU displayed the unexpected behavior:

The Application Server shows much higher CPU than we expected.

Root Cause: CPU Hotspots Reveals Exception Handling as Main Performance Problem
The next step was to identify the hotspots in the application causing the high CPU utilization. The following screenshot shows the top CPU-consuming methods in a 5-minute interval on our application server just when CPU utilization began crossing the 60% mark. 96 seconds (s) out of the 300s (5 minutes) were consumed by fillInStackTrace(), which was called every time an Exception object was created:

Creating Exceptions calls the fillInStackTrace method contributing to high CPU utilization on the AppServer.

fillInStackTrace() was called from the Throwable constructor. That means that every exception that gets created ends up calling this method, which turns out to be our hotspot. We also see that 79% of the time it is the MissingResourceException that gets thrown when one of the i18n utility classes try to get text from the deployed resource bundles.

Sheer Volume Is the Problem - Not the Individual Exception
As with a lot of things - it is not a single Exception that consumes CPU - but - it is the sum of all Exceptions. How much did it take to consume 30% CPU? In our case about 182,000 Exceptions in 5 minutes!

182000 Exceptions thrown in 5 minutes cause the 30% CPU Overhead.

An Obsolete Plugin Is the Root Cause
We quickly identified the problem by looking at the PureStack and PurePath information, the log files, and with help from the great support team at Atlassian. It turned out that after we upgraded to a newer version of our Confluence instance we forgot to upgrade one of the plugins that we actually no longer use. The old version of the plugin caused these Exceptions when Confluence iterated through the different resource packages. As we actually don't use the problematic plugin no end user would have complained about broken functionality. The only way this problem manifested itself was unusual high CPU consumption that - under heavy load - impacts all users on the system.

Lessons Learned for Dev, Test and Ops
Knowing that Exceptions can be a performance impact means that we need to make sure we prevent too many Exceptions from being thrown. All teams involved in the application lifecycle can do their part to make sure that the problem won't occur - or - if it does happens - will be addressed proactively. Here is how:

  • Operations: They now monitor and alert on unusual behavior in the number of Exceptions thrown in production. This catches problems that are introduced with configuration changes or deployment of new code that hasn't been thoroughly tested. It also detects deployment issues such as missing files that also results in similar Exceptions
  • Testing: They look at the number of Exceptions just as they did after running this test. Comparing it with previous tests allows them to identify any regression that was introduced.
  • Development: We do develop our own plugins and extensions to Confluence. This story taught us that during development we also need to make sure that our custom plugins don't access any APIs that cause internal exceptions. We also automated that through tests executed in our continuous integration. Executing these tests also captures the number of exceptions and lets the build fail in case we observe untypical behavior.

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and ...
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smart...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
DXWorldEXPO LLC announced today that All in Mobile, a mobile app development company from Poland, will exhibit at the 22nd International CloudEXPO | DXWorldEXPO. All In Mobile is a mobile app development company from Poland. Since 2014, they maintain passion for developing mobile applications for enterprises and startups worldwide.
@DevOpsSummit at Cloud Expo, taking place November 12-13 in New York City, NY, is co-located with 22nd international CloudEXPO | first international DXWorldEXPO and will feature technical sessions from a rock star conference faculty and the leading industry players in the world.
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...
We all know that end users experience the internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices - not doing so will be a path to eventual ...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
DXWorldEXPO LLC announced today that ICC-USA, a computer systems integrator and server manufacturing company focused on developing products and product appliances, will exhibit at the 22nd International CloudEXPO | DXWorldEXPO. DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City. ICC is a computer systems integrator and server manufacturing company focused on developing products and product appliances to meet a wide range of ...
Sanjeev Sharma Joins November 11-13, 2018 @DevOpsSummit at @CloudEXPO New York Faculty. Sanjeev Sharma is an internationally known DevOps and Cloud Transformation thought leader, technology executive, and author. Sanjeev's industry experience includes tenures as CTO, Technical Sales leader, and Cloud Architect leader. As an IBM Distinguished Engineer, Sanjeev is recognized at the highest levels of IBM's core of technical leaders.
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
Headquartered in Plainsboro, NJ, Synametrics Technologies has provided IT professionals and computer systems developers since 1997. Based on the success of their initial product offerings (WinSQL and DeltaCopy), the company continues to create and hone innovative products that help its customers get more from their computer applications, databases and infrastructure. To date, over one million users around the world have chosen Synametrics solutions to help power their accelerated business or per...
We are seeing a major migration of enterprises applications to the cloud. As cloud and business use of real time applications accelerate, legacy networks are no longer able to architecturally support cloud adoption and deliver the performance and security required by highly distributed enterprises. These outdated solutions have become more costly and complicated to implement, install, manage, and maintain.SD-WAN offers unlimited capabilities for accessing the benefits of the cloud and Internet. ...
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Founded in 2000, Chetu Inc. is a global provider of customized software development solutions and IT staff augmentation services for software technology providers. By providing clients with unparalleled niche technology expertise and industry experience, Chetu has become the premiere long-term, back-end software development partner for start-ups, SMBs, and Fortune 500 companies. Chetu is headquartered in Plantation, Florida, with thirteen offices throughout the U.S. and abroad.
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
DXWorldEXPO LLC announced today that Dez Blanchfield joined the faculty of CloudEXPO's "10-Year Anniversary Event" which will take place on November 11-13, 2018 in New York City. Dez is a strategic leader in business and digital transformation with 25 years of experience in the IT and telecommunications industries developing strategies and implementing business initiatives. He has a breadth of expertise spanning technologies such as cloud computing, big data and analytics, cognitive computing, m...
"DivvyCloud as a company set out to help customers automate solutions to the most common cloud problems," noted Jeremy Snyder, VP of Business Development at DivvyCloud, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"Venafi has a platform that allows you to manage, centralize and automate the complete life cycle of keys and certificates within the organization," explained Gina Osmond, Sr. Field Marketing Manager at Venafi, in this SYS-CON.tv interview at DevOps at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...