Welcome!

IoT User Interface Authors: Elizabeth White, Peter Silva, Yakov Fain, John Basso, Derek Weeks

Related Topics: @DevOpsSummit, Java IoT, Linux Containers, IoT User Interface, @CloudExpo, @BigDataExpo

@DevOpsSummit: Blog Post

What If It Is the Network? Dive Deep to Find the Root Cause

How do you identify the real cause behind the network problems?

Modern Application Performance Management (APM) solutions can be tremendously helpful in delivering end-to-end visibility into the application delivery chain: across all tiers and network sections, all the way to the end user. In previous blog posts we showed how to narrow down to various root causes of the problems that the end users might experience. Those issues ranged from infrastructure through application and network, and through the end-user client application or inefficient use of the software. When the problem comes from the end user application, e.g., a Web 2.0 Web site, user experience management (UEM) solutions can offer broad analysis of possible root causes. Similarly, when an APM fault domain algorithm points to the application, the DevOps team can go deep into the actually executed code and database queries to identify the root cause of the problem.

But what do you do when your APM tool points to the network as the fault domain? How do you identify the real cause behind the network problems? Most of the APM tools stop there, forcing the network team to use separate solutions to monitor the actual network packets.

In this article we show how an Application-Aware Network Performance Management (AANPM) suite can be used to not only zero in on the network problems as the fault domain, but also dive deeper to show the actual trace of network packets in the selected context, captured back at the time when the problem happened.

Isolating Fault Domain to the Network
In one of our blog posts we wrote how Fonterra used our APM tools to identify the problem with SAP application used in the milk churn scanning process. The operations team could easily isolate the fault domain to network problems (see Figure 1); they required, however, further analysis to identify the root cause behind that network problem.

Figure 1: The performance report indicates network problems as the fault domain

In some cases information about loss rate or zero window events is enough to successfully find and resolve the problem. In general, finding the root cause may require you to analyze more detailed, packet level views in order to see exactly what is causing this network performance problem. These details can not only help to determine why we experienced packet loss or zero window events, but also whether the problem was gradually ramping up or if there was a sudden flow control blockage, which would indicate congestion.

For example, a number of users start to experience performance degradation of the service and APM points to the network as the fault domain. The detailed, packet-level analysis can show that the whole service delivery process was blocked by failed initial name resolution.

What Really Happened in the Network?
Why is detailed packet-level analysis so important when our AANPM points to the network?

Let's first consider what happens when we determine fault domain with one of the application delivery tiers. The engineers responsible for that application can start analyzing logs or, better, drill down to single transaction execution steps and often isolate the problem to the actual line of code that was causing the whole performance degradation of the whole application.

However, when our AANPM tells us it is the network, there are no logs or code execution steps to drill down to. Unless we can deliver conclusive and actionable evidence in the form of detailed, packet-level analysis, the network team might have a problem determining the root cause and may remain skeptical whether the network is at fault at all.

This is exactly what happened to one of our customers. An APM solution had correctly identified that there was a performance problem with the web server. The reports showed who was affected and where the users affected by that problem were located when the problem was occurring. The system also pointed toward the network as the primary fault domain.

The network team tried to determine the root cause of the problem. They needed packet level data for that. But, capturing all traffic with a network protocol analyzer after the incident happened not only overloaded the IT team with unnecessary data, but eventually turned out to be a hit and miss.

What the team needed were the network packets at the time the problem occurred, and only those few packets that related to the actual communication realizing affected transactions.

Figure 2: You can drill down to analyze captured network packets in the context of given user operations

For Figure 3, and further insight, click here for the full article.

More Stories By Sebastian Kruk

Sebastian Kruk is a Technical Product Strategist, Center of Excellence, at Compuware APM Business Unit.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
Fact: storage performance problems have only gotten more complicated, as applications not only have become largely virtualized, but also have moved to cloud-based infrastructures. Storage performance in virtualized environments isn’t just about IOPS anymore. Instead, you need to guarantee performance for individual VMs, helping applications maintain performance as the number of VMs continues to go up in real time. In his session at Cloud Expo, Dhiraj Sehgal, Product and Marketing at Tintri, wil...
19th Cloud Expo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterpri...
Enterprises have forever faced challenges surrounding the sharing of their intellectual property. Emerging cloud adoption has made it more compelling for enterprises to digitize their content, making them available over a wide variety of devices across the Internet. In his session at 19th Cloud Expo, Santosh Ahuja, Director of Architecture at Impiger Technologies, will introduce various mechanisms provided by cloud service providers today to manage and share digital content in a secure manner....
SYS-CON Events announced today that 910Telecom will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Housed in the classic Denver Gas & Electric Building, 910 15th St., 910Telecom is a carrier-neutral telecom hotel located in the heart of Denver. Adjacent to CenturyLink, AT&T, and Denver Main, 910Telecom offers connectivity to all major carriers, Internet service providers, Internet backbones and ...
To leverage Continuous Delivery, enterprises must consider impacts that span functional silos, as well as applications that touch older, slower moving components. Managing the many dependencies can cause slowdowns. See how to achieve continuous delivery in the enterprise.
There is growing need for data-driven applications and the need for digital platforms to build these apps. In his session at 19th Cloud Expo, Muddu Sudhakar, VP and GM of Security & IoT at Splunk, will cover different PaaS solutions and Big Data platforms that are available to build applications. In addition, AI and machine learning are creating new requirements that developers need in the building of next-gen apps. The next-generation digital platforms have some of the past platform needs a...
SYS-CON Events announced today Telecom Reseller has been named “Media Sponsor” of SYS-CON's 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
DevOps at Cloud Expo – being held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Am...
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 19th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devices - comp...
SYS-CON Events announced today that eCube Systems, a leading provider of middleware modernization, integration, and management solutions, will exhibit at @DevOpsSummit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. eCube Systems offers a family of middleware evolution products and services that maximize return on technology investment by leveraging existing technical equity to meet evolving business needs. ...
StarNet Communications Corp has announced the addition of three Secure Remote Desktop modules to its flagship X-Win32 PC X server. The new modules enable X-Win32 to safely tunnel the remote desktops from Linux and Unix servers to the user’s PC over encrypted SSH. Traditionally, users of PC X servers deploy the XDMCP protocol to display remote desktop environments such as the Gnome and KDE desktops on Linux servers and the CDE environment on Solaris Unix machines. XDMCP is used primarily on comp...
Traditional on-premises data centers have long been the domain of modern data platforms like Apache Hadoop, meaning companies who build their business on public cloud were challenged to run Big Data processing and analytics at scale. But recent advancements in Hadoop performance, security, and most importantly cloud-native integrations, are giving organizations the ability to truly gain value from all their data. In his session at 19th Cloud Expo, David Tishgart, Director of Product Marketing ...
DevOps at Cloud Expo, taking place Nov 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 19th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long dev...
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportuni...
Aspose.Total for .NET is the most complete package of all file format APIs for .NET as offered by Aspose. It empowers developers to create, edit, render, print and convert between a wide range of popular document formats within any .NET, C#, ASP.NET and VB.NET applications. Aspose compiles all .NET APIs on a daily basis to ensure that it contains the most up to date versions of each of Aspose .NET APIs. If a new .NET API or a new version of existing APIs is released during the subscription peri...
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
SYS-CON Events announced today that StarNet Communications will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. StarNet Communications’ FastX is the industry first cloud-based remote X Windows emulator. Using standard Web browsers (FireFox, Chrome, Safari, etc.) users from around the world gain highly secure access to applications and data hosted on Linux-based servers in a central data center. ...
SYS-CON Events announced today that Hitrons Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Hitrons Solutions Inc. is distributor in the North American market for unique products and services of small and medium-size businesses, including cloud services and solutions, SEO marketing platforms, and mobile applications.
As the world moves toward more DevOps and Microservices, application deployment to the cloud ought to become a lot simpler. The Microservices architecture, which is the basis of many new age distributed systems such as OpenStack, NetFlix and so on, is at the heart of Cloud Foundry - a complete developer-oriented Platform as a Service (PaaS) that is IaaS agnostic and supports vCloud, OpenStack and AWS. Serverless computing is revolutionizing computing. In his session at 19th Cloud Expo, Raghav...
Pulzze Systems was happy to participate in such a premier event and thankful to be receiving the winning investment and global network support from G-Startup Worldwide. It is an exciting time for Pulzze to showcase the effectiveness of innovative technologies and enable them to make the world smarter and better. The reputable contest is held to identify promising startups around the globe that are assured to change the world through their innovative products and disruptive technologies. There w...