|By Leon Fayer||
|July 27, 2012 12:00 PM EDT||
I've talked at length about the importance of business process monitoring alongside of system monitoring, but in discussions I found that sometimes an overview and simple examples are not enough to convince people about the benefits of this approach. Business owners think they don't need to know anything about the operational performance of their systems as long as they have their numbers, and engineers often don't feel they need to invest time into understanding the business they are supporting in detail, finding examples shown too "common sense."
One question we ask our engineers during an interview at work is to describe the process of how they would go about troubleshooting a hypothetical issue, given only the minimum information. We often hear from our clients things like, "our website is slow" and "something's wrong with registration" with no additional information, and in order to figure out the potential issue we need to review the whole system at a glance. For large applications, with a myriad of moving and interweaving components, this is not an easy task. This is one of the reasons we are looking for best of the best. But if you are monitoring all of those components, in a lot of cases, the task can be simplified.
So let's examine a real problem. A large e-commerce company called and said that they are seeing less money coming in from web transactions. They have a pretty complex system with a lot of different revenue generation points, so this observation shed very little light on the root cause of the problem. Luckily, both systems and business processes were being monitored with Circonus, so the data was available to review.
As any engineer knows - step one of troubleshooting the problem is to confirm the problem, so looking at the revenue trends seemed like a good starting point.
The graph clearly shows that, starting around April 30th, the trend looked abnormal in comparison to the previous few weeks. So it seemed like there was an actual problem, and potentially, the issue could lie in payment processor itself or somewhere in the system, preventing certain users from making a purchase. So let's overlay the traffic trends, collected from Google Analytics, against revenue graph and see if there are any common trends.
Even though the traffic showed a clear drop at the same time as revenue, the ratio remained the same, allowing us to exclude payment processor and other application logic from the equation (for now).
Note: This is the first potential breaking point in the process. It is very tempting to look at the ratios, attribute revenue decrease to traffic decrease, and stop the investigation. 99% of the time, unfortunately, nothing "just" happens, so on we go.
Now for the next step - what would be a logical cause for a drop in overall traffic to the site? Response time is probably the first thought that should come to mind. So let's look at what the HTTP checks collected.
Load times didn't seem to be deviating from the norm, but the HTTP response metric doesn't provide full visibility into the load times for a dynamic application, so let's check the health of the database and CPU usage on the server(s), to validate that the underlying platform is not the bottleneck. There are numerous metrics to monitor database and system health that should be, and in this case, are collected, but when researching the root cause of the elusive problem, diving deep into a specific component can waste time early in the process.
Both of the metrics appear well within norm, so at first glance, it seems like the problem is not a systems issue.
Note: This is the second point of the investigation where the process can break down. A lot of technologists will either report that there is no confirmation to the problem reported; the reported problem is just an anomaly because the system monitors don't exhibit any issues. This is exactly why understanding of the business by the technology team is vital.
With that said, what would be the next logical process to validate? It is not uncommon for an e-commerce site to see a drop in purchases if they either stop promoting or if their marketing campaign is ineffective: traffic to the site slows down, subsequently decreasing the number of transactions. This company, in particular, sends out tens of millions of emails a day which bring in new users, and subsequently, new conversions. So let's take a look at the email deliverability and bounce rates collected from the company's MTAs.
Bingo! The bounce rates sky rocketed at the same time as the drop in traffic and revenue stream occurred. Upon closer investigation, it appeared that one of the major ESPs accidentally blocked the delivery domain, and the emails did not go through to the recipients. The issue was resolved (after some discussions with the ESP) and the trends returned back to the expected level.
Keep in mind, if email deliverability was not the issue - there are multiple other metrics that were on a list to be verified, both system (operational and development alike) and business. The amazing part of all of this is that I was able to view the whole system at a glance in just one graph. Granted, stacking everything on one graph is probably not the most optimal every day approach, but it is very useful in a certain cases when the direct overlay correlation is needed. For everything else - a real-time dashboard that displays all the vital points of the business at any given moment is a must-have for anyone responsible for business and/or system health.
Everyone responsible for the success of a business, regardless of the role, needs an ability to see the status of the whole business at a glance at any given point. System engineers don't need to know all the ins and out of marketing, but they should be aware of the overall organizational goals, and should be able to spot irregularities in the business trends. Similarly, CEOs don't need to know how systems work in the background, but should be able to correlate high email bounce rates (if it's critical to the business) to a decrease in purchases.
The point of all of this is that everything should be monitored, and to suggest some tools and methods that can enable users in all roles--within any organization--to ensure the success of the business. Get ‘em, learn 'em, use 'em! You will thank me later.
“DevOps is really about the business. The business is under pressure today, competitively in the marketplace to respond to the expectations of the customer. The business is driving IT and the problem is that IT isn't responding fast enough," explained Mark Levy, Senior Product Marketing Manager at Serena Software, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Feb. 1, 2015 07:00 PM EST Reads: 4,510
Entuity®, a provider of enterprise-class network management solutions, today announced that it solidifies its position as a market leader through global enterprise customer acquisitions and a refined channel strategy. In 2014, Entuity increased new license revenues in EMEA by over 75 percent, and LATAM by over 125 percent as customers embraced Entuity for its highly automated solution and unified architecture. Entuity’s refined channel strategy focuses on even deeper strategic alignment with ke...
Feb. 1, 2015 05:00 PM EST Reads: 1,522
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
Feb. 1, 2015 04:15 PM EST Reads: 5,488
We are all here because we are sold on the transformative promise of The Cloud. But what good is all of this ephemeral, on-demand infrastructure if your usage doesn't actually improve the agility and speed of your business? How must Operations adapt in order to avoid stifling your Cloud initiative? In his session at DevOps Summit, Damon Edwards, co-founder and managing partner of the DTO Solutions, will highlight the successful organizational, process, and tooling patterns of high-performing c...
Feb. 1, 2015 04:15 PM EST Reads: 4,489
Technology is enabling a new approach to collecting and using data. This approach, commonly referred to as the "Internet of Things" (IoT), enables businesses to use real-time data from all sorts of things including machines, devices and sensors to make better decisions, improve customer service, and lower the risk in the creation of new revenue opportunities. In his General Session at Internet of @ThingsExpo, Dave Wagstaff, Vice President and Chief Architect at BSQUARE Corporation, discuss the ...
Feb. 1, 2015 03:45 PM EST Reads: 4,457
“We help people build clusters, in the classical sense of the cluster. We help people put a full stack on top of every single one of those machines. We do the full bare metal install," explained Greg Bruno, Vice President of Engineering and co-founder of StackIQ, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Feb. 1, 2015 02:45 PM EST Reads: 4,406
At 15th Cloud Expo, Shrikant Pattathil, Executive Vice President at Harbinger Systems, demos a video delivery platform that helps you do interactive videos. He discusses how Harbinger is accomplishing it in the cloud world, the problems they faced and the choices they made to get around these problems.
Feb. 1, 2015 02:45 PM EST Reads: 2,909
"People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Feb. 1, 2015 02:30 PM EST Reads: 4,717
CloudBees, Inc., has announced a $23.5 million financing round, led by longtime CloudBees investor Lightspeed Venture Partners. Existing investors Matrix Partners, Verizon Ventures and Blue Cloud Ventures also participated in the round. The latest funding announcement follows earlier rounds of $4 million, $10.5 million and $10.8 million, bringing the total investment in CloudBees to just under $50 million since the company’s inception in 2010. Previous venture investment rounds were led by Ma...
Feb. 1, 2015 02:00 PM EST Reads: 1,879
In this Women in Technology Power Panel at 15th Cloud Expo, moderated by Anne Plese, Senior Consultant, Cloud Product Marketing at Verizon Enterprise, Esmeralda Swartz, CMO at MetraTech; Evelyn de Souza, Data Privacy and Compliance Strategy Leader at Cisco Systems; Seema Jethani, Director of Product Management at Basho Technologies; Victoria Livschitz, CEO of Qubell Inc.; Anne Hungate, Senior Director of Software Quality at DIRECTV, discussed what path they took to find their spot within the tec...
Feb. 1, 2015 01:45 PM EST Reads: 3,988
The cloud is becoming the de-facto way for enterprises to leverage common infrastructure while innovating and one of the biggest obstacles facing public cloud computing is security. In his session at 15th Cloud Expo, Jeff Aliber, a global marketing executive at Verizon, discussed how the best place for web security is in the cloud. Benefits include: Functions as the first layer of defense Easy operation –CNAME change Implement an integrated solution Best architecture for addressing network-l...
Feb. 1, 2015 01:30 PM EST Reads: 3,863
DevOps Summit 2015 New York, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that it is now accepting Keynote Proposals. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete...
Feb. 1, 2015 01:15 PM EST Reads: 4,702
Today, IT is not just a cost center. IT is an enabler and driver of business. With the emergence of the hybrid cloud paradigm, IT now has increasingly more capabilities to create new strategic opportunities for a business. Hybrid cloud allows an organization to utilize multi-tenant public clouds, dedicated private clouds, bare metal hosting, and the associated support and services for the right use cases through an on-demand, XaaS model. This model of IT creates tremendous opportunities for busi...
Feb. 1, 2015 01:00 PM EST Reads: 2,747
Cloud computing started a technology revolution; now DevOps is driving that revolution forward. By enabling new approaches to service delivery, cloud and DevOps together are delivering even greater speed, agility, and efficiency. No wonder leading innovators are adopting DevOps and cloud together! In his session at DevOps Summit, Andi Mann, Vice President of Strategic Solutions at CA Technologies, explored the synergies in these two approaches, with practical tips, techniques, research data, wa...
Feb. 1, 2015 01:00 PM EST Reads: 4,538
Software AG and Wipro Ltd. have announced a joint solution platform for streaming analytics that provides real-time actionable intelligence for the Internet of Things (IoT) market. “The key to successfully addressing the IoT market is the ability to rapidly build and evolve apps that tap into, analyze and make smart decisions on fast, big data”, said John Bates, Global Head of Industry Solutions and CMO, Software AG. To address the huge market potential created by streaming analytics in conj...
Feb. 1, 2015 01:00 PM EST Reads: 2,050
Appcore deploys cloud for service providers based on the Apache Cloud set. In this demo at 15th Cloud Expo, Nate Gordon, Director of Technology at Appcore, shows their new product that's coming out in January - Appcore Atlas, which is focused on deploying private clouds based on CloudStack in 15 minutes or less. Our upcoming June 9-11, 2015, event in New York City will present a total of 10 simultaneous tracks (the largest conference content in the world) by an all-star faculty, over three days...
Feb. 1, 2015 01:00 PM EST Reads: 2,708
Amazon, Google and Facebook are household names in part because of their mastery of Big Data. But what about organizations without billions of dollars to spend on Big Data tools - how can they extract value from their data? In his session at 6th Big Data Expo®, Ali Ghodsi, Co-Founder and Head of Engineering at Databricks, discussed how the zero management cost and scalability of the cloud is addressing the challenges and pain points that data engineers face when working with Big Data. He also s...
Feb. 1, 2015 01:00 PM EST Reads: 4,433
The Industrial Internet revolution is now underway, enabled by connected machines and billions of devices that communicate and collaborate. The massive amounts of Big Data requiring real-time analysis is flooding legacy IT systems and giving way to cloud environments that can handle the unpredictable workloads. Yet many barriers remain until we can fully realize the opportunities and benefits from the convergence of machines and devices with Big Data and the cloud, including interoperability, ...
Feb. 1, 2015 01:00 PM EST Reads: 4,773
IBM has announced a new strategic technology services agreement with Anthem, Inc., a health benefits company in the U.S. IBM has been selected to provide operational services for Anthem's mainframe and data center server and storage infrastructure for the next five years. Among the benefits of the relationship, Anthem has the ability to leverage IBM Cloud solutions that will help increase the ease, availability and speed of adding infrastructure to support new business requirements.
Feb. 1, 2015 01:00 PM EST Reads: 2,761
The term culture has had a polarizing effect among DevOps supporters. Some propose that culture change is critical for success with DevOps, but are remiss to define culture. Some talk about a DevOps culture but then reference activities that could lead to culture change and there are those that talk about culture change as a set of behaviors that need to be adopted by those in IT. There is no question that businesses successful in adopting a DevOps mindset have seen departmental culture change, ...
Feb. 1, 2015 12:45 PM EST Reads: 4,199