|By Andreas Grabner||
|October 6, 2014 04:00 AM EDT||
Web Service Monitoring 101: Identifying Bad Deployments
Have you ever deployed a change to production and thought "All went well - Systems are operating as expected!" but then you had to deal with users complaining that they keep running into errors?
When deployments fail you don't want your users to be the first to tell you about it: Sit down with the Business and Dev to define how and what to monitor
We recently moved some of our systems between two of our data centers - even moving some components to the public cloud. Everything was prepared well, system monitoring was set up and everyone gave the thumbs up to execute the move. Immediately following, our Operations dashboards continued to show green. Soon thereafter I received a complaint from a colleague who reported that he couldn't use one of the migrated services (our free dynaTrace AJAX Edition) anymore as the authentication web service seemed to fail. The questions we asked ourselves were:
- Impact: Was this a problem related to his account only or did it impact more users?
- Root Cause: What is the root cause and how was this problem introduced?
- Alerting: Why don't our Ops monitoring dashboards show any failed web service calls?
It turned out that the problem was in fact
- Caused by an outdated configuration file deployment
- It only impacted employees whose accounts were handled by a different authentication back-end service
- Didn't show up in Ops dashboards because the used SOAP Framework always return HTTP 200 transporting any success/failure information in the response body which doesn't show up in any web server log file
In this blog I give you a little more insight on how we triaged the problem and some best practices we derived from that incident in order to level-up technical implementations and production monitoring. Only if you monitor all your system components and correlate the results with deployment tasks will you be able to deploy with more confidence without disrupting your business.
Bad Monitoring: When Your End Users become your Alerting System
So - when I got a note from a colleague that he could no longer use dynaTrace AJAX Edition to analyze the web site performance of a particular web site, I launched my copy to verify this behavior. It failed with my credentials, which proved that it was not a local problem on my colleague's machine:
Business Problem: Our end users can't use our free product due to failing authentication service
Asking our Ops Team that manages and monitors these web services resulted in the following response:
"We do not see any errors on the Web Server nor do we have any reported availability problems on our authentication service. It's all green on our infrastructure dashboards as can be seen on the following screenshot:"
Infrastructure is all green: No HTTP-based errors or SLA problems based on IIS log or on any of the resources on the host
Web Server Log Monitoring Is Not Enough
As mentioned in the initial paragraph, it turned out that our SOAP Framework always returns HTTP 200 with the actual error in the response body. This is not an uncommon "Best (or worst) Practice" as you can see for instance on the following discussion on GitHub.
The problem with that approach though is that "traditional" operations monitoring based on web server log files will not detect any of these "logical/business" problems. As you don't want to wait until your users start complaining, it's time to level-up your monitoring approach. How can this be done? Those developing and those monitoring the system need to sit down and figure out a way how to monitor the usage of these services and need to talk with business to figure out which level of detail to report and alert on.
How can you find out if your current monitoring approach works? Start by looking more closely at problems reported by your users but that you don't get any automatic alerts on. Then, talk with engineers and see whether they use frameworks like mentioned here.
For further insight, and for lessons learned, click here for the full article.
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
Sep. 1, 2015 11:45 PM EDT Reads: 388
Any Ops team trying to support a company in today’s cloud-connected world knows that a new way of thinking is required – one just as dramatic than the shift from Ops to DevOps. The diversity of modern operations requires teams to focus their impact on breadth vs. depth. In his session at DevOps Summit, Adam Serediuk, Director of Operations at xMatters, Inc., will discuss the strategic requirements of evolving from Ops to DevOps, and why modern Operations has begun leveraging the “NoOps” approa...
Sep. 1, 2015 08:15 PM EDT Reads: 420
IBM’s Blue Box Cloud, powered by OpenStack, is now available in any of IBM’s globally integrated cloud data centers running SoftLayer infrastructure. Less than 90 days after its acquisition of Blue Box, IBM has integrated its Blue Box Cloud Dedicated private-cloud-as-a-service into its broader portfolio of OpenStack® based solutions. The announcement, made today at the OpenStack Silicon Valley event, further highlights IBM’s continued support to deliver OpenStack solutions across all cloud depl...
Sep. 1, 2015 07:00 PM EDT Reads: 271
In their Live Hack” presentation at 17th Cloud Expo, Stephen Coty and Paul Fletcher, Chief Security Evangelists at Alert Logic, will provide the audience with a chance to see a live demonstration of the common tools cyber attackers use to attack cloud and traditional IT systems. This “Live Hack” uses open source attack tools that are free and available for download by anybody. Attendees will learn where to find and how to operate these tools for the purpose of testing their own IT infrastructu...
Sep. 1, 2015 06:45 PM EDT Reads: 476
Red Hat is investing in Tesora, the number one contributor to OpenStack Trove Database as a Service (DBaaS) also ranked among the top 20 companies contributing to OpenStack overall. Tesora, the company bringing OpenStack Trove Database as a Service (DBaaS) to the enterprise, has announced that Red Hat and others have invested in the company as a part of Tesora's latest funding round. The funding agreement expands on the ongoing collaboration between Tesora and Red Hat, which dates back to Febr...
Sep. 1, 2015 04:30 PM EDT Reads: 388
SYS-CON Events announced today that DataClear Inc. will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. The DataClear ‘BlackBox’ is the only solution that moves your PC, browsing and data out of the United States and away from prying (and spying) eyes. Its solution automatically builds you a clean, on-demand, virus free, new virtual cloud based PC outside of the United States, and wipes it clean...
Sep. 1, 2015 04:15 PM EDT Reads: 436
WSM International, the pioneer and leader in server migration services, has announced an agreement with WHOA.com, a leader in providing secure public, private and hybrid cloud computing services. Under terms of the agreement, WSM will provide migration services to WHOA.com customers to relocate some or all of their applications, digital assets, and other computing workloads to WHOA.com enterprise-class, secure cloud infrastructure. The migration services include detailed evaluation and planning...
Sep. 1, 2015 04:00 PM EDT Reads: 196
Cloud and datacenter migration innovator AppZero has joined the Microsoft Enterprise Cloud Alliance Program. AppZero is a fast, flexible way to move Windows Server applications from any source machine – physical or virtual – to any destination server, in any cloud or datacenter, using its patented container technology. AppZero’s container is also called a Virtual Application Appliance (VAA). To facilitate Microsoft Azure onboarding, AppZero has two purpose-built offerings: AppZero SP for Azure,...
Sep. 1, 2015 04:00 PM EDT Reads: 211
SYS-CON Events announced today that G2G3 will exhibit at SYS-CON's @DevOpsSummit Silicon Valley, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Based on a collective appreciation for user experience, design, and technology, G2G3 is uniquely qualified and motivated to redefine how organizations and people engage in an increasingly digital world.
Sep. 1, 2015 03:00 PM EDT Reads: 511
SYS-CON Events announced today that IceWarp will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. IceWarp, the leader of cloud and on-premise messaging, delivers secured email, chat, documents, conferencing and collaboration to today's mobile workforce, all in one unified interface
Sep. 1, 2015 03:00 PM EDT Reads: 442
In 2014, the market witnessed a massive migration to the cloud as enterprises finally overcame their fears of the cloud’s viability, security, etc. Over the past 18 months, AWS, Google and Microsoft have waged an ongoing battle through a wave of price cuts and new features. For IT executives, sorting through all the noise to make the best cloud investment decisions has become daunting. Enterprises can and are moving away from a "one size fits all" cloud approach. The new competitive field has ...
Sep. 1, 2015 02:45 PM EDT
With the proliferation of connected devices underpinning new Internet of Things systems, Brandon Schulz, Director of Luxoft IoT – Retail, will be looking at the transformation of the retail customer experience in brick and mortar stores in his session at @ThingsExpo. Questions he will address include: Will beacons drop to the wayside like QR codes, or be a proximity-based profit driver? How will the customer experience change in stores of all types when everything can be instrumented and a...
Sep. 1, 2015 12:45 PM EDT Reads: 479
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
Sep. 1, 2015 12:30 PM EDT Reads: 916
This Enterprise Strategy Group lab validation report of the NEC Express5800/R320 server with Intel® Xeon® processor presents the benefits of 99.999% uptime NEC fault-tolerant servers that lower overall virtualized server total cost of ownership. This report also includes survey data on the significant costs associated with system outages impacting enterprise and web applications. Click Here to Download Report Now!
Sep. 1, 2015 12:30 PM EDT Reads: 262
Enterprises can achieve rigorous IT security as well as improved DevOps practices and Cloud economics by taking a new, cloud-native approach to application delivery. Because the attack surface for cloud applications is dramatically different than for highly controlled data centers, a disciplined and multi-layered approach that spans all of your processes, staff, vendors and technologies is required. This may sound expensive and time consuming to achieve as you plan how to move selected applicati...
Sep. 1, 2015 12:30 PM EDT
Through WebRTC, audio and video communications are being embedded more easily than ever into applications, helping carriers, enterprises and independent software vendors deliver greater functionality to their end users. With today’s business world increasingly focused on outcomes, users’ growing calls for ease of use, and businesses craving smarter, tighter integration, what’s the next step in delivering a richer, more immersive experience? That richer, more fully integrated experience comes ab...
Sep. 1, 2015 11:45 AM EDT Reads: 678
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advance...
Sep. 1, 2015 11:45 AM EDT Reads: 326
Organizations from small to large are increasingly adopting cloud solutions to deliver essential business services at a much lower cost. According to cyber security experts, the frequency and severity of cyber-attacks are on the rise, causing alarm to businesses and customers across a variety of industries. To defend against exploits like these, a company must adopt a comprehensive security defense strategy that is designed for their business. In 2015, organizations such as United Airlines, Sony...
Sep. 1, 2015 11:45 AM EDT Reads: 495
As more and more data is generated from a variety of connected devices, the need to get insights from this data and predict future behavior and trends is increasingly essential for businesses. Real-time stream processing is needed in a variety of different industries such as Manufacturing, Oil and Gas, Automobile, Finance, Online Retail, Smart Grids, and Healthcare. Azure Stream Analytics is a fully managed distributed stream computation service that provides low latency, scalable processing of ...
Sep. 1, 2015 11:15 AM EDT Reads: 297
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
Sep. 1, 2015 11:15 AM EDT Reads: 373