Click here to close now.

Welcome!

AJAX & REA Authors: AppDynamics Blog, Elizabeth White, Carmen Gonzalez, Pat Romanski, Liz McMillan

Blog Feed Post

Black Friday Horror Story Averted with Alerting and Monitoring

image_pdfimage_print

AppDynamics recently announced the launch of our Application Intelligence Platform, which is the underlying infrastructure that delivers our portfolio of products to customers. A key component of the Application Intelligence platform is the notion of extensibility – we can integrate with many of the existing tools you already have in place so you can leverage AppDynamics analytics in the tools and dashboards your team knows and loves with minimal effort. These extensions as we call them are available on the AppDynamics eXchange section of our Community for download, and customers even have the option to submit extensions they’ve written themselves to be included in the eXchange.

To illustrate the power of the 75+ extensions we’ve published in our community, I’ll walk you through two scenarios that involve several common technologies that are prevalent across our customer base.

____

Before AppDynamics:

Jerry has been tossing and turning all night long. In fact, he’s had difficulty sleeping the past three weeks. His sub-optimal sleep patterns are in large part a result of the production application environment he is responsible for. “Things always seem to break in the middle of the night,” Jerry complained to his wife earlier that day.

As the DevOps lead for their company’s mission critical ecom app, Jerry is copied on most urgent application related alerts so that he can help manually forward the details he gets from his current monitoring tools to the admin from his team who happens to be on call at the time. Tonight, he only received 5 such notifications which is less than normal, but still sufficient to wake him up throughout the night. As he squints in the darkness and his eyes adjust to the bright screen, he sees a new notification that troubles him… “SSL Certificate Expired?” he mumbles to himself. “How is that possible?”

He checks the clock – 5:30AM. The person who handles the SSL Certificate isn’t going to be awake for a few more hours. Jerry’s heart drops because he knows that for every hour his ecommerce application is down it costs his company about $10,000 of revenue. “Why wasn’t this on my radar?” Jerry says. “We could’ve planned for this.”

Jerry gets to work early and starts sending emails and calling stakeholders to schedule an 8:30AM conference call. By 9:15AM the action items and deliverables are clear. By 10:30AM the SSL Certificate is renewed and the ecom store is back online servicing customers. Whew. “That could’ve been a lot worse than just 5 hours of downtime and $50K of revenue impact,” Jerry reasons with a colleague.

Back at his desk, Jerry looks at his calendar, his next meeting is ‘testing & capacity planning’ which is a weekly recurring meeting with him and his team.

Jerry’s company is preparing for the holiday season (Black Friday, Cyber Monday, etc.) which is still a few months away but for ecommerce stores, these peak seasons are huge operational and business challenges. You know that $10K per hour of revenue metric?  During those peak days in the holiday season that quadruples to $40K of revenue per hour. The ecommerce store can’t have any hiccups during that time or the impact would be massive, and that’s why this particular recurring meeting leading up to the code freeze are very important.

Jerry greets his team and looks over the shoulder of one of his sys admins. She’s just got the application infrastructure diagram drawn on the white board and has the first load test done and now they are analyzing the results. Looks like most of the synthetic tests they’ve run completed with relatively few errors and utilization was within the acceptable range even as the load increased over the duration of the load test. So far so good.

Jerry moves on to peek over the shoulder of his DBA who is currently analyzing the Cassandra cluster metrics after the load test. Disk I/O looked good and memory looked OK. Over the course of the next hour Jerry’s team tests 6 different load testing and failover scenarios. Today’s tests are done – until next week.

“Everything looks good… a little too good,” Jerry says to himself. “My team and I understand things like utilization and throughput but how does that translate to things my boss and the rest of the business care about?”

If only there was another approach to monitoring that would save Jerry from the fire drills, cut down on the constant testing and debugging, and give him a real-time view into how customers were engaging with his ecommerce application…

Luckily for Jerry, AppDynamics does just that…  Let’s look at this same situation one year later.

After AppDynamics:

Jerry wakes up from a great night’s sleep and checks his email for the daily AppDynamics events digest that gets sent to him with all of the application events over the last 24 hours. Only one event in the digest. Ever since Jerry’s organization invested into AppDynamics’ products that are delivered on the Application Intelligence Platform, his dev team has gotten code-level visibility into the root cause of performance issues inside his ecommerce application and has substantially cut down the number of bugs in the software. That means less production issues for his team to deal with downstream.

Using the PagerDuty alerting extension, the one issue that was sent in Jerry’s digest triggered the creation of a help ticket and was automatically assigned to the technician on duty with no manual intervention on Jerry’s part.

By the time Jerry checked on the status, the issue was already resolved. Nice.

On his way to work, Jerry smiles and thinks about last year’s SSL Certificate debacle. Since installing the SSL Certificate Monitoring extension from AppDynamics, his team has been able to build a dashboard that shows the number of days left until the SSL Certificate expires. No more SSL Certs expiring without anyone knowing ahead of time.

Jerry arrives at work and goes to his recurring ‘testing and capacity planning’ meeting that his team sets up every year around this time. Since deploying AppDynamics and installing two additional AppDynamics extensions – the Cassandra monitoring extension and the Amazon Web Services (AWS) cloud connector extension – his testing and capacity planning work for the holiday season has gotten a lot easier.

First, AppDynamics has given him and his team a great topology view that has relieved them of their needs for Visio diagrams and whiteboarded architectures. Being able to have a real-time view of how the different components of an application interact with each other, and have that map update automatically as new code is released, was hugely valuable for Jerry’s team.

Screen Shot 2014-06-18 at 10.26.02 AM

Second, during Cassandra testing, in addition to getting basic metrics like disk I/O and memory, the Cassandra extension provides configurable metrics like:

  • Cache size, capacity, hit count, hit rate, request count

  • Total latency, statistics, timeout requests, unavailable requests

  • Bloom filter disk space used, false positives, false ratio

  • SSTables compression ratio, live tables, disk space, compacted row size

  • Row size histogram

  • Column count histogram

  • Memtable columns, data size, switch count

  • Pending tasks

  • Read latency

  • Write latency

  • Pending and completed tasks

  • Compaction tasks pending and completed

  • Timeouts

  • Dropped messages

  • Streams

  • Total disk space used

  • Thread pool tasks: active, completed, blocked, pending

By leveraging these metrics, Jerry’s team is able to get granular visibility into Cassandra performance and see exactly where performance bottlenecks occur. This visibility has cut down the time needed to test their Cassandra implementation drastically. Pinpointing exactly where the performance issues are and what caused them enable Jerry’s team to proactively address Cassandra performance issues before they affect end users.

Finally, while capacity planning, Jerry now leverages the Amazon Web Services (AWS) cloud connector extension which allows his team to easily scale up and scale down in the cloud automatically based on policies that can involve a number of rules including:

•       Overall application health (load, response time, number of slow calls, etc.)

•       Business transaction health (load, response time, number of slow calls, etc.)

•       End User Experience health (pages / iFrames / AJAX requests per minute, first byte time, DOM ready time, etc.)

•       Databases & Remote Services health (calls per minute, errors per minute, etc)

•       Error rates (exceptions, return codes, etc.)

This year, Jerry’s team is putting a few different health rules in place that will automatically scale up the AWS EC2 resources when certain load & response time metrics are breached and scale down when those metrics go back down to a normal level. Jerry has also added an authorization step to these workflows that will alert him and ask for permission before spinning instances up or down. That way, they only pay for the EC2 resources they need to use and Jerry still has full control.

Screen Shot 2014-06-12 at 3.49.26 PMScreen Shot 2014-06-12 at 3.49.51 PM

Screen Shot 2014-06-12 at 3.50.17 PM

Jerry leaves the testing meeting with full confidence that his team has a good grasp on the upcoming peak season and has the visibility in place that will allow his team to quickly deal with any performance issues as they arise.

_____

As you can see, Jerry is in a lot better spot this year than he was 1 year ago. By leveraging AppDynamics he has one platform that can easily connect to the rest of the technologies he already uses and provide him a single UI in which he can manage the performance of his environment.

If you’d like to try AppDynamics for free and test drive some of the extensions we’ve highlighted in this blog post, click here.

The post Black Friday Horror Story Averted with Alerting and Monitoring written by appeared first on Application Performance Monitoring Blog from AppDynamics.

Read the original blog entry...

More Stories By AppDynamics Blog

In high-production environments where release cycles are measured in hours or minutes — not days or weeks — there's little room for mistakes and no room for confusion. Everyone has to understand what's happening, in real time, and have the means to do whatever is necessary to keep applications up and running optimally.

DevOps is a high-stakes world, but done well, it delivers the agility and performance to significantly impact business competitiveness.

@CloudExpo Stories
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, discussed how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP HANA...
P2P RTC will impact the landscape of communications, shifting from traditional telephony style communications models to OTT (Over-The-Top) cloud assisted & PaaS (Platform as a Service) communication services. The P2P shift will impact many areas of our lives, from mobile communication, human interactive web services, RTC and telephony infrastructure, user federation, security and privacy implications, business costs, and scalability. In his session at @ThingsExpo, Robin Raymond, Chief Architect...
The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential. The DevOps Summit at Cloud Expo – to be held June 3-5, 2015, at the Javits Center in New York City – will expand the DevOps community, enable a wide...
Enterprises are fast realizing the importance of integrating SaaS/Cloud applications, API and on-premises data and processes, to unleash hidden value. This webinar explores how managers can use a Microservice-centric approach to aggressively tackle the unexpected new integration challenges posed by proliferation of cloud, mobile, social and big data projects. Industry analyst and SOA expert Jason Bloomberg will strip away the hype from microservices, and clearly identify their advantages and d...
With worldwide spending on cloud services and infrastructure growing by 23% in 2015 to $118B, it is clear that cloud services are here to stay. Yet, the rate of cloud adoption varies by companies and markets around the world. With thousands of outages and hijacks across the Internet every day, one reason for hesitation is the faith in quality Internet performance. In his session at 16th Cloud Expo, Michael Kane, Senior Manager at Dyn, will explore how Internet performance affects your end-user...
With SaaS use rampant across organizations, how can IT departments track company data and maintain security? More and more departments are commissioning their own solutions and bypassing IT. A cloud environment is amorphous and powerful, allowing you to set up solutions for all of your user needs: document sharing and collaboration, mobile access, e-mail, even industry-specific applications. In his session at 16th Cloud Expo, Shawn Mills, President and a founder of Green House Data, will discus...
Cloud Expo, Inc. has announced today that Andi Mann returns to DevOps Summit 2015 as Conference Chair. The 4th International DevOps Summit will take place on June 9-11, 2015, at the Javits Center in New York City. "DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited to help the great team at ...
Organizations today are confounded by an avalanche of data that needs to be processed and managed on a daily basis. Through relevant use cases and a thought-provoking dialogue on an organization’s ‘Data to Decisions’ journey, Andrew Clyne, Chief Data Officer at CenturyLink Cognilytics, will reveal in his session at Big Data Expo how your organization can monetize data as a strategic asset. State-of-the-art Big Data and Advanced Analytics capabilities provided as a managed service can enable da...
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at Internet of @ThingsExpo, James Kirkland, Chief Ar...
Move from reactive to proactive cloud management in a heterogeneous cloud infrastructure. In his session at 16th Cloud Expo, Manoj Khabe, Innovative Solution-Focused Transformation Leader at Vicom Computer Services, Inc., will show how to replace a help desk-centric approach with an ITIL-based service model and service-centric CMDB that’s tightly integrated with an event and incident management platform. Learn how to expand the scope of operations management to service management. He will al...
For IoT to grow as quickly as analyst firms’ project, a lot is going to fall on developers to quickly bring applications to market. But the lack of a standard development platform threatens to slow growth and make application development more time consuming and costly, much like we’ve seen in the mobile space. In his session at @ThingsExpo, Mike Weiner is Product Manager of the Omega DevCloud with KORE Telematics Inc., will discuss the evolving requirements for developers as IoT matures and co...
There is no question that the cloud is where businesses want to host data. Until recently hypervisor virtualization was the most widely used method in cloud computing. Recently virtual containers have been gaining in popularity, and for good reason. In the debate between virtual machines and containers, the latter have been seen as the new kid on the block – and like other emerging technology have had some initial shortcomings. However, the container space has evolved drastically since coming on...
Mobile commerce traffic is surpassing desktop, yet less than 20% of sales in the U.S. are mobile commerce sales. In his session at 15th Cloud Expo, Dan Franklin, Segment Manager, Commerce, at Verizon Digital Media Services, defined mobile devices and discussed how next generation means simplification. It means taking your digital content and turning it into instantly gratifying experiences.
What do a firewall and a fortress have in common? They are no longer strong enough to protect the valuables housed inside. Like the walls of an old fortress, the cracks in the firewall are allowing the bad guys to slip in - unannounced and unnoticed. By the time these thieves get in, the damage is already done and the network is already compromised. Intellectual property is easily slipped out the back door leaving no trace of forced entry. If we want to reign in on these cybercriminals, it's hig...
SYS-CON Events announced today that DragonGlass, an enterprise search platform, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. After eleven years of designing and building custom applications, OpenCrowd has launched DragonGlass, a cloud-based platform that enables the development of search-based applications. These are a new breed of applications that utilize a search index as their backbone for data...
Converging digital disruptions is creating a major sea change - Cisco calls this the Internet of Everything (IoE). IoE is the network connection of People, Process, Data and Things, fueled by Cloud, Mobile, Social, Analytics and Security, and it represents a $19Trillion value-at-stake over the next 10 years. In her keynote at @ThingsExpo, Manjula Talreja, VP of Cisco Consulting Services, will discuss IoE and the enormous opportunities it provides to public and private firms alike. She will shar...
Container frameworks, such as Docker, provide a variety of benefits, including density of deployment across infrastructure, convenience for application developers to push updates with low operational hand-holding, and a fairly well-defined deployment workflow that can be orchestrated. Container frameworks also enable a DevOps approach to application development by cleanly separating concerns between operations and development teams. But running multi-container, multi-server apps with containers ...
Software Development Solution category in The 2015 American Business Awards, and will ultimately be a Gold, Silver, or Bronze Stevie® Award winner in the program. More than 3,300 nominations from organizations of all sizes and in virtually every industry were submitted this year for consideration. "We are honored to be recognized as a leader in the software development industry by the Stevie Awards judges," said Steve Brodie, CEO of Electric Cloud. "We introduced ElectricFlow and our Deploy app...
In their general session at 16th Cloud Expo, Michael Piccininni, Global Account Manager – Cloud SP at EMC Corporation, and Mike Dietze, Regional Director at Windstream Hosted Solutions, will review next generation cloud services, including the Windstream-EMC Tier Storage solutions, and discuss how to increase efficiencies, improve service delivery and enhance corporate cloud solution development. Speaker Bios Michael Piccininni is Global Account Manager – Cloud SP at EMC Corporation. He has b...
The time is ripe for high speed resilient software defined storage solutions with unlimited scalability. ISS has been working with the leading open source projects and developed a commercial high performance solution that is able to grow forever without performance limitations. In his session at DevOps Summit, Alex Gorbachev, President of Intelligent Systems Services Inc., will share foundation principles of Ceph architecture, as well as the design to deliver this storage to traditional SAN st...