|By Rebecca Clinard||
|September 24, 2012 07:30 AM EDT||
The ability to conduct effective performance testing has become a highly desired skillset within the IT industry. Unfortunately, these highly sought-after skills are consistently in short supply. "Front-end testers" can work with a tool to create a realistic load and although this is an important skillset, creating the load is just the beginning of any performance project. Understanding the load patterns and tuning the environment makes the unique talents of a "performance engineer" worth their weight in gold.
Performance engineers require skills in data analysis such as resource usage patterns, modeling, capacity planning, and tuning in order to detect, isolate, and alleviate saturation points within a deployment. Performance testing generates concurrency conditions and exposes resource competition at a server level. When the competition results in a resource (such as a thread pool) becoming over-utilized, this resource becomes a bottleneck or a saturation point. Performance engineers need to first understand the underlying architectures and develop a sense of where to look for potential scalability issues. Much of these "senses" or skills come from experience, working in many multi-tier environments and successfully tuning bottlenecks. Here are some tips to make the challenging but rewarding transition from a front-end tester to a performance engineer.
Wisdom, Determination, Patience, and Communication
Who said there isn't a whole lot of psychology in technology? Whether you are determining the current capacity of a deployment or you are recreating a production problem, it's often a very complex task- so many moving parts within the infrastructure, so many numbers to analyze from so many sources, data sets of raw test results to turn into understandable formats, so many people to keep in the loop, so much technical coordination... I could go on and on. It's your professional soft skills which will keep you on the right course. It requires determination to unpeel the layers of an onion and investigate each tier of the deployment. It requires the wisdom to spot trends instead of pursuing the tangents of anomalies. It requires the dedication to keep an eye on many different metrics and isolate resource saturation. And it requires the patience to reproduce scenarios in order to make conclusions based on proof/evidence. And you need to accomplish all of this while being an excellent communicator!
Methodical Approach - The Constant
Spend your time wisely in the beginning and set up the most realistic test scenarios. Then "set" the performance scenario in stone. This means Do Not change even the most minute details in your test case: All transactions flows, all mixtures, all think times, all behaviors - no variations at this point. This is the "constant" in your experiment and it is the only way you can reproduce and compare results. Any deviation within the test case scenario will result in different throughputs which affect resource patterns. Not following this tip will surely lead you on a collision course with Analysis Paralysis!
Architectural Diagram - Identify Potential Bottlenecks by Visualization
Make sure you ask for and receive an architectural diagram of the entire deployment. Map out business transactions to resources utilized within the environment. Make sure you understand all the transaction flows, from front end load balancers down to the shared resource database. Study the deployment and hook up precise monitors, leaving no blind spots. Visualize where contentions or bottlenecks COULD occur. Each resource of the environment must be monitored for signs of saturation. In reality, it's in the identification of where to look for bottlenecks that is the more difficult task. Alleviating these bottlenecks is the easy (and most rewarding) part. But without an architectural map, your journey will easily end by the frustration of getting lost in the dark.
Tuning Hardware and Software Level Bottlenecks
"Tuning is an Art". "Tuning is a Science". Which is it? Hardware servers are restricted by the physical resources (disk io/memory, cpu). Software servers are much more configurable and this is where expertise in needed for tuning. Performance engineers must understand the workings of a "server" in thread pools, caching policies, memory allocations, connection pooling, etc. Tuning is a balancing act. It's the situation where you tune the software servers in order to take full advantage of hardware resources, without causing a flood. Simply opening up all the gates isn't going to help when the backend is saturated with requests. Tuning must be conservative, weighing all the benefits as well as the consequences.
Proof: Reproducible Results
Typically, a seasoned performance engineer will tune a layer of the environment only when the results are reproducible. Always use trends instead of points in time, mere spikes are not cause for architectural changes. As a rule of thumb, you should reproduce 3 times before you make a change. Sometimes this takes a while... So be prepared to be patient. For example, if you are emulating a production login rate of 3 users per second, but the performance deterioration doesn't occur until you have 2000 active users, it will take a while to see it. Making an unnecessary change simply muddies the waters, keep it clear and recreate those exact conditions.
Tune the First Occurring Bottleneck
Make sure you tune the layer which showed contention earliest in the performance test, not the first identified bottleneck. When monitoring a large complex system, there are many counters to keep in your sights. Don't jump the gun and tune a thread pool when you see it becomes saturated, this could actually be a symptom of the problem, not the root cause. Correlate (using graphing is easiest) the point of time of degradation of performance to the first saturation within the environment. Understandably, there is a ton of information to look at - keep it simpler by just looking at the free resources based on percentages (free threads, free cache, and free file descriptors) and this will allow you to spot a bottleneck quicker. When a free resource runs low, there's a possible bottleneck. Understand the resource utilization and free resources will allow you to understand a bottleneck before it affects the end-user response time. In other words, watch as the resource becomes utilized. When free gets low, keep it on your radar for a cause of performance degradation.
Iterative Tuning Process
Tuning is an iterative process. Know that once you have alleviated one bottleneck, you will surely encounter another one. But do not fret... All aspects of servers are limited and since nothing is infinite you will eventually reach the end. Tuning manipulates the gates, requests which don't have a resource are queued and must wait to be serviced. Tuning becomes a process you must repeat until the workload reaches target capacity with acceptable response times.
Validate, validate, validate. Just as important as recreating and tuning based upon proof is validating that the tuning change had the desired effect. Did it indeed impact scalability in a positive way? Often, performance engineers test out theories. And sometimes, the validation stage will cause a change to be reverted. It's ok that not every change will make it to production. The key is to use a very scientific approach in which you prove the result as well as the requirement.
I hope you gleaned some pearls of wisdom.
Creating the load and emulating production workload is a means to end - you obviously need to create the load before you can capacity plan or understand the scalability of the deployment. But it is the skills in performance analysis that are most valuable. The performance engineer who walks into a project, takes the lead, wastes no time in learning the environment, creates and/or executes the realistic tests, identifies current capacity, isolates and alleviates bottlenecks, documents results, mentors the juniors, and clearly and effectively communicates with everyone from developers on up to the CIO/CTO's, is truly a GOLD MINE.
Becoming a true performance engineer is no easy task, but it's well worth the effort!
[session] From Build to Scale: Lifecycle of Microservices By @fortyfivan | @CloudExpo #Microservices
More and more companies are looking to microservices as an architectural pattern for breaking apart applications into more manageable pieces so that agile teams can deliver new features quicker and more effectively. What this pattern has done more than anything to date is spark organizational transformations, setting the foundation for future application development. In practice, however, there are a number of considerations to make that go beyond simply “build, ship, and run,” which changes ho...
Feb. 8, 2016 01:30 PM EST Reads: 163
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 ad...
Feb. 8, 2016 12:45 PM EST Reads: 351
[session] Focusing on Time-to-Value in Big Data Deployments By @AndyWarfield | @BigDataExpo #BigData
As enterprises work to take advantage of Big Data technologies, they frequently become distracted by product-level decisions. In most new Big Data builds this approach is completely counter-productive: it presupposes tools that may not be a fit for development teams, forces IT to take on the burden of evaluating and maintaining unfamiliar technology, and represents a major up-front expense. In his session at @BigDataExpo at @ThingsExpo, Andrew Warfield, CTO and Co-Founder of Coho Data, will dis...
Feb. 8, 2016 12:30 PM EST Reads: 130
Advances in technology and ubiquitous connectivity have made the utilization of a dispersed workforce more common. Whether that remote team is located across the street or country, management styles/ approaches will have to be adjusted to accommodate this new dynamic. In his session at 17th Cloud Expo, Sagi Brody, Chief Technology Officer at Webair Internet Development Inc., focused on the challenges of managing remote teams, providing real-world examples that demonstrate what works and what do...
Feb. 8, 2016 12:30 PM EST Reads: 226
SYS-CON Events announced today that Men & Mice, the leading global provider of DNS, DHCP and IP address management overlay solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. The Men & Mice Suite overlay solution is already known for its powerful application in heterogeneous operating environments, enabling enterprises to scale without fuss. Building on a solid range of diverse platform support,...
Feb. 8, 2016 12:00 PM EST Reads: 146
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
Feb. 8, 2016 10:45 AM EST Reads: 377
SYS-CON Events announced today that AppNeta, the leader in performance insight for business-critical web applications, will exhibit and present at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. AppNeta is the only application performance monitoring (APM) company to provide solutions for all applications – applications you develop internally, business-critical SaaS applications you use and the networks that deli...
Feb. 8, 2016 10:15 AM EST Reads: 354
It's easy to assume that your app will run on a fast and reliable network. The reality for your app's users, though, is often a slow, unreliable network with spotty coverage. What happens when the network doesn't work, or when the device is in airplane mode? You get unhappy, frustrated users. An offline-first app is an app that works, without error, when there is no network connection.
Feb. 8, 2016 09:45 AM EST Reads: 164
The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, will provide an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data profes...
Feb. 8, 2016 09:30 AM EST Reads: 148
Father business cycles and digital consumers are forcing enterprises to respond faster to customer needs and competitive demands. Successful integration of DevOps and Agile development will be key for business success in today’s digital economy. In his session at DevOps Summit, Pradeep Prabhu, Co-Founder & CEO of Cloudmunch, covered the critical practices that enterprises should consider to seamlessly integrate Agile and DevOps processes, barriers to implementing this in the enterprise, and pr...
Feb. 8, 2016 09:30 AM EST Reads: 324
Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
Feb. 8, 2016 09:15 AM EST Reads: 194
How Best to Integrate Cloud Foundry into Your Existing Ecosystem By @Gidrontxt | @DevOpsSummit #DevOps
As someone who has been dedicated to automation and Application Release Automation (ARA) technology for almost six years now, one of the most common questions I get asked regards Platform-as-a-Service (PaaS). Specifically, people want to know whether release automation is still needed when a PaaS is in place, and why. Isn't that what a PaaS provides? A solution to the deployment and runtime challenges of an application? Why would anyone using a PaaS then need an automation engine with workflow ...
Feb. 8, 2016 07:00 AM EST Reads: 137
SYS-CON Events announced today that VAI, a leading ERP software provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. VAI (Vormittag Associates, Inc.) is a leading independent mid-market ERP software developer renowned for its flexible solutions and ability to automate critical business functions for the distribution, manufacturing, specialty retail and service sectors. An IBM Premier Business Part...
Feb. 7, 2016 02:00 PM EST Reads: 565
SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
Feb. 7, 2016 01:45 PM EST Reads: 368
In most cases, it is convenient to have some human interaction with a web (micro-)service, no matter how small it is. A traditional approach would be to create an HTTP interface, where user requests will be dispatched and HTML/CSS pages must be served. This approach is indeed very traditional for a web site, but not really convenient for a web service, which is not intended to be good looking, 24x7 up and running and UX-optimized. Instead, talking to a web service in a chat-bot mode would be muc...
Feb. 7, 2016 01:15 PM EST Reads: 199
SYS-CON Events announced today that Catchpoint Systems, Inc., a provider of innovative web and infrastructure monitoring solutions, has been named “Silver Sponsor” of SYS-CON's DevOps Summit at 18th Cloud Expo New York, which will take place June 7-9, 2016, at the Javits Center in New York City, NY. Catchpoint is a leading Digital Performance Analytics company that provides unparalleled insight into customer-critical services to help consistently deliver an amazing customer experience. Designed...
Feb. 7, 2016 01:00 PM EST Reads: 335
With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things. In his session at @ThingsExpo, Don DeLoach, CEO and President of Infobright, will discuss the infrastructures businesses will need to implement to handle this explosion of data by providing specific use cases for filte...
Feb. 7, 2016 10:15 AM EST Reads: 122
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies adopt disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevO...
Feb. 7, 2016 09:30 AM EST Reads: 156
SYS-CON Events announced today that Fusion, a leading provider of cloud services, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Fusion, a leading provider of integrated cloud solutions to small, medium and large businesses, is the industry's single source for the cloud. Fusion's advanced, proprietary cloud service platform enables the integration of leading edge solutions in the cloud, including clou...
Feb. 6, 2016 03:30 PM EST Reads: 729
Your business relies on your applications and your employees to stay in business. Whether you develop apps or manage business critical apps that help fuel your business, what happens when users experience sluggish performance? You and all technical teams across the organization – application, network, operations, among others, as well as, those outside the organization, like ISPs and third-party providers – are called in to solve the problem.
Feb. 6, 2016 02:00 PM EST Reads: 692