|By Michael Kopp||
|October 27, 2012 10:00 AM EDT||
Traditional Enterprise Database vendors often bring up the lack of professional monitoring and management tool support for NoSQL solutions. Their argument is that enterprise applications require sophisticated tuning and monitoring of the database in order to ensure a performant and smooth operation. NoSQL Vendors, while arguing that this lack is not enough to favor RDBMS over their respective solutions, do agree. Several vendors try to differentiate themselves by providing enterprise level monitoring and management software, for example, Cassandra, MongoDB, HBase or others. Both are of course correct that monitoring and management of especially the performance aspect is important, but at the same time they are making the same mistake that RDBMS vendors have made for the last decade: they ignore the application.
Application Performance Management for Databases
What matters in the end is not the database performance itself, but the performance of the application that uses the database. We have explained the different problem patterns numerous times in this blog (...) so I won't go into them. All of them however have one thing in common: the application logic drives how the database is used and there is only so much that you can tune on the database to cover for mistakes on the application side. So we need to monitor and optimize that usage pattern itself. Application Logic again is driven by input data or in most cases end user interaction, thus we need to understand how end user behavior and end user actions drive the database usage. On the other hand we need to understand the impact of the database on these actions. What's important to understand is that the database can work and perform to the highest standards and still be the main bottleneck as far as the application is concerned - if it is wrongly used or has a bad access pattern. RDBMS and NoSQL Databases have that in common. Therefore the fundamental way I, as a performance engineer, do application performance analysis and management does not change:
First we need to understand if a particular business transaction slows down, has a general performance problem and if this has an impact on the end user:
Do any of the business transactions violate a baseline or have negative end user experience
Next we would isolate the high level cause of the slow down or performance issue. There are many ways of doing this, but it's always some kind of fault domain isolation:
Transaction Flow shows that we spend ~15 % waiting for the Database
This Transaction Flow shows that the Business Backend is calling a Cassandra Database Cluster
This tells us if we spend time waiting on the database. We see that there is not much difference between a regular database and something like an Apache Cassandra!
What's important here is that if the database shows up as the main contributor, this does not mean that the database itself is at fault, it might just be the applications usage of it. Thus I would need to check the usage and access pattern next:
This shows the select statements executed within a particular transaction type
This shows all Cassandra database statements against all participating Cassandra servers executed in a particular transaction
Here I might see that the reason for bad performance is that we execute too many statements per transaction, or that we read too much data. If that is the case I would need to check the application logic itself and potentially need a developer to fix it. The developer would of course want to understand where, why and in which particular transaction the statements are executed.
This shows a single Transaction (PurePath) and the Cassandra Statements executed within it
If however a particular statement is slow we, might very well have a database issue and I would talk to the DBA. The only difference in case of a NoSQL solution in this process is that you often have a database cluster, so I would want to understand if the problem is isolated to a particular node or not. And the DBA will want to understand if my access pattern leads to a good distribution across that cluster or if I am hammering away at a subset of them.
This hotspot view shows that Cassandra Server Node3 consumes much more Wait and I/O time than the others
The nice thing is that all in all my analysis does not differ between JDBC, ADO, Cassandra or one of the many NoSQL solutions.
APM Solution Support
There is of course a caveat here; it requires some level of support of the APM solution of choice. Sometimes it might be enough to see API calls on the NoSQL client in my response time breakdown. More often than not of course a little bit more context is desired, like which ColumnFamily is accessed and how many rows are read or which Database Node in the Cluster is serving a read. For this and the afore mentioned reasons I argue that APM solution support of your chosen Database or NoSQL solution is as important as the monitoring of the database itself.
I have spent considerable time tuning SQL statements and indexes, but in the end the best optimizations have always been those on the application and how the application uses the database. SQL Tuning almost always adds complexity and often is a workaround over bad application or data structure design. In the NoSQL world "SQL statement" tuning for the most part is a task of the past, but Data Structure Design has retained its importance! At the same time logic that traditionally resided in the database is now in the application layer, making application design even more important than before. So while some things have shifted, from an Application Performance Engineering Perspective I have to say: nothing really changed, it's still about the application. Now more than ever!
The industrial software market has treated data with the mentality of “collect everything now, worry about how to use it later.” We now find ourselves buried in data, with the pervasive connectivity of the (Industrial) Internet of Things only piling on more numbers. There’s too much data and not enough information. In his session at @ThingsExpo, Bob Gates, Global Marketing Director, GE’s Intelligent Platforms business, to discuss how realizing the power of IoT, software developers are now focu...
Mar. 1, 2015 03:15 PM EST Reads: 1,293
Operational Hadoop and the Lambda Architecture for Streaming Data Apache Hadoop is emerging as a distributed platform for handling large and fast incoming streams of data. Predictive maintenance, supply chain optimization, and Internet-of-Things analysis are examples where Hadoop provides the scalable storage, processing, and analytics platform to gain meaningful insights from granular data that is typically only valuable from a large-scale, aggregate view. One architecture useful for capturing...
Mar. 1, 2015 02:00 PM EST Reads: 1,272
SYS-CON Events announced today that Vitria Technology, Inc. will exhibit at SYS-CON’s @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Vitria will showcase the company’s new IoT Analytics Platform through live demonstrations at booth #330. Vitria’s IoT Analytics Platform, fully integrated and powered by an operational intelligence engine, enables customers to rapidly build and operationalize advanced analytics to deliver timely business outcomes ...
Mar. 1, 2015 01:45 PM EST Reads: 1,176
DevOps is about increasing efficiency, but nothing is more inefficient than building the same application twice. However, this is a routine occurrence with enterprise applications that need both a rich desktop web interface and strong mobile support. With recent technological advances from Isomorphic Software and others, it is now feasible to create a rich desktop and tuned mobile experience with a single codebase, without compromising performance or usability.
Mar. 1, 2015 01:15 PM EST Reads: 1,066
SYS-CON Events announced today Arista Networks will exhibit at SYS-CON's DevOps Summit 2015 New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. Arista Networks was founded to deliver software-driven cloud networking solutions for large data center and computing environments. Arista’s award-winning 10/40/100GbE switches redefine scalability, robustness, and price-performance, with over 3,000 customers and more than three million cloud networking ports depl...
Mar. 1, 2015 01:00 PM EST Reads: 1,536
The speed of software changes in growing and large scale rapid-paced DevOps environments presents a challenge for continuous testing. Many organizations struggle to get this right. Practices that work for small scale continuous testing may not be sufficient as the requirements grow. In his session at DevOps Summit, Marc Hornbeek, Sr. Solutions Architect of DevOps continuous test solutions at Spirent Communications, will explain the best practices of continuous testing at high scale, which is r...
Mar. 1, 2015 01:00 PM EST Reads: 1,144
Security can create serious friction for DevOps processes. We've come up with an approach to alleviate the friction and provide security value to DevOps teams. In her session at DevOps Summit, Shannon Lietz, Senior Manager of DevSecOps at Intuit, will discuss how DevSecOps got started and how it has evolved. Shannon Lietz has over two decades of experience pursuing next generation security solutions. She is currently the DevSecOps Leader for Intuit where she is responsible for setting and driv...
Mar. 1, 2015 12:00 PM EST Reads: 2,347
The explosion of connected devices / sensors is creating an ever-expanding set of new and valuable data. In parallel the emerging capability of Big Data technologies to store, access, analyze, and react to this data is producing changes in business models under the umbrella of the Internet of Things (IoT). In particular within the Insurance industry, IoT appears positioned to enable deep changes by altering relationships between insurers, distributors, and the insured. In his session at @Things...
Mar. 1, 2015 12:00 PM EST Reads: 1,217
SYS-CON Events announced today that Open Data Centers (ODC), a carrier-neutral colocation provider, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. Open Data Centers is a carrier-neutral data center operator in New Jersey and New York City offering alternative connectivity options for carriers, service providers and enterprise customers.
Mar. 1, 2015 12:00 PM EST Reads: 1,879
Thanks to Docker, it becomes very easy to leverage containers to build, ship, and run any Linux application on any kind of infrastructure. Docker is particularly helpful for microservice architectures because their successful implementation relies on a fast, efficient deployment mechanism – which is precisely one of the features of Docker. Microservice architectures are therefore becoming more popular, and are increasingly seen as an interesting option even for smaller projects, instead of bein...
Mar. 1, 2015 12:00 PM EST Reads: 2,534
Even as cloud and managed services grow increasingly central to business strategy and performance, challenges remain. The biggest sticking point for companies seeking to capitalize on the cloud is data security. Keeping data safe is an issue in any computing environment, and it has been a focus since the earliest days of the cloud revolution. Understandably so: a lot can go wrong when you allow valuable information to live outside the firewall. Recent revelations about government snooping, along...
Mar. 1, 2015 11:00 AM EST Reads: 6,938
In his session at DevOps Summit, Tapabrata Pal, Director of Enterprise Architecture at Capital One, will tell a story about how Capital One has embraced Agile and DevOps Security practices across the Enterprise – driven by Enterprise Architecture; bringing in Development, Operations and Information Security organizations together. Capital Ones DevOpsSec practice is based upon three "pillars" – Shift-Left, Automate Everything, Dashboard Everything. Within about three years, from 100% waterfall, C...
Mar. 1, 2015 11:00 AM EST Reads: 2,695
PubNub on Monday has announced that it is partnering with IBM to bring its sophisticated real-time data streaming and messaging capabilities to Bluemix, IBM’s cloud development platform. “Today’s app and connected devices require an always-on connection, but building a secure, scalable solution from the ground up is time consuming, resource intensive, and error-prone,” said Todd Greene, CEO of PubNub. “PubNub enables web, mobile and IoT developers building apps on IBM Bluemix to quickly add sc...
Mar. 1, 2015 10:00 AM EST Reads: 4,710
Data-intensive companies that strive to gain insights from data using Big Data analytics tools can gain tremendous competitive advantage by deploying data-centric storage. Organizations generate large volumes of data, the vast majority of which is unstructured. As the volume and velocity of this unstructured data increases, the costs, risks and usability challenges associated with managing the unstructured data (regardless of file type, size or device) increases simultaneously, including end-to-...
Mar. 1, 2015 09:45 AM EST Reads: 2,163
The excitement around the possibilities enabled by Big Data is being tempered by the daunting task of feeding the analytics engines with high quality data on a continuous basis. As the once distinct fields of data integration and data management increasingly converge, cloud-based data solutions providers have emerged that can buffer your organization from the complexities of this continuous data cleansing and management so that you’re free to focus on the end goal: actionable insight.
Mar. 1, 2015 09:30 AM EST Reads: 1,649
With several hundred implementations of IoT-enabled solutions in the past 12 months alone, this session will focus on experience over the art of the possible. Many can only imagine the most advanced telematics platform ever deployed, supporting millions of customers, producing tens of thousands events or GBs per trip, and hundreds of TBs per month. With the ability to support a billion sensor events per second, over 30PB of warm data for analytics, and hundreds of PBs for an data analytics arc...
Mar. 1, 2015 09:00 AM EST Reads: 1,256
Between the compelling mockups and specs produced by your analysts and designers, and the resulting application built by your developers, there is a gulf where projects fail, costs spiral out of control, and applications fall short of requirements. In his session at DevOps Summit, Charles Kendrick, CTO and Chief Architect at Isomorphic Software, will present a new approach where business and development users collaborate – each using tools appropriate to their goals and expertise – to build mo...
Mar. 1, 2015 09:00 AM EST Reads: 2,896
The Internet of Things (IoT) is causing data centers to become radically decentralized and atomized within a new paradigm known as “fog computing.” To support IoT applications, such as connected cars and smart grids, data centers' core functions will be decentralized out to the network's edges and endpoints (aka “fogs”). As this trend takes hold, Big Data analytics platforms will focus on high-volume log analysis (aka “logs”) and rely heavily on cognitive-computing algorithms (aka “cogs”) to mak...
Mar. 1, 2015 09:00 AM EST Reads: 1,025
Since 2008 and for the first time in history, more than half of humans live in urban areas, urging cities to become “smart.” Today, cities can leverage the wide availability of smartphones combined with new technologies such as Beacons or NFC to connect their urban furniture and environment to create citizen-first services that improve transportation, way-finding and information delivery. In her session at @ThingsExpo, Laetitia Gazel-Anthoine, CEO of Connecthings, will focus on successful use c...
Mar. 1, 2015 04:00 AM EST Reads: 2,907
SYS-CON Events announced today that GENBAND, a leading developer of real time communications software solutions, has been named “Silver Sponsor” of SYS-CON's WebRTC Summit, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. The GENBAND team will be on hand to demonstrate their newest product, Kandy. Kandy is a communications Platform-as-a-Service (PaaS) that enables companies to seamlessly integrate more human communications into their Web and mobile applicatio...
Feb. 28, 2015 05:00 PM EST Reads: 1,398