Welcome!

Machine Learning Authors: Yeshim Deniz, Zakia Bouachraoui, Liz McMillan, Pat Romanski, Elizabeth White

Related Topics: Java IoT, Industrial IoT, Microservices Expo, Microsoft Cloud, Machine Learning , Recurring Revenue

Java IoT: Article

Mainstream Business Applications and In-Memory Databases

Databases serving business applications are heading towards memory-centric design and implementation

(Please refer to the following article: Oracle 12c In-Memory Database is Out - Hardly Anybody Notices for update on Oracle 12c databases)

Contemporary large servers are routinely configured with 2TB of RAM. It is thus possible to fit an entire average size OLTP database in memory directly accessible by CPU. There is a long history of academic research on how to best utilize relatively abundant computer memory. This research is becoming increasingly relevant as databases serving business applications are heading towards memory centric design and implementation.

If you simply place Oracle RDBMS's files on Solid State Disk, or configure buffer cache (SGA) large enough to contain the whole database, Oracle will not magically become an IMDB database, nor it will perform much faster.  In order to properly utilize memory, IMDB databases require purposely architected, configured, balanced and optimized hardware (CPU, RAM, Flash, busses, cluster interconnect), as well as RDBMS software written with RAM as center point in mind. IMDB's main premise is that data primarily resides in RAM and is only persisted to disk for protection and auxiliary functions.  Data structures, methods and processes typically associated with an RDBMS (index types, join methods, data layout, internal data flows and processing) need to be (re)designed with RAM-centric axiom in mind.

There are a few fully functional IMDB products available on the market today. SAP Hana is one of the most recent additions to IMDB class of products.

SAP Hana is a hybrid row/columnar data store.  It is designed under the assumption that the most of operations ( either OLTP or OLAP ) are reading ( as opposed to writing ) the data. Hana aspires to serve equally well both OLTP and OLAP workloads since it considers them similar in terms of workload characteristics.

Hana's row oriented tables are always in memory, while column tables are loaded to memory on demand. Column store data is dictionary encoded and compressed and thus more expensive (in terms of system resources) for inserts and updates than row store. In order to handle relatively demanding writes to column store, Hana's database memory is logically split into main store and delta store.  Main store is optimized for reads and efficient memory consumption. Writes to columnar data are handled via delta storage which has basic compression level and is optimized for write. Cache Sensitive B+ tree (CSB+) used for faster search on delta. Delta and main store are merged automatically or manually. The need for delta store effectively halves size of database that Hana can handle since delta store must equal size of the main store.

While SAP Hana column store performance is in some cases is spectacular (aggregations over small number of columns), reconstruction of large rows is fairly slow because of the need to reconstruct rows.

It is possible to build clusters with dozens of nodes (SAP recently tested 100 node, 100 TB RAM cluster), where data is distributed across nodes.

In Hana's clustered, distributed database, all data is not directly accessible by a single CPU (only local RAM is directly addressable) which has negative  performance implications.

There are other memory centric databases available on the market today - IBM Solid DB, Oracle Times Ten, Volt DB. We expect that major vendors like Oracle further modify their mainstream RDBMS to adjust to increased role of RAM and other types of memory in modern hardware.

More Stories By Ranko Mosic

Ranko Mosic, BScEng, is specializing in Big Data/Data Architecture consulting services ( database/data architecture, machine learning ). His clients are in finance, retail, telecommunications industries. Ranko is welcoming inquiries about his availability for consulting engagements and can be reached at 408-757-0053 or [email protected]

CloudEXPO Stories
Your job is mostly boring. Many of the IT operations tasks you perform on a day-to-day basis are repetitive and dull. Utilizing automation can improve your work life, automating away the drudgery and embracing the passion for technology that got you started in the first place. In this presentation, I'll talk about what automation is, and how to approach implementing it in the context of IT Operations. Ned will discuss keys to success in the long term and include practical real-world examples. Get started on automating your way to a brighter future!
The challenges of aggregating data from consumer-oriented devices, such as wearable technologies and smart thermostats, are fairly well-understood. However, there are a new set of challenges for IoT devices that generate megabytes or gigabytes of data per second. Certainly, the infrastructure will have to change, as those volumes of data will likely overwhelm the available bandwidth for aggregating the data into a central repository. Ochandarena discusses a whole new way to think about your next-gen applications and how to address the challenges of building applications that harness all data types and sources.
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like "How is my application doing" but no idea how to get a proper answer.
DXWorldEXPO LLC announced today that Big Data Federation to Exhibit at the 22nd International CloudEXPO, colocated with DevOpsSUMMIT and DXWorldEXPO, November 12-13, 2018 in New York City. Big Data Federation, Inc. develops and applies artificial intelligence to predict financial and economic events that matter. The company uncovers patterns and precise drivers of performance and outcomes with the aid of machine-learning algorithms, big data, and fundamental analysis. Their products are deployed by some of the world's largest financial institutions. The company develops and applies innovative machine-learning technologies to big data to predict financial, economic, and world events. The team is a group of passionate technologists, mathematicians, data scientists and programmers in Silicon Valley with over 100 patents to their names. Big Data Federation was incorporated in 2015 and is ...
All in Mobile is a place where we continually maximize their impact by fostering understanding, empathy, insights, creativity and joy. They believe that a truly useful and desirable mobile app doesn't need the brightest idea or the most advanced technology. A great product begins with understanding people. It's easy to think that customers will love your app, but can you justify it? They make sure your final app is something that users truly want and need. The only way to do this is by researching target group and involving users in the designing process.