Welcome!

Machine Learning Authors: Mehdi Daoudi, Mark Ross-Smith, Jason Bloomberg, Carmen Gonzalez, Jeffrey Abbott

Related Topics: Java IoT, Industrial IoT, Microservices Expo, Microsoft Cloud, Machine Learning , Recurring Revenue

Java IoT: Article

Mainstream Business Applications and In-Memory Databases

Databases serving business applications are heading towards memory-centric design and implementation

(Please refer to the following article: Oracle 12c In-Memory Database is Out - Hardly Anybody Notices for update on Oracle 12c databases)

Contemporary large servers are routinely configured with 2TB of RAM. It is thus possible to fit an entire average size OLTP database in memory directly accessible by CPU. There is a long history of academic research on how to best utilize relatively abundant computer memory. This research is becoming increasingly relevant as databases serving business applications are heading towards memory centric design and implementation.

If you simply place Oracle RDBMS's files on Solid State Disk, or configure buffer cache (SGA) large enough to contain the whole database, Oracle will not magically become an IMDB database, nor it will perform much faster.  In order to properly utilize memory, IMDB databases require purposely architected, configured, balanced and optimized hardware (CPU, RAM, Flash, busses, cluster interconnect), as well as RDBMS software written with RAM as center point in mind. IMDB's main premise is that data primarily resides in RAM and is only persisted to disk for protection and auxiliary functions.  Data structures, methods and processes typically associated with an RDBMS (index types, join methods, data layout, internal data flows and processing) need to be (re)designed with RAM-centric axiom in mind.

There are a few fully functional IMDB products available on the market today. SAP Hana is one of the most recent additions to IMDB class of products.

SAP Hana is a hybrid row/columnar data store.  It is designed under the assumption that the most of operations ( either OLTP or OLAP ) are reading ( as opposed to writing ) the data. Hana aspires to serve equally well both OLTP and OLAP workloads since it considers them similar in terms of workload characteristics.

Hana's row oriented tables are always in memory, while column tables are loaded to memory on demand. Column store data is dictionary encoded and compressed and thus more expensive (in terms of system resources) for inserts and updates than row store. In order to handle relatively demanding writes to column store, Hana's database memory is logically split into main store and delta store.  Main store is optimized for reads and efficient memory consumption. Writes to columnar data are handled via delta storage which has basic compression level and is optimized for write. Cache Sensitive B+ tree (CSB+) used for faster search on delta. Delta and main store are merged automatically or manually. The need for delta store effectively halves size of database that Hana can handle since delta store must equal size of the main store.

While SAP Hana column store performance is in some cases is spectacular (aggregations over small number of columns), reconstruction of large rows is fairly slow because of the need to reconstruct rows.

It is possible to build clusters with dozens of nodes (SAP recently tested 100 node, 100 TB RAM cluster), where data is distributed across nodes.

In Hana's clustered, distributed database, all data is not directly accessible by a single CPU (only local RAM is directly addressable) which has negative  performance implications.

There are other memory centric databases available on the market today - IBM Solid DB, Oracle Times Ten, Volt DB. We expect that major vendors like Oracle further modify their mainstream RDBMS to adjust to increased role of RAM and other types of memory in modern hardware.

More Stories By Ranko Mosic

Ranko Mosic, BScEng, is specializing in Big Data/Data Architecture consulting services ( database/data architecture, machine learning ). His clients are in finance, retail, telecommunications industries. Ranko is welcoming inquiries about his availability for consulting engagements and can be reached at 408-757-0053 or [email protected]

@CloudExpo Stories
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smart...
The WebRTC Summit New York, to be held June 6-8, 2017, at the Javits Center in New York City, NY, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 20th International Cloud Expo and @ThingsExpo. WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web co...
"A lot of times people will come to us and have a very diverse set of requirements or very customized need and we'll help them to implement it in a fashion that you can't just buy off of the shelf," explained Nick Rose, CTO of Enzu, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Buzzword alert: Microservices and IoT at a DevOps conference? What could possibly go wrong? In this Power Panel at DevOps Summit, moderated by Jason Bloomberg, the leading expert on architecting agility for the enterprise and president of Intellyx, panelists peeled away the buzz and discuss the important architectural principles behind implementing IoT solutions for the enterprise. As remote IoT devices and sensors become increasingly intelligent, they become part of our distributed cloud enviro...
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex softw...
Cloud Expo, Inc. has announced today that Andi Mann returns to 'DevOps at Cloud Expo 2017' as Conference Chair The @DevOpsSummit at Cloud Expo will take place on June 6-8, 2017, at the Javits Center in New York City, NY. "DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited to help the great t...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
Creating replica copies to tolerate a certain number of failures is easy, but very expensive at cloud-scale. Conventional RAID has lower overhead, but it is limited in the number of failures it can tolerate. And the management is like herding cats (overseeing capacity, rebuilds, migrations, and degraded performance). In his general session at 18th Cloud Expo, Scott Cleland, Senior Director of Product Marketing for the HGST Cloud Infrastructure Business Unit, discussed how a new approach is neces...
The buzz continues for cloud, data analytics and the Internet of Things (IoT) and their collective impact across all industries. But a new conversation is emerging - how do companies use industry disruption and technology enablers to lead in markets undergoing change, uncertainty and ambiguity? Organizations of all sizes need to evolve and transform, often under massive pressure, as industry lines blur and merge and traditional business models are assaulted and turned upside down. In this new da...
DevOps tends to focus on the relationship between Dev and Ops, putting an emphasis on the ops and application infrastructure. But that’s changing with microservices architectures. In her session at DevOps Summit, Lori MacVittie, Evangelist for F5 Networks, will focus on how microservices are changing the underlying architectures needed to scale, secure and deliver applications based on highly distributed (micro) services and why that means an expansion into “the network” for DevOps.
In his session at 19th Cloud Expo, Claude Remillard, Principal Program Manager in Developer Division at Microsoft, contrasted how his team used config as code and immutable patterns for continuous delivery of microservices and apps to the cloud. He showed how the immutable patterns helps developers do away with most of the complexity of config as code-enabling scenarios such as rollback, zero downtime upgrades with far greater simplicity. He also demoed building immutable pipelines in the cloud ...
@DevOpsSummit at Cloud taking place June 6-8, 2017, at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long developm...
Traditional on-premises data centers have long been the domain of modern data platforms like Apache Hadoop, meaning companies who build their business on public cloud were challenged to run Big Data processing and analytics at scale. But recent advancements in Hadoop performance, security, and most importantly cloud-native integrations, are giving organizations the ability to truly gain value from all their data. In his session at 19th Cloud Expo, David Tishgart, Director of Product Marketing ...
The security needs of IoT environments require a strong, proven approach to maintain security, trust and privacy in their ecosystem. Assurance and protection of device identity, secure data encryption and authentication are the key security challenges organizations are trying to address when integrating IoT devices. This holds true for IoT applications in a wide range of industries, for example, healthcare, consumer devices, and manufacturing. In his session at @ThingsExpo, Lancen LaChance, vic...
While many government agencies have embraced the idea of employing cloud computing as a tool for increasing the efficiency and flexibility of IT, many still struggle with large scale adoption. The challenge is mainly attributed to the federated structure of these agencies as well as the immaturity of brokerage and governance tools and models. Initiatives like FedRAMP are a great first step toward solving many of these challenges but there are a lot of unknowns that are yet to be tackled. In hi...
Who are you? How do you introduce yourself? Do you use a name, or do you greet a friend by the last four digits of his social security number? Assuming you don’t, why are we content to associate our identity with 10 random digits assigned by our phone company? Identity is an issue that affects everyone, but as individuals we don’t spend a lot of time thinking about it. In his session at @ThingsExpo, Ben Klang, Founder & President of Mojo Lingo, discussed the impact of technology on identity. Sho...
What are the new priorities for the connected business? First: businesses need to think differently about the types of connections they will need to make – these span well beyond the traditional app to app into more modern forms of integration including SaaS integrations, mobile integrations, APIs, device integration and Big Data integration. It’s important these are unified together vs. doing them all piecemeal. Second, these types of connections need to be simple to design, adapt and configure...
"Splunk basically takes machine data and we make it usable, valuable and accessible for everyone. The way that plays in DevOps is - we need to make data-driven decisions to delivering applications," explained Andi Mann, Chief Technology Advocate at Splunk and @DevOpsSummit Conference Chair, in this SYS-CON.tv interview at @DevOpsSummit at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
"Logz.io is a log analytics platform. We offer the ELK stack, which is the most common log analytics platform in the world. We offer it as a cloud service," explained Tomer Levy, co-founder and CEO of Logz.io, in this SYS-CON.tv interview at DevOps Summit, held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
WebRTC is about the data channel as much as about video and audio conferencing. However, basically all commercial WebRTC applications have been built with a focus on audio and video. The handling of “data” has been limited to text chat and file download – all other data sharing seems to end with screensharing. What is holding back a more intensive use of peer-to-peer data? In her session at @ThingsExpo, Dr Silvia Pfeiffer, WebRTC Applications Team Lead at National ICT Australia, looked at differ...