Machine Learning Authors: Pat Romanski, Yeshim Deniz, Liz McMillan, Elizabeth White, Zakia Bouachraoui

Related Topics: Machine Learning

Machine Learning : Article

The Offline Web – Introduced Complexity

The Pitfalls of Creating Data in an Occasionally Connected Application

Up until recently, Web applications were "connected-only" applications. Users could only use the application by connecting to the central server and all data access was done in a single place. For many years, people accepted that was the limitation of Web applications. But it isn't a limitation any longer.

Over the past year a number of technologies have come along such as Adobe AIR and Google Gears that allow offline access to the data and the application. This is a huge boon to Web developers who, up until now, were hampered by the offline problem of Web applications. Both of these solutions employ a local, lightweight database to serve and store the data while the application is offline.

Migrating your "connected-only" application to an "occasionally connected" application will likely require significant architecture changes to facilitate the change tracking and conflict detection needed for synchronization. It's extremely important that all the architecture changes are considered, and that a synchronization strategy is designed carefully from the start.

To illustrate the complexities that occasionally connected applications bring, and to exhibit the sort of architecture changes that may be required, this article will focus on only one problem: How do you create and uniquely identify data that's created while offline?

While this is among the simplest of the synchronization problems that you'll encounter, it provides an excellent example of the changes you must consider when designing occasionally connected applications.

Why Good Old Auto-Increment Isn't Enough
To be able to synchronize and detect conflicts, all of the data records must have a unique identifier that distinguishes them from all other records. In the database world, this would be called the record's primary key. When a new record is created, it must be assigned a unique primary key to ensure that it can be uniquely identified across all instances of the application. In a centrally accessed system, the most common way to generate unique values is to use an auto-incrementing integer. Each created record is assigned a primary key that's one greater than that of the last created record. But this fast, simple method that works so well for connected-only applications simply doesn't work for occasionally connected ones.

To show why it doesn't work, consider an application that lets multiple users share a single contact list between them. The master version of the contact list resides on the central server, and a copy of that list is stored in the local database of each user's application. Assume that two users, User A and User B, synchronize their contact list with the server at the beginning of the day and then proceed to work offline. Later, while working offline, each user adds a new contact to his application. What value should they use as a primary key? Auto-incrementing seems to work fine on the server, so what happens if auto-increment is used in the offline applications? The answer: collisions.

More Stories By Eric Farrar

Eric Farrar is a senior product manager at Sybase, an SAP company, working on the SQL Anywhere embedded database. He is focused on bringing the power of embedded database technology into the new world of the web, and cloud computing environments.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

CloudEXPO Stories
With more than 30 Kubernetes solutions in the marketplace, it's tempting to think Kubernetes and the vendor ecosystem has solved the problem of operationalizing containers at scale or of automatically managing the elasticity of the underlying infrastructure that these solutions need to be truly scalable. Far from it. There are at least six major pain points that companies experience when they try to deploy and run Kubernetes in their complex environments. In this presentation, the speaker will detail these pain points and explain how cloud can address them.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-centric compute for the most data-intensive applications. Hyperconverged systems already in place can be revitalized with vendor-agnostic, PCIe-deployed, disaggregated approach to composable, maximizing the value of previous investments.
When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by sharing information within the building and with outside city infrastructure via real time shared cloud capabilities.
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.