Welcome!

Machine Learning Authors: Pat Romanski, Yeshim Deniz, Liz McMillan, Elizabeth White, Zakia Bouachraoui

Related Topics: Machine Learning , Agile Computing

Machine Learning : Article

Instant Instant Messaging: Just Add Web Sockets

Chat requires full-duplex communication

Chat rooms and live peer-to-peer chat on the web are high on the list of stunning rich application features that can still drop jaws. Facebook recently launched an integrated web chat implementation to much fanfare. Their impressive Erlang and C++ chat infrastructure showcases real-time web techniques and live, interactive interface elements. However, following Facebook's lead is actually not that hard if you use the right approach.

The two main difficulties faced by Facebook, Google, or any other web giant that wishes to deploy a chat application are scaling out an inconceivably large messaging back end and sending real-time messages to the browser. The first problem is one that many of us would like to have, but few actually encounter. The second problem is getting much, much easier to solve. Web chat is about to go mainstream, thanks in large part to a new HTML 5 feature called Web Sockets.

Chat Is Full-Duplex; the Web Is Not
The challenges in bringing real-time messaging (chat, for example) to web applications stem from the fundamental design of the web. It is, after all, a system designed for navigating hypertext documents. The request-response model is perfect for document retrieval, but it's less perfect for an application platform. Some applications are exceptionally bad fits.

Chat requires full-duplex communication. That is, data must be able to flow bidirectionally. This is a clear necessity for chat, as messages may originate on either the server or the client and must be transmitted without delay. After all, instant messaging is of little use if it isn't instantaneous. HTTP explicitly eliminates the ability to send information over the network in either direction at will at the same time.

Since nearly anything is possible, some tenacious developers have shoehorned full-duplex communication into existing web browsers. Comet long-polling, iframe streaming, and other techniques have all been applied in pursuit of this goal. Each technique has weaknesses, however, and even Comet techniques that allow data to stream down from the server cannot avoid initiating a new request for each upstream message. While a new request might not seem like much, each HTTP action causes several hundred bytes of header data to be generated, transmitted, and parsed. Even when bandwidth and computing resources are cheap, the additional latency incurred by a full round trip hurts real-time interactivity.

Using today's browser technology for both upstream and downstream, there is significant overhead to real-time communication. Despite this, some have been willing to stomach the complexity and inefficiency to run applications like chat on the web without succumbing to the temptation of proprietary plugins.

HTML 5 Web Sockets
The HTML 5 standard specifies new APIs for storage, drawing, drag-and-drop, and other areas that have made web programming painful. Browsers have already begun incorporating parts of HTML 5 (canvas, for example) even though the specification is far from complete. The HTML 5 Communication section includes two additional connectivity features: Server-Sent Events, a standardization of HTTP push, and Web Sockets, a cross-domain safe, full-duplex connection. Server-Sent Events will make real-time updates and notifications easy, and Web Sockets provide the functionality necessary to build chat for the web without the previously required hackery.

More Stories By Frank Salim

Frank Salim is a polyglot programmer with a keen interest in making life easier for his fellow coders. He leads WebSocket development at Kaazing and is the front man for Kaazing's open source project at kaazing.org. Salim is an open source advocate and a committer in several open source projects. He is a regular author and contributor to the online tech magazine Comet Daily.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
393rocker 11/02/09 10:02:00 AM EST

do u have a source code of implementing IM? if i start from the scratch..

CloudEXPO Stories
With more than 30 Kubernetes solutions in the marketplace, it's tempting to think Kubernetes and the vendor ecosystem has solved the problem of operationalizing containers at scale or of automatically managing the elasticity of the underlying infrastructure that these solutions need to be truly scalable. Far from it. There are at least six major pain points that companies experience when they try to deploy and run Kubernetes in their complex environments. In this presentation, the speaker will detail these pain points and explain how cloud can address them.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-centric compute for the most data-intensive applications. Hyperconverged systems already in place can be revitalized with vendor-agnostic, PCIe-deployed, disaggregated approach to composable, maximizing the value of previous investments.
When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by sharing information within the building and with outside city infrastructure via real time shared cloud capabilities.
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.