Welcome!

Machine Learning Authors: Elizabeth White, Jnan Dash, William Schmarzo, Pat Romanski, Liz McMillan

Related Topics: Java IoT, Open Source Cloud, Machine Learning

Java IoT: Blog Post

Long Polling Explained

How long polling could be used in Spring MVC web applications

Let us start with the news publishing service called NewsPostingThread, which produces news messages like: "broadcasting news article # 1". In the heart of this service is lockObject which serves as a synchronization locking mechanism. When a thread processing request asks for latest news, it blocks by the lockObject synchronization lock until the thread generating the news will call lockObject.notifyAll(). At that point the thread processing request will prepare and send response to the client.

@Component
public class NewsPostingThread implements Runnable {
...
@Override
public void run() {
while (true) {
try {
Thread.sleep(random.nextInt(10000));
} catch (Throwable t) {
t.printStackTrace(); }
counter.incrementAndGet();
if (counter.get() > Long.MAX_VALUE - 10) {
counter.set(0);
}
synchronized(lockObject) {
lockObject.notifyAll();
}
}
}
public String getLatestNews() {
try {
log.debug("waiting for the latest news");
synchronized(lockObject) {
lockObject.wait();
}
} catch (InterruptedException e) {
e.printStackTrace();
}
return "broadcastig news article # " + counter.get();
}  }

Let us now take a look at the part of the code that of the Spring MVC controller that calls NewsPostingThread to get the news:

@Controller
public class LongWaitController {
...
@RequestMapping(value="/longPollingAjax")
@ResponseBody
public String ajaxReply(final HttpServletRequest request,
final HttpServletResponse response,
ModelMap mm) throws Exception {
log.debug("ajaxReply");
String news = newsPostingThread.getLatestNews();
log.debug("Data is " + news);
return "{\"data\": \"" + news + "\" }";
}
}

ajaxReply(...) is a simple controller method implementation that processes long polling AJAX requests and returns raw data to the calling client. When processing the client request, a method called getLatestNews() gets blocked until the news become available. Now let us look at the client side where long polling request originates. Ajax is used to create request, and there are two ways AJAX could be used to implement long polling. One way is to use setInterval JavaScript function as a mechanism to schedule polling: it will call poll() method repeatedly at specified intervals. You might need to coordinate request timeout setting on the server and the interval you used in the setInterval functions in such a way that your requests don't start queuing up, as AJAX requests that might not return in the same order they were issued. In the code below, once user clicks on the button with id send, the poll function will be invoked

$("#send").unbind("click").bind("click", function(event) {
$("#result").html('Started looking for news ....');
/* Send the data using post and put the results in a div */
setInterval(function() { poll(); }, 200);
});
function poll() {
$.ajax({
url: "/app/longPollingAjax",
type: "GET",
dataType: "json",
success: function(response){
$("#result").html('Submitted successfully ' + response.data);
},
error:function(){
$("#result").html('There is error while submit');
}
});
}

Alternatively, a better solution is to use JQuery Ajax complete method, which fires when the Ajax call is complete after the success and error call backs have been executed. This way, when button with id send is pushed, the first poll() request is initiated. If request is successful, then the result div will show the latest news from the server, after which "complete" is called and a new poll request is issued.  If request times out or errors out, again "complete" is called and a new poll request is issued.

$("#send").unbind("click").bind("click", function(event) {
$("#result").html('Started looking for news ....');
/* Send the data using post and put the results in a div */
poll();
});
function poll() {
$.ajax({
url: "/app/longPollingAjax",
type: "GET",
dataType: "json",
success: function(response){
$("#result").html('Submitted successfully ' + response.data);
},
error:function(){
$("#result").html('There is error while submit');
},
complete: function() {
poll(); //will re-issue polling request after current request completes.
}
});
}

Enter Servlet spec 3.0

Before Servlet 3.0 spec, there were two server threading models: thread per connection and thread per request. In the thread per connection model, the thread is associated for every TCP/IP connection, and the server can scale to very high number of requests per second when they come from the same clients. Yet this model, for a non-long-polling site can have a hard time scaling to support thousands of users. The reason for it, is that for the most web sites, the users initiate an action and then the connection stays mostly idle while users read the pages and decide on the next actions. Hence threads that are tied to a connection are sitting idle. To improve scalability, the web-servers have a thread per request model. In this model, after servicing the request, the thread can be reused to service a request from different client. This model allows much greater scaling of user base at a minor expense of increased time of servicing each request. This expense is due to the thread scheduling that takes place.

Yet, with the long polling, the difference in scalability between request per connection and thread per request is blurred. It happens because each request is stuck waiting for an event (in this case news event) before generating the response. Waiting in the servlet is an inefficient, since, at the very least, the server thread that could be used to serve a different request is blocked. This results in poor scalability as users are added to the application.

Servlet 3.0 spec comes to the rescue - it defines an asynchronous way for server to process requests. To do that, servlet passes request to an AsyncContext. Then, the response processing is passed to a different thread, and the servlet processing thread can handle a request from a different user. Ideally, the web server container would maintain at least two thread pools - one to process servlet requests and another one that is used to process long running request that are passed to asynchronous context.

Let us see how this long polling example would look like if asynchronous features of Servlet 3.0 container using Spring MVC asynchronous servlet abstractions were to be used to solve the previous task.

@Controller
public class SubscribeToBroadcast {
...
@RequestMapping(value="/pollBroadcast")
@ResponseBody
public DeferredResult<String> ajaxReply(final HttpServletRequest request,
final HttpServletResponse response,
ModelMap mm) throws Exception {
final DeferredResult<String> dr = new DeferredResult<String>(
TimeUnit.MINUTES.toMillis(1), TIMEOUT_RESULT);
broadcastCounter.addSubscribed(dr);
return dr;
}
}

AjaxReply method here returns a DefferedResult, which allows the result to be processed in a thread of developers' choice. In this case the method returns a string representation of a Jason object, hence DefferedResult<String> is used. A new differed result is created that will expire in 1 minute with a default response upon expiry. Then the deferred result is added to the news subscription list and the servlet thread is done.

Let us take a look at the BroadcastCounter - the news publishing service.

@Component
public class BroadcastCounter {
private static final Logger log = Logger.getLogger(BroadcastCounter.class);
private Thread t;
private AtomicLong counter = new AtomicLong();
private List<DeferredResult<String>> subscribedClient = Collections.synchronizedList(new ArrayList<DeferredResult<String>>());

public BroadcastCounter() {
t = new Thread(
new Runnable() {
@Override
public void run() {
while(true) {
counter.incrementAndGet();
if (counter.get() > Long.MAX_VALUE - 100) {
counter.set(0);
}
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
log.error(e);
}
synchronized(subscribedClient) {
Iterator<DeferredResult<String>> it = subscribedClient.iterator();
while(it.hasNext()) {
DeferredResult<String> dr = it.next();
dr.setResult("{ \"data\" : \"Deferred Broadcast News # "+ counter +"\" }");
it.remove();
}
}
}
}
});
t.setDaemon(true);
t.setName("BroadcastDeferredThread");
t.start();
}


public void addSubscribed(DeferredResult<String> client) {
synchronized(subscribedClient) {
subscribedClient.add(client);
}
}

}

At the heart of this class is a subscribed client list - where all the deferred results stored, to be processed when new event occurs. Servlets add deferred result to the list when request is received, and the Broadcast Deferred Thread will go through the list and assigns the news to each of the waiting deferred result. The broadcast thread runs indefinitely, incrementing the news counter, then sleeping for 10 seconds and then broadcasting the news to all of the subscribed clients. If, by chance a deferred result is already expired, the call to setResult is ignored by the Spring API.

Here is a sample page with the result from running this task.

Deferred Result Example

Scalability Testing

To compare the scalability of the pre-Servlet 3.0 solution with the asynchronous servlet, I wrote a program that creates multiple threads, retrieves the web page content, and measures the time it took to perform the retrieve operation (WebClient). It was not essential for the analysis to know in specific time results, e.g. whether it took 9,000 milliseconds vs 10,000 for one call, but rather a trend of what happens when the number of threads accessing the site are increased.  Here are the results testing:

Servlet

# threads

time (milis)

st. dev

pre-Servlet 3.0 spec servlet

68

9,424

2.19

pre-Servlet 3.0 spec servlet

128

9,993

2.33

pre-Servlet 3.0 spec servlet

256

10,149

785.31

pre-Servlet 3.0 spec servlet

512

19,610

898.06

pre-Servlet 3.0 spec servlet

1024

38,679

1,743.00

pre-Servlet 3.0 spec servlet

2048

76,925

3,780.28

Servlet 3.0 asynchronous servlet

68

9,076

4.87

Servlet 3.0 asynchronous servlet

128

10,028

2.77

Servlet 3.0 asynchronous servlet

256

10,034

3.96

Servlet 3.0 asynchronous servlet

512

10,014

9.44

Servlet 3.0 asynchronous servlet

1024

10,027

21.57

Servlet 3.0 asynchronous servlet

2048

10,014

39.40







 

Performance Test

As expected, as the number of threads accessing synchronous long-polling solution is increased, the response time increases. Up until just under 256 thread both synchronous and asynchronous solution performance is very similar. It is most likely because the web server (jetty jetty-8.1.13.v20130916) probably has that many threads servicing requests, so as requests are coming from the test program, almost none have to wait to be serviced. After 256 threads, the difference between the two solutions is highly noticeable.

As you look at the test data for synchronous service at or after 256 threads, two notable things happen: the response time increases with number of threads, and the standard deviation also increases. This increase in both time to get response and standard deviation happens because there are less threads available to process client requests as the number of clients increase. Depending on when a particular request comes in, all threads on the server might be already busy waiting for the news update. In this situation, the amount of time the new request has to wait depends on not only when it came in, e.g. 0.1 second or 0.2 seconds before the news are available to be broadcasted, but also how many threads are waiting in line before it could even get to be serviced.

The asynchronous test results shows remarkable difference in comparison. There is no ramp up in the response time as the number of requests grow, and one can observe only relatively small standard deviation increase. The results suggest that as requests are coming, they are all getting serviced by being put aside to wait for the news update, while server threads that service requests are returned back to service other user requests. So, if you need to use long polling and your server supports Servlet 3.0 specification use asynchronous servlets since they are scaling better.

p>Realistic example News Publishing Subscribe site.

 

The last example is a little more interactive, it allows one person to subscribe to news on one of four channels and another person can post news on these channels. There is also an auto-posting service that posts news as well.

Client Example

News Service Example

At the heart of this service is the NewsDistributionEngine class. It has three methods: addMessageConsumer, removeMessageConsumer and postMessage. AddMessageConsumer adds a consumer request (a DeferredResult) to the list of consumers interested in the news topic: this is done when the AJAX is polling the news service after user had subscribed to a news topic. Lists of consumers interested in specific topics is held in a map, which links topic id to the list of interested consumers. removeMessageConsumer removes the consumer from that list. PostMessage is called by clients that posts news: either the web page or an automatic broadcasting client. To process the news and update all clients, application starts a new thread that first gets the list of the clients subscribed to the news and, if the request did not timed out, posts the news to the client. Clients are then removed from subscribed clients, since subscription only lasts for one request. The client will re-subscribe to the topic via long polling AJAX.

SubscribeToNews is a spring MVC component, that has 3 methods: show the subscription page, process subscription and poll for news. It utilizes HttpSession to remember last subscription for the client, which is then used in the poll for news method to subscribe to the appropriate news channel. PublishNewsController is also a spring MVC component which shows the news posing page and when news are posted, calls NewsDistributionEngine to process the news.

Conclusion
The article has shown three examples of long-polling applications: one using pre-Servlet 3.0 spec, second one using Servlet 3.0 asynchronous servelts and third example which is a bit more interactive. In these examples I showed how implement long polling on both server and client side. As you choose the implementation of long polling, the results show that using asynchronous servlets will scale much better synchronous servlets. I hope you can find these examples useful in understanding how to implement long polling. The code used in these examples could be found at ... (where do you usually like to get sources, git?)?

More Stories By Henry Naftulin

Henry Naftulin is seasoned, result-oriented enterprise senior software developer with 10 years of experience in enterprise scale software applications. He is highly skilled in Java (J2EE), Web and database enterprise application development. Recognized as strong team player, technical leader and mentor for junior team members. Latest interests include e-Logistics/Supply Chain, financial applications and performance tuning.

@CloudExpo Stories
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Large industrial manufacturing organizations are adopting the agile principles of cloud software companies. The industrial manufacturing development process has not scaled over time. Now that design CAD teams are geographically distributed, centralizing their work is key. With large multi-gigabyte projects, outdated tools have stifled industrial team agility, time-to-market milestones, and impacted P&L stakeholders.
"ZeroStack is a startup in Silicon Valley. We're solving a very interesting problem around bringing public cloud convenience with private cloud control for enterprises and mid-size companies," explained Kamesh Pemmaraju, VP of Product Management at ZeroStack, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
"Codigm is based on the cloud and we are here to explore marketing opportunities in America. Our mission is to make an ecosystem of the SW environment that anyone can understand, learn, teach, and develop the SW on the cloud," explained Sung Tae Ryu, CEO of Codigm, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"Infoblox does DNS, DHCP and IP address management for not only enterprise networks but cloud networks as well. Customers are looking for a single platform that can extend not only in their private enterprise environment but private cloud, public cloud, tracking all the IP space and everything that is going on in that environment," explained Steve Salo, Principal Systems Engineer at Infoblox, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventio...
The question before companies today is not whether to become intelligent, it’s a question of how and how fast. The key is to adopt and deploy an intelligent application strategy while simultaneously preparing to scale that intelligence. In her session at 21st Cloud Expo, Sangeeta Chakraborty, Chief Customer Officer at Ayasdi, provided a tactical framework to become a truly intelligent enterprise, including how to identify the right applications for AI, how to build a Center of Excellence to oper...
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, discussed how by using ne...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
Agile has finally jumped the technology shark, expanding outside the software world. Enterprises are now increasingly adopting Agile practices across their organizations in order to successfully navigate the disruptive waters that threaten to drown them. In our quest for establishing change as a core competency in our organizations, this business-centric notion of Agile is an essential component of Agile Digital Transformation. In the years since the publication of the Agile Manifesto, the conn...
In his session at 21st Cloud Expo, James Henry, Co-CEO/CTO of Calgary Scientific Inc., introduced you to the challenges, solutions and benefits of training AI systems to solve visual problems with an emphasis on improving AIs with continuous training in the field. He explored applications in several industries and discussed technologies that allow the deployment of advanced visualization solutions to the cloud.
Vulnerability management is vital for large companies that need to secure containers across thousands of hosts, but many struggle to understand how exposed they are when they discover a new high security vulnerability. In his session at 21st Cloud Expo, John Morello, CTO of Twistlock, addressed this pressing concern by introducing the concept of the “Vulnerability Risk Tree API,” which brings all the data together in a simple REST endpoint, allowing companies to easily grasp the severity of the ...
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, discussed how data centers of the future will be managed, how the p...
"NetApp is known as a data management leader but we do a lot more than just data management on-prem with the data centers of our customers. We're also big in the hybrid cloud," explained Wes Talbert, Principal Architect at NetApp, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
"We're focused on how to get some of the attributes that you would expect from an Amazon, Azure, Google, and doing that on-prem. We believe today that you can actually get those types of things done with certain architectures available in the market today," explained Steve Conner, VP of Sales at Cloudistics, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.