Welcome!

Machine Learning Authors: Yeshim Deniz, Elizabeth White, Pat Romanski, Amit Gupta, Liz McMillan

Related Topics: @ThingsExpo, Machine Learning , @BigDataExpo

@ThingsExpo: Blog Post

Are You Thinking About Big Data When Doing IoT? – You Should Be | @ThingsExpo #ML #IoT #M2M #BigData

Based on all estimates by industry analysts and current trends, the IoT is growing at an incredible rate and is here to stay

Are You Thinking About Big Data When Doing IoT? - You Should Be

There is no denying the Internet of Things (IoT) is a hot topic. Gartner positions IoT as being at the peak of the ‘hype cycle.' From a size perspective, these ‘Things' can be anything, from a small sensor to a large appliance, and everything in between. The data transmitted by these devices, for the most part, tends to be small - tiny packets of information destined for consumption and analysis, bringing value to the business.

Is there hype? Yes. As with any new technology, there is always a level of hype involved. Are the data packets involved small? For the most part, yes (there are always exceptions). While both may be true, The Internet of Things is growing at breakneck speed. No matter which analyst you read, the growth predictions are staggering. Gartner predicts that we will hit over 20 billion (with a B) devices by 2020. IHS predicts even larger numbers, with 30 billion by 2020, and over 75 billion devices by 2025. No matter what, that's a lot of devices, and no matter how small the packets, multiplied by the number of devices, that's a lot of data.

It's not the things, it's the data
What I find interesting is that many times the focus of discussion when talking IoT are the devices, the sensors, the hardware itself. The latest Fitbit or smartwatch. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). Yes, those technologies are interesting (okay, fascinating, I will admit, my inner geek loves getting down into the actual technologies), but when we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing?

What I am about to say may sound like heresy to many. IoT is not about the devices. The devices are not the end goal. The devices are tools, mechanisms, conduits, conduits of information. They provide (and consume) information. Massive amounts of information. A former colleague of mine for years was always fond of saying, ‘Ed, It's all about the data.' In the burgeoning world of IoT that statement identifies the true business value of IoT. Information.

Watching out for potholes
Recently, Ford announced they were testing a pothole detector and alert system for cars. Living in New England, let me tell you, potholes are the bane of a car driver's existence. Many a car ends up in the repair shop during pothole season. Given that, the concept is intriguing. The manufacturer has cameras mounted on the vehicles. The cameras scan the roadway around the vehicle looking for signs of potholes. Image recognition allows it to make this determination. If a pothole is detected, the system will allow the car to avoid hitting the pothole, and thus potential damage to the vehicle.

Now some would say, ‘what does that have to do with big data?' The system is self-contained within the vehicle. To be useful, the system needs to react in near real-time to the situation. It doesn't have time to send all the data back to the cloud for analysis to determine if there is a pothole. Also, what if it loses network connection? All valid points. Let's take a step back, and look at the bigger picture.

  • How does the system recognize a pothole? Image recognition. What does image recognition need? Lots of data about what potholes look like. Machine learning algorithms help it determine if its seeing a pothole, and those algorithms need data to do that.
  • What will be the source of those pothole images? Wouldn't it be useful if images of any potholes the system encounters become part of the source data for the image recognition system to improve its detection? Wouldn't it be useful to provide that back to a central location to improve the algorithms and detection software, which could then be sent back to all the other vehicles to improve their capability?
  • What about all the cars without the system? Wouldn't it be nice if the pothole locations were flagged to the various GPS applications people use so they are aware of the pothole and its location?
  • What about the local public works department? Wouldn't it be nice if they were automatically notified about the new pothole identified so it could be repaired?

Ingestion considerations
Given the importance of the data to the success of any IoT implementation, ingesting that information is critical to the successful implementation.

  • Data Quality - In the world of data, quality has always been an important consideration. Data cleansing and scrubbing is standard practice already in many organizations. It has become critical for IoT implementations. Ingesting dirty data into even the best IoT implementation will bring it to a grinding halt.
  • Data Volume - As I have mentioned already, many times the data packets for an individual device/sensor are small. That being said, multiplied by the sheer number of devices, the volume can quickly overwhelm a network or storage environment if not planned for appropriately. These considerations also must take into account location
  • Data Timeliness - Besides volume, new and timely data is also a consideration. In the pothole example, if the last update was weeks ago, how valid is the location anymore?
  • Data Pedigree - Where did the data come from? Is it a valid source? The pedigree is less important when using internal systems, as the source is well known, but IoT systems, by their nature, frequently will be getting their data from devices and sources outside the normal perimeter. This requires extra effort to ensure you trust the information being consumed.

No technology negates the need for good design and planning
Based on all estimates by industry analysts and current trends, the Internet of Things is growing at an incredible rate and is here to stay. There is a big radar blip of data outside your data center that is not going anywhere. That data provides great value, but also many challenges that need to be taken into consideration. If you are doing IoT and are not looking at Big Data, you are missing an opportunity and business value. As many of my readers have heard me say frequently, no technology negates the need for good design and planning. The Internet of Things and the accompanying Big Data demands it if you are to be successful.

More Stories By Ed Featherston

Ed Featherston is VP, Principal Architect at Cloud Technology Partners. He brings 35 years of technology experience in designing, building, and implementing large complex solutions. He has significant expertise in systems integration, Internet/intranet, and cloud technologies. He has delivered projects in various industries, including financial services, pharmacy, government and retail.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
SYS-CON Events announced today that Daiya Industry will exhibit at the Japanese Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ruby Development Inc. builds new services in short period of time and provides a continuous support of those services based on Ruby on Rails. For more information, please visit https://github.com/RubyDevInc.
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and busine...
As businesses evolve, they need technology that is simple to help them succeed today and flexible enough to help them build for tomorrow. Chrome is fit for the workplace of the future — providing a secure, consistent user experience across a range of devices that can be used anywhere. In her session at 21st Cloud Expo, Vidya Nagarajan, a Senior Product Manager at Google, will take a look at various options as to how ChromeOS can be leveraged to interact with people on the devices, and formats th...
First generation hyperconverged solutions have taken the data center by storm, rapidly proliferating in pockets everywhere to provide further consolidation of floor space and workloads. These first generation solutions are not without challenges, however. In his session at 21st Cloud Expo, Wes Talbert, a Principal Architect and results-driven enterprise sales leader at NetApp, will discuss how the HCI solution of tomorrow will integrate with the public cloud to deliver a quality hybrid cloud e...
Is advanced scheduling in Kubernetes achievable? Yes, however, how do you properly accommodate every real-life scenario that a Kubernetes user might encounter? How do you leverage advanced scheduling techniques to shape and describe each scenario in easy-to-use rules and configurations? In his session at @DevOpsSummit at 21st Cloud Expo, Oleg Chunikhin, CTO at Kublr, will answer these questions and demonstrate techniques for implementing advanced scheduling. For example, using spot instances ...
SYS-CON Events announced today that Yuasa System will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Yuasa System is introducing a multi-purpose endurance testing system for flexible displays, OLED devices, flexible substrates, flat cables, and films in smartphones, wearables, automobiles, and healthcare.
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Taica manufacturers Alpha-GEL brand silicone components and materials, which maintain outstanding performance over a wide temperature range -40C to +200C. For more information, visit http://www.taica.co.jp/english/.
As hybrid cloud becomes the de-facto standard mode of operation for most enterprises, new challenges arise on how to efficiently and economically share data across environments. In his session at 21st Cloud Expo, Dr. Allon Cohen, VP of Product at Elastifile, will explore new techniques and best practices that help enterprise IT benefit from the advantages of hybrid cloud environments by enabling data availability for both legacy enterprise and cloud-native mission critical applications. By rev...
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and busine...
Organizations do not need a Big Data strategy; they need a business strategy that incorporates Big Data. Most organizations lack a road map for using Big Data to optimize key business processes, deliver a differentiated customer experience, or uncover new business opportunities. They do not understand what’s possible with respect to integrating Big Data into the business model.
Companies are harnessing data in ways we once associated with science fiction. Analysts have access to a plethora of visualization and reporting tools, but considering the vast amount of data businesses collect and limitations of CPUs, end users are forced to design their structures and systems with limitations. Until now. As the cloud toolkit to analyze data has evolved, GPUs have stepped in to massively parallel SQL, visualization and machine learning.
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, will discuss how they b...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities – ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups. As a result, many firms employ new business models that place enormous impor...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous a...
SYS-CON Events announced today that Dasher Technologies will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Dasher Technologies, Inc. ® is a premier IT solution provider that delivers expert technical resources along with trusted account executives to architect and deliver complete IT solutions and services to help our clients execute their goals, plans and objectives. Since 1999, we'v...
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
Data scientists must access high-performance computing resources across a wide-area network. To achieve cloud-based HPC visualization, researchers must transfer datasets and visualization results efficiently. HPC clusters now compute GPU-accelerated visualization in the cloud cluster. To efficiently display results remotely, a high-performance, low-latency protocol transfers the display from the cluster to a remote desktop. Further, tools to easily mount remote datasets and efficiently transfer...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, will provide a fun and simple way to introduce Machine Leaning to anyone and everyone. Together we will solve a machine learning problem and find an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intellige...
SYS-CON Events announced today that TidalScale, a leading provider of systems and services, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale has been involved in shaping the computing landscape. They've designed, developed and deployed some of the most important and successful systems and services in the history of the computing industry - internet, Ethernet, operating s...
SYS-CON Events announced today that TidalScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale is the leading provider of Software-Defined Servers that bring flexibility to modern data centers by right-sizing servers on the fly to fit any data set or workload. TidalScale’s award-winning inverse hypervisor technology combines multiple commodity servers (including their ass...