Welcome!

Machine Learning Authors: Elizabeth White, Kevin Jackson, Progress Blog, Jason Bloomberg, Dan Blacharski

Related Topics: @DXWorldExpo, @CloudExpo, @ThingsExpo

@DXWorldExpo: Blog Feed Post

Aggregated Data Dilemma | @BigDataExpo #BigData #Analytics #DataScience

Valuable performance and behavioral nuances can be buried in the aggregated data

Okay, I am weird (tell me something that I don’t know, say most of my friends).  For Christmas I wanted a Nike Apple Watch to go with my existing FitBit and Garmin fitness trackers (I look sort of like a cyborg in the photo below…which is always cool).

While I was intrigued by the ability to do all sorts of cool things on the Apple Watch (like take a phone call and talk into my wrist watch like Dick Tracy), the thing that most intrigued me was the ability to buy third-party apps that could yield detailed exercise and health data.  I was hoping that this detailed exercise and health data could help me understand what effect particular behaviors or activities (or lack of particular behaviors and activities) were having on my overall health.

Why is this important to me?  You can thank articles like “Unexpected Heart Attack Triggers” for my health and exercise anxiety.  The article highlighted several things that can trigger a heart attack including:

  • Lack of sleep (definitely an issue, especially when I’m traveling so much)
  • Migraine Headaches (how can you work in technology and not have headaches)
  • Cold Weather (need to find more clients in warmer weather)
  • Big, Heavy Meals (with the exception of Chipotle, right?)
  • Getting Out of Bed in the Morning (see, I knew that was a big danger!!)
  • Alcohol (just like to drink a beer now and then)
  • Coffee (I drink Chai Tea Lattes, that’s technically not coffee, and I know that I shouldn’t admit that I drink Chai Tea Lattes)

So there are many items on that above list that could trigger a heart attack, and I enjoy many of the things on that list (like sleeping and eating and the occasional beer).  Consequently, I thought I’d put my data science experience to work to monitor my exercise and diet behaviors and predict potential health outcomes.

Personal Fitness Analytics
I tested the downloadable data from each of the three devices. The Fitbit offered the easiest way to download my fitness data (and I have TONS of useful fitness and diet tracking suggestions if anyone at Fitbit, Garmin or Apple ever read this blog!!). The problem with the fitness data is that I can only get daily level data (see Table 1).

Table 1:  Daily Fitness Tracking Data

I can add more external data to the aggregated fitness data (e.g., days of the week, days when I travel, how much I travel on those travel days) to come up with some simple plots.

For example, Figure 2 shows a visual correlation between the calories that I burn per step and the days that I travel.  My assumption is that I burn more calories per step when I am doing something that requires more exertion (like running or climbing steps), so it makes sense that on days when I am traveling, I have less opportunities for highly exertive activities.

Figure 2:  How Many Calories I Burn Per Step When Traveling

While this information is “interesting,” unfortunately, data at the aggregated daily level is not actionable.  If I had more detailed or granular fitness data, I’d like to chart what happens to my heart rate (and related stress levels):

  • During an airplane flight
  • When racing through an airport to catch a connecting flight
  • Waking up very early in the morning while traveling
  • Immediately after eating a large meal
  • While I’m doing my taxes (I hate doing my taxes)

The problem is that the data provided by my fitness band is aggregated to a level that is not actionable.  If I had my fitness data at 5 or 10-minute intervals, then I could more easily spot unusual health outcomes and determine (and eventually predict?) what behaviors (e.g., flying in an airplane, eating large meals, heavy exercise exertion, waking up extremely early) might be causing health concerns.

Power of Granular Data
Big Data and data science are all about granular data because valuable performance and behavioral nuances can be buried in the aggregated data.  For example, the chart in Figure 3 shows how additional performance nuances are being uncovered as we transition from a 5-minute to a 1-minute and finally to a 5-second interval in the capture of the performance data.

Figure 3:  Performance Nuances Uncovered in Granular Data

As the data gets more granular, the behavioral and performance nuances buried in the data start to surface. Data at the 5 minute and 1 minute intervals in Figure 3 tell you very little. Aggregated data is the anti-data science. Data at the 5-second interval highlights some potential performance concerns.  In this example, data at the 5-second interval starts to become actionable.

For example, I might notice too sedentary of a heart rate whenever I sit too long on a cross-country flight or my stress level jumping whenever I get another “flight delayed” message while trying to catch a connecting flight. I might then learn to perform some in-seat exercises and walking around during those long flights, or practicing controlled breathing and some simple yoga when enduring yet another flight delay (SFO airport does have a yoga room, and now I know why).

Preparing for an IoT World of Granular Data
Understanding the challenges of capturing and analyzing real-time granular machine and device-generated data will become even more critical as we move into the Internet of Things (IOT), where hundreds of sensors are kicking off tens, hundreds or even thousands of data points per minute.  This will force two specific challenges upon those of us coming from the more traditional human-generated big data world:

  • Real-time data capture and compression
  • Real-time analytics at the edge

For my fitness focus, I might need to expand my Personal Fitness Analysis to capture and analyze more of this detailed data in (near) real-time so that I can become aware of behaviors that are hurting or improving my health and fitness.  Ultimately, my goal is to change my behaviors, but I need to understand (and quantify?) what behaviors lead to desirable health and fitness outcomes (e.g., improved blood pressure, lower weight, less stress).

The post Aggregated Data Dilemma appeared first on InFocus Blog | Dell EMC Services.

Read the original blog entry...

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business”, is responsible for setting the strategy and defining the Big Data service line offerings and capabilities for the EMC Global Services organization. As part of Bill’s CTO charter, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He’s written several white papers, avid blogger and is a frequent speaker on the use of Big Data and advanced analytics to power organization’s key business initiatives. He also teaches the “Big Data MBA” at the University of San Francisco School of Management.

Bill has nearly three decades of experience in data warehousing, BI and analytics. Bill authored EMC’s Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute’s faculty as the head of the analytic applications curriculum.

Previously, Bill was the Vice President of Advertiser Analytics at Yahoo and the Vice President of Analytic Applications at Business Objects.

@CloudExpo Stories
Mobile device usage has increased exponentially during the past several years, as consumers rely on handhelds for everything from news and weather to banking and purchases. What can we expect in the next few years? The way in which we interact with our devices will fundamentally change, as businesses leverage Artificial Intelligence. We already see this taking shape as businesses leverage AI for cost savings and customer responsiveness. This trend will continue, as AI is used for more sophistica...
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive ov...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
In his general session at 21st Cloud Expo, Greg Dumas, Calligo’s Vice President and G.M. of US operations, discussed the new Global Data Protection Regulation and how Calligo can help business stay compliant in digitally globalized world. Greg Dumas is Calligo's Vice President and G.M. of US operations. Calligo is an established service provider that provides an innovative platform for trusted cloud solutions. Calligo’s customers are typically most concerned about GDPR compliance, application p...
Digital transformation is about embracing digital technologies into a company's culture to better connect with its customers, automate processes, create better tools, enter new markets, etc. Such a transformation requires continuous orchestration across teams and an environment based on open collaboration and daily experiments. In his session at 21st Cloud Expo, Alex Casalboni, Technical (Cloud) Evangelist at Cloud Academy, explored and discussed the most urgent unsolved challenges to achieve f...
Continuous Delivery makes it possible to exploit findings of cognitive psychology and neuroscience to increase the productivity and happiness of our teams. In his session at 22nd Cloud Expo | DXWorld Expo, Daniel Jones, CTO of EngineerBetter, will answer: How can we improve willpower and decrease technical debt? Is the present bias real? How can we turn it to our advantage? Can you increase a team’s effective IQ? How do DevOps & Product Teams increase empathy, and what impact does empath...
DevOps promotes continuous improvement through a culture of collaboration. But in real terms, how do you: Integrate activities across diverse teams and services? Make objective decisions with system-wide visibility? Use feedback loops to enable learning and improvement? With technology insights and real-world examples, in his general session at @DevOpsSummit, at 21st Cloud Expo, Andi Mann, Chief Technology Advocate at Splunk, explored how leading organizations use data-driven DevOps to close th...
As many know, the first generation of Cloud Management Platform (CMP) solutions were designed for managing virtual infrastructure (IaaS) and traditional applications. But that's no longer enough to satisfy evolving and complex business requirements. In his session at 21st Cloud Expo, Scott Davis, Embotics CTO, explored how next-generation CMPs ensure organizations can manage cloud-native and microservice-based application architectures, while also facilitating agile DevOps methodology. He expla...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
"Digital transformation - what we knew about it in the past has been redefined. Automation is going to play such a huge role in that because the culture, the technology, and the business operations are being shifted now," stated Brian Boeggeman, VP of Alliances & Partnerships at Ayehu, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
You know you need the cloud, but you're hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You're looking at private cloud solutions based on hyperconverged infrastructure, but you're concerned with the limits inherent in those technologies. What do you do?
"I focus on what we are calling CAST Highlight, which is our SaaS application portfolio analysis tool. It is an extremely lightweight tool that can integrate with pretty much any build process right now," explained Andrew Siegmund, Application Migration Specialist for CAST, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that Synametrics Technologies will exhibit at SYS-CON's 22nd International Cloud Expo®, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Synametrics Technologies is a privately held company based in Plainsboro, New Jersey that has been providing solutions for the developer community since 1997. Based on the success of its initial product offerings such as WinSQL, Xeams, SynaMan and Syncrify, Synametrics continues to create and hone inn...
"We started a Master of Science in business analytics - that's the hot topic. We serve the business community around San Francisco so we educate the working professionals and this is where they all want to be," explained Judy Lee, Associate Professor and Department Chair at Golden Gate University, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"Evatronix provides design services to companies that need to integrate the IoT technology in their products but they don't necessarily have the expertise, knowledge and design team to do so," explained Adam Morawiec, VP of Business Development at Evatronix, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making r...