Click here to close now.

Welcome!

IoT User Interface Authors: Lori MacVittie, Liz McMillan, Sebastian Bos, Frank McCourt, Sematext Blog

Related Topics: @CloudExpo Blog, Java IoT, @MicroservicesE Blog, Open Source Cloud, Agile Computing, Apache

@CloudExpo Blog: Article

Big Data Security for Apache Hadoop

Ten Tips for HadoopWorld attendees

Big Data takes center stage today at the Strata Conference & Hadoop World in New York, the world’s largest gathering of the Apache Hadoop™ community. A key conversation topic will be how organizations can improve data security for Hadoop and the applications that run on the platform. As you know, Hadoop and similar data stores hold a lot of promise for organizations to finally gain some value out of the immense amount of data they're capturing. But HDFS, Hive and other nascent NoSQL technologies were not necessarily designed with comprehensive security in mind. Often what happens as big data projects grow is sensitive data like HIPAA data, PII and financial records get captured and stored. It's important this data remains secure at rest.

I polled my fellow co-workers at Gazzang last week, and asked them to come up with a top ten list for securing Apache Hadoop. Here's what they delivered. Enjoy:

Think about security before getting started – You don’t wait until after a burglary to put locks on your doors, and you should not wait until after a breach to secure your data. Make sure a serious data security discussion takes place before installing and feeding data into your Hadoop cluster.

Consider what data may get stored – If you are using Hadoop to store and run analytics against regulatory data, you likely need to comply with specific security requirements. If the stored data does not fall under regulatory jurisdiction, keep in mind the risks to your public reputation and potential loss of revenue if data such as personally identifiable information (PII) were breached.

Encrypt data at rest and in motion – Add transparent data encryption at the file layer as a first step toward enhancing the security of a big data project. SSL encryption can protect big data as it moves between nodes and applications.

As Securosis analyst Adrian Lane wrote in a recent blog, “File encryption addresses two attacker methods for circumventing normal application security controls. Encryption protects in case malicious users or administrators gain access to data nodes and directly inspect files, and it also renders stolen files or disk images unreadable. It is transparent to both Hadoop and calling applications and scales out as the cluster grows. This is a cost-effective way to address several data security threats.”

Store the keys away from the encrypted data – Storing encryption keys on the same server as the encrypted data is akin to locking your house and leaving the key in your front door. Instead, use a key management system that separates the key from the encrypted data.

Institute access controls – Establishing and enforcing policies that govern which people and processes can access data stored within Hadoop is essential for keeping rogue users and applications off your cluster.

Require multi-factor authentication - Multi-factor authentication can significantly reduce the likelihood of an account being compromised or access to Hadoop data being granted to an unauthorized party.

Use secure automation – Beyond data encryption, organizations should look to DevOps tools such as Chef or Puppet for automated patch and configuration management.

Frequently audit your environment – Project needs, data sets, cloud requirements and security risks are constantly changing. It’s important to make sure you are closely monitoring your Hadoop environment and performing frequent checks to ensure performance and security goals are being met.

Ask tough questions of your cloud provider – Be sure you know what your cloud provider is responsible for. Will they encrypt your data? Who will store and have access to your keys? How is your data retired when you no longer need it? How do they prevent data leakage?

Centralize accountability – Centralizing the accountability for data security ensures consistent policy enforcement and access control across diverse organizational silos and data sets.

Did we miss anything? If so, please comment below, and enjoy Strata +HadoopWorld.

More Stories By David Tishgart

After spending years at large corporations including Dell, AMD and BMC, David Tishgart joined the startup ranks leading product marketing for Gazzang. Focused on security for big data, he helps communicate the benefits and challenges that big data can present, offering practical solutions. When not ranting about encryption and key management, you can find David clamoring for a big data application that can fine tune his fantasy football team.

@CloudExpo Stories
Discussions about cloud computing are evolving into discussions about enterprise IT in general. As enterprises increasingly migrate toward their own unique clouds, new issues such as the use of containers and microservices emerge to keep things interesting. In this Power Panel at 16th Cloud Expo, moderated by Conference Chair Roger Strukhoff, panelists addressed the state of cloud computing today, and what enterprise IT professionals need to know about how the latest topics and trends affect t...
DevOps tends to focus on the relationship between Dev and Ops, putting an emphasis on the ops and application infrastructure. But that’s changing with microservices architectures. In her session at DevOps Summit, Lori MacVittie, Evangelist for F5 Networks, will focus on how microservices are changing the underlying architectures needed to scale, secure and deliver applications based on highly distributed (micro) services and why that means an expansion into “the network” for DevOps.
One of the hottest areas in cloud right now is DRaaS and related offerings. In his session at 16th Cloud Expo, Dale Levesque, Disaster Recovery Product Manager with Windstream's Cloud and Data Center Marketing team, will discuss the benefits of the cloud model, which far outweigh the traditional approach, and how enterprises need to ensure that their needs are properly being met.
Internet of Things (IoT) will be a hybrid ecosystem of diverse devices and sensors collaborating with operational and enterprise systems to create the next big application. In their session at @ThingsExpo, Bramh Gupta, founder and CEO of robomq.io, and Fred Yatzeck, principal architect leading product development at robomq.io, discussed how choosing the right middleware and integration strategy from the get-go will enable IoT solution developers to adapt and grow with the industry, while at th...
"We provide a web application framework for building really sophisticated web applications that run on a browser without any installation need so we get used for biotech, defense, and banking applications," noted Charles Kendrick, CTO and Chief Architect at Isomorphic Software, in this SYS-CON.tv interview at @DevOpsSummit (http://DevOpsSummit.SYS-CON.com), held June 9-11, 2015, at the Javits Center in New York
"Plutora provides release and testing environment capabilities to the enterprise," explained Dalibor Siroky, Director and Co-founder of Plutora, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
The time is ripe for high speed resilient software defined storage solutions with unlimited scalability. ISS has been working with the leading open source projects and developed a commercial high performance solution that is able to grow forever without performance limitations. In his session at Cloud Expo, Alex Gorbachev, President of Intelligent Systems Services Inc., shared foundation principles of Ceph architecture, as well as the design to deliver this storage to traditional SAN storage co...
Public Cloud IaaS started its life in the developer and startup communities and has grown rapidly to a $20B+ industry, but it still pales in comparison to how much is spent worldwide on IT: $3.6 trillion. In fact, there are 8.6 million data centers worldwide, the reality is many small and medium sized business have server closets and colocation footprints filled with servers and storage gear. While on-premise environment virtualization may have peaked at 75%, the Public Cloud has lagged in adop...
The Internet of Things is not only adding billions of sensors and billions of terabytes to the Internet. It is also forcing a fundamental change in the way we envision Information Technology. For the first time, more data is being created by devices at the edge of the Internet rather than from centralized systems. What does this mean for today's IT professional? In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists addressed this very serious issue of pro...
SYS-CON Events announced today that BMC will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. BMC delivers software solutions that help IT transform digital enterprises for the ultimate competitive business advantage. BMC has worked with thousands of leading companies to create and deliver powerful IT management services. From mainframe to cloud to mobile, BMC pairs high-speed digital innovation with robust...
Even as cloud and managed services grow increasingly central to business strategy and performance, challenges remain. The biggest sticking point for companies seeking to capitalize on the cloud is data security. Keeping data safe is an issue in any computing environment, and it has been a focus since the earliest days of the cloud revolution. Understandably so: a lot can go wrong when you allow valuable information to live outside the firewall. Recent revelations about government snooping, along...
There will be 150 billion connected devices by 2020. New digital businesses have already disrupted value chains across every industry. APIs are at the center of the digital business. You need to understand what assets you have that can be exposed digitally, what their digital value chain is, and how to create an effective business model around that value chain to compete in this economy. No enterprise can be complacent and not engage in the digital economy. Learn how to be the disruptor and not ...
Internet of Things is moving from being a hype to a reality. Experts estimate that internet connected cars will grow to 152 million, while over 100 million internet connected wireless light bulbs and lamps will be operational by 2020. These and many other intriguing statistics highlight the importance of Internet powered devices and how market penetration is going to multiply many times over in the next few years.
"We got started as search consultants. On the services side of the business we have help organizations save time and save money when they hit issues that everyone more or less hits when their data grows," noted Otis Gospodnetić, Founder of Sematext, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
It is one thing to build single industrial IoT applications, but what will it take to build the Smart Cities and truly society-changing applications of the future? The technology won’t be the problem, it will be the number of parties that need to work together and be aligned in their motivation to succeed. In his session at @ThingsExpo, Jason Mondanaro, Director, Product Management at Metanga, discussed how you can plan to cooperate, partner, and form lasting all-star teams to change the world...
Containers have changed the mind of IT in DevOps. They enable developers to work with dev, test, stage and production environments identically. Containers provide the right abstraction for microservices and many cloud platforms have integrated them into deployment pipelines. DevOps and Containers together help companies to achieve their business goals faster and more effectively. In his session at DevOps Summit, Ruslan Synytsky, CEO and Co-founder of Jelastic, reviewed the current landscape of...
Malicious agents are moving faster than the speed of business. Even more worrisome, most companies are relying on legacy approaches to security that are no longer capable of meeting current threats. In the modern cloud, threat diversity is rapidly expanding, necessitating more sophisticated security protocols than those used in the past or in desktop environments. Yet companies are falling for cloud security myths that were truths at one time but have evolved out of existence.
The Internet of Things is not only adding billions of sensors and billions of terabytes to the Internet. It is also forcing a fundamental change in the way we envision Information Technology. For the first time, more data is being created by devices at the edge of the Internet rather than from centralized systems. What does this mean for today's IT professional? In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will addresses this very serious issue o...
SYS-CON Events announced today that JFrog, maker of Artifactory, the popular Binary Repository Manager, will exhibit at SYS-CON's @DevOpsSummit Silicon Valley, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Based in California, Israel and France, founded by longtime field-experts, JFrog, creator of Artifactory and Bintray, has provided the market with the first Binary Repository solution and a software distribution social platform.
In the midst of the widespread popularity and adoption of cloud computing, it seems like everything is being offered “as a Service” these days: Infrastructure? Check. Platform? You bet. Software? Absolutely. Toaster? It’s only a matter of time. With service providers positioning vastly differing offerings under a generic “cloud” umbrella, it’s all too easy to get confused about what’s actually being offered. In his session at 16th Cloud Expo, Kevin Hazard, Director of Digital Content for SoftL...