|By Greg Schulz||
|November 27, 2012 11:30 AM EST||
Recently while I was in Europe presenting some sessions at conferences and doing some seminars, I was invited by Ed Saipetch (@edsai) of Inktank.com to attend the first Ceph Day in Amsterdam.
As luck or fate would turn out, I was in Nijkerk which is about an hour train ride from Amsterdam central station plus a free day in my schedule. After a morning train ride and nice walk from Amsterdam Central I arrived at the Tobacco Theatre (a former tobacco trading venue) where Ceph Day was underway, and in time for lunch of Krokettens sandwich.
Let's take a quick step back and address for those not familiar what is Ceph (Cephalanthera) and why it was worth spending a day to attend this event. Ceph is an open source distributed object scale out (e.g. cluster or grid) software platform running on industry standard hardware.
Ceph is used for deploying object storage, cloud storage and managed services, general purpose storage for research, commercial, scientific, high performance computing (HPC) or high productivity computing (commercial) along with backup or data protection and archiving destinations. Other software similar in functionality or capabilities to Ceph include OpenStack Swift, Basho Riak CS, Cleversafe, Scality and Caringo among others. There are also the tin wrapped software (e.g. appliances or pre-packaged) solutions such as Dell DX (Caringo), DataDirect Networks (DDN) WOS, EMC ATMOS and Centera, Amplidata and HDS HCP among others. From a service standpoint, these solutions can be used to build services similar Amazon S3 and Glacier, Rackspace Cloud files and Cloud Block, DreamHost DreamObject and HP Cloud storage among others.
At the heart of Ceph is RADOS a distributed object store that consists of peer nodes functioning as object storage devices (OSD). Data can be accessed via REST (Amazon S3 like) APIs, Libraries, CEPHFS and gateway with information being spread across nodes and OSDs using a CRUSH based algorithm (note Sage Weil is one of the authors of CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data). Ceph is scalable in terms of performance, availability and capacity by adding extra nodes with hard disk drives (HDD) or solid state devices (SSDs). One of the presentations pertained to DreamHost that was an early adopter of Ceph to make their DreamObjects (cloud storage) offering.
In addition to storage nodes, there are also an odd number of monitor nodes to coordinate and manage the Ceph cluster along with optional gateways for file access. In the above figure (via DreamHost), load balancers sit in front of gateways that interact with the storage nodes. The storage node in this example is a physical server with 12 x 3TB HDDs each configured as a OSD.
In the DreamHost example above, there are 90 storage nodes plus 3 management nodes, the total raw storage capacity (no RAID) is about 3PB (12 x 3TB = 36TB x 90 = 3.24PB). Instead of using RAID or mirroring, each objects data is replicated or copied to three (e.g. N=3) different OSDs (on separate nodes), where N is adjustable for a given level of data protection, for a usable storage capacity of about 1PB.
Note that for more usable capacity and lower availability, N could be set lower, or a larger value of N would give more durability or data protection at higher storage capacity overhead cost. In addition to using JBOD configurations with replication, Ceph can also be configured with a combination of RAID and replication providing more flexibility for larger environments to balance performance, availability, capacity and economics.
One of the benefits of Ceph is the flexibility to configure it how you want or need for different applications. This can be in a cost-effective hardware light configuration using JBOD or internal HDDs in small form factor generally available servers, or high density servers and storage enclosures with optional RAID adapters along with SSD. This flexibility is different from some cloud and object storage systems or software tools which take a stance of not using or avoiding RAID vs. providing options and flexibility to configure and use the technology how you see fit.
Here are some links to presentations from Ceph Day:
Introduction and Welcome by Wido den Hollander
Ceph: A Unified Distributed Storage System by Sage Weil
Ceph in the Cloud by Wido den Hollander
DreamObjects: Cloud Object Storage with Ceph by Ross Turk
Cluster Design and Deployment by Greg Farnum
Notes on Librados by Sage Weil
While at Ceph day, I was able to spend a few minutes with Sage Weil Ceph creator and founder of inktank.com to record a pod cast (listen here) about what Ceph is, where and when to use it, along with other related topics. Also while at the event I had a chance to sit down with Curtis (aka Mr. Backup) Preston where we did a simulcast video and pod cast. The simulcast involved Curtis recording this video with me as a guest discussing Ceph, cloud and object storage, backup, data protection and related themes while I recorded this pod cast.
One of the interesting things I heard, or actually did not hear while at the Ceph Day event that I tend to hear at related conferences such as SNW is a focus on where and how to use, configure and deploy Ceph along with various configuration options, replication or copy modes as opposed to going off on erasure codes or other tangents. In other words, instead of focusing on the data protection protocol and algorithms, or what is wrong with the competition or other architectures, the Ceph Day focused was removing cloud and object storage objections and enablement.
Thanks again to Sage Weil for taking time out of his busy schedule to record a pod cast talking about Ceph, as well 42on.com and inktank for hosting, and the invitation to attend the first Ceph Day in Amsterdam.
Ok, nuff said.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO All Rights Reserved
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
Apr. 29, 2016 09:15 PM EDT Reads: 251
The IoT has the potential to create a renaissance of manufacturing in the US and elsewhere. In his session at 18th Cloud Expo, Florent Solt, CTO and chief architect of Netvibes, will discuss how the expected exponential increase in the amount of data that will be processed, transported, stored, and accessed means there will be a huge demand for smart technologies to deliver it. Florent Solt is the CTO and chief architect of Netvibes. Prior to joining Netvibes in 2007, he co-founded Rift Technol...
Apr. 29, 2016 09:00 PM EDT Reads: 1,516
If there is anything we have learned by now, is that every business paves their own unique path for releasing software- every pipeline, implementation and practices are a bit different, and DevOps comes in all shapes and sizes. Software delivery practices are often comprised of set of several complementing (or even competing) methodologies – such as leveraging Agile, DevOps and even a mix of ITIL, to create the combination that’s most suitable for your organization and that maximize your busines...
Apr. 29, 2016 08:30 PM EDT Reads: 1,738
Struggling to keep up with increasing application demand? Learn how Platform as a Service (PaaS) can streamline application development processes and make resource management easy.
Apr. 29, 2016 07:45 PM EDT Reads: 1,999
New Relic, Inc. has announced a set of new features across the New Relic Software Analytics Cloud that offer IT operations teams increased visibility, and the ability to diagnose and resolve performance problems quickly. The new features further IT operations teams’ ability to leverage data and analytics, as well as drive collaboration and a common, shared understanding between teams. Software teams are under pressure to resolve performance issues quickly and improve availability, as the comple...
Apr. 29, 2016 07:30 PM EDT Reads: 2,397
The proper isolation of resources is essential for multi-tenant environments. The traditional approach to isolate resources is, however, rather heavyweight. In his session at 18th Cloud Expo, Igor Drobiazko, co-founder of elastic.io, will draw upon their own experience with operating a Docker container-based infrastructure on a large scale and present a lightweight solution for resource isolation using microservices. He will also discuss the implementation of microservices in data and applicat...
Apr. 29, 2016 05:15 PM EDT Reads: 1,684
Join IBM June 8 at 18th Cloud Expo at the Javits Center in New York City, NY, and learn how to innovate like a startup and scale for the enterprise. You need to deliver quality applications faster and cheaper, attract and retain customers with an engaging experience across devices, and seamlessly integrate your enterprise systems. And you can't take 12 months to do it.
Apr. 29, 2016 04:30 PM EDT Reads: 1,766
See storage differently! Storage performance problems have only gotten worse and harder to solve as applications have become largely virtualized and moved to a cloud-based infrastructure. Storage performance in a virtualized environment is not just about IOPS, it is about how well that potential performance is guaranteed to individual VMs for these apps as the number of VMs keep going up real time. In his session at 18th Cloud Expo, Dhiraj Sehgal, in product and marketing at Tintri, will discu...
Apr. 29, 2016 04:30 PM EDT Reads: 617
Machine Learning helps make complex systems more efficient. By applying advanced Machine Learning techniques such as Cognitive Fingerprinting, wind project operators can utilize these tools to learn from collected data, detect regular patterns, and optimize their own operations. In his session at 18th Cloud Expo, Stuart Gillen, Director of Business Development at SparkCognition, will discuss how research has demonstrated the value of Machine Learning in delivering next generation analytics to im...
Apr. 29, 2016 03:45 PM EDT Reads: 1,597
This is not a small hotel event. It is also not a big vendor party where politicians and entertainers are more important than real content. This is Cloud Expo, the world's longest-running conference and exhibition focused on Cloud Computing and all that it entails. If you want serious presentations and valuable insight about Cloud Computing for three straight days, then register now for Cloud Expo.
Apr. 29, 2016 03:30 PM EDT Reads: 1,659
As you respond to increasing requests for new analytics, you need fast and flexible technology in your arsenal so that you can deploy the right workload to the right platform for the need at hand. Do you need self-service and fast time to value? Do you have data and application control and privacy needs, along with strict SLAs to meet? IBM dashDB™ is data warehouse technology powered by in-memory computing and in-database analytics that are designed for fast results, scalability and more.
Apr. 29, 2016 03:15 PM EDT Reads: 1,535
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
Apr. 29, 2016 03:00 PM EDT Reads: 843
So, you bought into the current machine learning craze and went on to collect millions/billions of records from this promising new data source. Now, what do you do with them? Too often, the abundance of data quickly turns into an abundance of problems. How do you extract that "magic essence" from your data without falling into the common pitfalls? In her session at @ThingsExpo, Natalia Ponomareva, Software Engineer at Google, will provide tips on how to be successful in large scale machine lear...
Apr. 29, 2016 02:45 PM EDT Reads: 844
Up until last year, enterprises that were looking into cloud services usually undertook a long-term pilot with one of the large cloud providers, running test and dev workloads in the cloud. With cloud’s transition to mainstream adoption in 2015, and with enterprises migrating more and more workloads into the cloud and in between public and private environments, the single-provider approach must be revisited. In his session at 18th Cloud Expo, Yoav Mor, multi-cloud solution evangelist at Cloudy...
Apr. 29, 2016 02:30 PM EDT Reads: 1,402
The paradigm has shifted. A Gartner survey shows that 43% of organizations are using or plan to implement the Internet of Things in 2016. However, not just a handful of companies are still using the old-style ad-hoc trial-and-error ways, unaware of the critical barriers, paint points, traps, and hidden roadblocks. How can you become a winner? In his session at @ThingsExpo, Tony Shan will present a methodical approach to guide the holistic adoption and enablement of IoT implementations. This ov...
Apr. 29, 2016 02:00 PM EDT Reads: 1,541
In his session at 18th Cloud Expo, Sagi Brody, Chief Technology Officer at Webair Internet Development Inc., will focus on real world deployments of DDoS mitigation strategies in every layer of the network. He will give an overview of methods to prevent these attacks and best practices on how to provide protection in complex cloud platforms. He will also outline what we have found in our experience managing and running thousands of Linux and Unix managed service platforms and what specifically c...
Apr. 29, 2016 01:45 PM EDT Reads: 1,072
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus inter...
Apr. 29, 2016 01:30 PM EDT Reads: 808
Artificial Intelligence has the potential to massively disrupt IoT. In his session at 18th Cloud Expo, AJ Abdallat, CEO of Beyond AI, will discuss what the five main drivers are in Artificial Intelligence that could shape the future of the Internet of Things. AJ Abdallat is CEO of Beyond AI. He has over 20 years of management experience in the fields of artificial intelligence, sensors, instruments, devices and software for telecommunications, life sciences, environmental monitoring, process...
Apr. 29, 2016 01:30 PM EDT Reads: 783
SYS-CON Events announced today that Stratoscale, the software company developing the next generation data center operating system, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Stratoscale is revolutionizing the data center with a zero-to-cloud-in-minutes solution. With Stratoscale’s hardware-agnostic, Software Defined Data Center (SDDC) solution to store everything, run anything and scale everywhere...
Apr. 29, 2016 01:00 PM EDT Reads: 1,484
SYS-CON Events announced today that Ericsson has been named “Gold Sponsor” of SYS-CON's @ThingsExpo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. Ericsson is a world leader in the rapidly changing environment of communications technology – providing equipment, software and services to enable transformation through mobility. Some 40 percent of global mobile traffic runs through networks we have supplied. More than 1 billion subscribers around the world re...
Apr. 29, 2016 01:00 PM EDT Reads: 757