Apache Bigtop and Juju: a charming approach to big data

Jorge O. Castro

Jorge O. Castro

on 7 June 2016

Juju users have been enjoying our collection of Big Data charms for over two years. During this time, we’ve learned a lot about what our users want from this complex corner of big software.

The Apache Bigtop community distills best practices for installing big data software. They extensively test and package Apache big data projects to ensure users are able to manage their deployments. This community also provides a foundation for other projects and products to build from.

Juju charms distill the operational intelligence needed to run big software such as Openstack, Kubernetes, and the Bigtop stack across clouds and architectures. Charms provide an Open Source service-oriented approach to sharing how complex software should be modeled. By modeling Bigtop with charms, we can rapidly deploy and test complex solutions at scale (and across clouds). These tests uncover hard problems, which leads to community collaboration, which leads to making individual Bigtop components better. Bigtop charms allow operators and developers to use, test, and benchmark the Bigtop stack on their laptops, bare metal, or favorite cloud.

Every time a Bigtop charm gets better, Bigtop also gets better – and vice versa. Since Juju can repeatedly deploy charms on multiple clouds and architectures, it allows us to quickly identify issues in individual components as well the relationships to other components in the Bigtop stack.

Today we are thrilled to announce the Bigtop charms, bundles, and test plans live alongside the Bigtop source!

The source layers for the charms for the Apache Hadoop component of Bigtop, along with instructions for building the charms from these layers, can now be found in the Apache Bigtop repository at: https://github.com/apache/bigtop/tree/master/bigtop-packages/src/charm

The Bigtop repository also includes the hadoop-processing bundle, which encapsulates how to deploy and relate these charms to get a fully-functioning, scalable Hadoop cluster in minutes: https://github.com/apache/bigtop/tree/master/bigtop-deploy/juju/hadoop-processing

Additionally, we included a test plan for the Cloud Weather Report project to run the Bigtop solutions across multiple clouds and report back testing and benchmarking results: https://github.com/apache/bigtop/tree/master/bigtop-tests/cloud-weather-report

Jump in and submit a pull request for any of those charms or bundles. If there is a specific workload that you would like to share with others, please mail the Bigtop Dev List. To see current ideas, take a look at our Apache Bigtop Charming Effort community wiki and/or chime in on the in-progress JIRA bugs: https://goo.gl/Hhda2M

If you already have Juju 2 installed, give the latest Bigtop Hadoop bundle a try with:

juju deploy hadoop-processing

Or deploy the bundle using the card below:

View details

hadoop-processing

by bigdata-charmers

If you’re new to Juju, follow the getting started instructions to start developing on these Bigtop solutions. If you need help with that, we’d be happy to walk you through the process on the Juju mailing list.

For those of you looking for some face to face training, join us at the next Juju Charmer Summit September 12th-14th, 2016 in Pasadena, CA for charm development, design, and best practices. It’s free for anyone to attend.

The Bigtop community is vibrant, collaborative, and friendly — a community we are excited to be a part of to help make Apache big data software better!

Talk to us today

Interested in running Ubuntu Desktop in your organisation?

Newsletter signup

Select topics you’re interested in

In submitting this form, I confirm that I have read and agree to Canonical’s Privacy Notice and Privacy Policy.

Related posts

How to Manage Multi-Cloud Services with Juju

Introduction Managing a service with deployments in multi-cloud environments can be a challenge in terms of troubleshooting and scalability due to the complexity of dealing with different public cloud providers. An effective way to manage…

451 Research benchmarks public and private infrastructure cost

Independent Report highlights the TCO of Canonical’s managed private cloud in a diverse multi-cloud strategy and enterprise infrastructure portfolio 451 Research’s latest report, ‘Busting the myth of private cloud economics ’, found that…

MAAS for the home

This article originally appeared on Chris Sanders’ blog MAAS is designed to run in a data center where it expects to have control of DNS and DHCP. The use of an external DHCP server is listed as ‘may work but not supported’ in the…