In the early 2010s, Apache Hadoop captured the imagination of the tech community. A free and powerful open source platform, it gave users a way to process unimaginably large quantities of data, and offered a dazzling variety of tooling to suit nearly every use case – MapReduce for odd jobs like processing of text, audio or video; Hive for SQL based data warehousing; Pig, an unusual language with a similar data warehousing goal; Hbase, Oozie, Sqoop, Flume and a whole parade of other tools for processing massive datasets at scale.
With the surge in interest in Hadoop, a number of startups and established software companies jumped to provide commercial offerings of it. Cloudera was one of the first into the game with CDH – a Hadoop distribution. Initially offered as Deb packages and RPMs, Cloudera quickly introduced Cloudera Manager, a sophisticated, web-based management system to deploy, maintain, and operate Hadoop clusters. With the introduction of Cloudera Manager,…