questions – Cloudera Engineering Blog

Channel: questions – Cloudera Engineering Blog

Image may be NSFW.
Clik here to view.

Apache Phoenix Joins Cloudera Labs

May 6, 2015, 1:23 pm

We are happy to announce the inclusion of Apache Phoenix in Cloudera Labs. Apache Phoenix is an efficient SQL skin for Apache HBase that has created a lot of buzz. Many companies are successfully using...

View Article

How-to: Get Started with CDH on OpenStack with Sahara

May 18, 2015, 9:03 am

The recent OpenStack Kilo release adds many features to the Sahara project, which provides a simple means of provisioning an Apache Hadoop (or Spark) cluster on top of OpenStack. This how-to, from...

View Article

Deploying Apache Kafka: A Practical FAQ

July 1, 2015, 9:21 am

This post contains answers to common questions about deploying and configuring Apache Kafka as part of a Cloudera-powered enterprise data hub. Cloudera added support for Apache Kafka, the open standard...

View Article

What’s New in Cloudera Director 1.5?

August 12, 2015, 1:13 pm

Cloudera Director 1.5 is now available; this post describes what’s inside, including a new open source plugin interface. Cloudera Director is the manifestation of Cloudera’s commitment to providing a...

View Article

Using Apache Spark for Massively Parallel NLP at TripAdvisor

August 24, 2015, 8:44 am

Thanks to Jeff Palmucci, Director of Machine Learning at TripAdvisor, for permission to republish the following (originally appeared in TripAdvisor’s Engineering/Operations blog). Here at TripAdvisor...

View Article

YCSB, the Open Standard for NoSQL Benchmarking, Joins Cloudera Labs

August 31, 2015, 8:21 am

YCSB, the open standard for comparative performance evaluation of data stores, is now available to CDH users for their Apache HBase deployments via new packages from Cloudera Labs. Many factors go into...

View Article

Untangling Apache Hadoop YARN, Part 1

September 4, 2015, 11:04 am

In this multipart series, fully explore the tangled ball of thread that is YARN. YARN (Yet Another Resource Negotiator) is the resource management layer for the Apache Hadoop ecosystem. YARN has been...

View Article

Meet Cloudera’s Apache Spark Committers

September 9, 2015, 6:01 am

The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen,...

View Article

How-to: Prepare Unstructured Data in Impala for Analysis

September 17, 2015, 8:27 am

Learn how to build an Impala table around data that comes from non-Impala, or even non-SQL, sources. As data pipelines start to include more aspects such as NoSQL or loosely specified schemas, you...

View Article

How-to: Use Apache Solr to Query Indexed Data for Analytics

October 14, 2015, 8:00 am

Bet you didn’t know this: In some cases, Solr offers lightning-fast response times for business-style queries. If you were to ask well informed technical people about use cases for Solr, the most...

View Article

New in Cloudera Enterprise 5.5: Analytics for Metadata Management

November 23, 2015, 8:24 am

Starting in Cloudera Enterprise 5.5, Cloudera Navigator offers interactive visual analytics that help answer important questions about the data that’s in your CDH clusters. The new analytics system in...

View Article

Progress Report: Hive-on-Spark Nears Production Readiness

December 2, 2015, 7:55 am

Contributors from Intel, Cloudera, and the rest of the community have been making strong progress on the Hive-on-Spark initiative. This post provides an update. Since its inception about one year ago,...

View Article

How-to: Get Started with CDH on OpenStack with Sahara

May 18, 2015, 9:03 am

The recent OpenStack Kilo release adds many features to the Sahara project, which provides a simple means of provisioning an Apache Hadoop (or Spark) cluster on top of OpenStack. This how-to, from...

View Article

Apache Phoenix Joins Cloudera Labs

May 6, 2015, 1:23 pm

We are happy to announce the inclusion of Apache Phoenix in Cloudera Labs. [Update: A new package for Apache Phoenix 4.5.2 on CDH 5.4.x was released on Nov. 19, 2015.] Apache Phoenix is an efficient...

View Article

Image may be NSFW.
Clik here to view.

The New Hadoop Application Architectures Book is Here!

July 15, 2014, 8:50 am

There’s an important new addition coming to the Apache Hadoop book ecosystem. It’s now in early release! We are very happy to announce that the new Apache Hadoop book we have been writing for O’Reilly...

View Article

New in CDH 5.1: Document-level Security for Cloudera Search

July 23, 2014, 8:24 am

Cloudera Search now supports fine-grain access control via document-level security provided by Apache Sentry. In my previous blog post, you learned about index-level security in Apache Sentry...

View Article

Running CDH 5 on GlusterFS 3.3

August 18, 2014, 9:45 am

The following post was written by Jay Vyas (@jayunit100) and originally published in the Gluster.org Community. I have recently spent some time getting Cloudera’s CDH 5 distribution of Apache Hadoop to...

View Article

Apache Kafka for Beginners

September 12, 2014, 11:10 am

When used in the right way and for the right use case, Kafka has unique attributes that make it a highly attractive option for data integration. Apache Kafka is creating a lot of buzz these days. While...

View Article

Secrets of Cloudera Support: Using OpenStack to Shorten Time-to-Resolution

September 24, 2014, 8:39 am

Automating the creation of short-lived clusters for testing purposes frees our support engineers to spend more time on customer issues. The first step for any support engineer is often to replicate the...

View Article

NoSQL in a Hadoop World

November 5, 2014, 8:51 am

The number of powerful data query tools in the Apache Hadoop ecosystem can be confusing, but understanding a few simple things about your needs usually makes the choice easy. Ah, the good old days. I...

View Article

More Pages to Explore .....

Latest Images