Quantcast
Channel: questions – Cloudera Engineering Blog
Browsing all 25 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Apache Phoenix Joins Cloudera Labs

We are happy to announce the inclusion of Apache Phoenix in Cloudera Labs. Apache Phoenix is an efficient SQL skin for Apache HBase that has created a lot of buzz. Many companies are successfully using...

View Article



How-to: Get Started with CDH on OpenStack with Sahara

The recent OpenStack Kilo release adds many features to the Sahara project, which provides a simple means of provisioning an Apache Hadoop (or Spark) cluster on top of OpenStack. This how-to, from...

View Article

Deploying Apache Kafka: A Practical FAQ

This post contains answers to common questions about deploying and configuring Apache Kafka as part of a Cloudera-powered enterprise data hub. Cloudera added support for Apache Kafka, the open standard...

View Article

What’s New in Cloudera Director 1.5?

Cloudera Director 1.5 is now available; this post describes what’s inside, including a new open source plugin interface. Cloudera Director is the manifestation of Cloudera’s commitment to providing a...

View Article

Using Apache Spark for Massively Parallel NLP at TripAdvisor

Thanks to Jeff Palmucci, Director of Machine Learning at TripAdvisor, for permission to republish the following (originally appeared in TripAdvisor’s Engineering/Operations blog). Here at TripAdvisor...

View Article


YCSB, the Open Standard for NoSQL Benchmarking, Joins Cloudera Labs

YCSB, the open standard for comparative performance evaluation of data stores, is now available to CDH users for their Apache HBase deployments via new packages from Cloudera Labs. Many factors go into...

View Article

Untangling Apache Hadoop YARN, Part 1

In this multipart series, fully explore the tangled ball of thread that is YARN. YARN (Yet Another Resource Negotiator) is the resource management layer for the Apache Hadoop ecosystem. YARN has been...

View Article

Meet Cloudera’s Apache Spark Committers

The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen,...

View Article


How-to: Prepare Unstructured Data in Impala for Analysis

Learn how to build an Impala table around data that comes from non-Impala, or even non-SQL, sources. As data pipelines start to include more aspects such as NoSQL or loosely specified schemas, you...

View Article


How-to: Use Apache Solr to Query Indexed Data for Analytics

Bet you didn’t know this: In some cases, Solr offers lightning-fast response times for business-style queries. If you were to ask well informed technical people about use cases for Solr, the most...

View Article

New in Cloudera Enterprise 5.5: Analytics for Metadata Management

Starting in Cloudera Enterprise 5.5, Cloudera Navigator offers interactive visual analytics that help answer important questions about the data that’s in your CDH clusters. The new analytics system in...

View Article

Progress Report: Hive-on-Spark Nears Production Readiness

Contributors from Intel, Cloudera, and the rest of the community have been making strong progress on the Hive-on-Spark initiative. This post provides an update. Since its inception about one year ago,...

View Article

How-to: Get Started with CDH on OpenStack with Sahara

The recent OpenStack Kilo release adds many features to the Sahara project, which provides a simple means of provisioning an Apache Hadoop (or Spark) cluster on top of OpenStack. This how-to, from...

View Article


Apache Phoenix Joins Cloudera Labs

We are happy to announce the inclusion of Apache Phoenix in Cloudera Labs. [Update: A new package for Apache Phoenix 4.5.2 on CDH 5.4.x was released on Nov. 19, 2015.] Apache Phoenix is an efficient...

View Article

Image may be NSFW.
Clik here to view.

The New Hadoop Application Architectures Book is Here!

There’s an important new addition coming to the Apache Hadoop book ecosystem. It’s now in early release! We are very happy to announce that the new Apache Hadoop book we have been writing for O’Reilly...

View Article


New in CDH 5.1: Document-level Security for Cloudera Search

Cloudera Search now supports fine-grain access control via document-level security provided by Apache Sentry. In my previous blog post, you learned about index-level security in Apache Sentry...

View Article

Running CDH 5 on GlusterFS 3.3

The following post was written by Jay Vyas (@jayunit100) and originally published in the Gluster.org Community. I have recently spent some time getting Cloudera’s CDH 5 distribution of Apache Hadoop to...

View Article


Apache Kafka for Beginners

When used in the right way and for the right use case, Kafka has unique attributes that make it a highly attractive option for data integration. Apache Kafka is creating a lot of buzz these days. While...

View Article

Secrets of Cloudera Support: Using OpenStack to Shorten Time-to-Resolution

Automating the creation of short-lived clusters for testing purposes frees our support engineers to spend more time on customer issues. The first step for any support engineer is often to replicate the...

View Article

NoSQL in a Hadoop World

The number of powerful data query tools in the Apache Hadoop ecosystem can be confusing, but understanding a few simple things about your needs usually makes the choice easy.  Ah, the good old days. I...

View Article
Browsing all 25 articles
Browse latest View live




Latest Images