Category Archives: hardware

IBM PureApplications for Hybrid IaaS Cloud

IBM PureApplications provides on-premise cloud. #PureApp for SoftLayer provides off-premises cloud solutions. ibm.co/TNzV8m @Prolifics

Video includes clip from my manager @Prolifics, Mike Hastie.

Running Hadoop on VMWare

Normally we’d like to think of Hadoop running on hundreds of racks of commodity hardware, but that doesn’t mean that we should forget all of the reasons why we love virtualization.

This case study explains that how & why, and provides benchmarks of the experiment of running Hadoop on VMWare. Of course the experiment was successful, as the study was published by VMWare.

The moral of the story is that just because Hadoop can run on commodity hardware doesn’t mean that it has to, or that it’s the best way to deploy.

Source:

Hard to believe, but here’s how to install Hadoop on RaspberryPi

I haven’t tried this myself (don’t have a RaspberryPi, but only have an Arduino), and even if it’s possible to get it to install I’m not sure what the runtime could accomplish, but this guy has published a short list of instructions on how to install Hadoop on RaspberryPi.

Source:

Intel distribution of Hadoop optimized for it’s own hardware

Hadoop is generally assumed to run on clusters of generic commodity hardware. Intel has just released a customized/optimized distribution that it claims is up to 30x faster if run on the Xenon E7 v2 family of processors, which is hardly generic or commodity.

Sources:

Don’t run Hadoop on a SAN

By definition, a SAN is about consolidating data and Hadoop is about distributing data. Can they co-exist? Not according to this article.

If you take data out of a Hadoop node and put it on a SAN, you’re reducing performance. You want data to transfer to the CPU at bus speed, not network speed. And maybe a heavy Hadoop load could saturate your network.

source:

Heterogeneous hardware

Nodes within a cluster do NOT need to be identical (CPUs, RAM, disk storage). The scheduler takes this into account. For a prototype, or the early release of an Agile system, it’s possible to just toss in whatever you have and then evolve to an environment that’s easier to support. But the key is that Hadoop couldn’t care less.

HP Hardware for Hadoop

We generally think of hardware for Hadoop being generic and commodity, but HP claims to have specialized (fastest, pre-optimized, pre-tested) hardware for its own reference architecture. How can it be pre-optimized if they have no knowledge of the target dataset?

Source: