Tag Archives: infoworld.com

3 types of clueless Big Data “experts”

I just read a very funny (and informative) article on InfoWorld about clueless “cloud experts”. Very easy to translate into any tech vertical, but made me recall so may examples of people who don’t understand Big Data.

  1. I’ve built Big Data applications years ago.
    I have a good friend (who I hope never reads this) who insists that he built a Big Data application in 1992 using Apple Hypercard with both executable and data distributed on one CDROM. Of course that was “a lot” of data in 1992. So one question if we want to be pedantic: If you don’t use Hadoop can it be a Big Data application?
  2. Big Data has no privacy. Isn’t that what the NSA proved?
    This misconception is the exact opposite of the truth. The NSA uses Accumulo, a very secure Hadoop distribution, and siphons data from all sorts of systems all over the planet. Sure, it probably pulls from some Hadoop systems, but for the NSA to get so much data doesn’t it make sense that the vast majority must be coming from ordinary non-Hadoop systems?
  3. Big Data is the answer for everything.
    I know a guy who suggested using Hadoop (running the Teradata distribution no less!) to store data feeds that we’re not ready to run ETL on yet. Wouldn’t a simple fileshare be a lot easier?

Source:

 

Don’t run Hadoop on a SAN

By definition, a SAN is about consolidating data and Hadoop is about distributing data. Can they co-exist? Not according to this article.

If you take data out of a Hadoop node and put it on a SAN, you’re reducing performance. You want data to transfer to the CPU at bus speed, not network speed. And maybe a heavy Hadoop load could saturate your network.

source: