Strata Hadoop 2014 World Conference

2 minute read

I attended the Strata + Hadoop World 2014 2-day conference in October for the first time. It was held at New York city’s Javits Center, and it was organized by Cloudera and O’Reilly Media. There were over 5000+ attendees from the following industries: business, finance, media, fashion, retail, and government. Numerous topics were covered including: business & industry, data science, hadoop and security.

Each of the two days began with a 2-hour plenary session where prominent business people from large companies and startups presented.

My favorites:

Rana el Kaliouby, co-founder and CSO of Affectiva, had an enlightening plenary session on human emotions, measurement and marketing. Her company’s database includes 2 million facial images from 80+ countries. These images are used to evaluate human emotions. Since human emotions predict human behavior, this is extremely beneficial in marketing and other fields.

Julia Angwin of ProPublica presented on big data and privacy. Her takeway: privacy may ultimately be impossible, so let’s move to getting assurances in protecting our personal data.



I spoke to the representative in the SAS booth and indicated that I was using SAS less frequently (since the license was so expensive), and instead I was using open source software such as python and R. He informed me, to my great astonishment, that SAS is now offering a free download

Revolution R

I also learned of Revolution R, which is an enhanced version of R (enhancements include multi-core processing and aid in managing R package versions for results which can be easily replicated). It is available for free download. The company also offers Revolution R Plus which is available for an annual subscription and includes technical support and training. I was going to download and explore this software, but I recently upgraded to Yosemite on my mac, and this software has been tested on Mavericks.


Recently, a former classmate had posted on a community page about his need for software that could do text matching of similar names. My initial recommendation was writing original regular expressions. But, at this conference, I learned of Tamr, which does data curation. I sent the information to him. Hopefully, he will find it helpful.

Special thanks to NYC PyLadies and Strata Hadoop for sponsoring me.




Leave a Comment