It’s an absolute myth that you can send an algorithm over raw data and have insights pop up. … the predicament of data wrangling [is] big data’s “iceberg” issue, meaning attention is focused on the result that is seen rather than all the unseen toil beneath.
“For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights” via NYTimes
A great article that covers the inherent issues of dealing with unstructured data. “Data wrangling” is as important as the actual magic delivered by “data science.”