At the outset I’d like to wish you a Merry Christmas and a Happy New Year.
You could well call 2010 the year of high-definition and 3D. We saw the launch of HD Satellite TV, HD cameras, 3D movies and TVs and now we also have HD contact lenses! How about high-definition data?
While technology vendors are trying to squeeze more pixels into displays I am suggesting a refinement of bits. There is a stream of data flowing through enterprise networks and settling in storage repositories. You can classify it all as Structured and Unstructured. In the case of the former, data is organized in rows, columns, tables and databases. So it’s easy to sift through it, and search for something meaningful. But what about unstructured data? The type that does not have a data model?
According to Merrill Lynch as much as 80 percent of business data is in unstructured form. And a lot of this is user generated content like e-mail, audio and video clips, and documents that users exchange such as PDFs, PPTs and DOC files.
There is a lot of meaningful information in these files. But how do you organize/classify it and tag it? That’s what I mean by ‘high definition data’—it’s about getting more meaningful information from the large morass of data that’s accumulated in storage pools.
There are commercial solutions available for analyzing and understanding unstructured data for business applications. Companies like Inxight and SPSS offer these tools and there are more specialized offerings such as Attensity360 and Sysomos.
Are you using these tools? How accurate and effective are these? Do they actually help you mine useful data or see patterns and trends?
Write to me and let me know.
Meanwhile, enjoy your end-of-year vacations and don’t think too much about work!