Text mining or text analytics is the process of deriving high-quality information from text. With the humongous amounts of content being generated across the globe, it is easy to see why analyzing this content and deriving relevant insights from it can be such a game changer. By all indications, text analytics is growing rapidly. According to a press release by Allied Market Research, global text analytics market has potential to reach $6.5 billion by 2020, registering a CAGR of 25.2% during 2014-2020.


But, analyzing text is particularly hard especially if you try to automate it. It’s not just about counting a bunch of keywords and crunching the data; understanding the context, social mores is also important. Humor, sarcasm, colloquial slangs – these are tough enough for humans to understand, let alone the computer!


Typically, text mining includes tasks such as text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling (i.e., learning relations between named entities).


As described in this KM World article, text analytics has had several distinct generations in the course of its evolution. It has become more sophisticated, moving from keyword-based analysis to clustering of concepts and sentiment analysis. The next advance will be towards making things easier for users through greater automation.


With its ability to provide insights from unstructured data, text analytics has some very extensive applications in predictive analytics.



We’re helping one of our customers, an ISV, use text analytics to prioritize product features. For example, the company can mine user reviews about its new mobile phone on social media and various review sites, also use the data to understand user sentiment and classify user feedback. It can help them determine the popularity of various features in their mobile phone, for instance- insights into how many people liked the camera or how many are using the new stylus etc, which can be used to improve the product.


The insurance industry is another great candidate for using text analytics to predict consumer behavior. Customer experience is a key element for success in the insurance sector since there is very little product differentiation. Using text analytics, insurance providers can study interactions based on specific products or services as well as marketing channels used, operations employed, etc. In addition, automatic opinion and sentiment analysis techniques enable to identify the sentiment relating to any specific aspects of a product, channel or procedure. Similarly, by analyzing social media networks, they can also identify trends in the sector and competition perception that can guide the strategic direction of their business.


Currently, there is a considerable amount of human intervention required for effective text analytics. And this is not likely to change in the near future. Evans Data Corporation’s latest survey-driven communiqué suggests that, despite all the buzz about automation and Artificial Intelligence, 97.4% of computers still need a human touch in order to function.


Cloud based managed analytics services like Cortana Analytics (from Microsoft), Watson analytics (from IBM) provide libraries for text mining, and as they evolve , text mining will reach to even small companies who cannot afford it now.


As per the KMWorld article, greater automation can only be achieved if the text analytics solution becomes more intelligent, has the ability to make inferences and provides or seeks out knowledge that is relevant to users.


But what does the future hold for text analytics? We’ll just have to wait and see.