BI Encyclopedia

Text Mining

What is Text Mining?

Text mining is the process of deriving insights from text. This information is typically obtained through determining patterns and trends within text through methods such as statistical pattern learning. It typically involves the process of structuring the input text, deriving a pattern within the structured data, and finally evaluating and interpreting the output.

The goal of text mining is to essentially turn text into data for analysis with applying natural language processing (NLP) and analytical methods. To accomplish this, text mining involves information and data retrieval, lexical analyses to study word frequency distributions, pattern recognition, tagging and annotation, information extraction, data mining techniques, visualization, and predictive analytics.

Some subtasks of text mining include:

Information retrieval or identification

Recognition of pattern identified entities: features such as telephone numbers, e-mail addresses, quantities, etc.

Relationship, fact, and event extraction: identifying associations among entities and other information in text

Sentiment analysis involving discerning subjective material

Quantitative text analysis