Topics: Dell Axim, Internet forum, Data mining Pages: 24 (8647 words) Published: November 1, 2010
Deriving Marketing Intelligence from Online Discussion
Natalie Glance

Matthew Hurst

Kamal Nigam

Matthew Siegler

Robert Stockton Intelliseek Applied Research Center Pittsburgh, PA 15217

Takashi Tomokiyo

Weblogs and message boards provide online forums for discussion that record the voice of the public. Woven into this mass of discussion is a wide range of opinion and commentary about consumer products. This presents an opportunity for companies to understand and respond to the consumer by analyzing this unsolicited feedback. Given the volume, format and content of the data, the appropriate approach to understand this data is to use large-scale web and text data mining technologies. This paper argues that applications for mining large volumes of textual data for marketing intelligence should provide two key elements: a suite of powerful mining and visualization technologies and an interactive analysis environment which allows for rapid generation and testing of hypotheses. This paper presents such a system that gathers and annotates online discussion relating to consumer products using a wide variety of state-of-the-art techniques, including crawling, wrapping, search, text classification and computational linguistics. Marketing intelligence is derived through an interactive analysis framework uniquely configured to leverage the connectivity and content of annotated online discussion. Categories and Subject Descriptors: H.3.3: Information Search and Retrieval General Terms: Algorithms, Experimentation Keywords: text mining, content systems, computational linguistics, machine learning, information retrieval

from online public communications. For example, there are message boards devoted to a specific gaming platform, newsgroups centered around a particular make and model of motorcycle, and weblogs devoted to a new drug on the market. Both the consumer and the corporation can benefit if online consumer sentiment is attended to: the consumer has a voice to which the corporation can respond, both on the personal level and on the product design level. This paper describes an end-to-end commercial system that is used to support a number of marketing intelligence and business intelligence applications. In short, we describe a mature system which leverages online data to help make informed and timely decisions with respect to brands, products and strategies in the corporate space. This system processes online content for entities interested in tracking the opinion of the online public (often as a proxy for the general public). The applications that this data is put to range from: • Early alerting - informing subscribers when a rare but critical, or even fatal, condition occurs. • Buzz tracking - following trends in topics of discussion and understanding what new topics are forming. • Sentiment mining - extracting aggregate measures of positive vs. negative opinion. Early implementations of these applications in the industry were enabled by sample-and-analyze systems where a human analyst read a tiny fraction of the data available and made observations and recommendations. As these approaches can not handle realistically-sized data sets, modern approaches are built on technology solutions which use comprehensive crawling, text mining, classification and other data driven methods to describe the opinion reported in online data. Other systems described in research literature have also focused on aggregating knowledge from the web. The WebKB project [9] was an early effort to automatically extract factual information about computer science research departments, people, and research projects using departmental web sites. Their emphasis was on the application of machine learning techniques to the extraction of data and facts, without emphasis placed on the access...
Continue Reading

Please join StudyMode to read the full document

Become a StudyMode Member

Sign Up - It's Free