Post on 17-Jul-2020
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
EL VALOR DE LA PALABRA DEL CLIENTE -
SAS TEXT MINING
rosanamac.lean@sas.com
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
AGENDA
• Datos No Estructurados
• Situación Actual
• Aplicaciones
• SAS® Text Analytics
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
UNSTRUCTURED AND SEMISTRUCTURED DATA
THE DARK MATTER FOR IT
Structured data Relational databases,
structured data files, system/application
data and logs that reside in a data store,
defined by a catalog (table definitions)/data
model accessible via SQL or Object
definitions.
This data has a characteristic of being
contextualized by the heading (field name)
and possibly defined in relation to other
"fields.” This data is also capable of being
processed in a simple manner, summed or
aggregated, etc.
Semistructured data houses structure
with freeform elements (e.g., e-mails) and
has structure and context to specific
elements in the header, but is freeform text
in the body. Semistructured data comes in
many forms.
Semistructured data is also formed when
unstructured data is combined with
metadata, making it accessible by search
engines via indexing schemas. This is the
ideal state for naturally unstructured data
within organizations.
Unstructured data Most
of the information that resides
in organizations is
unstructured in nature –images, content of Web
documents, standard
documents, audio, video and
correspondence.
This type of information is
typically difficult to find
effectively if nothing has been
done to make the data
accessible, such as putting it
into a content management
system and tagging it with
metadata.
70%
25%
5%
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
CURRENT SITUATION:
COMMON QUESTIONS ABOUT TEXTUAL DATA SOURCES
How can I leverage on our textual
data sources?
What value can it bring?
Are there hidden insights within text data
sources that can help my organization?
Such as call center notes, emails, news, online
forums, social media…
How can I leverage on both
unstructured and structured
data sources?
Customer data + Customer
feedback?
Unable to leverage the most from text data?
Can I also use text data
to analyze and
predict the future?
To reduce churn, improve sales,
reduce costs…
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
WHAT IF YOU CAN…
Discover new insights from large text data sources
Extract key patterns from text data to predict the future
Discover current topics about your products from customer opinions
Find patterns within customer feedback, that predicts good interest in upsell
opportunities
Detect anomalies from usual topics described in text reports,
text applications or feedback
Find patterns in reports that may seem to predict/ relate to suspicious behavior
Understand previously unknown issues/ concerns, from citizen discussions on
twitter/ forums
Extract key opinions from citizen feedback to forecast citizen sentiments
in the near future
Customers
Fraud
Public Opinion
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
TEXT ANALYTICS Y
NPS
• Know the key drivers of promoters vs. detractors
• Predict the key drivers on entire customer base
• Measure and visualize the impact of changing
the key drivers of promoters vs. detractors
PROMOTORES
NEUTROS
DETRACTORES
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
TEXT ANALYTICS Y
CHURN
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SEGUROS: PREDICCIÓN FRAUDE DE SINIESTROS
Correct Dismissal False Detection
False Dismissal Correct Detection
No-Fraude Fraude
Predicción
No
-Fra
ud
eF
rau
de
Actu
al
Mejora del
20%
Mejora del
60%
TEXT ANALYTICS Y
FRAUDE
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
CATEGORIZE AND DETERMINE SENTIMENTSENTIMENT
ANALYSIS
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SENTIMENT
ANALYSISGOBIERNO
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS® Text Analytics
Acceso & Organización de la Información
SAS Enterprise Content
Categorization
Modelado Predictivo, Descubrimiento de Tendencias &
Patrones
SAS Text MinerSAS Sentiment
Analysis
Experiencia en Temática Modelos Analíticos
Natural Language Processing
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
NATURAL
LANGUAGE
PROCESSING
No me gusta la nueva gaseosa.Negación
Causa & Efecto Compré un nuevo telefóno y tengo mejor señal.
DesambiguaciónParis Hilton está en el Hilton de Paris
La Casa Rosada. La casa es rosada.
Co-referencia Alejandro sabe de TM. Él dijo que me ayudará.
Ortografía Sinónimos
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
NATURAL
LANGUAGE
PROCESSING
Tokenización Identificar palabras/expresiones o tokens
Stemming Identificar variantes: plurales, géneros, conjugaciones
Extracción de Entidades
Nombre de personas, Empresas, Productos, Lugares,
Direcciones email, Números de teléfono. Fechas.
Etiquetado Parte del discurso
Yo paseo mi mascota.
El paseo en lancha. Misma palabra: sustantivo/verbo
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
HOW DOES
TEXT MINING
WORK?
EXPLORING & DISCOVERING INSIGHTS
1. Input text messages – e.g. twitter data,
reports, email, news, forum messages
3. Discover Topics – cluster documents of similar
content and describe them with important key words
2. Parse & explore Text Data –break down text and explore relationships of key concepts
such as persons, places, organizations…
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
HOW DOES
TEXT MINING
WORK?
DISCOVER PATTERNS FOR PREDICTIVE
MODELING
1. Input text messages with relevant
structured data –e.g. email, call center
notes, applications
Customer
data
2. Parse Text Data and Discover Topics – Break
down text into structured data, group messages of
similar content
3. Predictive Modeling with text data – text data input into
models may provide reliable info to predict outcome & behavior
This customer is likely to accept your offer…
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SENTIMENT
ANALYSIS HOW DOES IT WORK?
1. Input text messages –e.g. twitter data, reports,
email, news, forum messages
Sentiment Taxonomy
2. Parse messages through Sentiment taxonomy –
match and score messages, and their details, for
sentiment polarity (e.g.
message is 80% positive)
3. Output Results – e.g. each message/ document and characteristics within the
document are now associated with a sentiment polarity score
This is negative
This is positive
This is negative
This is positive
This is positive
This is negative
Results are indexed or fed into existing systems
for search & analysis
4. Sentiments Reports –Results are easily analyzed against time period and/or
product features,drillable to see exact message
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
EXAMPLETEXT
ANALYTICS
Data & Sampling Text Analytics Model Testing Model Assessment
& Scoring
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS SENTIMENT
ANALYSIS
Once the taxonomy for the text documents has been
established, rules can be developed to determine the
sentiment of the document within that context. These
rules can be derived through:
• statistical means,
• come from out-of-the-box sets of rules,
• be written by the user, or
• a hybrid of the above.
Below is an example where a customer review of
an airline is found to be of a predominantly positive
sentiment.
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS SENTIMENT
ANALYSIS
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS® TEXT MINER COMMONLY-USED TEXT MINER NODES
• The Text Parsing node parses a document collection in order to quantify information
about the terms that are contained therein.
• The Text Filter node applies filters to reduce the number of terms or documents
included in a text analysis. The Text Filter node must be preceded by a Text Parsing
node and may be preceded by Text Filter and Text Topic nodes.
• The Text Topic node is used to create topics from a document collection. For each
topic it creates, it adds a variable to the training data table which the node exports.
The Text Topic node must be preceded by a Text Parsing node and may also be
preceded by a Text Filter node(s).
• The Text Cluster node performs a statistical cluster analysis of a document collection.
The Text Cluster node must be preceded by Text Parsing and Text Filter nodes and
may also be preceded by a Text Topic node.
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS® TEXT MINER OTHER TEXT MINER NODES
• The Text Import node extracts the text from documents contained in a directory and
creates a data set of the results. The Text Import node can also crawl the Internet,
beginning from a specified URL, and retrieve the Web pages which it finds.
• The Text Profile node is used to associate descriptive terms with different levels of a
target (dependent) variable in the data.
• The Text Rule Builder node creates Boolean rules to predict a categorical target
(dependent) variable. Each rule in the set is associated with a specific target
category. This node must be preceded by Text Parsing and Text Filter nodes.
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS TEXT MINER
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS TEXT MINER
1. From thousands of complaint
messages…
2. Text miner breaks down what
is mentioned into granular
terms/ phrases/ concepts
3. Text Topics automatically discovers
key topics mentioned in the messages,
and list out key words that seem to
describe the topics uniquely
4. we’re able to see what are the key
topics discovered from the complaint
messages – what are the most common
or rare trends/ topics
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS TEXT MINER
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS TEXT MINER
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
PREGUNTAS??
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d . www.SAS.com
GRACIAS!!
rosanamac.lean@sas.com