In TextContents[text,…], text can be a string, a file with plain text represented by File[…], a ContentObject expression or a list of these text objects.
TextContents[{text₁,text₂,…},…] gives cases for each text_i.
Identification type form can be:

	"type"	any text content type (e.g. "Noun", "City")
	Entity[…,…]	a specific entity of a text content type
	form₁\|form₂\|…	form matching any of the form_i
	Containing[outer,inner]	forms of type outer containing type inner
	Verbatim["string"]	a specific string to be matched exactly
	pattern	a string pattern to be matched
	Automatic	entities, dates, quantities and other content-related elements

Possible choices for the property prop include:

	"String"	string of the identified text (default)
	"Position"	start and end position of the string in text
	"Probability"	estimated probability that the identification is correct
	"Type"	type of content (entity type, …)
	"Interpretation"	standard interpretation of the identified string
	"Snippet"	a snippet around the identified string
	"HighlightedSnippet"	a snippet with the identified string highlighted
	All	all the preceding properties
	{prop₁,prop₂,…}	a list of property specifications

The following options can be given:

AcceptanceThreshold	Automatic	minimum probability to accept identification
TargetDevice	"CPU"	whether CPU or GPU computation should be used for entity detection
VerifyInterpretation	False	whether interpretability should be verified

Examples

open all close all

Basic Examples (1)

Find entities in a text:

Wolfram Language code:

TextContents["The flag of Italy is green, white and red. Since 1861, the capital is Rome, which also serves as the capital of the Lazio region. With 2,872,800 residents in 1,285 km2 (496.1 sq mi)"]

Only get the results for locations:

Wolfram Language code:

TextContents["The flag of Italy is green, white and red. Since 1861, the capital is Rome, which also serves as the capital of the Lazio region. With 2,872,800 residents in 1,285 km2 (496.1 sq mi)", "Location"]

Only get the results for locations and quantities:

Wolfram Language code:

TextContents["The flag of Italy is green, white and red. Since 1861, the capital is Rome, which also serves as the capital of the Lazio region. With 2,872,800 residents in 1,285 km2 (496.1 sq mi)", {"Location", "Quantity"}]

Get interpretations for all cases:

Wolfram Language code:

TextContents["The flag of Italy is green, white and red. Since 1861, the capital is Rome, which also serves as the capital of the Lazio region. With 2,872,800 residents in 1,285 km2 (496.1 sq mi)", {"Location", "Quantity"}, All]

Get a specified set of properties for entities:

Wolfram Language code:

TextContents["The flag of Italy is green, white and red. Since 1861, the capital is Rome, which also serves as the capital of the Lazio region. With 2,872,800 residents in 1,285 km2 (496.1 sq mi)", Automatic, {"HighlightedSnippet", "Interpretation"}]

Options (2)

AcceptanceThreshold (1)

By default, all the detected entities have an estimated probability higher than 0.5:

Wolfram Language code: TextContents[ExampleData[{"Text", "JFKInaugural"}]]

Get only the entities that are highly probable to be correct by setting a high AcceptanceThreshold:

Wolfram Language code: TextContents[ExampleData[{"Text", "JFKInaugural"}], AcceptanceThreshold -> 0.9]

VerifyInterpretation (1)

By default, some entities cannot be interpreted, either because they are not correct or because they are not yet in the knowledgebase:

Wolfram Language code: TextContents["We visited Toulouse and Auterive in Midi-Pyrénées in France.", Automatic, {"String", "Interpretation"}]

Use VerifyInterpretation to filter out the entities that cannot be interpreted:

Wolfram Language code:

TextContents["We visited Toulouse and Auterive in Midi-Pyrénées in France.", Automatic, {"String", "Interpretation"}, VerifyInterpretation -> True]

Properties & Relations (1)

TextContents handles the same types as TextPosition and TextCases and always identifies the same substrings as these functions for a given type:

Wolfram Language code: TextContents["Boston, Worcester, and Springfield are the largest cities in Massachusetts.", "City"]

Wolfram Language code: TextCases["Boston, Worcester, and Springfield are the largest cities in Massachusetts.", "City"]

Wolfram Language code: TextPosition["Boston, Worcester, and Springfield are the largest cities in Massachusetts.", "City"]

A dataset that is similar to the output of TextContents can be obtained using TextCases:

Wolfram Language code: Dataset@TextCases["Boston, Worcester, and Springfield are the largest cities in Massachusetts.", "City" -> Identity]

Wolfram Language code:

Dataset@TextCases["Boston, Worcester, and Springfield are the largest cities in Massachusetts.", "City" -> Function[<|"City" -> Last[#Interpretation][[1]], "State" -> Last[#Interpretation][[2]], "Country" -> Last[#Interpretation][[3]], "Position" -> #Position, "HighlightedSnippet" -> #HighlightedSnippet|>]]

Neat Examples (1)

Load the text of a Wikipedia page about the Moon:

Wolfram Language code: moon = WikipediaData["Moon"];

Wolfram Language code: Snippet[moon, 5]

Extract notable text contents from the page:

Wolfram Language code: contents = TextContents[moon, VerifyInterpretation -> True]

Visualize the frequency of content types found on the page:

Wolfram Language code: counts = ReverseSort@CountsBy[contents, #Type&]

Wolfram Language code: WordCloud[counts]

Find potential notable persons identified on the page:

Wolfram Language code: Normal[Select[contents, #Type === "Person"&][[All, "String"]]]

Interpret these persons as entities:

Wolfram Language code: persons = Normal[Select[contents, #Type === "Person"&][[All, "Interpretation"]]]

Visualize occupations of these persons:

Wolfram Language code: WordCloud[Counts[Flatten@EntityValue[persons, EntityProperty["Person", "Occupation"]]]]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

TextContents

Details and Options

Examples

Basic Examples (1)

Options (2)

AcceptanceThreshold (1)

VerifyInterpretation (1)

Properties & Relations (1)

Neat Examples (1)

Text

CMS

APA

BibTeX

BibLaTeX

TextContents

Details and Options

Examples

Basic Examples (1)

Options (2)

AcceptanceThreshold (1)

VerifyInterpretation (1)

Properties & Relations (1)

Neat Examples (1)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX

TextContents