TextWords
Details
- Characters in string that are not identified as being part of words are dropped by TextWords.
- TextWords[ContentObject[…]] gives words from the plain text contents of the ContentObject.
Examples
open all close allBasic Examples (3)
Segment a string into a list of words:
TextWords["first second third fourth"]TextWords["apple bear cat dog elephant"]TextWords separates words by punctuation as well as whitespace:
TextWords["First sentence. Second sentence. Third sentence."]TextWords["A phrase, and yet another phrase."]Get the first 10 words in a block of text:
TextWords["In the fell clutch of circumstance
I have not winced nor cried aloud.
Under the bludgeonings of chance
My head is bloody, but unbowed.", 10]TextWords[ResourceData["The Secret Sharer"], 10]Scope (3)
TextWords preserves hyphenation:
TextWords["The investment came from a London-based company."]TextWords["He packed the meal in shrink-wrap."]Titles, currencies and other syntactic units are segmented as separate words:
TextWords["50% of the school played a sport."]TextWords["She spent $10 on lunch."]TextWords["He spent 100€ on a trip to France."]Get a list of words from a ContentObject:
file = Export[FileNameJoin[{$TemporaryDirectory, "hamlet.txt"}], ExampleData[{"Text", "ToBeOrNotToBe"}]];
doc = ContentObject[File[file]]TextWords[doc]Applications (1)
Make a WordCloud of words from a poem:
WordCloud[TextWords[ExampleData[{"Text", "TheRaven"}]]]Properties & Relations (2)
TextWords is equivalent to TextCases[…,"Word"]:
TextCases["As a matter-of-fact, my mother-in-law is in N.Y.C.", "Word"]//TextElementTextWords["As a matter-of-fact, my mother-in-law is in N.Y.C."]//TextElementTextStructure splits texts into the same words:
TextCases["As a matter-of-fact, my mother-in-law is in N.Y.C.", "Word" | "Punctuation"]//TextElementTextStructure["As a matter-of-fact, my mother-in-law is in N.Y.C.", "PartsOfSpeech"]Possible Issues (1)
Words returned by TextWords are identified structurally, and may not be dictionary words:
TextWords["A li milk."]DictionaryWordQ /@ %See Also
WordCount TextSentences TextCases StringSplit WordCounts DeleteStopwords WordStem TextRecognize FeatureExtractor SequencePredict
Text Contents: Word
Function Repository: JapaneseTextTokenizer KeywordsGraph
Related Guides
Related Workflows
- Analyze the Text on a Webpage
Text
Wolfram Research (2015), TextWords, Wolfram Language function, https://reference.wolfram.com/language/ref/TextWords.html (updated 2016).
CMS
Wolfram Language. 2015. "TextWords." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2016. https://reference.wolfram.com/language/ref/TextWords.html.
APA
Wolfram Language. (2015). TextWords. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/TextWords.html
BibTeX
@misc{reference.wolfram_2026_textwords, author="Wolfram Research", title="{TextWords}", year="2016", howpublished="\url{https://reference.wolfram.com/language/ref/TextWords.html}", note=[Accessed: 12-June-2026]}
BibLaTeX
@online{reference.wolfram_2026_textwords, organization={Wolfram Research}, title={TextWords}, year={2016}, url={https://reference.wolfram.com/language/ref/TextWords.html}, note=[Accessed: 12-June-2026]}