Text Mining Package


[Up] [Top]

Documentation for package ‘tm’ version 0.7-11

Help Pages

A B C D E F G H I L M N O P R S T U V W X Z

-- A --

acq 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
as.DocumentTermMatrix Term-Document Matrix
as.TermDocumentMatrix Term-Document Matrix
as.VCorpus Volatile Corpora

-- B --

Boost_tokenizer Tokenizers

-- C --

c.TermDocumentMatrix Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors
c.term_frequency Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors
c.TextDocument Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors
c.VCorpus Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors
close.SimpleSource Sources
content_transformer Content Transformers
Corpus Corpora
crude 20 Exemplary News Articles from the Reuters-21578 Data Set of Topic crude

-- D --

DataframeSource Data Frame Source
DirSource Directory Source
Docs Access Document IDs and Terms
DocumentTermMatrix Term-Document Matrix
DublinCore Metadata Management
DublinCore<- Metadata Management

-- E --

eoi Sources
eoi.SimpleSource Sources

-- F --

findAssocs Find Associations in a Term-Document Matrix
findAssocs.DocumentTermMatrix Find Associations in a Term-Document Matrix
findAssocs.TermDocumentMatrix Find Associations in a Term-Document Matrix
findFreqTerms Find Frequent Terms
findMostFreqTerms Find Most Frequent Terms
findMostFreqTerms.DocumentTermMatrix Find Most Frequent Terms
findMostFreqTerms.TermDocumentMatrix Find Most Frequent Terms
findMostFreqTerms.term_frequency Find Most Frequent Terms
FunctionGenerator Readers

-- G --

getElem Sources
getElem.DataframeSource Sources
getElem.DirSource Sources
getElem.URISource Sources
getElem.VectorSource Sources
getElem.XMLSource Sources
getMeta Sources
getMeta.DataframeSource Sources
getReaders Readers
getSources Sources
getTokenizers Tokenizers
getTransformations Transformations

-- H --

Heaps_plot Explore Corpus Term Frequency Characteristics

-- I --

inspect Inspect Objects
inspect.PCorpus Inspect Objects
inspect.TermDocumentMatrix Inspect Objects
inspect.TextDocument Inspect Objects
inspect.VCorpus Inspect Objects

-- L --

length.SimpleSource Sources

-- M --

MC_tokenizer Tokenizers
meta Metadata Management
meta.PCorpus Metadata Management
meta.PlainTextDocument Metadata Management
meta.SimpleCorpus Metadata Management
meta.VCorpus Metadata Management
meta.XMLTextDocument Metadata Management
meta<-.PCorpus Metadata Management
meta<-.PlainTextDocument Metadata Management
meta<-.SimpleCorpus Metadata Management
meta<-.VCorpus Metadata Management
meta<-.XMLTextDocument Metadata Management

-- N --

nDocs Access Document IDs and Terms
nTerms Access Document IDs and Terms

-- O --

open.SimpleSource Sources

-- P --

PCorpus Permanent Corpora
pGetElem Sources
pGetElem.DataframeSource Sources
pGetElem.DirSource Sources
pGetElem.URISource Sources
pGetElem.VectorSource Sources
PlainTextDocument Plain Text Documents
plot.TermDocumentMatrix Visualize a Term-Document Matrix

-- R --

readDataframe Read In a Text Document from a Data Frame
readDOC Read In a MS Word Document
Reader Readers
reader Sources
reader.SimpleSource Sources
readPDF Read In a PDF Document
readPlain Read In a Text Document
readRCV1 Read In a Reuters Corpus Volume 1 Document
readRCV1asPlain Read In a Reuters Corpus Volume 1 Document
readReut21578XML Read In a Reuters-21578 XML Document
readReut21578XMLasPlain Read In a Reuters-21578 XML Document
readTagged Read In a POS-Tagged Word Text Document
readXML Read In an XML Document
read_dtm_Blei_et_al Read Document-Term Matrices
read_dtm_MC Read Document-Term Matrices
removeNumbers Remove Numbers from a Text Document
removeNumbers.character Remove Numbers from a Text Document
removeNumbers.PlainTextDocument Remove Numbers from a Text Document
removePunctuation Remove Punctuation Marks from a Text Document
removePunctuation.character Remove Punctuation Marks from a Text Document
removePunctuation.PlainTextDocument Remove Punctuation Marks from a Text Document
removeSparseTerms Remove Sparse Terms from a Term-Document Matrix
removeWords Remove Words from a Text Document
removeWords.character Remove Words from a Text Document
removeWords.PlainTextDocument Remove Words from a Text Document

-- S --

scan_tokenizer Tokenizers
SimpleCorpus Simple Corpora
SimpleSource Sources
Source Sources
stemCompletion Complete Stems
stemDocument Stem Words
stemDocument.character Stem Words
stemDocument.PlainTextDocument Stem Words
stepNext Sources
stepNext.SimpleSource Sources
stopwords Stopwords
stripWhitespace Strip Whitespace from a Text Document
stripWhitespace.PlainTextDocument Strip Whitespace from a Text Document

-- T --

TermDocumentMatrix Term-Document Matrix
termFreq Term Frequency Vector
Terms Access Document IDs and Terms
TextDocument Text Documents
tm_filter Filter and Index Functions on Corpora
tm_filter.PCorpus Filter and Index Functions on Corpora
tm_filter.SimpleCorpus Filter and Index Functions on Corpora
tm_filter.VCorpus Filter and Index Functions on Corpora
tm_index Filter and Index Functions on Corpora
tm_index.PCorpus Filter and Index Functions on Corpora
tm_index.SimpleCorpus Filter and Index Functions on Corpora
tm_index.VCorpus Filter and Index Functions on Corpora
tm_map Transformations on Corpora
tm_map.PCorpus Transformations on Corpora
tm_map.SimpleCorpus Transformations on Corpora
tm_map.VCorpus Transformations on Corpora
tm_parLapply Parallelized 'lapply'
tm_parLapply_engine Parallelized 'lapply'
tm_reduce Combine Transformations
tm_term_score Compute Score for Matching Terms
tm_term_score.DocumentTermMatrix Compute Score for Matching Terms
tm_term_score.PlainTextDocument Compute Score for Matching Terms
tm_term_score.TermDocumentMatrix Compute Score for Matching Terms
tm_term_score.term_frequency Compute Score for Matching Terms

-- U --

URISource Uniform Resource Identifier Source

-- V --

VCorpus Volatile Corpora
VectorSource Vector Source

-- W --

weightBin Weight Binary
WeightFunction Weighting Function
weightSMART SMART Weightings
weightTf Weight by Term Frequency
weightTfIdf Weight by Term Frequency - Inverse Document Frequency
writeCorpus Write a Corpus to Disk

-- X --

XMLSource XML Source
XMLTextDocument XML Text Documents

-- Z --

Zipf_plot Explore Corpus Term Frequency Characteristics
ZipSource ZIP File Source