Loading...

Practical text mining with PEARL

This book is devoted to the fundamentals of text mining using Perl, an open-source programming tool that is freely available via the Internet (www.perl.org). It covers mining ideas from several perspectives--statistics, data mining, linguistics, and information retrieval--and provides readers with t...

Full description

Bibliographic Details
Main Author: Bilisoly, Roger
Format: Printed Book
Published: New Jersey John Wiley 2008
Edition:1st ED.
Subjects:
LEADER 03251nam a22001937a 4500
020 |a 9788126554218 
020 |a 9780470176436 
082 |a 005.741  |b P8 
100 |a Bilisoly, Roger 
245 |a Practical text mining with PEARL 
250 |a 1st ED. 
260 |a New Jersey  |b John Wiley  |c 2008 
300 |a 320p.  |b Hard bound 
505 |a 2.1 Introduction. 2.2 Regular Expressions. 2.3 Finding Words in a Text. 2.4 Decomposing Poe's "The Tell-Tale Heart" into Words. 2.5 A Simple Concordance. 2.6 First Attempt at Extracting Sentences. 2.7 Regex Odds and Ends. 2.8 References. 3. Quantitative Text Summaries. 3.1 Introduction. 3.2 Scalars, Interpolation and Context in Perl. 3.3 Arrays and Context in Perl. 3.4 Word Lengths in Poe's "The Tell-Tale Heart". 3.5 Arrays and Functions. 3.6 Hashes. 3.7 Two Text Applications. 3.8 Complex Data Structures. 3.9 References. 3.10 First Transition. 4. Probability and Text Sampling. 4.1 Introduction. 4.2 Probability. 4.3 Conditioned Probability. 4.4 Mean and Variance of random Variables. 4.5 The Bag-of-Words Model for Poe's :The Black Cat". 4.6 The Effect of Sample Size. 4.7 References. 5. Applying Information Retrieval to Text Mining. 5.1 Introduction. 5.2 Counting Letters and Words. 5.3 Text Counts and Vectors. 5.4 The Term-Document Matrix Applied to Poe. 5.5 Matrix Multiplication. 5.6 Functions of Counts. 5.7 Document Similarity. 5.8 References. 6. Concordance Lines and Corpus Linguistics. 6.1 Introduction. 6.2 Sampling. 6.3 Corpus as Baseline. 6.4 Concordancing. 6.5 Collocations and Concordance Lines. 6.6 Applications with References. 6.7 Second Transition. 7. Multivariate Techniques with Text. 7.1 Introduction. 7.2 Basic Statistics. 7.3 Basic Linear Algebra. 7.4 Principal Component Matrices. 7.5 Text Applications. 7.6 Applications and References. 8. Text Clustering. 8.1 Introduction. 8.2 Clustering. 8.3 A Note on Classification. 8.4 References. 8.5 Last Transition. 9. A Sample of Additional Topics. 9.1 Introduction. 9.2 Perl Modules. 9.3 Other Languages: Analyzing Goethe in German. 9.4 Permutation Tests.  
520 |a This book is devoted to the fundamentals of text mining using Perl, an open-source programming tool that is freely available via the Internet (www.perl.org). It covers mining ideas from several perspectives--statistics, data mining, linguistics, and information retrieval--and provides readers with the means to successfully complete text mining tasks on their own. The book begins with an introduction to regular expressions, a text pattern methodology, and quantitative text summaries, all of which are fundamental tools of analyzing text. Then, it builds upon this foundation to explore: Probability and texts, including the bag-of-words model Information retrieval techniques such as the TF-IDF similarity measure Concordance lines and corpus linguistics Multivariate techniques such as correlation, principal components analysis, and clustering Perl modules, German, and permutation tests  
650 |a Text mining  |a PEARL 
942 |c BK 
999 |c 85209  |d 85209 
952 |0 0  |1 0  |4 0  |6 005_741000000000000_P8  |7 0  |9 77554  |a SOCS  |b SOCS  |d 2015-07-15  |e Arunima Book Distributors  |g 4907.55  |l 1  |m 1  |o 005.741 P8  |p SOCS3364  |r 2015-07-15  |s 2015-07-15  |w 2015-07-15  |y BK