Načítá se...

Machine learning for hackers /

Podrobná bibliografie
Hlavní autor: Conway, Drew
Další autoři: White, John Myles
Médium: Printed Book
Jazyk:English
Vydáno: Sebastopol, CA : O'Reilly Media, 2012.
Vydání:1st ed.
Témata:
On-line přístup:http://www.loc.gov/catdir/enhancements/fy1307/2012277057-b.html
http://www.loc.gov/catdir/enhancements/fy1307/2012277057-d.html
http://www.loc.gov/catdir/enhancements/fy1307/2012277057-t.html
Obsah:
  • Machine generated contents note: 1. Using R
  • R for Machine Learning
  • Downloading and Installing R
  • IDEs and Text Editors
  • Loading and Installing R Packages
  • R Basics for Machine Learning
  • Further Reading on R
  • 2. Data Exploration
  • Exploration versus Confirmation
  • What Is Data?
  • Inferring the Types of Columns in Your Data
  • Inferring Meaning
  • Numeric Summaries
  • Means, Medians, and Modes
  • Quantiles
  • Standard Deviations and Variances
  • Exploratory Data Visualization
  • Visualizing the Relationships Between Columns
  • 3. Classification: Spam Filtering
  • This or That: Binary Classification
  • Moving Gently into Conditional Probability
  • Writing Our First Bayesian Spam Classifier
  • Defining the Classifier and Testing It with Hard Ham
  • Testing the Classifier Against All Email Types
  • Improving the Results
  • 4. Ranking: Priority Inbox
  • How Do You Sort Something When You Don't Know the Order?
  • Ordering Email Messages by Priority.
  • Contents note continued: Priority Features of Email
  • Writing a Priority Inbox
  • Functions for Extracting the Feature Set
  • Creating a Weighting Scheme for Ranking
  • Weighting from Email Thread Activity
  • Training and Testing the Ranker
  • 5. Regression: Predicting Page Views
  • Introducing Regression
  • The Baseline Model
  • Regression Using Dummy Variables
  • Linear Regression in a Nutshell
  • Predicting Web Traffic
  • Defining Correlation
  • 6. Regularization: Text Regression
  • Nonlinear Relationships Between Columns: Beyond Straight Lines
  • Introducing Polynomial Regression
  • Methods for Preventing Overfitting
  • Preventing Overfitting with Regularization
  • Text Regression
  • Logistic Regression to the Rescue
  • 7. Optimization: Breaking Codes
  • Introduction to Optimization
  • Ridge Regression
  • Code Breaking as Optimization
  • 8. PCA: Building a Market Index
  • Unsupervised Learning
  • 9. MDS: Visually Exploring US Senator Similarity.
  • Contents note continued: Clustering Based on Similarity
  • A Brief Introduction to Distance Metrics and Multidirectional Scaling
  • How Do US Senators Cluster?
  • Analyzing US Senator Roll Call Data (101st--111th Congresses)
  • 10. kNN: Recommendation Systems
  • The k-Nearest Neighbors Algorithm
  • R Package Installation Data
  • 11. Analyzing Social Graphs
  • Social Network Analysis
  • Thinking Graphically
  • Hacking Twitter Social Graph Data
  • Working with the Google SocialGraph API
  • Analyzing Twitter Networks
  • Local Community Structure
  • Visualizing the Clustered Twitter Network with Gephi
  • Building Your Own "Who to Follow" Engine
  • 12. Model Comparison
  • SVMs: The Support Vector Machine
  • Comparing Algorithms.