-
内容大纲
社交网站数据如同深埋地下的“金矿”,如何利用这些数据来发现哪些人正通过社交媒介进行联系?他们正在谈论什么?或者他们在哪儿?本书第2版对上一版内容进行了全面更新和修订,它将揭示回答这些问题的方法与技巧。你将学到如何获取、分析和汇总散落于社交网站(包括Facebook、Twitter、LinkedIn、Google+、 GitHub、邮件、网站和博客等)的数据,以及如何通过可视化找到你一直在社交世界中寻找的内容和你闻所未闻的有用信息。 -
作者介绍
-
目录
Preface
Part I. A Guided Tour of the Social Web
Prelude
1. Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking
About, and More
1.1 Overview
1.2 Why Is Twitter All the Rage?
1.3 Exploring Twitter's API
1.3.1 Fundamental Twitter Terminology
1.3.2 Creating a Twitter API Connection
1.3.3 Exploring Trending Topics
1.3.4 Searching for Tweets
1.4 Analyzing the 140 (or More) Characters
1.4.1 Extracting Tweet Entities
1.4.2 Analyzing Tweets and Tweet Entities with Frequency Analysis
1.4.3 Computing the Lexical Diversity of Tweets
1.4.4 Examining Patterns in Retweets
1.4.5 Visualizing Frequency Data with Histograms
1.5 Closing Remarks
1.6 Recommended Exercises
1.7 Online Resources
2. Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
2.1 Overview
2.2 Exploring Facebook's Graph API
2.2.1 Understanding the Graph API
2.2.2 Understanding the Open Graph Protocol
2.3 Analyzing Social Graph Connections
2.3.1 Analyzing Facebook Pages
2.3.2 Manipulating Data Using pandas
2.4 Closing Remarks
2.5 Recommended Exercises
2.6 Online Resources
3. Mining Instagram: Computer Vision, Neural Networks, Object Recognition,
and Face Detection
3.1 Overview
3.2 Exploring the Instagram API
3.2.1 Making Instagram API Requests
3.2.2 Retrieving Your Own Instagram Feed
3.2.3 Retrieving Media by Hashtag
3.3 Anatomy of an Instagram Post
3.4 Crash Course on Artificial Neural Networks
3.4.1 Training a Neural Network to "Look" at Pictures
3.4.2 Recognizing Handwritten Digits
3.4.3 Object Recognition Within Photos Using Pretrained Neural
Networks
3.5 Applying Neural Networks to Instagram Posts
3.5.1 Tagging the Contents of an Image
3.5.2 Detecting Faces in Images
3.6 Closing Remarks
3.7 Recommended Exercises
3.8 Online Resources
4. Mining Linkeflln: Faceting Job Titles, Clustering Colleagues, and More
4.1 Overview
4.2 Exploring the LinkedIn API
4.2.1 Making LinkedIn API Requests
4.2.2 Downloading LinkedIn Connections as a CSV File
4.3 Crash Course on Clustering Data
4.3.1 Normalizing Data to Enable Analysis
4.3.2 Measuring Similarity
4.3.3 Clustering Algorithms
4.4 Closing Remarks /
4.5 Recommended Exercises
4.6 Online Resources
5. Mining Text Files: Computing Document Similarity, Extracting Collocations, and More.
5.1 Overview
5.2 Text Files
5.3 A Whiz-Bang Introduction to TF-IDF
5.3.1 Term Frequency
5.3.2 Inverse Document Frequency
5.3.3 TF-IDF
5.4 Querying Human Language Data with TF-IDF
5.4.1 Introducing the Natural Language Toolkit
5.4.2 Applying TF-IDF to Human Language
5.4.3 Finding Similar Documents
5.4.4 Analyzing Bigrams in Human Language
5.4.5 Reflections on Analyzing Human Language Data
5.5 Closing Remarks
5.6 Recommended Exercises
5.7 Online Resources
6. Mining Web Pages: Using Natural Language Processing to Understand Human
Language, Summarize Blog Posts, and More
6.1 Overview
6.2 Scraping, Parsing, and Crawling the Web
6.2.1 Breadth-First Search in Web Crawling
6.3 Discovering Semantics by Decoding Syntax
6.3.1 Natural Language Processing Illustrated Step-by-Step
6.3.2 Sentence Detection in Human Language Data
6.3.3 Document Summarization
6.4 Entity-Centric Analysis: A Paradigm Shift
6.4.1 Gisting Human Language Data
6.5 Quality of Analytics for Processing Human Language Data
6.6 Closing Remarks
6.7 Recommended Exercises
6.8 Online Resources
7. Mining Mailboxes: Analyzing Who's Talking to Whom About What,
How Often, and More
7.1 Overview
7.2 Obtaining and Processing a Mail Corpus
7.2.1 A Primer on Unix Mailboxes
7.2.2 Getting the Enron Data
7.2.3 Converting a Mail Corpus to a Unix Mailbox
7.2.4 Converting Unix Mailboxes to pandas DataFrames
7.3 Analyzing the Enron Corpus
7.3.1 Querying by Date/Time Range
7.3.2 Analyzing Patterns in Sender/Recipient Communications
7.3.3 Searching Emails by Keywords
7.4 Analyzing Your Own Mail Data
7.4.1 Accessing Your Gmail with OAuth
7.4.2 Fetching and Parsing Email Messages
7.4.3 Visualizing Patterns in Email with Immersion
7.5 Closing Remarks
7.6 Recommended Exercises
7.7 Online Resources
8. Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs,
and More
8.1 Overview
8.2 Exploring GitHub's API
8.2.1 Creating a GitHub API Connection
8.2.2 Making GitHub API Requests
8.3 Modeling Data with Property Graphs
8.4 Analyzing GitHub Interest Graphs
8.4.1 Seeding an Interest Graph
8.4.2 Computing Graph Centrality Measures
8.4.3 Extending the Interest Graph with "Follows" Edges for Users
8.4.4 Using Nodes as Pivots for More Efficient Queries
8.4.5 Visualizing Interest Graphs
8.5 Closing Remarks
8.6 Recommended Exercises
8.7 Online Resources
Part II. Twitter Cookbook
9. Twitter Cookbook
9.1 Accessing Twitter's API for Development Purposes
9.2 Doing the OAuth Dance to Access Twitter's API for Production Purposes
9.3 Discovering the Trending Topics
9.4 Searching for Tweets
9.5 Constructing Convenient Function Calls
9.6 Saving and Restoring ]SON Data with Text Files
9.7 Saving and Accessing JSON Data with MongoDB /
9.8 Sampling the Twitter Firehose with the Streaming API
9.9 Collecting Time-Series Data
9.10 Extracting Tweet Entities
9.11 Finding the Most Popular Tweets in a Collection of Tweets
9.12 Finding the Most Popular Tweet Entities in a Collection of Tweets
9.13 Tabulating Frequency Analysis
9.14 Finding Users Who Have Retweeted a Status
9.15 Extracting a Retweet's Attribution
9.16 Making Robust Twitter Requests
9.17 Resolving User Profile Information
9.18 Extracting Tweet Entities from Arbitrary Text
9.19 Getting All Friends or Followers for a User
9.20 Analyzing a User's Friends and Followers
9.21 Harvesting a User's Tweets
9.22 Crawling a Friendship Graph
9.23 Analyzing Tweet Content
9.24 Summarizing Link Targets
9.25 Analyzing a User's Favorite Tweets
9.26 Closing Remarks
9.27 Recommended Exercises
9.28 Online Resources
Part III. Appendixes
A. Information About This Book's Virtual Machine Experience
B. OAuth Primer
C. Python and Jupyter Notebook Tips and Tricks
Index
同类热销排行榜
- C语言与程序设计教程(高等学校计算机类十二五规划教材)16
- 电机与拖动基础(教育部高等学校自动化专业教学指导分委员会规划工程应用型自动化专业系列教材)13.48
- 传感器与检测技术(第2版高职高专电子信息类系列教材)13.6
- ASP.NET项目开发实战(高职高专计算机项目任务驱动模式教材)15.2
- Access数据库实用教程(第2版十二五职业教育国家规划教材)14.72
- 信号与系统(第3版下普通高等教育九五国家级重点教材)15.08
- 电气控制与PLC(普通高等教育十二五电气信息类规划教材)17.2
- 数字电子技术基础(第2版)17.36
- VB程序设计及应用(第3版十二五职业教育国家规划教材)14.32
- Java Web从入门到精通(附光盘)/软件开发视频大讲堂27.92
推荐书目
-
孩子你慢慢来/人生三书 华人世界率性犀利的一枝笔,龙应台独家授权《孩子你慢慢来》20周年经典新版。她的《...
-
时间简史(插图版) 相对论、黑洞、弯曲空间……这些词给我们的感觉是艰深、晦涩、难以理解而且与我们的...
-
本质(精) 改革开放40年,恰如一部四部曲的年代大戏。技术突变、产品迭代、产业升级、资本对接...