{"product_id":"social-media-data-mining-and-analytics-9781118824856","title":"Social Media Data Mining and Analytics","description":"\u003cb\u003eBook Synopsis\u003c\/b\u003e\u003cbr\u003eHarness the power of social media to predict customer behavior and improve sales    Social media is the biggest source of Big Data. Because of this, 90% of Fortune 500 companies are investing in Big Data initiatives that will help them predict consumer behavior to produce better sales results. Written by Dr.\u003cbr\u003e\u003cbr\u003e\u003cb\u003eTable of Contents\u003c\/b\u003e\u003cbr\u003e\u003cp\u003eIntroduction xvii\u003c\/p\u003e \u003cp\u003e\u003cb\u003eChapter 1 Users: TheWho of Social Media 1\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eMeasuring Variations in User Behavior in Wikipedia 2\u003c\/p\u003e \u003cp\u003eThe Diversity of User Activities 3\u003c\/p\u003e \u003cp\u003eThe Origin of the User Activity Distribution 12\u003c\/p\u003e \u003cp\u003eThe Consequences of the Power Law 20\u003c\/p\u003e \u003cp\u003eThe Long Tail in Human Activities 25\u003c\/p\u003e \u003cp\u003eLong Tails Everywhere: The 80\/20 Rule (\u003ci\u003ep\/q\u003c\/i\u003e Rule) 28\u003c\/p\u003e \u003cp\u003eOnline Behavior on Twitter 32\u003c\/p\u003e \u003cp\u003eRetrieving Tweets for Users 33\u003c\/p\u003e \u003cp\u003eLogarithmic Binning 36\u003c\/p\u003e \u003cp\u003eUser Activities on Twitter 37\u003c\/p\u003e \u003cp\u003eSummary 39\u003c\/p\u003e \u003cp\u003e\u003cb\u003eChapter 2 Networks: The \u003ci\u003eHow\u003c\/i\u003e of Social Media 41\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eTypes and Properties of Social Networks 42\u003c\/p\u003e \u003cp\u003eWhen Users Create the Connections: Explicit Networks 43\u003c\/p\u003e \u003cp\u003eDirected Versus Undirected Graphs 45\u003c\/p\u003e \u003cp\u003eNode and Edge Properties 45\u003c\/p\u003e \u003cp\u003eWeighted Graphs 46\u003c\/p\u003e \u003cp\u003eCreating Graphs from Activities: Implicit Networks 48\u003c\/p\u003e \u003cp\u003eVisualizing Networks 51\u003c\/p\u003e \u003cp\u003eDegrees: The Winner Takes All 55\u003c\/p\u003e \u003cp\u003eCounting the Number of Connections 57\u003c\/p\u003e \u003cp\u003eThe Long Tail in User Connections 58\u003c\/p\u003e \u003cp\u003eBeyond the Idealized Network Model 62\u003c\/p\u003e \u003cp\u003eCapturing Correlations: Triangles, Clustering, and Assortativity 64\u003c\/p\u003e \u003cp\u003eLocal Triangles and Clustering 64\u003c\/p\u003e \u003cp\u003eAssortativity 70\u003c\/p\u003e \u003cp\u003eSummary 75\u003c\/p\u003e \u003cp\u003e\u003cb\u003eChapter 3 Temporal Processes: The \u003ci\u003eWhen\u003c\/i\u003e of Social Media 77\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eWhat Traditional Models Tell You About Events in Time 77\u003c\/p\u003e \u003cp\u003eWhen Events Happen Uniformly in Time 79\u003c\/p\u003e \u003cp\u003eInter-Event Times 81\u003c\/p\u003e \u003cp\u003eComparing to a Memoryless Process 86\u003c\/p\u003e \u003cp\u003eAutocorrelations 89\u003c\/p\u003e \u003cp\u003eDeviations from Memorylessness 91\u003c\/p\u003e \u003cp\u003ePeriodicities in Time in User Activities 93\u003c\/p\u003e \u003cp\u003eBursty Activities of Individuals 99\u003c\/p\u003e \u003cp\u003eCorrelations and Bursts 105\u003c\/p\u003e \u003cp\u003eReservoir Sampling 106\u003c\/p\u003e \u003cp\u003eForecasting Metrics in Time 110\u003c\/p\u003e \u003cp\u003eFinding Trends 112\u003c\/p\u003e \u003cp\u003eFinding Seasonality 115\u003c\/p\u003e \u003cp\u003eForecasting Time Series with ARIMA 117\u003c\/p\u003e \u003cp\u003eThe Autoregressive Part (“AR”) 118\u003c\/p\u003e \u003cp\u003eThe Moving Average Part (“MA”) 119\u003c\/p\u003e \u003cp\u003eThe Full ARIMA(p, d, q) Model 119\u003c\/p\u003e \u003cp\u003eSummary 121\u003c\/p\u003e \u003cp\u003e\u003cb\u003eChapter 4 Content: The \u003ci\u003eWhat\u003c\/i\u003e of Social Media 123\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eDefining Content: Focus on Text and Unstructured Data 123\u003c\/p\u003e \u003cp\u003eCreating Features from Text: The Basics of Natural Language Processing 125\u003c\/p\u003e \u003cp\u003eThe Basic Statistics of Term Occurrences in Text 128\u003c\/p\u003e \u003cp\u003eUsing Content Features to Identify Topics 129\u003c\/p\u003e \u003cp\u003eThe Popularity of Topics 138\u003c\/p\u003e \u003cp\u003eHow Diverse Are Individual Users’ Interests? 141\u003c\/p\u003e \u003cp\u003eExtracting Low-Dimensional Information from High-Dimensional Text 144\u003c\/p\u003e \u003cp\u003eTopic Modeling 145\u003c\/p\u003e \u003cp\u003eUnsupervised Topic Modeling 147\u003c\/p\u003e \u003cp\u003eSupervised Topic Modeling 155\u003c\/p\u003e \u003cp\u003eRelational Topic Modeling 162\u003c\/p\u003e \u003cp\u003eSummary 169\u003c\/p\u003e \u003cp\u003e\u003cb\u003eChapter 5 Processing Large Datasets 171\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eMap Reduce: Structuring Parallel and Sequential Operations 172\u003c\/p\u003e \u003cp\u003eCounting Words 174\u003c\/p\u003e \u003cp\u003eSkew: The Curse of the Last Reducer 177\u003c\/p\u003e \u003cp\u003eMulti-Stage MapReduce Flows 179\u003c\/p\u003e \u003cp\u003eFan-Out 180\u003c\/p\u003e \u003cp\u003eMerging Data Streams 181\u003c\/p\u003e \u003cp\u003eJoining Two Data Sources 183\u003c\/p\u003e \u003cp\u003eJoining Against Small Datasets 186\u003c\/p\u003e \u003cp\u003eModels of Large-Scale MapReduce 187\u003c\/p\u003e \u003cp\u003ePatterns in MapReduce Programming 188\u003c\/p\u003e \u003cp\u003eStatic MapReduce Jobs 188\u003c\/p\u003e \u003cp\u003eIterative MapReduce Jobs 195\u003c\/p\u003e \u003cp\u003ePageRank for Ranking in Graphs 195\u003c\/p\u003e \u003cp\u003eK-means Clustering 199\u003c\/p\u003e \u003cp\u003eIncremental MapReduce Jobs 203\u003c\/p\u003e \u003cp\u003eTemporal MapReduce Jobs 204\u003c\/p\u003e \u003cp\u003eRollups and Data Cubing 205\u003c\/p\u003e \u003cp\u003eExpanding Rollup Jobs 211\u003c\/p\u003e \u003cp\u003eChallenges with Processing Long-Tailed Social Media Data 212\u003c\/p\u003e \u003cp\u003eSampling and Approximations: Getting Results with Less Computation 214\u003c\/p\u003e \u003cp\u003eHyperLogLog 217\u003c\/p\u003e \u003cp\u003eHyperLogLog Example 219\u003c\/p\u003e \u003cp\u003eHyperLogLog on the Stack Exchange Dataset 221\u003c\/p\u003e \u003cp\u003ePerformance of HLL on Large Datasets 222\u003c\/p\u003e \u003cp\u003eBloom Filters 223\u003c\/p\u003e \u003cp\u003eA Bloom Filter Example 226\u003c\/p\u003e \u003cp\u003eBloom Filter as Pre-Computed Membership Knowledge 228\u003c\/p\u003e \u003cp\u003eBloom Filters on Large Social Datasets 229\u003c\/p\u003e \u003cp\u003eCount-Min Sketch 231\u003c\/p\u003e \u003cp\u003eCount-Min Sketch—Heavy Hitters Example 233\u003c\/p\u003e \u003cp\u003eCount-Min Sketch—Top Percentage Example 235\u003c\/p\u003e \u003cp\u003eAggregating Approximate Data Structures 235\u003c\/p\u003e \u003cp\u003eSummary of Approximations 236\u003c\/p\u003e \u003cp\u003eExecuting on a Hadoop Cluster (Amazon EC2) 237\u003c\/p\u003e \u003cp\u003eInstalling a CDH Cluster on Amazon EC2 237\u003c\/p\u003e \u003cp\u003eProviding IAM Access to Collaborators 241\u003c\/p\u003e \u003cp\u003eAdding On-Demand Cluster Capabilities 242\u003c\/p\u003e \u003cp\u003eSummary 243\u003c\/p\u003e \u003cp\u003e\u003cb\u003eChapter 6 Learn, Map, and Recommend 245\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eSocial Media Services Online 246\u003c\/p\u003e \u003cp\u003eSearch Engines 246\u003c\/p\u003e \u003cp\u003eContent Engagement 246\u003c\/p\u003e \u003cp\u003eInteractions with the Real World 248\u003c\/p\u003e \u003cp\u003eInteractions with People 249\u003c\/p\u003e \u003cp\u003eProblem Formulation 251\u003c\/p\u003e \u003cp\u003eLearning and Mapping 253\u003c\/p\u003e \u003cp\u003eMatrix Factorization 255\u003c\/p\u003e \u003cp\u003eLearning, Training 257\u003c\/p\u003e \u003cp\u003eUnder- and Overfitting 257\u003c\/p\u003e \u003cp\u003eRegularizing in Matrix Factorization 259\u003c\/p\u003e \u003cp\u003eNon-Negative Matrix Factorization and Sparsity 260\u003c\/p\u003e \u003cp\u003eDemonstration on Movie Ratings 261\u003c\/p\u003e \u003cp\u003eInterpreting the Learned Stereotypes 265\u003c\/p\u003e \u003cp\u003eExploratory Analysis 269\u003c\/p\u003e \u003cp\u003ePrediction and Recommendation 274\u003c\/p\u003e \u003cp\u003eEvaluation 277\u003c\/p\u003e \u003cp\u003eOverview of Methodologies 278\u003c\/p\u003e \u003cp\u003eNearest Neighbor-Based Approaches 278\u003c\/p\u003e \u003cp\u003eApproaches Based on Supervised Learning 280\u003c\/p\u003e \u003cp\u003ePredicting Movie Ratings with Logistic Regression 280\u003c\/p\u003e \u003cp\u003eCommon Issues with Features 288\u003c\/p\u003e \u003cp\u003eDomain-Specific Applications 289\u003c\/p\u003e \u003cp\u003eSummary 290\u003c\/p\u003e \u003cp\u003e\u003cb\u003eChapter 7 Conclusions 293\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eThe Surprising Stability of Human Interaction Patterns 293\u003c\/p\u003e \u003cp\u003eAverages, Standard Deviations, and Sampling 296\u003c\/p\u003e \u003cp\u003eRemoving Outliers 303\u003c\/p\u003e \u003cp\u003eIndex 309\u003c\/p\u003e","brand":"John Wiley \u0026 Sons","offers":[{"title":"Default Title","offer_id":53186829844823,"sku":"9781118824856","price":999.99,"currency_code":"GBP","in_stock":false}],"url":"https:\/\/bookcurl.com\/products\/social-media-data-mining-and-analytics-9781118824856","provider":"Book Curl","version":"1.0","type":"link"}