Data warehousing Books

69 products


  • Data Pipelines Pocket Reference

    O'Reilly Media Data Pipelines Pocket Reference

    5 in stock

    Book SynopsisData pipelines are the foundation for success in data analytics. Moving data from numerous diverse sources and transforming it to provide context is the difference between having data and actually gaining value from it. This pocket reference defines data pipelines and explains how they work in today's modern data stack.

    5 in stock

    £19.19

  • Fundamentals of Data Observability

    O'Reilly Media Fundamentals of Data Observability

    15 in stock

    Book Synopsis

    15 in stock

    £39.74

  • Pandas for Everyone

    Pearson Education (US) Pandas for Everyone

    15 in stock

    Book SynopsisDaniel Chen is a graduate student in the Interdisciplinary PhD program in Genetics, Bioinformatics & Computational Biology (GBCB) at Virginia Polytechnic Institute and State University (Virginia Tech). He is involved with Software Carpentry as an instructor, Mentoring Committee Member, and currently serves as the Assessment Committee Chair. He completed his Masters in Public Health at Columbia University Mailman School of Public Health in Epidemiology with a certificate in Advanced Epidemiology and currently extending his Master's thesis work in the Social and Decision Analytics Laboratory under the Virginia Bioinformatics Institute on attitude diffusion in social networks.Table of ContentsForeword by Anne M. Brown xxiii Foreword by Jared Lander xxv Preface xxvii Changes in the Second Edition xxxix Part I: Introduction 1 Chapter 1. Pandas DataFrame Basics 3 Learning Objectives 3 1.1 Introduction 3 1.2 Load Your First Data Set 4 1.3 Look at Columns, Rows, and Cells 6 1.4 Grouped and Aggregated Calculations 23 1.5 Basic Plot 27 Conclusion 28 Chapter 2. Pandas Data Structures Basics 31 Learning Objectives 31 2.1 Create Your Own Data 31 2.2 The Series 33 2.3 The DataFrame 42 2.4 Making Changes to Series and DataFrames 45 2.5 Exporting and Importing Data 52 Conclusion 63 Chapter 3. Plotting Basics 65 Learning Objectives 65 3.1 Why Visualize Data? 65 3.2 Matplotlib Basics 66 3.3 Statistical Graphics Using matplotlib 72 3.4 Seaborn 78 3.5 Pandas Plotting Method 111 Conclusion 115 Chapter 4. Tidy Data 117 Learning Objectives 117 Note About This Chapter 117 4.1 Columns Contain Values, Not Variables 118 4.2 Columns Contain Multiple Variables 122 4.3 Variables in Both Rows and Columns 126 Conclusion 129 Chapter 5. Apply Functions 131 Learning Objectives 131 Note About This Chapter 131 5.1 Primer on Functions 131 5.2 Apply (Basics) 133 5.3 Vectorized Functions 138 5.4 Lambda Functions (Anonymous Functions) 141 Conclusion 142 Part II: Data Processing 143 Chapter 6. Data Assembly 145 Learning Objectives 145 6.1 Combine Data Sets 145 6.2 Concatenation 146 6.3 Observational Units Across Multiple Tables 154 6.4 Merge Multiple Data Sets 160 Conclusion 167 Chapter 7. Data Normalization 169 Learning Objectives 169 7.1 Multiple Observational Units in a Table (Normalization) 169 Conclusion 173 Chapter 8. Groupby Operations: Split-Apply-Combine 175 Learning Objectives 175 8.1 Aggregate 176 8.2 Transform 184 8.3 Filter 188 8.4 The pandas.core.groupby.DataFrameGroupBy object 190 8.5 Working with a MultiIndex 195 Conclusion 199 Part III: Data Types 203 Chapter 9. Missing Data 203 Learning Objectives 203 9.1 What Is a NaN Value? 203 9.2 Where Do Missing Values Come From? 205 9.3 Working with Missing Data 210 9.4 Pandas Built-In NA Missing 216 Conclusion 218 Chapter 10. Data Types 219 Learning Objectives 219 10.1 Data Types 219 10.2 Converting Types 220 10.3 Categorical Data 225 Conclusion 227 Chapter 11. Strings and Text Data 229 Introduction 229 Learning Objectives 229 11.1 Strings 229 11.2 String Methods 233 11.3 More String Methods 234 11.4 String Formatting (F-Strings) 236 11.5 Regular Expressions (RegEx) 239 11.6 The regex Library 247 Conclusion 247 Chapter 12. Dates and Times 249 Learning Objectives 249 12.1 Python's datetime Object 249 12.2 Converting to datetime 250 12.3 Loading Data That Include Dates 253 12.4 Extracting Date Components 254 12.5 Date Calculations and Timedeltas 257 12.6 Datetime Methods 259 12.7 Getting Stock Data 261 12.8 Subsetting Data Based on Dates 263 12.9 Date Ranges 266 12.10 Shifting Values 270 12.11 Resampling 276 12.12 Time Zones 278 12.13 Arrow for Better Dates and Times 280 Conclusion 280 Part IV: Data Modeling 281 Chapter 13. Linear Regression (Continuous Outcome Variable) 283 13.1 Simple Linear Regression 283 13.2 Multiple Regression 287 13.3 Models with Categorical Variables 289 13.4 One-Hot Encoding in scikit-learn with Transformer Pipelines 294 Conclusion 296 Chapter 14. Generalized Linear Models 297 About This Chapter 297 14.1 Logistic Regression (Binary Outcome Variable) 297 14.2 Poisson Regression (Count Outcome Variable) 304 14.3 More Generalized Linear Models 308 Conclusion 309 Chapter 15. Survival Analysis 311 15.1 Survival Data 311 15.2 Kaplan Meier Curves 312 15.3 Cox Proportional Hazard Model 314 Conclusion 317 Chapter 16. Model Diagnostics 319 16.1 Residuals 319 16.2 Comparing Multiple Models 324 16.3 k-Fold Cross-Validation 329 Conclusion 334 Chapter 17. Regularization 335 17.1 Why Regularize? 335 17.2 LASSO Regression 337 17.3 Ridge Regression 338 17.4 Elastic Net 340 17.5 Cross-Validation 341 Conclusion 343 Chapter 18. Clustering 345 18.1 k-Means 345 18.2 Hierarchical Clustering 351 Conclusion 356 Part V. Conclusion 357 Chapter 19. Life Outside of Pandas 359 19.1 The (Scientific) Computing Stack 359 19.2 Performance 360 19.3 Dask 360 19.4 Siuba 360 19.5 Ibis 361 19.6 Polars 361 19.7 PyJanitor 361 19.8 Pandera 361 19.9 Machine Learning 361 19.10 Publishing 362 19.11 Dashboards 362 Conclusion 362 Chapter 20. It's Dangerous To Go Alone! 363 20.1 Local Meetups 363 20.2 Conferences 363 20.3 The Carpentries 364 20.4 Podcasts 364 20.5 Other Resources 365 Conclusion 365 Appendices 367 A. Concept Maps 369B. Installation and Setup 373C. Command Line 377D. Project Templates 379E. Using Python 381F. Working Directories 383G. Environments 385H. Install Packages 389I. Importing Libraries 391J. Code Style 393K. Containers: Lists, Tuples, and Dictionaries 395L. Slice Values 399M. Loops 401N. Comprehensions 403O. Functions 405P. Ranges and Generators 409Q. Multiple Assignment 413R. Numpy ndarray 415S. Classes 417T. SettingWithCopyWarning 419U. Method Chaining 423V. Timing Code 427W. String Formatting 429X. Conditionals (if-elif-else) 433Y. New York ACS Logistic Regression Example 435Z. Replicating Results in R 443 Index 451

    15 in stock

    £34.19

  • Streaming Databases

    O'Reilly Media Streaming Databases

    15 in stock

    Book Synopsis

    15 in stock

    £47.99

  • Kimballs Data Warehouse Toolkit Classics 3 Volume

    John Wiley & Sons Inc Kimballs Data Warehouse Toolkit Classics 3 Volume

    15 in stock

    Book Synopsis

    15 in stock

    £99.00

  • Database Modeling and Design

    Elsevier Science Database Modeling and Design

    15 in stock

    Book SynopsisHow do you model and design your database application in consideration of new technology or new business needs? This title is loaded with design rules and case studies that are applicable to any SQL, UML, or XML-based system. It is useful to those tasked with the creation of data models for the integration of large-scale enterprise data.Trade Review"Database Modeling and Design is one of the best books that I have seen for explaining how to build database applications. The book is informative, well-written, and concise." --Michael Blaha, DSc., Consultant, Modelsoft Consulting Corp"This book book is by far the best book available on classic database design. Topics like normalization and many-to-many and n-ary association semantics are without peer in teaching you how to model real-world complexities. This latest edition extends the classic material with extensive discussion of modern tools and other aspects of logical database design. Every database architect should have this book at hand." --Bob Muller, Data Analyst, Poesys Associates“The book is not only good for beginners, but it also provides greater insight for experienced learners. Perhaps this is why it has evolved into its fifth edition. The book is generally well organized. It starts with the first step in the database life cycle, and progresses in a chronological order to more advanced concepts such as object relational design, Extensible Markup Language (XML), and Web databases. The writing style of the book is simple and straightforward, and the use of database terminology is very concise…In my opinion, the book could be used as a course text, with some help from other sources to cover SQL query-related concepts. However, I would have liked a chapter on SQL that covered simple and complex query design, as well as optimization." --Computing ReviewsTable of Contents1. Introduction2. The Entity-Relationship Model3. Unified Modeling Language (UML)4. Requirements Analysis and Conceptual Modeling5. Transforming the Conceptual Data Model to SQL6. Normalization7. An Example of Logical Database Design8. Object Relational Design9. XML and Web Databases10. Business Intelligence11. CASE ToolsAppendix: The Basics of SQL

    15 in stock

    £40.49

  • Foundational Python for Data Science

    Pearson Education (US) Foundational Python for Data Science

    15 in stock

    Book Synopsis Kennedy Behrman is a veteran software and data engineer. He first used Python writing asset management systems in the Visual Effects industry. He then moved into the startup world, using Python at startups using machine learning to characterize videos and predict the social media power of athletes. Table of ContentsPreface xiii I: Learning Python in a Notebook Environment 1 1 Introduction to Notebooks 3 2 Fundamentals of Python 13 3 Sequences 25 4 Other Data Structures 37 5 Execution Control 55 6 Functions 67 II: Data Science Libraries 83 7 NumPy 85 8 SciPy 103 9 Pandas 113 10 Visualization Libraries 135 11 Machine Learning Libraries 153 12 Natural Language Toolkit 159 III: Intermediate Python 171 13 Functional Programming 173 14 Object-Oriented Programming 187 15 Other Topics 201 A Answers to End-of-Chapter Questions 215 Index 221

    15 in stock

    £40.49

  • The Herschel Objects and How to Observe Them

    Springer-Verlag New York Inc. The Herschel Objects and How to Observe Them

    15 in stock

    Book SynopsisAmateur astronomers are always on the lookout for new observing challenges. This is a practical guide to locating and viewing the most impressive of Herschel’s star clusters, nebulae and galaxies, cataloging more than 600 of the brightest objects, and offering detailed descriptions and images of 150 to 200 of the best.Trade ReviewFrom the reviews: "Mullaney packs an incredible amount of information into this 166-page book. … All in all, The Herschel Objects, and how to observe them is engaging, challenging, well-written, and comprehensive. So, if you love deep-sky observing – and even if you’ve observed the Astronomical League’s Herschel 400 – Mullaney’s book offers a new list with several hundred additional objects you’ll enjoy." (Michael Bakich, Astronomy Magazine, October, 2007) "The Herschel Objects and How to Observe Them is a fine addition to the Springer series of observing guides. Mullaney has been observing the Herschel objects for many years and his passion for them clearly comes across. … Overall though, this is a book that will be a useful addition to any deep-sky observer’s library." (Paul Money, BBC Sky at Night, February, 2008) "Mullaney begins with a well-written biographical sketch of Herschel and his family, and explains the significance of the work of this great observational astronomer. … the objects are illustrated with excellent images obtained using a modern charge-coupled device (CCD) system. The book concludes with a list of 618 targets that would provide for a lifetime of study. The book will be of greatest interest to experienced observers who wish to push on to the most challenging deep sky objects. … Summing Up: Recommended. General readers." (D. E. Hogg, CHOICE, Vol. 45 (6), February, 2008) "The book opens with a few short chapters on Herschel himself together with a brief introduction to observing techniques … . rounded out with some objects that the author regards as showpieces that were not discovered by Herschel. Any collection of these will of course be very subjective. … I found the book’s reproductions to be a cut above the usual Springer ones and the book does offers something sufficiently different … and the Astronomical League guides to make it worth adding to your collection." (Owen Brazell, The Observatory, Vol. 128 (1203), 2008)Table of ContentsWilliam Herschel's Life, Telescopes and Catalogs.- Herschel's Telescopes.- Herschel's Catalogs and Classes.- Observing Techniques.- Exploring The Herschel Showpieces.- Showpieces of Class I.- Showpieces of Class IV.- Showpieces of Class V.- Showpieces of Class VI.- Showpieces of Class VII.- Showpieces of Class VIII.- Samples of Classes II & III.- Showpieces Missed by Herschel.- The “Missing” Herschel Objects.- Conclusion.

    15 in stock

    £23.74

  • The Microsoft Data Warehouse Toolkit With SQL

    John Wiley & Sons Inc The Microsoft Data Warehouse Toolkit With SQL

    15 in stock

    Book SynopsisThe techniques pioneered by the Kimball Group have become the industry standard for data warehouse design, development, and management. In this new edition of the Microsoft Data Warehouse Toolkit, the authors share best practices for using these techniques in SQL Server 2008 R2 and Office 2010.Table of ContentsForeword xxvii Introduction xxix Part 1 Requirements, Realities, and Architecture 1 Chapter 1 Defining Business Requirements 3 The Most Important Determinant of Long-Term Success 5 Adventure Works Cycles Introduction 6 Uncovering Business Value 6 Obtaining Sponsorship 7 Defining Enterprise-Level Business Requirements 8 Prioritizing the Business Requirements 22 Revisiting the Project Planning 25 Gathering Project-Level Requirements 26 Summary 28 Chapter 2 Designing the Business Process Dimensional Model 29 Dimensional Modeling Concepts and Terminology 30 Facts 31 Dimensions 33 Bringing Facts and Dimensions Together 34 The Bus Matrix, Conformed Dimensions, and Drill Across 36 Additional Design Concepts and Techniques 38 Surrogate Keys 38 Slowly Changing Dimensions 39 Dates 42 Degenerate Dimensions 43 Snowflaking 43 Many-to-Many or Multivalued Dimensions 44 Hierarchies 47 Aggregate Dimensions 49 Junk Dimensions 51 The Three Fact Table Types 52 Aggregates 53 The Dimensional Modeling Process 54 Preparation 55 Data Profiling and Research 60 Building Dimensional Models 63 Developing the Detailed Dimensional Model 66 Testing and Refining the Model 68 Reviewing and Validating the Model 68 Case Study: The Adventure Works Cycles Orders Dimensional Model 69 The Orders Fact Table 69 The Dimensions 69 Identifying Dimension Attributes and Facts for the Orders Business Process 72 The Final Draft of the Initial Orders Model 74 Detailed Orders Dimensional Model Development 75 Final Dimensional Model 77 Summary 77 Chapter 3 The Toolset 79 The Microsoft DW/BI Toolset 80 Why Use the Microsoft Toolset? 82 Architecture of a Microsoft DW/BI System 83 Why Analysis Services? 84 Why a Relational Store? 86 ETL Is Not Optional 86 The Role of Master Data Services 88 Delivering BI Applications 88 Overview of the Microsoft Tools 89 Which Products Do You Need? 90 SQL Server Development and Management Tools 92 Summary 97 Chapter 4 System Setup 99 System Sizing Considerations 100 Calculating Data Volumes 101 Determining Usage Complexity 102 Estimating Simultaneous Users 104 Assessing System Availability Requirements 105 How Big Will It Be? 105 System Configuration Considerations 105 Memory 106 Monolithic or Distributed? 106 Storage System Considerations 110 Processors 113 Setting Up for High Availability 114 Software Installation and Configuration 115 Development Environment Software Requirements 116 Test and Production Software Requirements 120 Operating Systems 122 SQL Server Relational Database Setup 122 Analysis Services Setup 126 Integration Services Setup 129 Reporting Services Setup 130 Summary 131 Part 2 Building and Populating the Databases 133 Chapter 5 Creating the Relational Data Warehouse 135 Getting Started 136 Complete the Physical Design 137 Surrogate Keys 138 String Columns 138 To Null, or Not to Null? 140 Housekeeping Columns 140 Table and Column Extended Properties 142 Define Storage and Create Constraints and Supporting Objects 142 Create Files and Filegroups 142 Data Compression 144 Entity and Referential Integrity Constraints 145 Initial Indexing and Database Statistics 147 Aggregate Tables 150 Create Table Views 151 Insert an Unknown Member Row 152 Example CREATE TABLE Statement 152 Partitioned Tables 153 Finishing Up 163 Staging Tables 163 Metadata Setup 163 Summary 164 Chapter 6 Master Data Management 165 Managing Master Reference Data 166 Incomplete Attributes 167 Data Integration 168 Systems Integration 170 Master Data Management Systems and the Data Warehouse 171 Introducing SQL Server Master Data Services 171 Model Definition Features 172 Data Management Features 174 User Interface: Exploring and Managing the Master Data 174 Importing and Updating Data 176 Exporting Data 177 Full Versioning of All Attributes 179 Creating a Simple Application 179 Summary 186 Chapter 7 Designing and Developing the ETL System 187 Round Up the Requirements 188 Develop the ETL Plan 191 Introducing SQL Server Integration Services 192 Control Flow and Data Flow 194 SSIS Package Architecture 197 The Major Subsystems of ETL 198 Extracting Data 199 Subsystem 1: Data Profiling 199 Subsystem 2: Change Data Capture System 200 Subsystem 3: Extract System 202 Cleaning and Conforming Data 206 Subsystem 4: Data Cleaning System 206 Subsystem 5: Error Event Schema 214 Subsystem 6: Audit Dimension Assembler 215 Subsystem 7: Deduplication System 216 Subsystem 8: Conforming System 217 Delivering Data for Presentation 218 Subsystem 9: Slowly Changing Dimension Manager 218 Subsystem 10: Surrogate Key Generator 223 Subsystem 11: Hierarchy Manager 223 Subsystem 12: Special Dimensions Manager 224 Subsystem 13: Fact Table Builders 225 Subsystem 14: Surrogate Key Pipeline 229 Subsystem 15: Multi-Valued Dimension Bridge Table Builder 235 Subsystem 16: Late Arriving Data Handler 235 Subsystem 17: Dimension Manager 238 Subsystem 18: Fact Provider System 238 Subsystem 19: Aggregate Builder 239 Subsystem 20: OLAP Cube Builder 239 Subsystem 21: Data Propagation Manager 240 Managing the ETL Environment 240 Summary 243 Chapter 8 The Core Analysis Services OLAP Database 245 Overview of Analysis Services OLAP 247 Why Use Analysis Services? 247 Why Not Analysis Services? 249 Designing the OLAP Structure 250 Planning 251 Getting Started 253 Create a Project and a Data Source View 255 Dimension Designs 257 Creating and Editing Dimensions 261 Creating and Editing the Cube 274 Physical Design Considerations 291 Understanding Storage Modes 293 Developing the Partitioning Plan 294 Designing Performance Aggregations 296 Planning for Deployment 298 Processing the Full Cube 299 Developing the Incremental Processing Plan 299 Summary 304 Chapter 9 Design Requirements for Real-Time BI 305 Real-Time Triage 306 What Does Real-Time Mean? 306 Who Needs Real Time? 307 Real-Time Tradeoffs 308 Scenarios and Solutions 311 Executing Reports in Real Time 313 Serving Reports from a Cache 313 Creating an ODS with Mirrors and Snapshots 314 Creating an ODS with Replication 314 Building a BizTalk Application 315 Building a Real-Time Relational Partition 315 Querying Real-Time Data in the Relational Database 317 Using Analysis Services to Query Real-Time Data 318 Summary 319 Part 3 Developing the BI Applications 321 Chapter 10 Building BI Applications in Reporting Services 323 A Brief Overview of BI Applications 324 Types of BI Applications 325 The Value of Business Intelligence Applications 326 A High-Level Architecture for Reporting 328 Reviewing Business Requirements for Reporting 328 Examining the Reporting Services Architecture 330 Using Reporting Services as a Standard Reporting Tool 332 Reporting Services Assessment 339 The Reporting System Design and Development Process 340 Reporting System Design 341 Reporting System Development 348 Building and Delivering Reports 351 Planning and Preparation 351 Creating Reports 354 Reporting Operations 368 Ad Hoc Reporting Options 369 The Report Model 370 Shared Datasets 371 Report Parts 371 Summary 372 Chapter 11 PowerPivot and Excel 375 Using Excel for Analysis and Reporting 376 The PowerPivot Architecture: Excel on Steroids 378 Creating and Using PowerPivot Databases 380 Getting Started 381 PowerPivot Table Design 381 Creating Analytics with PowerPivot 385 Observations and Guidelines on PowerPivot for Excel 392 PowerPivot for SharePoint 394 The PowerPivot SharePoint User Experience 394 Server-Level Resources 397 PowerPivot Monitoring and Management 397 PowerPivot’s Role in a Managed DW/BI Environment 400 Summary 401 Chapter 12 The BI Portal and SharePoint 403 The BI Portal 404 Planning the BI Portal 405 Impact on Design 406 Business Process Categories 407 Additional Functions 408 Building the BI Portal 409 Using SharePoint as the BI Portal 411 Architecture and Concepts 412 Setting Up SharePoint 417 Summary 426 Chapter 13 Incorporating Data Mining 429 Defining Data Mining 430 Basic Data Mining Terminology 432 Business Uses of Data Mining 433 Roles and Responsibilities 440 SQL Server Data Mining Architecture Overview 440 The Data Mining Design Environment 442 Build, Deploy, and Process 442 Accessing the Mining Models 443 Integration Services and Data Mining 443 Additional Features 444 Architecture Summary 445 Microsoft Data Mining Algorithms 445 Decision Trees 446 Naïve Bayes 447 Clustering 448 Sequence Clustering 448 Time Series 449 Association 449 Neural Network 449 The Data Mining Process 450 The Business Phase 451 The Data Mining Phase 453 The Operations Phase 460 Metadata 462 Data Mining Examples 463 Case Study: Categorizing Cities 463 Case Study: Product Recommendations 472 Summary 488 Part 4 Deploying and Managing the DW/BI System 491 Chapter 14 Designing and Implementing Security 493 Identifying the Security Manager 494 Securing the Hardware and Operating System 495 Securing the Operating System 495 Using Windows Integrated Security 496 Securing the Development Environment 497 Securing the Data 498 Providing Open Access for Internal Users 498 Itemizing Sensitive Data 500 Securing Various Types of Data Access 500 Securing the Components of the DW/BI System 502 Reporting Services Security 502 Analysis Services Security 505 Relational DW Security 514 Integration Services Security 520 Usage Monitoring 521 Summary 521 Chapter 15 Metadata Plan 523 Metadata Basics 524 The Purpose of Metadata 524 Metadata Categories 525 The Metadata Repository 526 Metadata Standards 526 SQL Server 2008 R2 Metadata 527 Cross-Tool Components 528 Relational Engine Metadata 532 Analysis Services 532 Integration Services 533 Reporting Services 533 Master Data Services 534 SharePoint 534 External Metadata Sources 534 Looking to the Future 535 A Practical Metadata Approach 535 Creating the Metadata Strategy 536 Business Metadata Reporting 538 Process Metadata Reporting 541 Technical Metadata Reporting 542 Ongoing Metadata Management 543 Summary 543 Chapter 16 Deployment 545 Setting Up the Environments 546 Testing 550 Development Testing 551 System Testing 555 Data Quality Assurance Testing 557 Performance Testing 559 Usability Testing 562 Testing Summary 563 Deploying to Production 564 Relational Database Deployment 565 Integration Services Package Deployment 567 Analysis Services Database Deployment 568 Reporting Services Report Deployment 571 Master Data Services Deployment 572 Data Warehouse and BI Documentation 573 Core Descriptions 573 Additional Documentation 575 User Training 576 User Support 579 Desktop Readiness and Configuration 580 Summary 581 Chapter 17 Operations and Maintenance 583 Providing User Support 584 Maintaining the BI Portal 585 Extending the BI Applications 586 System Management 587 Governing the DW/BI System 588 Performance Monitoring 593 Usage Monitoring 600 Managing Disk Space 602 Service and Availability Management 603 Performance Tuning the DW/BI System 604 Backup and Recovery 606 Executing the ETL Packages 611 Summary 611 Chapter 18 Present Imperatives and Future Outlook 613 Growing the DW/BI System 613 Lifecycle Review with Common Problems 615 Phase I — ​Requirements, Realities, Plans, and Designs 616 Phase II — ​Developing the Databases 616 Phase III — ​Developing the BI Applications and Portal Environment 617 Phase IV — ​Deploying and Managing the DW/BI System 618 Iteration and Growth 618 What We Like in the Microsoft BI Toolset 619 Future Directions: Room for Improvement 620 Conclusion 623 Index 625

    15 in stock

    £36.09

  • Dark Data

    Princeton University Press Dark Data

    7 in stock

    Book SynopsisTrade Review"[A] penetrating study of missing (‘dark’) data and its impacts on decisions—skewing stats, enabling fraud, embedding inequity and triggering preventable catastrophes. Advocating ‘data science judo,’ Hand offers expert training, from recognizing when facts are being cherry-picked to designing randomized trials. A book illuminating shadowed corners in science, medicine and policy."---Barbara Kiser, Nature"A tour de force. . . . Hand is a good and able guide to take us through the many aspects of dark data that are potentially skewing our understanding of real world observations and potential scientific breakthroughs. He writes in an accessible and understandable way too."---Simon Cocking, Irish Tech News"Well-written and accessible."---Tim Harford, Undercover Economist"You need to read [Dark Data], and be convinced by David’s reasoning and his examples of cases in which unseen or unreported data play a critical and sometimes even a fatal role. You are likely to walk away with the feeling that the term dark data is indeed a very effective one to arouse both curiosity and suspicion, mixed with happiness that finally a great term was coined by a statistician—and sadness that the statistician is not you."---Xiao-Li Meng, IMS Bulletin"An exploration of a major problem in data analysis with an attempt of classification, analysing causes, mechanisms, and to some extent also suggest mitigations."---Adhemar Bultheel, European Mathematical Society"An excellent guide to the many reasons for caution in interpreting data."---Diane Coyle, Enlightened Economist

    7 in stock

    £21.25

  • Dark Data

    Princeton University Press Dark Data

    15 in stock

    Book SynopsisTrade Review"[A] penetrating study of missing (‘dark’) data and its impacts on decisions—skewing stats, enabling fraud, embedding inequity and triggering preventable catastrophes. Advocating ‘data science judo,’ Hand offers expert training, from recognizing when facts are being cherry-picked to designing randomized trials. A book illuminating shadowed corners in science, medicine and policy."---Barbara Kiser, Nature"A tour de force. . . . Hand is a good and able guide to take us through the many aspects of dark data that are potentially skewing our understanding of real world observations and potential scientific breakthroughs. He writes in an accessible and understandable way too."---Simon Cocking, Irish Tech News"Well-written and accessible."---Tim Harford, Undercover Economist"You need to read [Dark Data], and be convinced by David’s reasoning and his examples of cases in which unseen or unreported data play a critical and sometimes even a fatal role. You are likely to walk away with the feeling that the term dark data is indeed a very effective one to arouse both curiosity and suspicion, mixed with happiness that finally a great term was coined by a statistician—and sadness that the statistician is not you."---Xiao-Li Meng, IMS Bulletin"An exploration of a major problem in data analysis with an attempt of classification, analysing causes, mechanisms, and to some extent also suggest mitigations."---Adhemar Bultheel, European Mathematical Society"An excellent guide to the many reasons for caution in interpreting data."---Diane Coyle, Enlightened Economist

    15 in stock

    £15.29

  • Efficient MySQL Performance

    O'Reilly Media Efficient MySQL Performance

    2 in stock

    Book SynopsisThis practical book bridges the gap by teaching software engineers mid-level MySQL knowledge beyond the fundamentals, but well shy of deep-level internals required by database administrators (DBAs).

    2 in stock

    £39.74

  • Learning Google Analytics

    O'Reilly Media Learning Google Analytics

    15 in stock

    Book SynopsisAuthor Mark Edmondson, Google Developer Expert for Google Analytics and Google Cloud, provides a concise yet comprehensive overview of GA4 and its cloud integrations.

    15 in stock

    £39.74

  • The Cloud Data Lake

    O'Reilly Media The Cloud Data Lake

    7 in stock

    Book SynopsisAuthor Rukmani Gopalan, a product management leader and data enthusiast, guides data architects and engineers through the major aspects of working with a cloud data lake, from design considerations and best practices to data format optimizations, performance optimization, cost management, and governance.

    7 in stock

    £39.74

  • Amazon Redshift The Definitive Guide

    O'Reilly Media Amazon Redshift The Definitive Guide

    5 in stock

    Book SynopsisThis practical guide thoroughly examines this managed service and demonstrates how you can use it to extract value from your data immediately, rather than go through the heavy lifting required to run a typical data warehouse.

    5 in stock

    £47.99

  • How To Make Things Faster

    O'Reilly Media How To Make Things Faster

    15 in stock

    Book SynopsisThis book explains in a clear and thoughtful voice why systems perform the way they do. It's for anybody who's curious about how computer programs and other processes use their time and about what you can do to improve them.

    15 in stock

    £33.74

  • Data Smart

    John Wiley & Sons Inc Data Smart

    15 in stock

    Book SynopsisTable of ContentsIntroduction xix 1 Everything You Ever Needed to Know About Spreadsheets but Were Too Afraid to Ask 1 Some Sample Data 2 Accessing Quick Descriptive Statistics 3 Excel Tables 4 Filtering and Sorting 5 Table Formatting 7 Structured References 7 Adding Table Columns 10 Lookup Formulas 11 VLOOKUP 11 INDEX/MATCH 13 XLOOKUP 15 PivotTables 16 Using Array Formulas 19 Solving Stuff with Solver 20 2 Set It and Forget It: An Introduction to Power Query 27 What Is Power Query? 27 Sample Data 28 Starting Power Query 29 Filtering Rows 32 Removing Columns 33 Find & Replace 34 Close & Load to Table 35 3 Naïve Bayes and the Incredible Lightness of Being an Idiot 39 The World's Fastest Intro to Probability Theory 39 Totaling Conditional Probabilities 40 Joint Probability, the Chain Rule, and Independence 40 What Happens in a Dependent Situation? 41 Bayes Rule 42 Separating the Signal and the Noise 43 Using the Bayes Rule to Create an AI Model 44 High-Level Class Probabilities Are Often Assumed to Be Equal 45 A Couple More Odds and Ends 46 Let's Get This Excel Party Started 47 Cleaning the Data with Power Query 48 Splitting on Spaces: Giving Each Word Its Due 50 Counting Tokens and Calculating Probabilities 55 We Have a Model! Let's Use It 58 4 Cluster Analysis Part 1: Using K-Means to Segment Your Customer Base 65 Dances at Summer Camp 65 Getting Real: K-Means Clustering Subscribers in Email Marketing 70 The Initial Dataset 71 Determining What to Measure 72 Start with Four Clusters 75 Euclidean Distance: Measuring Distances as the Crow Flies 76 Solving for the Cluster Centers 80 Making Sense of the Results 82 Getting the Top Deals by Cluster 83 The Silhouette: A Good Way to Let Different K Values Duke It Out 86 How About Five Clusters? 95 Solving for Five Clusters 96 Getting the Top Deals for All Five Clusters 96 Computing the Silhouette for 5-Means Clustering 99 K-Medians Clustering and Asymmetric Distance Measurements 100 Using K-Medians Clustering 100 Getting a More Appropriate Distance Metric 100 Putting It All in Excel 102 The Top Deals for the 5-Medians Clusters 104 5 Cluster Analysis Part II: Network Graphs and Community Detection 109 What Is a Network Graph? 110 Visualizing a Simple Graph 110 Beyond GiGraph and Adjacency Lists 115 Building a Graph from the Wholesale Wine Data 117 Creating a Cosine Similarity Matrix 118 Producing an R-Neighborhood Graph 121 Introduction to Gephi 123 Creating a Static Adjacency Matrix 124 Bringing in Your R-Neighborhood Adjacency Matrix into Gephi 124 Node Degree 128 Touching the Graph Data 130 How Much Is an Edge Worth? Points and Penalties in Graph Modularity 132 What's a Point, and What's a Penalty? 133 Setting Up the Score Sheet 136 Let's Get Clustering! 138 Split Number 1 138 Split 2: Electric Boogaloo 143 And. . .Split3: Split with a Vengeance 145 Encoding and Analyzing the Communities 146 There and Back Again: A Gephi Tale 151 6 Regression: The Granddaddy of Supervised Artificial Intelligence 157 Predicting Pregnant Customers at RetailMart Using Linear Regression 158 The Feature Set 159 Assembling the Training Data 161 Creating Dummy Variables 163 Let's Bake Our Own Linear Regression 165 Linear Regression Statistics: R-Squared, F-Tests, t-Tests 173 Making Predictions on Some New Data and Measuring Performance 182 Predicting Pregnant Customers at RetailMart Using Logistic Regression 192 First You Need a Link Function 192 Hooking Up the Logistic Function and Reoptimizing 193 Baking an Actual Logistic Regression 196 7 Ensemble Models: A Whole Lot of Bad Pizza 203 Getting Started Using the Data from Chapter 6 203 Bagging: Randomize, Train, Repeat 204 Decision Stump is Another Name for a Weak Learner 204 Doesn't Seem So Weak to Me! 204 You Need More Power! 207 Let's Train It 208 Evaluating the Bagged Model 220 Boosting: If You Get It Wrong, Just Boost and Try Again 223 Training the Model—Every Feature Gets a Shot 224 Evaluating the Boosted Model 231 8 Forecasting: Breathe Easy: You Can't Win 235 The Sword Trade Is Hopping 236 Getting Acquainted with Time-Series Data 236 Starting Slow with Simple Exponential Smoothing 238 Setting Up the Simple Exponential Smoothing Forecast 240 You Might Have a Trend 249 Holt's Trend-Corrected Exponential Smoothing 250 Setting Up Holt's Trend-Corrected Smoothing in a Spreadsheet 252 So Are You Done? Looking at Autocorrelations 258 Multiplicative Holt-Winters Exponential Smoothing 266 Setting the Initial Values for Level, Trend, and Seasonality 268 Getting Rolling on the Forecast 274 And. . .Optimize! 280 Putting a Prediction Interval Around the Forecast 283 Creating a Fan Chart for Effect 287 Forecast Sheets in Excel 289 9 Optimization Modeling: Because That "Fresh-Squeezed" Orange Juice Ain't Gonna Blend Itself 293 Wait Is This Data Science? 294 Starting with a Simple Trade-Off 295 Representing the Problem as a Polytope 296 Solving by Sliding the Level Set 297 The Simplex Method: Rooting Around the Corners 298 Working in Excel 300 Fresh from the Grove to Your Glass with a Pit Stop Through a Blending Model 305 Let's Start with Some Specs 307 Coming Back to Consistency 308 Putting the Data into Excel 309 Setting Up the Problem in Solver 311 Lowering Your Standards 314 Dead Squirrel Removal: the Minimax Formulation 317 If-Then and the "Big M" Constraint 320 Multiplying Variables: Cranking Up the Volume to 11,000 324 Modeling Risk 330 Normally Distributed Data 331 10 Outlier Detection: Just Because They're Odd Doesn't Mean They're Unimportant 339 Outliers Are (Bad?) People, Too 340 The Fascinating Case of Hadlum v Hadlum 340 Tukey's Fences 341 Applying Tukey's Fences in a Spreadsheet 342 The Limitations of This Simple Approach 345 Terrible at Nothing, Bad at Everything 346 Preparing Data for Graphing 347 Creating a Graph 350 Getting the k-Nearest Neighbors 351 Graph Outlier Detection Method 1: Just Use the Indegree 352 Graph Outlier Detection Method 2: Getting Nuanced with k-Distance 355 Graph Outlier Detection Method 3: Local Outlier Factors Are Where It's At 358 11 Moving on From Spreadsheets 363 Getting Up and Running with R 364 A Crash Course in R-ing 366 Show Me the Numbers! Vector Math and Factoring 367 The Best Data Type of Them All: the Dataframe 370 How to Ask for Help in R 371 It Gets Even Better Beyond Base R 372 Doing Some Actual Data Science 374 Reading Data into R 374 Spherical K-Means on Wine Data in Just a Few Lines 375 Building AI Models on the Pregnancy Data 381 Forecasting in R 389 Looking at Outlier Detection 393 12 Conclusion 397 Where Am I? What Just Happened? 397 Before You Go-Go 397 Get to Know the Problem 398 We Need More Translators 398 Beware the Three-Headed Geek-Monster: Tools, Performance, and Mathematical Perfection 399 You Are Not the Most Important Function of Your Organization 401 Get Creative and Keep in Touch! 402 Index 403

    15 in stock

    £28.49

  • The Enterprise Big Data Framework

    Kogan Page Ltd The Enterprise Big Data Framework

    15 in stock

    Book SynopsisJan-Willem Middelburg is a Dutch entrepreneur and author with a passion for technology and innovation. He is the CEO and co-founder of Cybiant, a global technology that company that helps to create a more sustainable world through analytics, big data and automation. He is also President and Chief Examiner of the Enterprise Big Data Framework, an independent organization dedicated to upskilling individuals with expertise in Big Data. In partnership with APMG-International, the Enterprise Big Data Framework offers vendor-neutral certifications for individuals.Trade Review"The Enterprise Big Data Framework is relevant for everybody within an organisation engaged in driving maximum benefits from data. There is something for everybody; from the board considering governance and ethical behaviour to individuals within the organisation knowing where they fit and the value they can get from better use of their organisation's data. If you are considering a transformation project, this is an excellent guide for your project team." * Richard Pharro, CEO, The APM Group Limited *"If you are looking for a good guide to empower your knowledge on big data and to find a framework to help you on your big data journey, then this book is for you. From learning what big data is to defining a big data strategy, Jan-Willem has built a book to empower the learner on the topic of big data." * Jordan Morrow, Chief Strategy & Transformation Officer, DataPrime and Author of Be Data Literate *"This book is a master piece for those who are familiar and those who discover the world of data. It provides an "a la carte framework" starting with a (big) data strategy and the supporting aspects such as big data functions, architecture and algorithms. It covers in depth data platforms architectures, its management as well as data governance, data catalogue and all the required security considerations associated to the various data classifications. You will find details of data life cycle management, of various machine learning algorithms and an important chapter covering AI ethics when building and deploying sophisticated algorithms using data. The concepts covered in this book apply to on-premises and in the (public) cloud environments, making this book a must read." * Jean-Michel Coeur, APAC Technology Practice Lead, Data Analytics, Google Cloud *Table of Contents Section - ONE: Introduction to Big Data; Chapter - 01: Introduction to Big Data; Chapter - 02: The Big Data framework; Chapter - 03: Big Data strategy; Chapter - 04: Big Data architecture; Chapter - 05: Big Data algorithms; Chapter - 06: Big Data processes; Chapter - 07: Big Data functions; Chapter - 08: Artificial intelligence; Section - TWO: Enterprise Big Data analysis; Chapter - 09: Introduction to Big Data analysis; Chapter - 10: Defining the business objective; Chapter - 11: Data ingestion – importing and reading data sets; Chapter - 12: Data preparation – cleaning and wrangling data; Chapter - 13: Data analysis – model building; Chapter - 14: Data presentation; Section - THREE: Enterprise Big Data engineering; Chapter - 15: Introduction to Big Data engineering; Chapter - 16: Data modelling; Chapter - 17: Constructing the data lake; Chapter - 18: Building an enterprise Big Data warehouse; Chapter - 19: Design and structure of Big Data pipelines; Chapter - 20: Managing data pipelines; Chapter - 21: Cluster technology; Section - FOUR: enterprise Big Data algorithm design; Chapter - 22: Introduction to Big Data algorithm design; Chapter - 23: Algorithm design – fundamental concepts; Chapter - 24: Statistical machine learning algorithms; Chapter - 25: The data science roadmap; Chapter - 26: Programming languages 26 visualization and simple metrics; Chapter - 27: Advanced machine learning algorithms; Chapter - 28: Advanced machine learning classification algorithms; Chapter - 29: Technical communication and documentation; Section - FIVE: Enterprise Big Data architecture; Chapter - 30: Introduction to the Big Data architecture; Chapter - 31: Strength and resilience – the Big Data platform; Chapter - 32: Design principles for Big Data architecture; Chapter - 33: Big Data infrastructure; Chapter - 34: Big Data platforms; Chapter - 35: The Big Data application provider; Chapter - 36: System orchestration in Big Data

    15 in stock

    £148.50

  • Managing and Mining Graph Data 40 Advances in Database Systems

    Springer Us Managing and Mining Graph Data 40 Advances in Database Systems

    15 in stock

    Book SynopsisManaging and Mining Graph Data is a comprehensive survey book in graph management and mining. It contains extensive surveys on a variety of important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy.Trade ReviewFrom the reviews:“This book provides a survey of some recent advances in graph mining. It contains chapters on graph languages, indexing, clustering, pattern mining, keyword search, and pattern matching. … The book is targeted at advanced undergraduate or graduate students, faculty members, and researchers from both industry and academia. … I highly recommend this book to someone who is starting to explore the field of graph mining or wants to delve deeper into this exciting field.” (Dimitrios Katsaros, ACM Computing Reviews, December, 2010)Table of ContentsAn Introduction to Graph Data.- Graph Data Management and Mining: A Survey of Algorithms and Applications.- Graph Mining: Laws and Generators.- Query Language and Access Methods for Graph Databases.- Graph Indexing.- Graph Reachability Queries: A Survey.- Exact and Inexact Graph Matching: Methodology and Applications.- A Survey of Algorithms for Keyword Search on Graph Data.- A Survey of Clustering Algorithms for Graph Data.- A Survey of Algorithms for Dense Subgraph Discovery.- Graph Classification.- Mining Graph Patterns.- A Survey on Streaming Algorithms for Massive Graphs.- A Survey of Privacy-Preservation of Graphs and Social Networks.- A Survey of Graph Mining for Web Applications.- Graph Mining Applications to Social Network Analysis.- Software-Bug Localization with Graph Mining.- A Survey of Graph Mining Techniques for Biological Datasets.- Trends in Chemical Graph Data Mining.

    15 in stock

    £189.99

  • Principles of Data Mining

    Springer Principles of Data Mining

    15 in stock

    Book SynopsisIntroduction to Data Mining.- Data for Data Mining.- Introduction to Classification: Naïve Bayes and Nearest Neighbour.- Using Decision Trees for Classification.- Decision Tree Induction: Using Entropy for Attribute Selection.- Decision Tree Induction: Using Frequency Tables for Attribute Selection.- Estimating the Predictive Accuracy of a Classifier.- Continuous Attributes.- Avoiding Overfitting of Decision Trees.- More About Entropy.- Inducing Modular Rules for Classification.- Measuring the Performance of a Classifier.- Dealing with Large Volumes of Data.- Ensemble Classification.- Comparing Classifiers.- Associate Rule Mining I.- Associate Rule Mining II.- Associate Rule Mining III.- Clustering.- Mining.- Classifying Streaming Data.- Classifying Streaming Data II: Time-dependent Data.- An Introduction to Neural Networks.- Appendix A Essential Mathematics.- Appendix B Datasets.- Appendix C Sources of Further Information.- Appendix D Glossary and Notation.- Appendix E SolutioTable of ContentsIntroduction to Data Mining.- Data for Data Mining.- Introduction to Classification: Naïve Bayes and Nearest Neighbour.- Using Decision Trees for Classification.- Decision Tree Induction: Using Entropy for Attribute Selection.- Decision Tree Induction: Using Frequency Tables for Attribute Selection.- Estimating the Predictive Accuracy of a Classifier.- Continuous Attributes.- Avoiding Overfitting of Decision Trees.- More About Entropy.- Inducing Modular Rules for Classification.- Measuring the Performance of a Classifier.- Dealing with Large Volumes of Data.- Ensemble Classification.- Comparing Classifiers.- Associate Rule Mining I.- Associate Rule Mining II.- Associate Rule Mining III.- Clustering.- Mining.- Classifying Streaming Data.- Classifying Streaming Data II: Time-dependent Data.- An Introduction to Neural Networks.- Appendix A – Essential Mathematics.- Appendix B – Datasets.- Appendix C – Sources of Further Information.- Appendix D – Glossary and Notation.- Appendix E – Solutions to Self-assessment Exercises.- Index.

    15 in stock

    £29.99

  • Oracle Database Upgrade and Migration Methods

    APress Oracle Database Upgrade and Migration Methods

    2 in stock

    Book Synopsis Learn all of the available upgrade and migration methods in detail to move to Oracle Database version 12c. You will become familiar with database upgrade best practices to complete the upgrade in an effective manner and understand the Oracle Database 12c patching process. So it''s time to upgrade Oracle Database to version 12c and you need to choose the appropriate method while considering issues such as downtime. This book explains all of the available upgrade and migration methods so you can choose the one that suits your environment. You will be aware of the practical issues and proactive measures to take to upgrade successfully and reduce unexpected issues.  With every release of Oracle Database there are new features and fixes to bugs identified in previous versions. As each release becomes obsolete, existing databases need to be upgraded. Oracle Database Upgrade and Migration Methods explains each method along Table of Contents PART I: packetC Background 1 CHAPTER 1: Getting Started 3 Introduction to Database upgrade Necessities of Database upgrade Benefits of Database upgrade Hurdles that affect Database upgrade decision Types of Database upgrade Things to consider before upgrade Engineers involved in upgrade activity Upgrade compatibility matrix Best practices of Database upgrade Database Migration Situations demand Migration Things to consider before migration summary PART II: Language Reference 53 n CHAPTER 2: Database Upgrade methods 55 DBUA Manual/Command line upgrade Export/Import Datapump Transportable Tablespace Golden Gate Create Table as Select (CTAS) Transient Logical Standby Full Transportable Tablespace Summary n CHAPTER 3: Comparison between upgrade methods 151 Comparison between methods 9 Real Application testing (RAT) 10 How to choose best Upgrade method 11 Summary n CHAPTER 4: Upgrade using Database backup 159 Cold backup Hot backup (User-Managed) Logical backup (expdp/impdp) RMAN backup (using duplicate option) Summary n CHAPTER 5: Database Migration methods 171 Export/Import Datapump Transportable Tablespace (TTS) Golden Gate Copy table as select (CTAS) Transport Database Heterogenous Standby database Oracle Streams Summary n CHAPTER 6: Migration of Oracle database from Non-ASM to ASM 175 Introduction Moving Datafiles Online from NON-ASM < to ASM Migrating Oracle 12c CDB with PDBs from NON ASM to ASM using EM Cloud Control 13c ............... Migrating Oracle 12c CDB with PDBs from NON ASM to ASM using RMAN Summary n CHAPTER 7: GI and DB upgrade in RAC environment 205 Introduction CVU Pre-Upgrade Check tool Execution Steps for ORAchk Rolling upgrade for Oracle GI Upgrading 11g RAC to 12c RAC using DBUA Upgrading 11g RAC to 12c RAC Manual Upgrading 11g RAC to 12c RAC using EM 13c Summary PART III: Developing Applications 215 n CHAPTER 8: Database upgrade in DG environment 217 Dummy Text Dummy Text Virtual Dummy Text Dummy Text Dummy Text Dummy Text Flow Dummy Text Summary n CHAPTER 9: Database upgrade in EBS environment 223 Prerequisite steps Preupgrade steps Upgrade steps Post upgrade steps Summary CHAPTER 10: Database upgrade in 12c Multitenant environment Migrate lower version database to Multitenant architecture Container database upgrade Pluggable database upgrade Summary n CHAPTER 11: Databases migrate in Multitenant environment 237 Pluggable database migrate Need for Migrate Migration steps Summary n CHAPTER 12: Oracle Database Patching Stratergies 245 Patching Introduction Opatch tool Types of patches Patch apply stratergies (online and offline patching).... PSU and SPU patching Patch apply in RAC and DG environment Datapatch Queryable patch inventory Summary n CHAPTER 13: Database Downgrade 263 Introduction Limitations of Oracle database downgrade Database downgrade steps Downgrade using database flashback Summary n CHAPTER 14: Database upgrade in 12.2 281 Preupgrade checks Upgrade Emulation DBUA Manual Database upgrade Pluggable database upgrade Downgrade 12.2 database to earlier version............... Summary n n APPENDIX A: Reference Tables 383 n APPENDIX B: Dummy Text 395 INDEX 433

    2 in stock

    £44.99

  • Learning to Love Data Science

    O'Reilly Media Learning to Love Data Science

    1 in stock

    Book SynopsisToday, big data is taken seriously, and data science is considered downright sexy. With this anthology of reports from award-winning journalist Mike Barlow, you'll appreciate how data science is fundamentally altering our world, for better and for worse.

    1 in stock

    £15.99

  • 97 Things Every Data Engineer Should Know

    O'Reilly Media 97 Things Every Data Engineer Should Know

    1 in stock

    Book SynopsisWith this in-depth book, current and aspiring engineers will learn powerful, real-world best practices for managing data big and small. Contributors from notable companies including Twitter, Google, StitchFix, Microsoft and Capital One share their experiences and lessons learned for overcoming a variety of specific and often nagging challenges.

    1 in stock

    £31.99

  • CostEffective Data Pipelines

    O'Reilly Media CostEffective Data Pipelines

    15 in stock

    Book SynopsisWith this practical guide, author Sev Leonard provides a holistic approach to designing scalable data pipelines in the cloud. Intermediate data engineers, software developers, and architects will learn how to navigate cost/performance trade-offs and how to choose and configure compute and storage.

    15 in stock

    £39.74

  • Data Storage: Systems, Management & Security

    Nova Science Publishers Inc Data Storage: Systems, Management & Security

    1 in stock

    Book Synopsis

    1 in stock

    £78.39

  • The Chief Data Officer Handbook for Data

    MC Press, LLC The Chief Data Officer Handbook for Data

    15 in stock

    Book SynopsisA practical guide for today’s chief data officers to define and manage data governance programs The relatively new role of chief data officer (CDO) has been created to address the issue of managing a company’s data as a strategic asset, but the problem is that there is no universally accepted “playbook” for this role. Magnifying the challenge is the rapidly increasing volume and complexity of data, as well as regulatory compliance as it relates to data. In this book, Sunil Soares provides a practical guide for today’s chief data officers to manage data as an asset while delivering the trusted data required to power business initiatives, from the tactical to the transformative. The guide describes the relationship between the CDO and the data governance team, whose task is the formulation of policy to optimize, secure, and leverage information as an enterprise asset by aligning the objectives of multiple functions. Soares provides unique insight into the role of the CDO and presents a blueprint for implementing data governance successfully within the context of the position. With practical advice CDOs need, this book helps establish new data governance practices or mature existing practices.

    15 in stock

    £14.20

  • Graph Databases in Action

    Manning Publications Graph Databases in Action

    1 in stock

    Book SynopsisGraph Databases in Action teaches readers everything they need to know to begin building and running applications powered by graph databases. Right off the bat, seasoned graph database experts introduce readers to just enough graph theory, the graph database ecosystem, and a variety of datastores. They also explore modelling basics in action with real-world examples, then go hands-on with querying, coding traversals, parsing results, and other essential tasks as readers build their own graph-backed social network app complete with a recommendation engine! Key Features · Graph database fundamentals · An overview of the graph database ecosystem · Relational vs. graph database modelling · Querying graphs using Gremlin · Real-world common graph use cases For readers with basic Java and application development skills building in RDBMS systems such as Oracle, SQL Server, MySQL, and Postgres. No experience with graph databases is required. About the technology Graph databases store interconnected data in a more natural form, making them superior tools for representing data with rich relationships. Unlike in relational database management systems (RDBMS), where a more rigid view of data connections results in the loss of valuable insights, in graph databases, data connections are first priority. Dave Bechberger has extensive experience using graph databases as a product architect and a consultant. He’s spent his career leveraging cutting-edge technologies to build software in complex data domains such as bioinformatics, oil and gas, and supply chain management. He’s an active member of the graph community and has presented on a wide variety of graph-related topics at national and international conferences. Josh Perryman is technologist with over two decades of diverse experience building and maintaining complex systems, including high performance computing (HPC) environments. Since 2014 he has focused on graph databases, especially in distributed or big data environments, and he regularly blogs and speaks at conferences about graph databases.

    1 in stock

    £35.99

  • Algorithms and Data Structures for Massive

    Manning Publications Algorithms and Data Structures for Massive

    10 in stock

    Book SynopsisMassive modern datasets make traditional data structures and algorithms grind to a halt. This fun and practical guide introduces cutting-edge techniques that can reliably handle even the largest distributed datasets.In Algorithms and Data Structures for Massive Datasets you will learn: Probabilistic sketching data structures for practical problems Choosing the right database engine for your application Evaluating and designing efficient on-disk data structures and algorithms Understanding the algorithmic trade-offs involved in massive-scale systems Deriving basic statistics from streaming data Correctly sampling streaming data Computing percentiles with limited space resources Algorithms and Data Structures for Massive Datasets reveals a toolbox of new methods that are perfect for handling modern big data applications. You’ll explore the novel data structures and algorithms that underpin Google, Facebook, and other enterprise applications that work with truly massive amounts of data. These effective techniques can be applied to any discipline, from finance to text analysis. Graphics, illustrations, and hands-on industry examples make complex ideas practical to implement in your projects—and there’s no mathematical proofs to puzzle over. Work through this one-of-a-kind guide, and you’ll find the sweet spot of saving space without sacrificing your data’s accuracy. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Standard algorithms and data structures may become slow—or fail altogether—when applied to large distributed datasets. Choosing algorithms designed for big data saves time, increases accuracy, and reduces processing cost. This unique book distills cutting-edge research papers into practical techniques for sketching, streaming, and organizing massive datasets on-disk and in the cloud. About the book Algorithms and Data Structures for Massive Datasets introduces processing and analytics techniques for large distributed data. Packed with industry stories and entertaining illustrations, this friendly guide makes even complex concepts easy to understand. You’ll explore real-world examples as you learn to map powerful algorithms like Bloom filters, Count-min sketch, HyperLogLog, and LSM-trees to your own use cases. What's inside Probabilistic sketching data structures Choosing the right database engine Designing efficient on-disk data structures and algorithms Algorithmic tradeoffs in massive-scale systems Computing percentiles with limited space resources About the reader Examples in Python, R, and pseudocode. About the author Dzejla Medjedovic earned her PhD in the Applied Algorithms Lab at Stony Brook University, New York. Emin Tahirovic earned his PhD in biostatistics from University of Pennsylvania. Illustrator Ines Dedovic earned her PhD at the Institute for Imaging and Computer Vision at RWTH Aachen University, Germany.   Table of Contents 1 Introduction PART 1 HASH-BASED SKETCHES 2 Review of hash tables and modern hashing 3 Approximate membership: Bloom and quotient filters 4 Frequency estimation and count-min sketch 5 Cardinality estimation and HyperLogLog PART 2 REAL-TIME ANALYTICS 6 Streaming data: Bringing everything together 7 Sampling from data streams 8 Approximate quantiles on data streams PART 3 DATA STRUCTURES FOR DATABASES AND EXTERNAL MEMORY ALGORITHMS 9 Introducing the external memory model 10 Data structures for databases: B-trees, Bε-trees, and LSM-trees 11 External memory sorting

    10 in stock

    £45.39

  • Data Resource Guide: Managing the Data Resource

    Technics Publications LLC Data Resource Guide: Managing the Data Resource

    10 in stock

    Book SynopsisAre you struggling to find the data that you need to support your business activities? Are you concerned that people may be using the wrong data for their business activities? Are you having difficulty understanding the data that you do find in your data resource? Are you frustrated over documenting that understanding in a manner that is readily accessible to anyone in the organisation? If the answer to any of these questions is Yes, then you need to read "Data Resource Guide" to help identify, understand, access, and use the appropriate data. Most public and private sector organisations today have no formal, single location for the complete documentation of their data resource that is readily available to everyone in the organisation. Many organisations do not even have a concept of how to design, develop, or manage a single repository containing an understanding all the data available to the organisation. Yet they are staking their business on those data. "Data Resource Data" provided the complete data resource model for an organisation''s Data Resource Data. Data Resource Understanding provided a detailed description of how to thoroughly understand an organisation''s data resource through those Data Resource Data. Now, "Data Resource Guide" provides the detailed specifications for developing a simple, inexpensive, and effective way to document the data resource understanding and make that understanding readily available to anyone in the organisation. Michael Brackett draws on over half a century of data management experience to complete two trilogies for formally managing an organisation''s data as a critical resource. The Data Architecture Trilogy describes the development of a single organisation wide data architecture for an organization. The Data Understanding Trilogy describes the acquisition and documentation of understanding about all the data at an organisation''s disposal.

    10 in stock

    £32.79

  • Data Lake Architecture: Designing the Data Lake

    Technics Publications LLC Data Lake Architecture: Designing the Data Lake

    15 in stock

    Book SynopsisOrganizations invest incredible amounts of time and money obtaining and then storing big data in data stores called data lakes. But how many of these organizations can actually get the data back out in a useable form? Very few can turn the data lake into an information gold mine. Most wind up with garbage dumps. Data Lake Architecture will explain how to build a useful data lake, where data scientists and data analysts can solve business challenges and identify new business opportunities. Learn how to structure data lakes as well as analog, application, and text-based data ponds to provide maximum business value. Understand the role of the raw data pond and when to use an archival data pond. Leverage the four key ingredients for data lake success: metadata, integration mapping, context, and metaprocess. Bill Inmon opened our eyes to the architecture and benefits of a data warehouse, and now he takes us to the next level of data lake architecture.

    15 in stock

    £22.09

  • Growing Business Intelligence: An Agile Approach

    Technics Publications LLC Growing Business Intelligence: An Agile Approach

    10 in stock

    Book SynopsisHow do we enable our organisations to enjoy the often significant benefits of BI and analytics, while at the same time minimising the cost and risk of failure? In this book, I am not going to try to be prescriptive; I wont tell you exactly how to build your BI environment. Instead, I am going to focus on a few core principles that will enable you to navigate the rocky shoals of BI architecture and arrive at a destination best suited for your particular organisation. Some of these core principles include: Have an overarching strategy, plan, and roadmap. Recognise and leverage your existing technology investments. Support both data discovery and data reuse. Keep data in motion, not at rest. Separate information delivery from data storage. Emphasise data transparency over data quality. Take an agile approach to BI development. This book will show you how to successfully navigate both the jungle of BI technology and the minefield of human nature. It will show you how to create a BI architecture and strategy that addresses the needs of all organisational stakeholders. It will show you how to maximise the value of your BI investments. It will show you how to manage the risk of disruptive technology. And it will show you how to use agile methodologies to deliver on the promise of BI and analytics quickly, succinctly, and iteratively. This book is about many things. But principally, its about success. The goal of any enterprise initiative is to succeed and to derive benefit -- benefit that all stakeholders can share in. I want you to be successful. I want your organisation to be successful. This book will show you how. This book is for anyone who is currently or will someday be working on a BI, analytics, or Big Data project, and for organisations that want to get the maximum amount of value from both their data and their BI technology investment. This includes all stakeholders in the BI effort -- not just the data people or the IT people, but also the business stakeholders who have the responsibility for the definition and use of data. There are six sections to this book: In Section I, What Kind of Garden Do You Want?, we will examine the benefits and risks of Business Intelligence, making the central point that BI is a business (not IT) process designed to manage data assets in pursuit of enterprise goals. We will show how data, when properly managed and used, can be a key enabler of several types of core business processes. The purpose of this section is to help you define the particular benefit(s) you want from BI. In Section II, Building the Bones, we will talk about how to design and build out the hardscape (infrastructure) of your BI environment. This stage of the process involves leveraging existing technology investments and iteratively moving toward your desired target state BI architecture. In Section III, From the Ground Up, we explore the more detailed aspects of implementing your BI operational environment. In Section IV, Weeds, Pests and Critters, we talk about the myriad of things that can go wrong on a BI project, and discuss ways of mitigating these risks. In Section V, The Sustainable Garden, we talk about how to create a BI infrastructure that is easy and inexpensive to maintain. Finally, Section VI presents a case study illustrating the principles of this book, as applied to a fictional manufacturing company (the Blue Moon Guitar Company).

    10 in stock

    £36.79

  • Analytics: How to Win with Intelligence

    Technics Publications LLC Analytics: How to Win with Intelligence

    15 in stock

    Book SynopsisLearn how big data and other sources of information can be transformed into valuable knowledge -- knowledge that can create incredible competitive advantage to propel a business toward market leadership. Learn through examples and experience exactly how to pick projects and build analytics teams that deliver results. Know the ethical and privacy issues, and apply the three-part litmus test of context, permission, and accuracy. Without a doubt, data and analytics are the new source of competitive advantage, but how do executives go from hype to action? Thats the objective of this book -- to assist executives in making the right investments in the right place and at the right time in order to reap the full benefits of data analytics.

    15 in stock

    £24.79

  • Enterprise Data Architecture: How to navigate its landscape

    15 in stock

    £22.99

  • Snowflake Cookbook: Techniques for building

    Packt Publishing Limited Snowflake Cookbook: Techniques for building

    1 in stock

    Book SynopsisDevelop modern solutions with Snowflake's unique architecture and integration capabilities; process bulk and real-time data into a data lake; and leverage time travel, cloning, and data-sharing features to optimize data operationsKey Features Build and scale modern data solutions using the all-in-one Snowflake platform Perform advanced cloud analytics for implementing big data and data science solutions Make quicker and better-informed business decisions by uncovering key insights from your data Book DescriptionSnowflake is a unique cloud-based data warehousing platform built from scratch to perform data management on the cloud. This book introduces you to Snowflake's unique architecture, which places it at the forefront of cloud data warehouses.You'll explore the compute model available with Snowflake, and find out how Snowflake allows extensive scaling through the virtual warehouses. You will then learn how to configure a virtual warehouse for optimizing cost and performance. Moving on, you'll get to grips with the data ecosystem and discover how Snowflake integrates with other technologies for staging and loading data.As you progress through the chapters, you will leverage Snowflake's capabilities to process a series of SQL statements using tasks to build data pipelines and find out how you can create modern data solutions and pipelines designed to provide high performance and scalability. You will also get to grips with creating role hierarchies, adding custom roles, and setting default roles for users before covering advanced topics such as data sharing, cloning, and performance optimization.By the end of this Snowflake book, you will be well-versed in Snowflake's architecture for building modern analytical solutions and understand best practices for solving commonly faced problems using practical recipes.What you will learn Get to grips with data warehousing techniques aligned with Snowflake's cloud architecture Broaden your skills as a data warehouse designer to cover the Snowflake ecosystem Transfer skills from on-premise data warehousing to the Snowflake cloud analytics platform Optimize performance and costs associated with a Snowflake solution Stage data on object stores and load it into Snowflake Secure data and share it efficiently for access Manage transactions and extend Snowflake using stored procedures Extend cloud data applications using Spark Connector Who this book is forThis book is for data warehouse developers, data analysts, database administrators, and anyone involved in designing, implementing, and optimizing a Snowflake data warehouse. Knowledge of data warehousing and database and cloud concepts will be useful. Basic familiarity with Snowflake is beneficial, but not necessary.Table of ContentsTable of Contents Getting Started with Snowflake Managing the Data Life Cycle Loading and Extracting Data into and out of Snowflake Building Data Pipelines in Snowflake Data Protection and Security in Snowflake Performance and Cost Optimization Secure Data Sharing Back to the Future with Time Travel Advanced SQL Techniques Extending Snowflake's Capabilities

    1 in stock

    £36.09

  • 3D Recording, Documentation and Management of

    Whittles Publishing 3D Recording, Documentation and Management of

    15 in stock

    Book SynopsisDocumentation of our cultural heritage is experiencing an explosion of innovation. New tools have appeared in recent decades including laser scanning, rapid prototyping, high dynamic range spherical and infrared imagery, drone photography, augmented and virtual reality and computer rendering in multiple dimensions. These give us visualisations and data that are at once interesting, intriguing and yet sometimes deceptive. This text provides an objective and integrated approach to the subject, bringing together the techniques of conservation with management, photographic methods, various modelling techniques and the use of unmanned aerial systems. This interdisciplinary approach addresses the need for knowledge about deploying advanced digital technologies and the materials and methods for the assessment, conservation, rehabilitation and maintenance of the sustainability of existing structures and designated historic buildings. Furthermore, this book actively provides the knowhow to facilitate the creation of heritage inventories, assessing risk, and addressing the need for sustainability.In so doing it becomes more feasible to mitigate the threats from inherent and external causes, not only for the built heritage but also for moveable objects and intangible heritage that suffer abandonment and negligence as well as looting and illegal trafficking. The book is written by a team of international experts based upon their practical experience and expertise. It therefore creates a unique book that encapsulates the knowledge of this discipline required by anyone working in this field.Trade Review`...this new publication is a welcome addition, highlighting how these 3D techniques can be utilised... ...this well-illustrated volume represents a useful contribution for scholars wishing to gain a better understanding of the underpinnings of 3D recording and documentation’. Medieval Archaeology -------------------- `...I found this book very valuable. It can reach an eclectic audience in providing a broad spectrum of the subject. This book is of major importance for Cultural Heritage 3D recording and management and...an important resource handbook’. International Institute for Conservation of Historic and Artistic Works -------------------- '...this new, richly illustrated reference publication on recording and documenting cultural heritage. ... For anyone considering a digital camera for survey purposes ... this chapter [4] is essential reading, and is rightfully one of the best references currently available on the science behind imaging. ...manages to provide what is probably the most up-to-date reference book on 3D recording, documentation and management of cultural heritage. For any heritage professional, academic, student or interested individual considering applying, acquiring, undertaking or researching digital imaging, photogrammetry, Structure-from-Motion, laser scanning, GIS, BIM or RPAS/UAV within a conservation context, this book should be essential reading before embarking down any one of these rapidly developing technological routes'. Conservation and Management of Archaeological Sites -------------------- '...the images in this book, both in colour and high-resolution, play a critical role along with the text. This is a well produced book that is wonderful to read and view. ...I find this book exceptional for its publishing quality, content and production. It clearly includes cutting-edge knowledge, awareness and experience from many contributors involved in cultural heritage processes around the globe...would be very useful to anyone involved in cultural heritage, documentation of history and site preservation and conservation. It can readily serve as a course text in addition to being a reference text. ... I've nothing but positive things to say about this book - I think you will too'. 3D Visualization WorldTable of ContentsIntroduction - current trends in cultural heritage and documentation; Conservation techniques in cultural heritage; Cultural heritage management tools: The role of GIS and BIM; Basics of photography for cultural heritage imaging; Basics of image-based modelling techniques in cultural heritage 3D recording; Basics of range-based modelling techniques in cultural heritage 3D recording; Cultural heritage documentation with RPAS/UAV

    15 in stock

    £76.50

  • Game Theory for Networks: 8th International EAI Conference, GameNets 2019, Paris, France, April 25–26, 2019, Proceedings

    Springer Nature Switzerland AG Game Theory for Networks: 8th International EAI Conference, GameNets 2019, Paris, France, April 25–26, 2019, Proceedings

    1 in stock

    Book SynopsisThis book constitutes the refereed proceedings of the 8th EAI International Conference on Game Theory for Networks, GameNets 2019, held in Paris, France, in April 2019. The 8 full and 3 short papers presented were carefully reviewed and selected from 17 submissions. They are organized in the following topical sections: Game Theory for Wireless Networks; Games for Economy and Resource Allocation; and Game Theory for Social Networks.Table of ContentsGame Theory for Wireless Networks.- Games for Economy and Resource Allocation.- Game Theory for Social Networks.

    1 in stock

    £34.19

  • Digital Libraries: The Era of Big Data and Data Science: 16th Italian Research Conference on Digital Libraries, IRCDL 2020, Bari, Italy, January 30–31, 2020, Proceedings

    Springer Nature Switzerland AG Digital Libraries: The Era of Big Data and Data Science: 16th Italian Research Conference on Digital Libraries, IRCDL 2020, Bari, Italy, January 30–31, 2020, Proceedings

    1 in stock

    Book SynopsisThis book constitutes the thoroughly refereed proceedings of the 16th Italian Research Conference on Digital Libraries, IRCDL 2020, held in Bari, Italy, in January 2020.The 12 full papers and 6 short papers presented were carefully selected from 26 submissions. The papers are organized in topical sections on information retrieval, bid data and data science in DL; cultural heritage; open science. Table of ContentsInformation Retrieval.- Bid Data and Data Science in DL.- Cultural Heritage.- Open Science.

    1 in stock

    £53.99

  • Information Retrieval: 27th China Conference, CCIR 2021, Dalian, China, October 29–31, 2021, Proceedings

    Springer Nature Switzerland AG Information Retrieval: 27th China Conference, CCIR 2021, Dalian, China, October 29–31, 2021, Proceedings

    1 in stock

    Book SynopsisThis book constitutes the refereed proceedings of the 27th China Conference on Information Retrieval, CCIR 2021, held in Dalian, China, in October 2021.The 15 full papers presented were carefully reviewed and selected from 124 submissions. The papers are organized in topical sections: search and recommendation, NLP for IR, IR in Education, and IR in Biomedicine.Table of ContentsSearch and Recommendation.- NLP for IR.- IR in Education.- IR in Biomedicine.

    1 in stock

    £49.49

  • Cohesive Subgraph Search Over Large Heterogeneous

    Springer Nature Switzerland AG Cohesive Subgraph Search Over Large Heterogeneous

    3 in stock

    Book SynopsisThis SpringerBrief provides the first systematic review of the existing works of cohesive subgraph search (CSS) over large heterogeneous information networks (HINs). It also covers the research breakthroughs of this area, including models, algorithms and comparison studies in recent years. This SpringerBrief offers a list of promising future research directions of performing CSS over large HINs.The authors first classify the existing works of CSS over HINs according to the classic cohesiveness metrics such as core, truss, clique, connectivity, density, etc., and then extensively review the specific models and their corresponding search solutions in each group. Note that since the bipartite network is a special case of HINs, all the models developed for general HINs can be directly applied to bipartite networks, but the models customized for bipartite networks may not be easily extended for other general HINs due to their restricted settings. The authors also analyze and compare these cohesive subgraph models (CSMs) and solutions systematically. Specifically, the authors compare different groups of CSMs and analyze both their similarities and differences, from multiple perspectives such as cohesiveness constraints, shared properties, and computational efficiency. Then, for the CSMs in each group, the authors further analyze and compare their model properties and high-level algorithm ideas.This SpringerBrief targets researchers, professors, engineers and graduate students, who are working in the areas of graph data management and graph mining. Undergraduate students who are majoring in computer science, databases, data and knowledge engineering, and data science will also want to read this SpringerBrief.Table of Contents1. Introduction2. Preliminaries3. CSS on Bipartite Networks4. CSS on Other General HINs5. Comparison Analysis6. Related Work on CSMs and solutions7. Future Work and Conclusion

    3 in stock

    £35.99

  • The Semantic Web: 19th International Conference, ESWC 2022, Hersonissos, Crete, Greece, May 29 – June 2, 2022, Proceedings

    Springer International Publishing AG The Semantic Web: 19th International Conference, ESWC 2022, Hersonissos, Crete, Greece, May 29 – June 2, 2022, Proceedings

    1 in stock

    Book SynopsisChapters “No. 10 and No. 21” are available open access under a Creative Commons Attribution 4.0 International License via link.springer.com.Table of ContentsResearch.- Resources.- In-Use Track.

    1 in stock

    £62.99

  • Automated Taxonomy Discovery and Exploration

    Springer International Publishing AG Automated Taxonomy Discovery and Exploration

    3 in stock

    Book SynopsisThis book provides a principled data-driven framework that progressively constructs, enriches, and applies taxonomies without leveraging massive human annotated data. Traditionally, people construct domain-specific taxonomies by extensive manual curations, which is time-consuming and costly. In today’s information era, people are inundated with the vast amounts of text data. Despite their usefulness, people haven’t yet exploited the full power of taxonomies due to the heavy curation needed for creating and maintaining them. To bridge this gap, the authors discuss automated taxonomy discovery and exploration, with an emphasis on label-efficient machine learning methods and their real-world usages. Taxonomy organizes entities and concepts in a hierarchy way. It is ubiquitous in our daily life, ranging from product taxonomies used by online retailers, topic taxonomies deployed by news outlets and social media, as well as scientific taxonomies deployed by digital libraries across various domains. When properly analyzed, these taxonomies can play a vital role for science, engineering, business intelligence, policy design, e-commerce, and more. Intuitive examples are used throughout enabling readers to grasp concepts more easily.Table of ContentsIntroduction.- Concept Set Expansion.- Taxonomy Construction.- Taxonomy Enrichment.- Taxonomy-Guided Classification.- Conclusions.

    3 in stock

    £44.99

  • The Semantic Web: ESWC 2022 Satellite Events: Hersonissos, Crete, Greece, May 29 – June 2, 2022, Proceedings

    Springer International Publishing AG The Semantic Web: ESWC 2022 Satellite Events: Hersonissos, Crete, Greece, May 29 – June 2, 2022, Proceedings

    1 in stock

    Book SynopsisThis book constitutes the proceedings of the satellite events held at the 19th Extended Semantic Web Conference, ESWC 2022, during May—June in Hersonissos, Greece, 2022. The included satellite events are: the poster and demo session; the PhD symposium; industry track; project networking; workshops and tutorials. During ESWC 2022, the following ten workshops took place:10th Linked Data in Architecture and Construction Workshop (LDAC 2022); 5th International Workshop on Geospatial Linked Data (GeoLD 2022); 5th Workshop on Semantic Web solutions for large-scale biomedical data analytics (SeMWeBMeDA 2022); 7th Natural Language Interfaces for the Web of Data (NLIWOD+QALD 2022); International Workshop on Knowledge Graph Generation from Text (Text2KG 2022); 3rd International Workshop on Deep Learning meets Ontologies and Natural Language Processing (DeepOntoNLP 2022); 1st Workshop on Modular Knowledge (ModularK 2022); Third International Workshop On Knowledge Graph Construction (KGCW 2022); Third International Workshop On Semantic Digital Twins (SeDIT 2022); and the 1st International Workshop on Semantic Industrial Information Modelling (SemIIM 2022). Table of Contents Summary of Workshops and Tutorials at European Semantic Web Conference 2022.- Posters and Demos.- Towards UML-style Visual Queries over Wikidata.- Using the ODRL Profile for Access Control for Solid Pod Resource Governance.- Relation Canonicalization in Open Knowledge Graphs: A Quantitative Analysis.- Harmonizing and Using Numismatic Linked Data in Digital Humanities Research and Application Development: Case DigiNUMA.- Extending AgreementMakerLight to Perform Holistic Ontology Matching.- It’s all in the Name: Entity Typing using Multilingual Language Models.- The Supervised Semantic Similarity Toolkit.- Tab2Onto: Unsupervised Semantification with Knowledge Graph Embeddings.- DataSpecer: A Model-Driven Approach to Managing Data Specifications.- Towards Query Processing over Heterogeneous Federations of RDF Data sources.- SAND: A Tool for Creating Semantic Descriptions of Tabular Sources.- BLAST: Block Applications for Things.- Leibniz Data Manager – A Research Data Management System.- Towards Knowledge Graph-Agnostic SPARQL Query Validation for Improving Question Answering.- Towards Generalized Welding Ontology in line with ISO and Knowledge Graph Construction.- O’FAIRe: Ontology FAIRness Evaluator in the AgroPortal semantic resource repository.- domOS Common Ontology: Web of Things Discovery in Smart Buildings.- WeaKG-MF: a Knowledge Graph of Observational Weather Data.- DAGOBAH UI: A New Hope For Semantic Table Interpretation.- KartoGraphI: Drawing a Map of Linked Data.- WikidataComplete – An easy-to-use method for rapid validation of text-extracted new facts applied to the Wikidata knowledge graph.- Query-based Industrial Analytics over Knowledge Graphs with Ontology Reshaping.- Semantic Video Entity Linking.- Walk this Way! Entity Walks and Property Walks for RDF2vec.- Self-Verifying Web Resource Representations using Solid, RDF-star and Signed URIs.- From OWL to Graphol: importing ontologies into Eddy the editor.- Audio Ontologies for Intangible Cultural Heritage.- Ontology Matching Through Absolute Orientation of Embedding Spaces.- Semantic modeling and reconstruction of drones’ Trajectories.- How to Search and Contextualize Scenes inside Videos for Enriched Watching Experience: Case Stories of the Second World War Veterans.- PhD Symposium.- (Semi-) Automatic construction of knowledge graph Metadata.- Towards a Similarity Algorithm for Controlled Vocabularies within the Digital Humanities.- Causal Domain Adaptation for Information Extraction from Complex Conversations.- Knowledge Graph Population with Out-of-KG Entities.- Dynamic Knowledge Graph Embeddings via Local Embedding Reconstructions.- Leveraging Standards in Model-Centric Geospatial Knowledge Graph Creation.- Building Narrative Structures from Knowledge Graphs.- Using Referential Language Games for Task-oriented Ontology Alignment.- Balancing RDF generation from heterogeneous data sources.- Geological Information Capture with Sketches and Ontologies.- Industry.- The Data Value Quest: A Holistic Semantic Approach at Bosch.- Extracting Subontologies from SNOMED CT.- “Semantify” business and content to meet demands for expert solutions in professional markets.- Enhancing Knowledge Graph Generation with Ontology Reshaping – Bosch Case.- Semantic Data Integration for Monitoring Operators’ Ergonomics in an Automotive Manufacturing Setting.- Semantic Description of Equipment and its Controls in Building Automation Systems.

    1 in stock

    £58.49

  • Business Process Management: 20th International

    Springer International Publishing AG Business Process Management: 20th International

    3 in stock

    Book SynopsisThis book constitutes the refereed proceedings of the 20th International Conference on Business Process Management, BPM 2022, which took place in Münster, Germany, in September 2022. The 22 papers included in this book were carefully reviewed and selected from 98 submissions. They were organized in topical sections as follows: task mining; design methods; process mining; process mining practice; analytics; and systems. The book also includes one keynote talk in full-paper length and 5 tutorial papers. Table of ContentsKeynote.- Advancing Business Process Science via the Co-Evolution of Substantive and Methodological Knowledge.- Tutorials.- BPM in Digital Transformation: New Tools and Productivity Challenges.- Multi-Dimensional Process Analysis.- Theory and Practice - What, With What and How Is Business Process Management Taught at German Universities.-How to Leverage Process Mining in Organizations - Towards Process Mining Capabilities.- Mastering Robotic Process Automation with Process Mining.- Task Mining.- A Reference Data Model for Process-Related User Interaction Logs.- Analysing Variable Human Actions for Robotic Process Automation.- The SWORD is Mightier than the Interview: A Framework for Semi-automatic WORkaround Detection.- Design Methods.- Back to the Roots – Investigating the Theoretical Foundations of Business Process Maturity Models.- Applying Process Mining in Small and Medium sized IT Enterprises – Challenges and Guidelines.- A Process Mining Success Factors Model.- Process Mining.- No Time to Dice: Learning Execution Contexts from Event Logs for Resource-Oriented Process Mining.- A Purpose-Guided Log Generation Framework.- Conformance Checking with Uncertainty via SMT.- Process Mining Practice.- The Dark Side of Process Mining. How Identifiable Are Users Despite Technologically Anonymized Data? A Case Study From the Health Sector.- Analyzing How Process Mining Reports Answer Time Performance Questions.- Process Mining of Knowledge-Intensive Processes: An Action Design Research Study in Manufacturing.- Process Mining Practices: Evidence from Interviews.- Analytics.- Measuring Inconsistency in Declarative Process Specifications.- Understanding and Decomposing Control-Flow Loops in Business Process Models.- Reasoning on Labelled Petri Nets and their Dynamics in a Stochastic Setting.- Incentive Alignment through Secure Computations.- Business Process Simulation with Differentiated Resources: Does it Make a Difference.- Uncovering Object-centric Data in Classical Event Logs for the Automated Transformation from XES to OCEL.- Systems.- Why Companies Use RPA: A Critical Reflection of Goals.- A trustworthy decentralized change propagation mechanism for declarative choreographies.- Architecture of decentralized Process Management Systems.

    3 in stock

    £53.99

  • Proximity and Epidata: Attributes and Meaning

    Springer International Publishing AG Proximity and Epidata: Attributes and Meaning

    3 in stock

    Book SynopsisThis book provides a new model to explore discoverability and enhance the meaning of information. The authors have coined the term epidata, which includes items and circumstances that impact the expression of the data in a document, but are not part of the ordinary process of retrieval systems. Epidata affords pathways and points to details that cast light on proximities that might otherwise go unknown. In addition, epidata are clues to mis-and dis-information discernment. There are many ways to find needed information; however, finding the most useable information is not an easy task. The book explores the uses of proximity and the concept of epidata that increases the probability of finding functional information. The authors sketch a constellation of proximities, present examples of attempts to accomplish proximity, and provoke a discussion of the role of proximity in the field. In addition, the authors suggest that proximity is a thread between retrieval constructs based on known topics, predictable relations, and types of information seeking that lie outside constructs such as browsing, stumbling, encountering, detective work, art making, and translation.Table of ContentsProximity and Clues.- More than Meets the Eye.- Epidata, Clues, Threads, and Webs.- Provocations and Invitations.

    3 in stock

    £31.49

  • Advances in Information Retrieval: 45th European

    Springer International Publishing AG Advances in Information Retrieval: 45th European

    1 in stock

    Book SynopsisThe three-volume set LNCS 13980, 13981 and 13982 constitutes the refereed proceedings of the 45th European Conference on IR Research, ECIR 2023, held in Dublin, Ireland, during April 2-6, 2023. The 65 full papers, 41 short papers, 19 demonstration papers, 12 reproducibility papers consortium papers, 7 tutorial papers, and 10 doctorial consortium papers were carefully reviewed and selected from 489 submissions. The book also contains, 8 workshop summaries and 13 CLEF Lab descriptions. The accepted papers cover the state of the art in information retrieval focusing on user aspects, system and foundational aspects, machine learning, applications, evaluation, new social and technical challenges, and other topics of direct or indirect relevance to search.Table of ContentsFull Papers.- Automatic Summarization of Financial Earnings Calls Transcript.- Parameter-Efficient Sparse Retrievers and Rerankers using Adapters.- Feature Differentiation and Fusion for Semantic Text Matching.- Multivariate Powered Dirichlet-Hawkes Process.- Fragmented Visual Attention in Web Browsing: Weibull Analysis of Item Visit Times.- Topic-Enhanced Personalized Retrieval-based Chatbot.- Improving the Generalizability of the Dense Passage Retriever Using Generated Datasets.- SegmentCodeList: Unsupervised Representation Learning for Human Skeleton Data Retrieval.- Knowing What and How: A Multi-modal Aspect-Based Framework for Complaint Detection.- What is your cause for concern? Towards Interpretable Complaint Cause Analysis.- DeCoDE: DEtection of COgnitive Distortion and Emotion cause extraction in clinical conversations.- Domain-aligned Data Augmentation for Low-resource and Imbalanced Text Classification.- Privacy-Preserving Fair Item Ranking.- Multimodal Geolocation Estimation of News Photos.- Topics in Contextualised Attention Embeddings.- New Metrics to Encourage Innovation and Diversity in Information Retrieval Approaches.- Probing BERT for Ranking Abilities.- Clustering of Bandit with Frequency-Dependent Information Sharing.- Contrastive Graph Learning with Positional Representation for Recommendation.- Domain Adaptation for Anomaly Detection on Heterogeneous Graphs in E-Commerce.- Short PapersImproving Neural Topic Models with Wasserstein Knowledge Distillation.- Towards Effective Paraphrasing for Information Disguise.- Generating Topic Pages for Scientific Concepts Using Scientific Publications.- Relevance Judgements for Fair Ranking.- A Study of Term-Topic Embeddings for Ranking.- Topic Refinement in Multi-Level Hate Speech Detection.- Is Cross-modal Information Retrieval Possible without Training?.- Adversarial Adaptation for French Named Entity Recognition.- Exploring Fake News Detection with Heterogeneous Social Media Context Graphs.- Justifying Multi-Label Text Classifications for Healthcare Applications.- Doc2Query–: When Less is More.- Towards Quantifying The Privacy Of Redacted Text. -Detecting Stance of Authorities towards Rumors in Arabic Tweets: A Preliminary Study.- Leveraging Comment Retrieval for Code Summarization.- CPR: Cross-domain Preference Ranking with User Transformation.- Colbert-FairPRF: Towards Fair Pseudo-Relevance Feedback in Dense Retrieval.- C2LIR: Continual Cross-lingual Transfer for Low-Resource Information Retrieval.- Joint Extraction and Classification of Danish Competences for Job Matching.- A Study on FGSM Adversarial Training for Neural Retrieval.- Dialogue-to-Video Retrieval.- Time-dependent next-basket recommendations.- Investigating the Impact of Query Representation on Medical Information Retrieval.- Where a Little Change Makes a Big Difference: A Preliminary Exploration of Children’s Queries.- Multi-document QA with GPT-3 and Neural Reranking .- Towards Detecting Interesting Ideas Expressed in Text.- Towards Linguistically Informed Multi-Objective Transformer Pre-Training for Natural Language Inference.- Dirichlet-Survival Process: Scalable Inference of Topic-Dependent Diffusion Networks.- Consumer Health Question Answering Using Off-the-shelf Components.- MOO-CMDS+NER: Named Entity Recognition-based Extractive Comment-oriented Multi-document Summarization.- Don’t Raise Your Voice, Improve Your Argument: Learning to Retrieve Convincing Arguments.- Learning Query-Space Document Representations for High-Recall Retrieval.- Investigating Conversational Search Behavior For Domain Exploration.- Evaluating Humorous Response Generation to Playful Shopping Requests.- Joint Span Segmentation and Rhetorical Role Labeling with Data Augmentation for Legal Documents.- Trigger or not Trigger: Dynamic Thresholding for Few Shot Event Detection.- The Impact of a Popularity Punishing Hyperparameter on ItemKNN Recommendation Performance.- Neural Ad hoc Retrieval Meets Information Extraction.- Augmenting Graph Convolutional Networks with Textual Data for Recommendations.- Utilising Twitter Metadata for Hate Classification.- Evolution of Filter Bubbles and Polarization in News Recommendation.- Capturing Cross-platform Interaction for Identifying Coordinated Accounts of Misinformation Campaigns.

    1 in stock

    £71.99

  • Advances in Information Retrieval: 45th European

    Springer International Publishing AG Advances in Information Retrieval: 45th European

    1 in stock

    Book SynopsisThe three-volume set LNCS 13980, 13981 and 13982 constitutes the refereed proceedings of the 45th European Conference on IR Research, ECIR 2023, held in Dublin, Ireland, during April 2-6, 2023. The 65 full papers, 41 short papers, 19 demonstration papers, 12 reproducibility papers consortium papers, 7 tutorial papers, and 10 doctorial consortium papers were carefully reviewed and selected from 489 submissions. The book also contains, 8 workshop summaries and 13 CLEF Lab descriptions. The accepted papers cover the state of the art in information retrieval focusing on user aspects, system and foundational aspects, machine learning, applications, evaluation, new social and technical challenges, and other topics of direct or indirect relevance to search.Table of ContentsReproducibility Papers.- Knowledge is Power, Understanding is Impact: Utility and Beyond Goals, Explanation Quality, and Fairness in Path Reasoning Recommendation.- Stat-weight: Improving the Estimator of Interleaved Methods Outcomes with Statistical Hypothesis Testing.- A Reproducibility Study of Question Retrieval for Clarifying Questions.- The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer.- Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study.- Index-Based Batch Query Processing Revisited.- A Unified Framework for Learned Sparse Retrieval.- Entity Embeddings for Entity Ranking: A Replicability Study.- Do the Findings of Document and Passage Retrieval Generalize to the Retrieval of Responses for Dialogues?.- PyGaggle: A Gaggle of Resources for Open-Domain Question Answering.- Pre-Processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering.- From Baseline to Top Performer: A Reproducibility Study of Approaches at the TREC 2021 Conversational Assistance Track.- Demonstration Papers.- Exploring Tabular Data Through Networks.- InfEval: Application for Object Detection Analysis.- The System for Efficient Indexing and Search in the Large Archives of Scanned Historical Documents.- Public News Archive: A Searchable Sub-Archive to Portuguese Past News Articles.- TweetStream2Story: Narrative Extraction from Tweets in Real Time.- SimpleRad: patient-friendly Dutch radiology reports.- Automated Extraction of Fine-Grained Standardized Product Information from Unstructured Multilingual Web Data.- Continuous Integration for Reproducible Shared Tasks with TIRA.io.- Dynamic Exploratory Search for the Information Retrieval Anthology.- Text2Storyline: Generating Enriched Storylines From Text.- Uptrendz: API-Centric Real-Time Recommendations in Multi-Domain Settings.- Clustering Without Knowing How To: Application and Evaluation.- Enticing local governments to produce FAIR freedom of information act dossiers.- Which Country is this? Automatic Country Ranking of StreetView Images.- Automatic Videography Generation from Audio Tracks.- Ablesbarkeitsmesser: A System for Assessing the Readability of German Text.- FACADE: Fake Articles Classification And Decision Explanation.- PsyProf: A Platform for Assisted Screening of Depression in Social Media.- SOPalign: A Tool for Automatic Estimation of Compliance with Medical Guidelines.- Tutorials.- Understanding and Mitigating Gender Bias in Information Retrieval Systems.- Neuro-Symbolic Representations for Information Retrieval.- Legal IR and NLP: the History, Challenges, and State-of-the-Art.- Deep Learning Methods for Query Auto Completion.- Trends and Overview: The Potential of Conversational Agents in Digital Health.- Crowdsourcing for Information Retrieval.- Uncertainty Quantification for Text Classification.- Workshops.- Fourth International Workshop on Algorithmic Bias in Search and Recommendation (Bias 2023).- The 6th International Workshop on Narrative Extraction from Texts (Text2Story’23).- 2nd Workshop on Augmented Intelligence in Technology-Assisted Review Systems (ALTARS): Evaluation Metrics and Protocols for eDiscovery and Systematic Review Systems.- Workshop QPP++ 2023: Query Performance Prediction and Ist Evaluation in New Tasks.- Bibliometric-enhanced Information Retrieval: 13th International BIR Workshop (BIR˜2023).- Geographic information extraction from texts (GeoExT).- ROMCIR 2023: Overview of the 3rd Workshop on Reducing Online Misinformation through Credible Information Retrieval.- ECIR 2023 workshop proposal: Legal Information Retrieval.- Doctoral Consoritum.- Building Safe and Reliable AI systems for Safety Critical Tasks with Vision-Language Processing.- Text Information Retrieval in Tetun.- Identifying and Representing Knowledge Delta in Scientific Literature.- Investigation of Bias in Web Search Queries.- Monitoring online discussions and responses to support the identification of misinformation.- User Privacy in Recommender Systems.- Conversational Search for Multimedia Archives.- Disinformation Detection: Knowledge Infusion with Transfer Learning and Visualizations.- A Comprehensive Overview of Consumer Conflicts on Social Media.- Designing useful conversational interfaces for information retrieval in career decision-making support.- CLEF Lab Descriptions iDPP@CLEF 2023: The Intelligent Disease Progression Prediction Challenge.- LongEval: Longitudinal Evaluation of Model Performance at CLEF 2023.- The CLEF-2023 CheckThat! Lab: Checkworthiness, Subjectivity, Political Bias, Factuality, and Authority.- Overview of PAN 2023: Authorship Verification, Multi-Author Writing Style Analysis, Profiling Cryptocurrency Influencers, and Trigger Detection.- Overview of Touché 2023: Argument and Causal Retrieval.- CLEF 2023 SimpleText Track: What Happens if General Users Search Scientific Texts?.- Science for Fun: The CLEF 2023 JOKER Track on Automatic Wordplay Analysis.- ImageCLEF 2023 Highlight: Multimedia Retrieval in Medical, Social Media and Content Recommendation Applications.- LifeCLEF 2023 teaser: Species Identification and Prediction Challenges.- BioASQ at CLEF2023: The eleventh edition of the Large-scale biomedical semantic indexing and question answering challenge.- eRisk 2023: Depression, Pathological Gambling, and Eating Disorder Challenges.- Overview of EXIST 2023: sEXism Identification in Social neTworks.- DocILE 2023 Teaser: Document Information Localization and Extraction.

    1 in stock

    £75.99

  • Keywords In and Out of Context

    Springer International Publishing AG Keywords In and Out of Context

    1 in stock

    Book SynopsisThis book explores the rich history of the keyword from its earliest manifestations (long before it appeared anywhere in Google Trends or library cataloging textbooks) in order to illustrate its implicit and explicit mediation of human cognition and communication processes. The author covers the concept of the keyword from its deictic origins in primate and proto-speech communities, through its development within oral traditions, to its initial appearances in numerous graphical forms and its workings over time within a variety of indexing traditions and technologies. The book follows the history all the way to its role in search engine optimization and social media strategies and its potential as an element in the slowly emerging semantic web, as well as in multiple voice search applications. The author synthesizes different perspectives on the significance of this often-invisible intermediary, both in and out of the library and information science context, helping readers to understand how it has come to be so embedded in our daily life.This book: Provides a thorough history of the keyword, from primate and proto-speech communities to current times Explains how the concept of the keyword relates to human cognition and communication processes Highlights the applications of the keyword, both in and out of the library and information science context Table of ContentsChapter 1 - Representation, Reference, Relevance, and Retention.- Chapter 2 - Signals, Semiotics.- Chapter 3 - Proto-Words, Proto-Signs.- Chapter 4 - Philologies, Philosophies, Pragmatics.- Chapter 5 - Rites, Religions.- Chapter 6 - Writing, Indexing.- Chapter 7 - Progress, Public.- Chapter 8 - Discovery, Retrieval.- Chapter 9 - Databases, Search Engines.

    1 in stock

    £23.74

  • The Semantic Web: 20th International Conference,

    Springer International Publishing AG The Semantic Web: 20th International Conference,

    1 in stock

    Book SynopsisThis book constitutes the refereed proceedings of the 20th International Conference onThe Semantic Web, ESWC 2023, held in Hersonissos, Crete, Greece, during May 28–June 1, 2023.The 41 full papers included in this book were carefully reviewed and selected from 167 submissions. They are organized in topical sections as follows: research, resource and in-use.Table of ContentsResearch.- Explainable Phenotype-Centric Drug Repurposing via Deep Reinforcement Learning.- A Comparative Study of Stream Reasoning Engines.- Join Ordering of SPARQL Property Path Queries.- Rening Large Integrated Identity Graphs using the Unique NameAssumption.- Structural Bias in Knowledge Graphs for the Entity Alignment Task.- A Framework to Include and Exploit Probabilistic Information in SHACL Validation Reports.- Transformer based Semantic Relation Typing for Knowledge Graph Integration.- Entity Linking for KGQA Using AMR Graphs.- REGNUM: Generating Logical Rules with Numerical Predicates in Knowledge Graphs.- Classifying sequences by combining context-free grammars and OWL ontologies.- NASTyLinker: NIL-Aware Scalable Transformer-based Entity Linker.- iSummary: Workload-based, Personalized summaries for Knowledge Graphs.- Neural Class Expression Synthesis.- Evaluating Language Models for Knowledge Base Completion.- Subsumption Prediction on E-Commerce Taxonomies.- Two-view Graph Neural Networks for Knowledge Graph Completion.- GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering.- Repairing EL ontologies using weakening and completing.- Activity Recommendation for Business Process Modeling with Sequence-to-Sequence Models.- Resource.- RELD: A Knowledge Graph of Relation Extraction Datasets.- The Internet Meme Knowledge Graph.- Describing and Organizing Semantic Web and Machine Learning Systems in the SWeMLS-KG.- A Concise Ontology to Support Research on Complex, Multimodal Clinical Reasoning.- LauNuts: A Knowledge Graph to identify and compare geographic regions in the European Union.- HHT : an approach for representing temporally-evolving historical territories.- An Upper Ontology for Modern Science Branches and Related Entities.- K-Hub: a modular ontology to support document retrieval and knowledge extraction in Industry 5.0.- pyRDF2Vec: A Python Implementation and Extension of RDF2Vec.- Boosting Knowledge Graph Generation from Tabular Data with RML Views.- A knowledge graph of contentious terminology for inclusive representation of cultural heritage.- LegalHTML: a Representation Language for Legal Acts.- Whyis 2: An Open Source Framework for Knowledge Graph Development and Research.- In-Use.- Prototyping an End-User User Interface for the Solid Application Interoperability Specication under GDPR.- SemReasoner - A high-performance Knowledge Graph Store and rule-based Reasoner.- LIS: A Knowledge Graph-based Line Information System.- Combining Semantic Web and Machine Learning for Auditable Legal Key Element Extraction.- Understanding Customer Requirements - an Enterprise Knowledge Graph Approach.- Investigating Ontology-based data access with GitHub.- Enabling Live SPARQL Queries Over ConceptNet Using Triple Pattern Fragments.- Evaluation of a Representative Selection of SPARQL Query Enginesusing Wikidata.- MOSAIK: An Agent-Based Decentralized Control System withStigmergy For A Transportation Scenario.

    1 in stock

    £80.74

© 2025 Book Curl

    • American Express
    • Apple Pay
    • Diners Club
    • Discover
    • Google Pay
    • Maestro
    • Mastercard
    • PayPal
    • Shop Pay
    • Union Pay
    • Visa

    Login

    Forgot your password?

    Don't have an account yet?
    Create account