{"product_id":"approximate-dynamic-programmin-9780470604458","title":"Approximate Dynamic Programmin","description":"\u003cb\u003eBook Synopsis\u003c\/b\u003e\u003cbr\u003e\u003cb\u003ePraise for the \u003ci\u003eFirst Edition\u003c\/i\u003e\u003c\/b\u003e  \u003cp\u003eFinally, a book devoted to dynamic programming and written using the language of operations research (OR)! This beautiful book fills a gap in the libraries of OR specialists and practitioners.\u003cbr\u003e \u003cb\u003e\u003ci\u003eComputing Reviews\u003c\/i\u003e\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003eThis new edition showcases a focus on modeling and computation for complex classes of approximate dynamic programming problems\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eUnderstanding approximate dynamic programming (ADP) is vital in order to develop practical and high-quality solutions to complex industrial problems, particularly when those problems involve making decisions in the presence of uncertainty. \u003ci\u003eApproximate Dynamic Programming\u003c\/i\u003e, Second Edition uniquely integrates four distinct disciplinesMarkov decision processes, mathematical programming, simulation, and statisticsto demonstrate how to successfully approach, model, and solve a wide range of real-life problems using ADP.\u003c\/p\u003e \u003cp\u003eThe book continues to bridge the gap bet\u003cbr\u003e\u003cbr\u003e\u003cb\u003eTable of Contents\u003c\/b\u003e\u003cbr\u003e\u003cb\u003ePreface to the Second Edition xi\u003c\/b\u003e  \u003c\/p\u003e\u003cp\u003e\u003cb\u003ePreface to the First Edition xv\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003eAcknowledgments xvii\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003e1 The Challenges of Dynamic Programming 1\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e1.1 A Dynamic Programming Example: A Shortest Path Problem, 2\u003c\/p\u003e \u003cp\u003e1.2 The Three Curses of Dimensionality, 3\u003c\/p\u003e \u003cp\u003e1.3 Some Real Applications, 6\u003c\/p\u003e \u003cp\u003e1.4 Problem Classes, 11\u003c\/p\u003e \u003cp\u003e1.5 The Many Dialects of Dynamic Programming, 15\u003c\/p\u003e \u003cp\u003e1.6 What Is New in This Book?, 17\u003c\/p\u003e \u003cp\u003e1.7 Pedagogy, 19\u003c\/p\u003e \u003cp\u003e1.8 Bibliographic Notes, 22\u003c\/p\u003e \u003cp\u003e\u003cb\u003e2 Some Illustrative Models 25\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e2.1 Deterministic Problems, 26\u003c\/p\u003e \u003cp\u003e2.2 Stochastic Problems, 31\u003c\/p\u003e \u003cp\u003e2.3 Information Acquisition Problems, 47\u003c\/p\u003e \u003cp\u003e2.4 A Simple Modeling Framework for Dynamic Programs, 50\u003c\/p\u003e \u003cp\u003e2.5 Bibliographic Notes, 54\u003c\/p\u003e \u003cp\u003eProblems, 54\u003c\/p\u003e \u003cp\u003e\u003cb\u003e3 Introduction to Markov Decision Processes 57\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e3.1 The Optimality Equations, 58\u003c\/p\u003e \u003cp\u003e3.2 Finite Horizon Problems, 65\u003c\/p\u003e \u003cp\u003e3.3 Infinite Horizon Problems, 66\u003c\/p\u003e \u003cp\u003e3.4 Value Iteration, 68\u003c\/p\u003e \u003cp\u003e3.5 Policy Iteration, 74\u003c\/p\u003e \u003cp\u003e3.6 Hybrid Value-Policy Iteration, 75\u003c\/p\u003e \u003cp\u003e3.7 Average Reward Dynamic Programming, 76\u003c\/p\u003e \u003cp\u003e3.8 The Linear Programming Method for Dynamic Programs, 77\u003c\/p\u003e \u003cp\u003e3.9 Monotone Policies*, 78\u003c\/p\u003e \u003cp\u003e3.10 Why Does It Work?**, 84\u003c\/p\u003e \u003cp\u003e3.11 Bibliographic Notes, 103\u003c\/p\u003e \u003cp\u003eProblems, 103\u003c\/p\u003e \u003cp\u003e\u003cb\u003e4 Introduction to Approximate Dynamic Programming 111\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e4.1 The Three Curses of Dimensionality (Revisited), 112\u003c\/p\u003e \u003cp\u003e4.2 The Basic Idea, 114\u003c\/p\u003e \u003cp\u003e4.3 \u003ci\u003eQ\u003c\/i\u003e-Learning and SARSA, 122\u003c\/p\u003e \u003cp\u003e4.4 Real-Time Dynamic Programming, 126\u003c\/p\u003e \u003cp\u003e4.5 Approximate Value Iteration, 127\u003c\/p\u003e \u003cp\u003e4.6 The Post-Decision State Variable, 129\u003c\/p\u003e \u003cp\u003e4.7 Low-Dimensional Representations of Value Functions, 144\u003c\/p\u003e \u003cp\u003e4.8 So Just What Is Approximate Dynamic Programming?, 146\u003c\/p\u003e \u003cp\u003e4.9 Experimental Issues, 149\u003c\/p\u003e \u003cp\u003e4.10 But Does It Work?, 155\u003c\/p\u003e \u003cp\u003e4.11 Bibliographic Notes, 156\u003c\/p\u003e \u003cp\u003eProblems, 158\u003c\/p\u003e \u003cp\u003e\u003cb\u003e5 Modeling Dynamic Programs 167\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e5.1 Notational Style, 169\u003c\/p\u003e \u003cp\u003e5.2 Modeling Time, 170\u003c\/p\u003e \u003cp\u003e5.3 Modeling Resources, 174\u003c\/p\u003e \u003cp\u003e5.4 The States of Our System, 178\u003c\/p\u003e \u003cp\u003e5.5 Modeling Decisions, 187\u003c\/p\u003e \u003cp\u003e5.6 The Exogenous Information Process, 189\u003c\/p\u003e \u003cp\u003e5.7 The Transition Function, 198\u003c\/p\u003e \u003cp\u003e5.8 The Objective Function, 206\u003c\/p\u003e \u003cp\u003e5.9 A Measure-Theoretic View of Information**, 211\u003c\/p\u003e \u003cp\u003e5.10 Bibliographic Notes, 213\u003c\/p\u003e \u003cp\u003eProblems, 214\u003c\/p\u003e \u003cp\u003e\u003cb\u003e6 Policies 221\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e6.1 Myopic Policies, 224\u003c\/p\u003e \u003cp\u003e6.2 Lookahead Policies, 224\u003c\/p\u003e \u003cp\u003e6.3 Policy Function Approximations, 232\u003c\/p\u003e \u003cp\u003e6.4 Value Function Approximations, 235\u003c\/p\u003e \u003cp\u003e6.5 Hybrid Strategies, 239\u003c\/p\u003e \u003cp\u003e6.6 Randomized Policies, 242\u003c\/p\u003e \u003cp\u003e6.7 How to Choose a Policy?, 244\u003c\/p\u003e \u003cp\u003e6.8 Bibliographic Notes, 247\u003c\/p\u003e \u003cp\u003eProblems, 247\u003c\/p\u003e \u003cp\u003e\u003cb\u003e7 Policy Search 249\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e7.1 Background, 250\u003c\/p\u003e \u003cp\u003e7.2 Gradient Search, 253\u003c\/p\u003e \u003cp\u003e7.3 Direct Policy Search for Finite Alternatives, 256\u003c\/p\u003e \u003cp\u003e7.4 The Knowledge Gradient Algorithm for Discrete Alternatives, 262\u003c\/p\u003e \u003cp\u003e7.5 Simulation Optimization, 270\u003c\/p\u003e \u003cp\u003e7.6 Why Does It Work?**, 274\u003c\/p\u003e \u003cp\u003e7.7 Bibliographic Notes, 285\u003c\/p\u003e \u003cp\u003eProblems, 286\u003c\/p\u003e \u003cp\u003e\u003cb\u003e8 Approximating Value Functions 289\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e8.1 Lookup Tables and Aggregation, 290\u003c\/p\u003e \u003cp\u003e8.2 Parametric Models, 304\u003c\/p\u003e \u003cp\u003e8.3 Regression Variations, 314\u003c\/p\u003e \u003cp\u003e8.4 Nonparametric Models, 316\u003c\/p\u003e \u003cp\u003e8.5 Approximations and the Curse of Dimensionality, 325\u003c\/p\u003e \u003cp\u003e8.6 Why Does It Work?**, 328\u003c\/p\u003e \u003cp\u003e8.7 Bibliographic Notes, 333\u003c\/p\u003e \u003cp\u003eProblems, 334\u003c\/p\u003e \u003cp\u003e\u003cb\u003e9 Learning Value Function Approximations 337\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e9.1 Sampling the Value of a Policy, 337\u003c\/p\u003e \u003cp\u003e9.2 Stochastic Approximation Methods, 347\u003c\/p\u003e \u003cp\u003e9.3 Recursive Least Squares for Linear Models, 349\u003c\/p\u003e \u003cp\u003e9.4 Temporal Difference Learning with a Linear Model, 356\u003c\/p\u003e \u003cp\u003e9.5 Bellman’s Equation Using a Linear Model, 358\u003c\/p\u003e \u003cp\u003e9.6 Analysis of TD(0), LSTD, and LSPE Using a Single State, 364\u003c\/p\u003e \u003cp\u003e9.7 Gradient-Based Methods for Approximate Value Iteration*, 366\u003c\/p\u003e \u003cp\u003e9.8 Least Squares Temporal Differencing with Kernel Regression*, 371\u003c\/p\u003e \u003cp\u003e9.9 Value Function Approximations Based on Bayesian Learning*, 373\u003c\/p\u003e \u003cp\u003e9.10 Why Does It Work*, 376\u003c\/p\u003e \u003cp\u003e9.11 Bibliographic Notes, 379\u003c\/p\u003e \u003cp\u003eProblems, 381\u003c\/p\u003e \u003cp\u003e\u003cb\u003e10 Optimizing While Learning 383\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e10.1 Overview of Algorithmic Strategies, 385\u003c\/p\u003e \u003cp\u003e10.2 Approximate Value Iteration and \u003ci\u003eQ\u003c\/i\u003e-Learning Using Lookup Tables, 386\u003c\/p\u003e \u003cp\u003e10.3 Statistical Bias in the Max Operator, 397\u003c\/p\u003e \u003cp\u003e10.4 Approximate Value Iteration and \u003ci\u003eQ\u003c\/i\u003e-Learning Using Linear Models, 400\u003c\/p\u003e \u003cp\u003e10.5 Approximate Policy Iteration, 402\u003c\/p\u003e \u003cp\u003e10.6 The Actor–Critic Paradigm, 408\u003c\/p\u003e \u003cp\u003e10.7 Policy Gradient Methods, 410\u003c\/p\u003e \u003cp\u003e10.8 The Linear Programming Method Using Basis Functions, 411\u003c\/p\u003e \u003cp\u003e10.9 Approximate Policy Iteration Using Kernel Regression*, 413\u003c\/p\u003e \u003cp\u003e10.10 Finite Horizon Approximations for Steady-State Applications, 415\u003c\/p\u003e \u003cp\u003e10.11 Bibliographic Notes, 416\u003c\/p\u003e \u003cp\u003eProblems, 418\u003c\/p\u003e \u003cp\u003e\u003cb\u003e11 Adaptive Estimation and Stepsizes 419\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e11.1 Learning Algorithms and Stepsizes, 420\u003c\/p\u003e \u003cp\u003e11.2 Deterministic Stepsize Recipes, 425\u003c\/p\u003e \u003cp\u003e11.3 Stochastic Stepsizes, 433\u003c\/p\u003e \u003cp\u003e11.4 Optimal Stepsizes for Nonstationary Time Series, 437\u003c\/p\u003e \u003cp\u003e11.5 Optimal Stepsizes for Approximate Value Iteration, 447\u003c\/p\u003e \u003cp\u003e11.6 Convergence, 449\u003c\/p\u003e \u003cp\u003e11.7 Guidelines for Choosing Stepsize Formulas, 451\u003c\/p\u003e \u003cp\u003e11.8 Bibliographic Notes, 452\u003c\/p\u003e \u003cp\u003eProblems, 453\u003c\/p\u003e \u003cp\u003e\u003cb\u003e12 Exploration Versus Exploitation 457\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e12.1 A Learning Exercise: The Nomadic Trucker, 457\u003c\/p\u003e \u003cp\u003e12.2 An Introduction to Learning, 460\u003c\/p\u003e \u003cp\u003e12.3 Heuristic Learning Policies, 464\u003c\/p\u003e \u003cp\u003e12.4 Gittins Indexes for Online Learning, 470\u003c\/p\u003e \u003cp\u003e12.5 The Knowledge Gradient Policy, 477\u003c\/p\u003e \u003cp\u003e12.6 Learning with a Physical State, 482\u003c\/p\u003e \u003cp\u003e12.7 Bibliographic Notes, 492\u003c\/p\u003e \u003cp\u003eProblems, 493\u003c\/p\u003e \u003cp\u003e\u003cb\u003e13 Value Function Approximations for Resource Allocation Problems 497\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e13.1 Value Functions versus Gradients, 498\u003c\/p\u003e \u003cp\u003e13.2 Linear Approximations, 499\u003c\/p\u003e \u003cp\u003e13.3 Piecewise-Linear Approximations, 501\u003c\/p\u003e \u003cp\u003e13.4 Solving a Resource Allocation Problem Using Piecewise-Linear Functions, 505\u003c\/p\u003e \u003cp\u003e13.5 The SHAPE Algorithm, 509\u003c\/p\u003e \u003cp\u003e13.6 Regression Methods, 513\u003c\/p\u003e \u003cp\u003e13.7 Cutting Planes*, 516\u003c\/p\u003e \u003cp\u003e13.8 Why Does It Work?**, 528\u003c\/p\u003e \u003cp\u003e13.9 Bibliographic Notes, 535\u003c\/p\u003e \u003cp\u003eProblems, 536\u003c\/p\u003e \u003cp\u003e\u003cb\u003e14 Dynamic Resource Allocation Problems 541\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e14.1 An Asset Acquisition Problem, 541\u003c\/p\u003e \u003cp\u003e14.2 The Blood Management Problem, 547\u003c\/p\u003e \u003cp\u003e14.3 A Portfolio Optimization Problem, 557\u003c\/p\u003e \u003cp\u003e14.4 A General Resource Allocation Problem, 560\u003c\/p\u003e \u003cp\u003e14.5 A Fleet Management Problem, 573\u003c\/p\u003e \u003cp\u003e14.6 A Driver Management Problem, 580\u003c\/p\u003e \u003cp\u003e14.7 Bibliographic Notes, 585\u003c\/p\u003e \u003cp\u003eProblems, 586\u003c\/p\u003e \u003cp\u003e\u003cb\u003e15 Implementation Challenges 593\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e15.1 Will ADP Work for Your Problem?, 593\u003c\/p\u003e \u003cp\u003e15.2 Designing an ADP Algorithm for Complex Problems, 594\u003c\/p\u003e \u003cp\u003e15.3 Debugging an ADP Algorithm, 596\u003c\/p\u003e \u003cp\u003e15.4 Practical Issues, 597\u003c\/p\u003e \u003cp\u003e15.5 Modeling Your Problem, 602\u003c\/p\u003e \u003cp\u003e15.6 Online versus Offline Models, 604\u003c\/p\u003e \u003cp\u003e15.7 If It Works, Patent It!, 606\u003c\/p\u003e \u003cp\u003e\u003cb\u003eBibliography 607\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003eIndex 623\u003c\/b\u003e\u003c\/p\u003e","brand":"John Wiley \u0026 Sons Inc","offers":[{"title":"Default Title","offer_id":49402377896279,"sku":"9780470604458","price":108.86,"currency_code":"GBP","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0817\/1739\/5799\/files\/9780470604458.jpg?v=1730480210","url":"https:\/\/bookcurl.com\/products\/approximate-dynamic-programmin-9780470604458","provider":"Book Curl","version":"1.0","type":"link"}