Search results for ""Author Niall Richard Murphy""
O'Reilly Media The Site Reliability Workbook: Practical ways to implement SRE
In 2016, Google’s Site Reliability Engineering book ignited an industry discussion on what it means to run production services today—and why reliability considerations are fundamental to service design. Now, Google engineers who worked on that bestseller introduce The Site Reliability Workbook, a hands-on companion that uses concrete examples to show you how to put SRE principles and practices to work in your environment. This new workbook not only combines practical examples from Google’s experiences, but also provides case studies from Google’s Cloud Platform customers who underwent this journey. Evernote, The Home Depot, The New York Times, and other companies outline hard-won experiences of what worked for them and what didn’t. Dive into this workbook and learn how to flesh out your own SRE practice, no matter what size your company is. You’ll learn: How to run reliable services in environments you don’t completely control—like cloud Practical applications of how to create, monitor, and run your services via Service Level Objectives How to convert existing ops teams to SRE—including how to dig out of operational overload Methods for starting SRE from either greenfield or brownfield
£43.19
O'Reilly Media Reliable Machine Learning: Applying SRE Principles to ML in Production
Whether you're part of a small startup or a planet-spanning megacorp, this practical book shows data scientists, SREs, and business owners how to run ML reliably, effectively, and accountably within your organization. You'll gain insight into everything from how to do model monitoring in production to how to run a well-tuned model development team in a product organization. By applying an SRE mindset to machine learning, authors and engineering professionals Cathy Chen, Kranti Parisa, Niall Richard Murphy, D. Sculley, Todd Underwood, and featured guests show you how to run an efficient ML system. Whether you want to increase revenue, optimize decision-making, solve problems, or understand and influence customer behavior, you'll learn how to perform day-to-day ML tasks while keeping the bigger picture in mind. You'll examine: What ML is: how it functions and what it relies on Conceptual frameworks for understanding how ML "loops" work Effective "productionization," and how it can be made easily monitorable, deployable, and operable Why ML systems make production troubleshooting more difficult, and how to get around them How ML, product, and production teams can communicate effectively
£57.59