Cargando…

Learning from failure : why a total site outage can be a good thing /

"Although an outage is a terrifying prospect, you should embrace it as an opportunity. Failure can expand and improve your understanding of your systems. Three years ago, Indeed suffered one of the worst outages in its history. No single fault or failure caused this outage. Rather, it was a com...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor Corporativo: O'Reilly Velocity
Formato: Electrónico Congresos, conferencias Video
Idioma:Inglés
Publicado: [Place of publication not identified] : O'Reilly Media, 2019.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)

MARC

LEADER 00000cgm a2200000 i 4500
001 OR_on1131863256
003 OCoLC
005 20231017213018.0
006 m o c
007 cr cna||||||||
007 vz czazuu
008 191218s2019 xx 035 o vleng d
040 |a UMI  |b eng  |e rda  |e pn  |c UMI  |d OCLCF  |d TOH  |d OCLCO  |d OCLCQ  |d OCLCO 
035 |a (OCoLC)1131863256 
037 |a CL0501000085  |b Safari Books Online 
050 4 |a TK5105.888 
049 |a UAMI 
100 1 |a Elman, Alex,  |e on-screen presenter. 
245 1 0 |a Learning from failure :  |b why a total site outage can be a good thing /  |c Alex Elman. 
264 1 |a [Place of publication not identified] :  |b O'Reilly Media,  |c 2019. 
264 4 |c ©2019 
300 |a 1 online resource (1 streaming video file (34 min., 16 sec.)) 
336 |a two-dimensional moving image  |b tdi  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
337 |a video  |b v  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
511 0 |a Presenter, Alex Elman. 
500 |a Title from title screen (viewed December 12, 2019). 
520 |a "Although an outage is a terrifying prospect, you should embrace it as an opportunity. Failure can expand and improve your understanding of your systems. Three years ago, Indeed suffered one of the worst outages in its history. No single fault or failure caused this outage. Rather, it was a complex interaction of bugs, design decisions, capacity loss, and poor situational awareness during incident response. Indeed learned valuable lessons from this event. It identified ways to make the systems more resilient and improved the approach to the incident lifecycle within the engineering culture. Alex Elman uses the narrative of this incident to demonstrate how a site-wide outage can inform increased resilience and reduced operational complexity. Learning from failure is a feedback loop rather than a one-off process. He applies Indeed's outage as a practical example of what an iteration of this loop can look like. He shares with other SREs the success that has risen from this failure. Indeed hasn't had a global site outage in the three years since this event. Alex begins with a discussion of failure to set the stage for delivering the incident background, then discusses incident response and situational awareness. He explains conducting incident postmortems and learning from failure and designing for reliability, including resilience patterns such as circuit breaking and graceful degradation. Finally, he gets into resilience testing, running chaos tests, and closing the feedback loop, leaving some time for a question and answer session. This session was recorded at the 2019 O'Reilly Velocity Conference in San Jose."--Resource description page 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
650 0 |a Web site development. 
650 0 |a Computer system failures. 
650 0 |a Computer software  |x Development. 
650 6 |a Sites Web  |x Développement. 
650 6 |a Pannes système (Informatique) 
650 7 |a Computer software  |x Development.  |2 fast  |0 (OCoLC)fst00872537 
650 7 |a Computer system failures.  |2 fast  |0 (OCoLC)fst00872650 
650 7 |a Web site development.  |2 fast  |0 (OCoLC)fst01173243 
655 4 |a Electronic videos. 
710 2 |a O'Reilly & Associates,  |e publisher. 
711 2 |a O'Reilly Velocity  |d (2019 :  |c San Jose, Calif.)  |j issuing body. 
856 4 0 |u https://learning.oreilly.com/videos/~/0636920338451/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
994 |a 92  |b IZTAP