Cargando…

Optimizing HPC applications with Intel® cluster tools /

Optimizing HPC Applications with Intel® Cluster Tools takes the reader on a tour of the fast-growing area of high performance computing and the optimization of hybrid programs. These programs typically combine distributed memory and shared memory programming models and use the Message Passing Interf...

Descripción completa

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autor principal:	Supalov, Alexander (Autor)
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Berkeley, CA : ApressOpen, 2014.
Colección:	Expert's voice in software engineering.
Temas:	High performance computing. Supercomputers. Superinformatique. Superordinateurs. Computer science. Computer science
Acceso en línea:	Texto completo (Requiere registro previo con correo institucional)

MARC


LEADER	00000cam a2200000Ii 4500
001	OR_ocn893478338
003	OCoLC
005	20231017213018.0
006	m o d
007	cr cnu\|\|\|unuuu
008	141021s2014 caua ob 001 0 eng d
040			\|a GW5XE \|b eng \|e rda \|e pn \|c GW5XE \|d COO \|d B24X7 \|d OCLCO \|d UMI \|d DEBBG \|d E7B \|d UPM \|d OCLCF \|d EBLCP \|d OCL \|d OCLCQ \|d Z5A \|d LIV \|d ESU \|d OCLCQ \|d VT2 \|d IOG \|d CEF \|d UAB \|d DEHBZ \|d VTS \|d REB \|d OCLCQ \|d MERER \|d YDXCP \|d U3W \|d AU@ \|d WYU \|d YOU \|d OCLCQ \|d OAPEN \|d OCLCQ \|d LEAUB \|d CNCEN \|d UWK \|d OCLCQ \|d OCLCO \|d DCT \|d ERF \|d OCLCQ \|d ADU \|d UKKNU \|d BRF \|d OCLCQ \|d DIPCC \|d S2H \|d OCLCQ \|d AAA \|d OCLCO \|d OCLCQ
019			\|a 895116739 \|a 896861824 \|a 897432769 \|a 900883023 \|a 1005798414 \|a 1026466454 \|a 1048144882 \|a 1055343303 \|a 1056437276 \|a 1059037595 \|a 1066501737 \|a 1086529949 \|a 1086952905 \|a 1103281025 \|a 1105798456 \|a 1107345047 \|a 1110281888 \|a 1110846153 \|a 1112554158 \|a 1112592095 \|a 1119459551 \|a 1129340472 \|a 1153035476 \|a 1159386313 \|a 1162639450 \|a 1163814492 \|a 1166053629 \|a 1166379255 \|a 1179888057 \|a 1192336585 \|a 1224922934 \|a 1228530265 \|a 1235834693 \|a 1240531568
020			\|a 9781430264972 \|q (electronic bk.)
020			\|a 1430264977 \|q (electronic bk.)
020			\|a 1430264969 \|q (print)
020			\|a 9781430264965 \|q (print)
020			\|z 9781430264965
024	7		\|a 10.1007/978-1-4302-6497-2 \|2 doi
024	8		\|a 9781430264972
029	1		\|a AU@ \|b 000058380604
029	1		\|a AU@ \|b 000060583841
029	1		\|a DEBBG \|b BV042491002
029	1		\|a DEBSZ \|b 43484179X
029	1		\|a GBVCP \|b 882741381
035			\|a (OCoLC)893478338 \|z (OCoLC)895116739 \|z (OCoLC)896861824 \|z (OCoLC)897432769 \|z (OCoLC)900883023 \|z (OCoLC)1005798414 \|z (OCoLC)1026466454 \|z (OCoLC)1048144882 \|z (OCoLC)1055343303 \|z (OCoLC)1056437276 \|z (OCoLC)1059037595 \|z (OCoLC)1066501737 \|z (OCoLC)1086529949 \|z (OCoLC)1086952905 \|z (OCoLC)1103281025 \|z (OCoLC)1105798456 \|z (OCoLC)1107345047 \|z (OCoLC)1110281888 \|z (OCoLC)1110846153 \|z (OCoLC)1112554158 \|z (OCoLC)1112592095 \|z (OCoLC)1119459551 \|z (OCoLC)1129340472 \|z (OCoLC)1153035476 \|z (OCoLC)1159386313 \|z (OCoLC)1162639450 \|z (OCoLC)1163814492 \|z (OCoLC)1166053629 \|z (OCoLC)1166379255 \|z (OCoLC)1179888057 \|z (OCoLC)1192336585 \|z (OCoLC)1224922934 \|z (OCoLC)1228530265 \|z (OCoLC)1235834693 \|z (OCoLC)1240531568
037			\|a CL0500000540 \|b Safari Books Online
050		4	\|a QA76.88
072		7	\|a UY \|2 bicssc
072		7	\|a COM014000 \|2 bisacsh
082	0	4	\|a 004.1/1 \|2 23
049			\|a UAMI
245	0	0	\|a Optimizing HPC applications with Intel® cluster tools / \|c Alexander Supalov, Andrey Semin, Michael Klemm, Christopher Dahnken.
264		1	\|a Berkeley, CA : \|b ApressOpen, \|c 2014.
264		2	\|a New York, NY : \|b Distributed to the Book trade worldwide by Springer
264		4	\|c ©2014
300			\|a 1 online resource (xxiv, 265 pages) : \|b illustrations
336			\|a text \|b txt \|2 rdacontent
337			\|a computer \|b c \|2 rdamedia
338			\|a online resource \|b cr \|2 rdacarrier
347			\|a text file
347			\|b PDF
490	1		\|a The expert's voice in software engineering
500			\|a Includes index.
588	0		\|a Online resource; title from PDF title page (SpringerLink, viewed October 21, 2014).
520			\|a Optimizing HPC Applications with Intel® Cluster Tools takes the reader on a tour of the fast-growing area of high performance computing and the optimization of hybrid programs. These programs typically combine distributed memory and shared memory programming models and use the Message Passing Interface (MPI) and OpenMP for multi-threading to achieve the ultimate goal of high performance at low power consumption on enterprise-class workstations and compute clusters. The book focuses on optimization for clusters consisting of the Intel® Xeon processor, but the optimization methodologies also apply to the Intel® Xeon PhiTM coprocessor and heterogeneous clusters mixing both architectures. Besides the tutorial and reference content, the authors address and refute many myths and misconceptions surrounding the topic. The text is augmented and enriched by descriptions of real-life situations.
504			\|a Includes bibliographical references and index.
542			\|f Copyright © 2014 by Apress Media, LLC, all rights reserved \|g 2014
546			\|a English.
505	0		\|a Ch. 1 No Time to Read This Book? -- Using Intel MPI Library -- Using Intel Composer XE -- Tuning Intel MPI Library -- Gather Built-in Statistics -- Optimize Process Placement -- Optimize Thread Placement -- Tuning Intel Composer XE -- Analyze Optimization and Vectorization Reports -- Use Interprocedural Optimization -- Summary -- References -- ch. 2 Overview of Platform Architectures -- Performance Metrics and Targets -- Latency, Throughput, Energy, and Power -- Peak Performance as the Ultimate Limit -- Scalability and Maximum Parallel Speedup -- Bottlenecks and a Bit of Queuing Theory -- Roofline Model -- Performance Features of Computer Architectures -- Increasing Single-Threaded Performance: Where You Can and Cannot Help -- Process More Data with SIMD Parallelism -- Distributed and Shared Memory Systems -- HPC Hardware Architecture Overview -- A Multicore Workstation or a Server Compute Node -- Coprocessor for Highly Parallel Applications -- Group of Similar Nodes Form an HPC Cluster -- Other Important Components of HPC Systems -- Summary -- References -- ch. 3 Top-Down Software Optimization -- The Three Levels and Their Impact on Performance -- System Level -- Application Level -- Microarchitecture Level -- Closed-Loop Methodology -- Workload, Application, and Baseline -- Iterating the Optimization Process -- Summary -- References -- ch. 4 Addressing System Bottlenecks -- Classifying System-Level Bottlenecks -- Identifying Issues Related to System Condition -- Characterizing Problems Caused by System Configuration -- Understanding System-Level Performance Limits -- Checking General Compute Subsystem Performance -- Testing Memory Subsystem Performance -- Testing I/O Subsystem Performance -- Characterizing Application System-Level Issues -- Selecting Performance Characterization Tools -- Monitoring the I/O Utilization -- Analyzing Memory Bandwidth -- Summary -- References -- ch. 5 Addressing Application Bottlenecks: Distributed Memory -- Algorithm for Optimizing MPI Performance -- Comprehending the Underlying MPI Performance -- Recalling Some Benchmarking Basics -- Gauging Default Intranode Communication Performance -- Gauging Default Internode Communication Performance -- Discovering Default Process Layout and Pinning Details -- Gauging Physical Core Performance -- Doing Initial Performance Analysis -- Is It Worth the Trouble? -- Getting an Overview of Scalability and Performance -- Learning Application Behavior -- Choosing Representative Workload(s) -- Balancing Process and Thread Parallelism -- Doing a Scalability Review -- Analyzing the Details of the Application Behavior -- Choosing the Optimization Objective -- Detecting Load Imbalance -- Dealing with Load Imbalance -- Classifying Load Imbalance -- Addressing Load Imbalance -- Optimizing MPI Performance -- Classifying the MPI Performance Issues -- Addressing MPI Performance Issues -- Mapping Application onto the Platform -- Tuning the Intel MPI Library -- Optimizing Application for Intel MPI -- Using Advanced Analysis Techniques -- Automatically Checking MPI Program Correctness -- Comparing Application Traces -- Instrumenting Application Code -- Correlating MPI and Hardware Events -- Summary -- References -- ch. 6 Addressing Application Bottlenecks: Shared Memory -- Profiling Your Application -- Using VTune Amplifier XE for Hotspots Profiling -- Hotspots for the HPCG Benchmark -- Compiler-Assisted Loop/Function Profiling -- Sequential Code and Detecting Load Imbalances -- Thread Synchronization and Locking -- Dealing with Memory Locality and NUMA Effects -- Thread and Process Pinning -- Controlling OpenMP Thread Placement -- Thread Placement in Hybrid Applications -- Summary -- References -- ch. 7 Addressing Application Bottlenecks: Microarchitecture -- Overview of a Modern Processor Pipeline -- Pipelined Execution -- Out-of-order vs. In-order Execution -- Superscalar Pipelines -- SIMD Execution -- Speculative Execution: Branch Prediction -- Memory Subsystem -- Putting It All Together: A Final Look at the Sandy Bridge Pipeline -- A Top-down Method for Categorizing the Pipeline Performance -- Intel Composer XE Usage for Microarchitecture Optimizations -- Basic Compiler Usage and Optimization -- Using Optimization and Vectorization Reports to Read the Compiler's Mind -- Optimizing for Vectorization -- Dealing with Disambiguation -- Dealing with Branches -- When Optimization Leads to Wrong Results -- Analyzing Pipeline Performance with Intel VTune Amplifier XE -- Using a Standard Library Method -- Summary -- References -- ch. 8 Application Design Considerations -- Abstraction and Generalization of the Platform Architecture -- Types of Abstractions -- Levels of Abstraction and Complexities -- Raw Hardware vs. Virtualized Hardware in the Cloud -- Questions about Application Design -- Designing for Performance and Scaling -- Designing for Flexibility and Performance Portability -- Understanding Bounds and Projecting Bottlenecks -- Data Storage or Transfer vs. Recalculation -- Total Productivity Assessment -- Summary -- References.
590			\|a O'Reilly \|b O'Reilly Online Learning: Academic/Public Library Edition
650		0	\|a High performance computing.
650		0	\|a Supercomputers.
650		6	\|a Superinformatique.
650		6	\|a Superordinateurs.
650		7	\|a Computer science. \|2 bicssc
650		7	\|a High performance computing. \|2 fast \|0 (OCoLC)fst00956032
650		7	\|a Supercomputers. \|2 fast \|0 (OCoLC)fst01138790
653			\|a Computer science
700	1		\|a Supalov, Alexander, \|e author.
758			\|i Is found in: \|a Apress \|1 https://openresearchlibrary.org/module/8b6e954c-c94f-4241-bea2-12704534d0e6
776	0	8	\|i Printed edition: \|z 9781430264965
830		0	\|a Expert's voice in software engineering.
856	4	0	\|u https://learning.oreilly.com/library/view/~/9781430264972/?ar \|z Texto completo (Requiere registro previo con correo institucional)
938			\|a Books 24x7 \|b B247 \|n bks00073674
938			\|a ProQuest Ebook Central \|b EBLB \|n EBL3091883
938			\|a ebrary \|b EBRY \|n ebr10952632
938			\|a Knowledge Unlatched \|b KNOW \|n 8efd21de-950b-4d01-ad23-c75fdb2e75de
938			\|a OAPEN Foundation \|b OPEN \|n 1001835
938			\|a YBP Library Services \|b YANK \|n 12143232
938			\|a DCS UAT TEST 8 \|b TEST \|n 1001835
994			\|a 92 \|b IZTAP

Optimizing HPC applications with Intel® cluster tools /

MARC

Ejemplares similares