Cargando…

Intel Xeon Phi processor high performance programming /

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autores principales:	Jeffers, Jim (Computer engineer) (Autor), Reinders, James (Autor), Sodani, Avinash (Autor)
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Cambridge, MA : Morgan Kaufmann is an imprint of Elsevier, [2016]
Edición:	Knights Landing edition.
Temas:	High performance processors. Computer programming. High performance computing. Processeurs �a hautes performances. Programmation (Informatique) Superinformatique. computer programming. COMPUTERS > Computer Literacy. COMPUTERS > Computer Science. COMPUTERS > Data Processing. COMPUTERS > Hardware > General. COMPUTERS > Information Technology. COMPUTERS > Machine Theory. COMPUTERS > Reference. COMPUTER SYSTEMS PERFORMANCE. COMPUTER PROGRAMMING.
Acceso en línea:	Texto completo

MARC


LEADER	00000cam a2200000 i 4500
001	SCIDIR_ocn951217526
003	OCoLC
005	20231120112113.0
006	m o d
007	cr cnu\|\|\|unuuu
008	160606t20162016mau ob 001 0 eng d
040			\|a N$T \|b eng \|e rda \|e pn \|c N$T \|d YDXCP \|d IDEBK \|d N$T \|d UIU \|d OPELS \|d EBLCP \|d OCLCF \|d COO \|d IDB \|d UPM \|d DEBSZ \|d OTZ \|d MERUC \|d OCLCQ \|d U3W \|d D6H \|d WRM \|d NLE \|d AU@ \|d OCLCQ \|d DCT \|d NAG \|d OCLCQ \|d S2H \|d OCLCO \|d VT2 \|d OCLCO \|d OCLCQ
019			\|a 951594142 \|a 1110641670 \|a 1235827802 \|a 1244445475 \|a 1262689167
020			\|a 9780128091951 \|q (electronic bk.)
020			\|a 0128091959 \|q (electronic bk.)
020			\|a 0128091940 \|q (paperback)
020			\|a 9780128091944
020			\|z 9780128091944 \|q (print)
035			\|a (OCoLC)951217526 \|z (OCoLC)951594142 \|z (OCoLC)1110641670 \|z (OCoLC)1235827802 \|z (OCoLC)1244445475 \|z (OCoLC)1262689167
050		4	\|a QA76.88
072		7	\|a COM \|x 013000 \|2 bisacsh
072		7	\|a COM \|x 014000 \|2 bisacsh
072		7	\|a COM \|x 018000 \|2 bisacsh
072		7	\|a COM \|x 067000 \|2 bisacsh
072		7	\|a COM \|x 032000 \|2 bisacsh
072		7	\|a COM \|x 037000 \|2 bisacsh
072		7	\|a COM \|x 052000 \|2 bisacsh
082	0	4	\|a 004.1/1 \|2 23
100	1		\|a Jeffers, Jim \|c (Computer engineer), \|e author.
245	1	0	\|a Intel Xeon Phi processor high performance programming / \|c by Jim Jeffers, James Reinders, Avinash Sodani.
250			\|a Knights Landing edition.
264		1	\|a Cambridge, MA : \|b Morgan Kaufmann is an imprint of Elsevier, \|c [2016]
264		4	\|c �2016
300			\|a 1 online resource
336			\|a text \|b txt \|2 rdacontent
337			\|a computer \|b c \|2 rdamedia
338			\|a online resource \|b cr \|2 rdacarrier
504			\|a Includes bibliographical references and index.
588	0		\|a Vendor-supplied metadata.
505	0		\|a Machine generated contents note: ch. 1 Introduction -- Introduction to Many-Core Programming -- Trend: More Parallelism -- Why Intel� Xeon Phi["! Processors Are Needed -- Processors Versus Coprocessor -- Measuring Readiness for Highly Parallel Execution -- What About GPUs? -- Enjoy the Lack of Porting Needed but Still Tune! -- Transformation for Performance -- Hyper-Threading Versus Multithreading -- Programming Models -- Why We Could Skip To Section II Now -- For More Information -- ch. 2 Knights Landing Overview -- Overview -- Instruction Set -- Architecture Overview -- Motivation: Our Vision and Purpose -- Summary -- For More Information -- ch. 3 Programming MCDRAM and Cluster Modes -- Programming for Cluster Modes -- Programming for Memory Modes -- Query Memory Mode and MCDRAM Available -- SNC Performance Implications of Allocation and Threading -- How to Not Hard Code the NUMA Node Numbers -- Approaches to Determining What to Put in MCDRAM.
505	0		\|a Note continued: Why Rebooting Is Required to Change Modes -- BIOS -- Summary -- For More Information -- ch. 4 Knights Landing Architecture -- Tile Architecture -- Cluster Modes -- Memory Interleaving -- Memory Modes -- Interactions of Cluster and Memory Modes -- Summary -- For More Information -- ch. 5 Intel Omni-Path Fabric -- Overview -- Performance and Scalability -- Transport Layer APIs -- Quality of Service -- Virtual Fabrics -- Unicast Address Resolution -- Multicast Address Resolution -- Summary -- For More Information -- ch. 6 [�]arch Optimization Advice -- Best Performance From 1, 2, or 4 Threads Per Core, Rarely 3 -- Memory Subsystem -- [�]arch Nuances (Tile) -- Direct Mapped MCDRAM Cache -- Advice: Use AVX-512 -- Summary -- For More Information -- ch. 7 Programming Overview for Knights Landing -- To Refactor, or Not to Refactor, That Is the Question -- Evolutionary Optimization of Applications -- Revolutionary Optimization of Applications.
505	0		\|a Note continued: Know When to Hold'em and When to Fold'em -- For More Information -- ch. 8 Tasks and Threads -- OpenMP -- Fortran 2008 -- Intel TBB -- hStreams -- Summary -- For More Information -- ch. 9 Vectorization -- Why Vectorize? -- How to Vectorize -- Three Approaches to Achieving Vectorization -- Six-Step Vectorization Methodology -- Streaming Through Caches: Data Layout, Alignment, Prefetching, and so on -- Compiler Tips -- Compiler Options -- Compiler Directives -- Use Array Sections to Encourage Vectorization -- Look at What the Compiler Created: Assembly Code Inspection -- Numerical Result Variations with Vectorization -- Summary -- For More Information -- ch. 10 Vectorization Advisor -- Getting Started with Intel Advisor for Knights Landing -- Enabling and Improving AVX-512 Code with the Survey Report -- Memory Access Pattern Report -- AVX-512 Gather/Scatter Profiler -- Mask Utilization and FLOPS Profiler -- Advisor Roofline Report.
505	0		\|a Note continued: Explore AVX-512 Code Characteristics Without AVX-512 Hardware -- Example -- Analysis of a Computational Chemistry Code -- Summary -- For More Information -- ch. 11 Vectorization with SDLT -- What Is SDLT? -- Getting Started -- SDLT Basics -- Example Normalizing 3d Points with SIMD -- What Is Wrong with AOS Memory Layout and SIMD? -- SIMD Prefers Unit-Stride Memory Accesses -- Alpha-Blended Overlay Reference -- Alpha-Blended Overlay With SDLT -- Additional Features -- Summary -- For More Information -- ch. 12 Vectorization with AVX-512 Intrinsics -- What Are Intrinsics? -- AVX-512 Overview -- Migrating From Knights Corner -- AVX-512 Detection -- Learning AVX-512 Instructions -- Learning AVX-512 Intrinsics -- Step-by-Step Example Using AVX-512 Intrinsics -- Results Using Our Intrinsics Code -- For More Information -- ch. 13 Performance Libraries -- Intel Performance Library Overview -- Intel Math Kernel Library Overview.
505	0		\|a Note continued: Intel Data Analytics Library Overview -- Together: MKL and DAAL -- Intel Integrated Performance Primitives Library Overview -- Intel Performance Libraries and Intel Compilers -- Native (Direct) Library Usage -- Offloading to Knights Landing While Using a Library -- Precision Choices and Variations -- Performance Tip for Faster Dynamic Libraries -- For More Information -- ch. 14 Profiling and Timing -- Introduction to Knight Landing Tuning -- Event-Monitoring Registers -- Efficiency Metrics -- Potential Performance Issues -- Intel VTune Amplifier XE Product -- Performance Application Programming Interface -- MPI Analysis: ITAC -- HPCToolkit -- Tuning and Analysis Utilities -- Timing -- Summary -- For More Information -- ch. 15 MPI -- Internode Parallelism -- MPI on Knights Landing -- MPI Overview -- How to Run MPI Applications -- Analyzing MPI Application Runs -- Tuning of MPI Applications -- Heterogeneous Clusters -- Recent Trends in MPI Coding.
505	0		\|a Note continued: Putting it all Together -- Summary -- For More Information -- ch. 16 PGAS Programming Models -- To Share or not to Share -- Why Use PGAS on Knights Landing? -- Programming with PGAS -- Performance Evaluation -- Beyond PGAS -- Summary -- For More Information -- ch. 17 Software-Defined Visualization -- Motivation for Software-Defined Visualization -- Software-Defined Visualization Architecture -- OpenSWR: OpenGL Raster-Graphics Software Rendering -- Embree: High-Performance Ray Tracing Kernel Library -- OSPRay: Scalable Ray Tracing Framework -- Summary -- Image Attributions -- For More Information -- ch. 18 Offload to Knights Landing -- Offload Programming Model-Using with Knights Landing -- Processors Versus Coprocessor -- Offload Model Considerations -- OpenMP Target Directives -- Concurrent Host and Target Execution -- Offload Over Fabric -- Summary -- For More Information -- ch. 19 Power Analysis -- Power Demand Gates Exascale -- Power 101.
505	0		\|a Note continued: Hardware-Based Power Analysis Techniques -- Software-Based Knights Landing Power Analyzer -- ManyCore Platform Software Package Power Tools -- Running Average Power Limit -- Performance Profiling on Knights Landing -- Intel Remote Management Module -- Summary -- For More Information -- ch. 20 Optimizing Classical Molecular Dynamics in LAMMPS -- Molecular Dynamics -- LAMMPS -- Knights Landing Processors -- LAMMPS Optimizations -- Data Alignment -- Data Types and Layout -- Vectorization -- Neighbor List -- Long-Range Electrostatics -- MPI and OpenMP Parallelization -- Performance Results -- System, Build, and Run Configurations -- Workloads -- Organic Photovoltaic Molecules -- Hydrocarbon Mixtures -- Rhodopsin Protein in Solvated Lipid Bilayer -- Coarse Grain Liquid Crystal Simulation -- Coarse-Grain Water Simulation -- Summary -- Acknowledgment -- For More Information -- ch. 21 High Performance Seismic Simulations -- High-Order Seismic Simulations.
505	0		\|a Note continued: Numerical Background -- Application Characteristics -- Intel Architecture as Compute Engine -- Highly-Efficient Small Matrix Kernels -- Sparse Matrix Kernel Generation and Sparse/Dense Kernel Selection -- Dense Matrix Kernel Generation: AVX2 -- Dense Matrix Kernel Generation: AVX-512 -- Kernel Performance Benchmarking -- Incorporating Knights Landing's Different Memory Subsystems -- Performance Evaluation -- Mount Merapi -- 1992 Landers -- Summary and Take-Aways -- For More Information -- ch. 22 Weather Research and Forecasting (WRF) -- WRF Overview -- WRF Execution Profile: Relatively Flat -- History of WRF on Intel Many-Core (Intel Xeon Phi Product Line) -- Our Early Experiences with WRF on Knights Landing -- Compiling WRF for Intel Xeon and Intel Xeon Phi Systems -- WRF CONUS12km Benchmark Performance -- MCDRAM Bandwidth -- Vectorization: Boost of AVX-512 Over AVX2 -- Core Scaling -- Summary -- For More Information -- ch. 23 N-Body simulation.
505	0		\|a Note continued: Parallel Programming for Noncomputer Scientists -- Step-by-Step Improvements -- N-Body Simulation -- Optimization -- Initial Implementation (Optimization Step 0) -- Thread Parallelism (Optimization Step 1) -- Scalar Performance Tuning (Optimization Step 2) -- Vectorization with SOA (Optimization Step 3) -- Memory Traffic (Optimization Step 4) -- Impact of MCDRAM on Performance -- Summary -- For More Information -- ch. 24 Machine Learning -- Convolutional Neural Networks -- OverFeat-FAST Results -- For More Information -- ch. 25 Trinity Workloads -- Out of the Box Performance -- Optimizing MiniGhost OpenMP Performance -- Summary -- For More Information -- ch. 26 Quantum Chromodynamics -- LQCD -- The QPhiX Library and Code Generator -- Wilson-Dslash Operator -- Configuring the QPhiX Code Generator -- The Experimental Setup -- Results -- Conclusion -- For More Information.
650		0	\|a High performance processors.
650		0	\|a Computer programming.
650		0	\|a High performance computing.
650		6	\|a Processeurs �a hautes performances. \|0 (CaQQLa)201-0331596
650		6	\|a Programmation (Informatique) \|0 (CaQQLa)201-0002014
650		6	\|a Superinformatique. \|0 (CaQQLa)201-0297410
650		7	\|a computer programming. \|2 aat \|0 (CStmoGRI)aat300054641
650		7	\|a COMPUTERS \|x Computer Literacy. \|2 bisacsh
650		7	\|a COMPUTERS \|x Computer Science. \|2 bisacsh
650		7	\|a COMPUTERS \|x Data Processing. \|2 bisacsh
650		7	\|a COMPUTERS \|x Hardware \|x General. \|2 bisacsh
650		7	\|a COMPUTERS \|x Information Technology. \|2 bisacsh
650		7	\|a COMPUTERS \|x Machine Theory. \|2 bisacsh
650		7	\|a COMPUTERS \|x Reference. \|2 bisacsh
650		7	\|a Computer programming. \|2 fast \|0 (OCoLC)fst00872390
650		7	\|a High performance computing. \|2 fast \|0 (OCoLC)fst00956032
650		7	\|a High performance processors. \|2 fast \|0 (OCoLC)fst00956040
650		7	\|a COMPUTER SYSTEMS PERFORMANCE. \|2 nasat
650		7	\|a COMPUTER PROGRAMMING. \|2 nasat
700	1		\|a Reinders, James, \|e author.
700	1		\|a Sodani, Avinash, \|e author.
776	0	8	\|i Print version: \|a Jeffers, James. \|t Intel Xeon Phi Processor High Performance Programming : Knights Landing Edition. \|d : Elsevier Science, �2016 \|z 9780128091944
856	4	0	\|u https://sciencedirect.uam.elogim.com/science/book/9780128091944 \|z Texto completo

Intel Xeon Phi processor high performance programming /

MARC

Ejemplares similares