Cargando…

Heterogeneous system architecture : a new compute platform infrastructure /

This book presents a next-generation hardware platform, and associated software, that allows processors of different types to work efficiently and cooperatively in shared memory from a single source program. HSA also defines a virtual ISA for parallel routines or kernels, which is vendor and ISA ind...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Hwu, Wen-mei (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Waltham, MA : Morgan Kaufmann, [2016]
Edición:First edition.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)
Tabla de Contenidos:
  • Machine generated contents note: ch. 1 Introduction
  • ch. 2 HSA Overview
  • 2.1.A Short History of GPU Computing: The Problems That Are Solved by HSA
  • 2.2. The Pillars of HSA
  • 2.2.1. HSA Memory Model
  • 2.2.2. HSA Queuing Model
  • 2.2.3. HSAIL Virtual ISA
  • 2.2.4. HSA Context Switching
  • 2.3. The HSA Specifications
  • 2.3.1. HSA Platform System Architecture Specification
  • 2.3.2. HSA Runtime Specification
  • 2.3.3. HSA Programmer's Reference Manual
  • a.k.a. "HSAIL Spec"
  • 2.4. HSA Software
  • 2.5. The HSA Foundation
  • 2.6. Summary
  • ch. 3 HSAIL
  • Virtual Parallel ISA
  • 3.1. Introduction
  • 3.2. Sample Compilation Flow
  • 3.3. HSAIL Execution Model
  • 3.4.A Tour of the HSAIL Instruction Set
  • 3.4.1. Atomic Operations
  • 3.4.2. Registers
  • 3.4.3. Segments
  • 3.4.4. Wavefronts and Lanes
  • 3.5. HSAIL Machine Models and Profiles
  • 3.6. HSAIL Compilation Flow
  • 3.7. HSAIL Compilation Tools
  • 3.7.1.Compiler Frameworks
  • 3.7.2. CL Offline Compilation (CLOC)
  • 3.7.3. HSAIL Assembler/Disassembler
  • 3.7.4. ISA and Machine Code Assembler/Disassembler
  • 3.8. Conclusion
  • ch. 4 HSA Runtime
  • 4.1. Introduction
  • 4.2. The HSA Core Runtime API
  • 4.2.1. Runtime Initialization and Shutdown
  • 4.2.2. Runtime Notifications
  • 4.2.3. System and HSA Agent Information
  • 4.2.4. Signals
  • 4.2.5. Queues
  • 4.2.6. Architected Queuing Language
  • 4.2.7. Memory
  • 4.2.8. Code Objects and Executables
  • 4.3. HSA Runtime Extensions
  • 4.3.1. HSAIL Finalization
  • 4.3.2. Images and Samplers
  • 4.4. Conclusion
  • References
  • ch. 5 HSA Memory Model
  • 5.1. Introduction
  • 5.2. HSA Memory Structure
  • 5.2.1. Segments
  • 5.2.2. Flat Addressing
  • 5.2.3. Shared Virtual Addressing
  • 5.2.4. Ownership
  • 5.2.5. Image Memory
  • 5.3. HSA Memory Consistency Basics
  • 5.3.1. Background: Sequential Consistency
  • 5.3.2. Background: Conflicts and Races
  • 5.3.3. The HSA Memory Model for a Single Memory Scope
  • 5.3.4. HSA Memory Model Using Memory Scopes
  • 5.3.5. Memory Segments
  • 5.3.6. Putting It All Together: HSA Race Freedom
  • 5.3.7. Additional Observations and Considerations
  • 5.4. Advanced Consistency in the HSA Memory Model
  • 5.4.1. Relaxed Atomics
  • 5.4.2. Ownership and Scope Bounding
  • 5.5. Conclusions
  • References
  • ch. 6 HSA Queuing Model
  • 6.1. Introduction
  • 6.2. User Mode Queues
  • 6.3. Architected Queuing Language
  • 6.3.1. Packet Types
  • 6.3.2. Building Packets
  • 6.4. Packet Submission and Scheduling
  • 6.5. Conclusions
  • References
  • ch. 7 Compiler Technology
  • 7.1. Introduction
  • 7.2.A Brief Introduction to C++ AMP
  • 7.2.1.C++ AMP array_view
  • 7.2.2.C++ AMP parallel_for_each, or Kernel Invocation
  • 7.3. HSA as a Compiler Target
  • 7.4. Mapping Key C++ AMP Constructs to HSA
  • 7.5.C++ AMP Compilation Flow
  • 7.6.Compiled C++ AMP Code
  • 7.7.Compiler Support for Tiling in C++AMP
  • 7.7.1. Dividing Compute Domain
  • 7.7.2. Specifying Address Space and Barriers
  • 7.8. Memory Segment Annotation
  • 7.9. Towards Generic C++ for HSA
  • 7.10.Compiler Support for Platform Atomics
  • 7.10.1. One Simple Example of Platform Atomics
  • 7.11.Compiler Support for New/Delete Operators
  • 7.11.1. Implementing New/Delete Operators with Platform Atomics
  • 7.11.2. Promoting New/Delete Returned Address to Global Memory Segment
  • 7.11.3. Improve New/Delete Operators Based on Wait API/Signal HSAIL Instruction
  • 7.12. Conclusion
  • References
  • ch. 8 Application Use Cases: Platform Atomics
  • 8.1. Introduction
  • 8.2. Atomics in HSA
  • 8.3. Task Queue System
  • 8.3.1. Static Execution
  • 8.3.2. Dynamic Execution
  • 8.3.3. HSA Task Queue System
  • 8.3.4. Evaluation
  • 8.4. Breadth-First Search
  • 8.4.1. Legacy Implementation
  • 8.4.2. HSA Implementation
  • 8.4.3. Evaluation
  • 8.5. Data Layout Conversion
  • 8.5.1. In-place SoA-ASTA Conversion with PTTWAC Algorithm
  • 8.5.2. An HSA Implementation of PTTWAC
  • 8.5.3. Evaluation
  • 8.6. Conclusions
  • Acknowledgment
  • References
  • ch. 9 HSA Simulators
  • 9.1. Simulating HSA in Multi2Sim
  • 9.1.1. Introduction
  • 9.1.2. Multi2Sim-HSA
  • 9.1.3. HSATL Host HSA
  • 9.1.4. HSA Runtime
  • 9.1.5. Emulator Design
  • 9.1.6. Logging and Debugging
  • 9.1.7. Multi2Sim-HSA Road Map
  • 9.1.8. Installation and Support
  • 9.2. Emulating HSA with HSAemu
  • 9.2.1. Introduction
  • 9.2.2. Modeled HSA Components
  • 9.2.3. Design of HSAemu
  • 9.2.4. Multithreaded HSA GPU Emulator
  • 9.2.5. Profiling, Debugging and Performance Models
  • 9.3. SoftHSA Simulator
  • 9.3.1. Introduction
  • 9.3.2. High-Level Design
  • 9.3.3. Building and Testing the Simulator
  • 9.3.4. Debugging with the LLVM HSA Simulator.