Heterogeneous system architecture : a new compute platform infrastructure /
This book presents a next-generation hardware platform, and associated software, that allows processors of different types to work efficiently and cooperatively in shared memory from a single source program. HSA also defines a virtual ISA for parallel routines or kernels, which is vendor and ISA ind...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Waltham, MA :
Morgan Kaufmann,
[2016]
|
Edición: | First edition. |
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Machine generated contents note: ch. 1 Introduction
- ch. 2 HSA Overview
- 2.1.A Short History of GPU Computing: The Problems That Are Solved by HSA
- 2.2. The Pillars of HSA
- 2.2.1. HSA Memory Model
- 2.2.2. HSA Queuing Model
- 2.2.3. HSAIL Virtual ISA
- 2.2.4. HSA Context Switching
- 2.3. The HSA Specifications
- 2.3.1. HSA Platform System Architecture Specification
- 2.3.2. HSA Runtime Specification
- 2.3.3. HSA Programmer's Reference Manual
- a.k.a. "HSAIL Spec"
- 2.4. HSA Software
- 2.5. The HSA Foundation
- 2.6. Summary
- ch. 3 HSAIL
- Virtual Parallel ISA
- 3.1. Introduction
- 3.2. Sample Compilation Flow
- 3.3. HSAIL Execution Model
- 3.4.A Tour of the HSAIL Instruction Set
- 3.4.1. Atomic Operations
- 3.4.2. Registers
- 3.4.3. Segments
- 3.4.4. Wavefronts and Lanes
- 3.5. HSAIL Machine Models and Profiles
- 3.6. HSAIL Compilation Flow
- 3.7. HSAIL Compilation Tools
- 3.7.1.Compiler Frameworks
- 3.7.2. CL Offline Compilation (CLOC)
- 3.7.3. HSAIL Assembler/Disassembler
- 3.7.4. ISA and Machine Code Assembler/Disassembler
- 3.8. Conclusion
- ch. 4 HSA Runtime
- 4.1. Introduction
- 4.2. The HSA Core Runtime API
- 4.2.1. Runtime Initialization and Shutdown
- 4.2.2. Runtime Notifications
- 4.2.3. System and HSA Agent Information
- 4.2.4. Signals
- 4.2.5. Queues
- 4.2.6. Architected Queuing Language
- 4.2.7. Memory
- 4.2.8. Code Objects and Executables
- 4.3. HSA Runtime Extensions
- 4.3.1. HSAIL Finalization
- 4.3.2. Images and Samplers
- 4.4. Conclusion
- References
- ch. 5 HSA Memory Model
- 5.1. Introduction
- 5.2. HSA Memory Structure
- 5.2.1. Segments
- 5.2.2. Flat Addressing
- 5.2.3. Shared Virtual Addressing
- 5.2.4. Ownership
- 5.2.5. Image Memory
- 5.3. HSA Memory Consistency Basics
- 5.3.1. Background: Sequential Consistency
- 5.3.2. Background: Conflicts and Races
- 5.3.3. The HSA Memory Model for a Single Memory Scope
- 5.3.4. HSA Memory Model Using Memory Scopes
- 5.3.5. Memory Segments
- 5.3.6. Putting It All Together: HSA Race Freedom
- 5.3.7. Additional Observations and Considerations
- 5.4. Advanced Consistency in the HSA Memory Model
- 5.4.1. Relaxed Atomics
- 5.4.2. Ownership and Scope Bounding
- 5.5. Conclusions
- References
- ch. 6 HSA Queuing Model
- 6.1. Introduction
- 6.2. User Mode Queues
- 6.3. Architected Queuing Language
- 6.3.1. Packet Types
- 6.3.2. Building Packets
- 6.4. Packet Submission and Scheduling
- 6.5. Conclusions
- References
- ch. 7 Compiler Technology
- 7.1. Introduction
- 7.2.A Brief Introduction to C++ AMP
- 7.2.1.C++ AMP array_view
- 7.2.2.C++ AMP parallel_for_each, or Kernel Invocation
- 7.3. HSA as a Compiler Target
- 7.4. Mapping Key C++ AMP Constructs to HSA
- 7.5.C++ AMP Compilation Flow
- 7.6.Compiled C++ AMP Code
- 7.7.Compiler Support for Tiling in C++AMP
- 7.7.1. Dividing Compute Domain
- 7.7.2. Specifying Address Space and Barriers
- 7.8. Memory Segment Annotation
- 7.9. Towards Generic C++ for HSA
- 7.10.Compiler Support for Platform Atomics
- 7.10.1. One Simple Example of Platform Atomics
- 7.11.Compiler Support for New/Delete Operators
- 7.11.1. Implementing New/Delete Operators with Platform Atomics
- 7.11.2. Promoting New/Delete Returned Address to Global Memory Segment
- 7.11.3. Improve New/Delete Operators Based on Wait API/Signal HSAIL Instruction
- 7.12. Conclusion
- References
- ch. 8 Application Use Cases: Platform Atomics
- 8.1. Introduction
- 8.2. Atomics in HSA
- 8.3. Task Queue System
- 8.3.1. Static Execution
- 8.3.2. Dynamic Execution
- 8.3.3. HSA Task Queue System
- 8.3.4. Evaluation
- 8.4. Breadth-First Search
- 8.4.1. Legacy Implementation
- 8.4.2. HSA Implementation
- 8.4.3. Evaluation
- 8.5. Data Layout Conversion
- 8.5.1. In-place SoA-ASTA Conversion with PTTWAC Algorithm
- 8.5.2. An HSA Implementation of PTTWAC
- 8.5.3. Evaluation
- 8.6. Conclusions
- Acknowledgment
- References
- ch. 9 HSA Simulators
- 9.1. Simulating HSA in Multi2Sim
- 9.1.1. Introduction
- 9.1.2. Multi2Sim-HSA
- 9.1.3. HSATL Host HSA
- 9.1.4. HSA Runtime
- 9.1.5. Emulator Design
- 9.1.6. Logging and Debugging
- 9.1.7. Multi2Sim-HSA Road Map
- 9.1.8. Installation and Support
- 9.2. Emulating HSA with HSAemu
- 9.2.1. Introduction
- 9.2.2. Modeled HSA Components
- 9.2.3. Design of HSAemu
- 9.2.4. Multithreaded HSA GPU Emulator
- 9.2.5. Profiling, Debugging and Performance Models
- 9.3. SoftHSA Simulator
- 9.3.1. Introduction
- 9.3.2. High-Level Design
- 9.3.3. Building and Testing the Simulator
- 9.3.4. Debugging with the LLVM HSA Simulator.