An experience with PyCUDA: Refactoring an existing implementation of a ray-surface intersection algorithm
This article is a sequel to "GPU implementation of a ray-surface intersection algorithm in CUDA" (arXiv:2209.02878) [1]. Its main focus is PyCUDA which represents a Python scripting approach to GPU run-time code generation in the Compute Unified Device Architecture (CUDA) framework. It accompanies the open-source code distributed in GitHub which provides a PyCUDA implementation of a GPU-based line-segment, surface-triangle intersection test. The objective is to share a PyCUDA learning experience with people who are new to PyCUDA. Using the existing CUDA code and foundation from [1] as the starting point, we document the key changes made to facilitate a transition to PyCUDA. As the CUDA source for the ray-surface intersection test contains both host and device code and uses multiple kernel functions, these notes offer a substantive example and real-world perspective of what it is like to utilize PyCUDA. It delves into custom data structures such as binary radix tree and highlights some possible pitfalls. The case studies present a debugging strategy which may be used to examine complex C structures in device memory using standard Python tools without the CUDA-GDB debugger.
READ FULL TEXT