EPIC: Support TMA descriptor

Initializing a TMA descriptor through the driver APIs
https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TENSOR__MEMORY.html
is really tedious and error prone. We need a way to abstract it out, which aligns well with the mission of `cuda.core`. This also allows JIT compilers to easier consume and incorporate into the compilation pipelines.

In my understanding there are two (implicit?) requirements for this to be useful:
1. Creating/initializing a TMA object on host
2. Passing the object to the `cuda.core.launch()` API as a kernel arg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EPIC: Support TMA descriptor #199

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

EPIC: Support TMA descriptor #199

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions