-
Notifications
You must be signed in to change notification settings - Fork 266
[BUG]: cublasmp dependencies are not reflected in supported_nvidia_libs.py #1116
Copy link
Copy link
Closed
Copy link
Labels
bugSomething isn't workingSomething isn't workingcuda.pathfinderEverything related to the cuda.pathfinder moduleEverything related to the cuda.pathfinder module
Milestone
Description
Discovered by chance while testing on a workstation that did not have the CUDA driver installed:
libcublasmp.so.0is the only supported lib that requireslibcuda.so.1, which led to atest_load_nvidia_dynamic_lib.pyfailure when the driver was not installed.- To double-check I ran the
ldd, which made it obvious that we are missing all dependencies:cublas,cublasLt,nvshmem_host,nccl
The dependency on libcuda.so.1 is unusual: not sure if we want to do something about it. But the other dependencies should be added to supported_nvidia_libs.py.
mgx-c2g2-pvt-66.cl1u1.colossus.nvidia.com:/wrk/forked/cuda-python/cuda_pathfinder $ ldd /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/libcublasmp.so.0
linux-vdso.so.1 (0x0000fb3f18b6a000)
libcuda.so.1 => /lib/aarch64-linux-gnu/libcuda.so.1 (0x0000fb3f11a00000)
libcublas.so.12 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../cublas/lib/libcublas.so.12 (0x0000fb3f0b600000)
libcublasLt.so.12 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../cublas/lib/libcublasLt.so.12 (0x0000fb3edb800000)
libnvshmem_host.so.3 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../nvshmem/lib/libnvshmem_host.so.3 (0x0000fb3ed2000000)
libnccl.so.2 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../nccl/lib/libnccl.so.2 (0x0000fb3ebb800000)
librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000fb3f18af0000)
libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000fb3f18ac0000)
libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000fb3f18a90000)
libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000fb3ebb400000)
libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000fb3f17750000)
libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000fb3f18a50000)
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000fb3ebb240000)
/lib/ld-linux-aarch64.so.1 (0x0000fb3f18b2d000)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcuda.pathfinderEverything related to the cuda.pathfinder moduleEverything related to the cuda.pathfinder module