Skip to content

Migrate pynvml to cuda.core.system#146

Open
mdboom wants to merge 2 commits into
rapidsai:mainfrom
mdboom:pynvml-to-cuda.core.system
Open

Migrate pynvml to cuda.core.system#146
mdboom wants to merge 2 commits into
rapidsai:mainfrom
mdboom:pynvml-to-cuda.core.system

Conversation

@mdboom

@mdboom mdboom commented Apr 28, 2026

Copy link
Copy Markdown

Migrates from pynvml.py to the new Cython/cybind-based cuda.core.system API.

@copy-pr-bot

copy-pr-bot Bot commented Apr 28, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mdboom mdboom force-pushed the pynvml-to-cuda.core.system branch from f6a44df to d88499b Compare May 19, 2026 15:47
@mdboom mdboom marked this pull request as ready for review May 19, 2026 15:47
@mdboom mdboom requested review from a team as code owners May 19, 2026 15:47
@mdboom mdboom requested a review from KyleFromNVIDIA May 19, 2026 15:47
@ncclementi

ncclementi commented May 21, 2026

Copy link
Copy Markdown
Contributor

I was trying to install this in Colab for example to give it a try and see if it works as expected, and I ran into an installation ERROR

ERROR: pip's dependency resolver does not currently take into account all the 
packages that are installed. This behaviour is the source of the following dependency conflicts.
torch 2.10.0+cu128 requires cuda-bindings==12.9.4; platform_system == "Linux", but you 
have cuda-bindings 13.2.0 which is incompatible.
cuda-python 12.9.4 requires cuda-bindings~=12.9.4, but you have cuda-bindings 13.2.0 which is incompatible.
numba-cuda 0.22.2 requires cuda-core<1.0.0,>=0.3.2, but you have cuda-core 1.0.1 which is incompatible.
dask-cuda 26.2.0 requires cuda-core==0.3.*, but you have cuda-core 1.0.1 which is incompatible.

It seems to be updating cuda bindings to the latest, which is to aggressive for most CSP.

I understand that the numba cuda version in colab is super old. So we can report that back and see if we can get them to upgrade.

All that being said, we can run the script. At the moment we are running the script to check the environment, but it seems like with this changes installing the rapids-cli is modifying that environment.

I'd like to know what @jacobtomlinson, @jayavenkatesh19 and @mmccarty think here.

jayavenkatesh19 added a commit that referenced this pull request Jun 1, 2026
Following up on
#146 (comment),
usage of `cuda-core` has a specific pin on `cuda-bindings` version which
has different versions for CUDA 12 and CUDA 13.

While `cuda-bindings` itself is backward compatible, this pin is causing
issues on other environments where tools like `pytorch` have their own
versioning requirements for `cuda-bindings`.

By switching back to using `pynvml` for GPU information, we can ensure
that `rapids-cli` is supported on diverse environments without the need
for specialized wheels

---------

Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
@mdboom

mdboom commented Jun 10, 2026

Copy link
Copy Markdown
Author

I was trying to install this in Colab for example to give it a try and see if it works as expected, and I ran into an installation ERROR

ERROR: pip's dependency resolver does not currently take into account all the 
packages that are installed. This behaviour is the source of the following dependency conflicts.
torch 2.10.0+cu128 requires cuda-bindings==12.9.4; platform_system == "Linux", but you 
have cuda-bindings 13.2.0 which is incompatible.
cuda-python 12.9.4 requires cuda-bindings~=12.9.4, but you have cuda-bindings 13.2.0 which is incompatible.
numba-cuda 0.22.2 requires cuda-core<1.0.0,>=0.3.2, but you have cuda-core 1.0.1 which is incompatible.
dask-cuda 26.2.0 requires cuda-core==0.3.*, but you have cuda-core 1.0.1 which is incompatible.

It seems to be updating cuda bindings to the latest, which is to aggressive for most CSP.

I understand that the numba cuda version in colab is super old. So we can report that back and see if we can get them to upgrade.

All that being said, we can run the script. At the moment we are running the script to check the environment, but it seems like with this changes installing the rapids-cli is modifying that environment.

I'd like to know what @jacobtomlinson, @jayavenkatesh19 and @mmccarty think here.

Can you move to cuda-python 12.9.6? That should hopefully be enough to satisfy the dependencies here. Unfortunately, 12.9.4 doesn't have the NVML functionality at all.

@ncclementi

Copy link
Copy Markdown
Contributor

Can you move to cuda-python 12.9.6? That should hopefully be enough to satisfy the dependencies here. Unfortunately, 12.9.4 doesn't have the NVML functionality at all.

@mdboom The problem is that in the case of rapids-cli, we use this library, among other things to run rapids debug, which asses the state of the environment, like what packages are installed. This varies from CSP to CSP, and what might be a solution now for colab, might not be a solution for another CSP. Plus we do not control how things get installed, like this is not a new environment, this is their built in environment

It's also bit problematic for us, that installing rapids-cli, will mean to upgrade a package in the environment, that we are trying to collect information on.

We talked about this with the team, and decided that we will hold of on the migration until the cuda.core.system is more mature.

cc: @jayavenkatesh19

@leofang

leofang commented Jun 11, 2026

Copy link
Copy Markdown
Member

We talked about this with the team, and decided that we will hold of on the migration until the cuda.core.system is more mature.

I think "more mature" is an ambiguous term. This is ultimately yet another packaging (aka dependency hell) problem and by "more mature" we don't mean that there is anything actionable for cuda.core, we mean to wait until at least some older dependencies like CUDA 12 are uniformly dropped across RAPIDS customers, which is fine with me. Just wanna make sure we're on the same page 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants