Skip to content

Fix nvbugpro 5348750#725

Merged
leofang merged 5 commits into
NVIDIA:mainfrom
oleksandr-pavlyk:fix-nvbugpro-5348750
Jun 26, 2025
Merged

Fix nvbugpro 5348750#725
leofang merged 5 commits into
NVIDIA:mainfrom
oleksandr-pavlyk:fix-nvbugpro-5348750

Conversation

@oleksandr-pavlyk

Copy link
Copy Markdown
Contributor

Description

This build on top of #724 and changes cuda.core.experimental._module to set _loader["paraminfo"] only when driver is >=12.4. If "paraminfo" is not found in _loader at run-time, a NotImplementedError is raised.

closes

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@copy-pr-bot

copy-pr-bot Bot commented Jun 25, 2025

Copy link
Copy Markdown
Contributor

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@leofang

leofang commented Jun 25, 2025

Copy link
Copy Markdown
Member

LGTM, is it still a draft?

@oleksandr-pavlyk

Copy link
Copy Markdown
Contributor Author

/ok to test

@oleksandr-pavlyk

Copy link
Copy Markdown
Contributor Author

@leofang I opened it as a draft. Will transition to 'ready-for-review' once CI gree-lights the change.

@leofang leofang added bug Something isn't working P0 High priority - Must do! cuda.core Everything related to the cuda.core module labels Jun 25, 2025
@leofang leofang added this to the cuda.core beta 5 milestone Jun 25, 2025
@github-actions

This comment has been minimized.

@oleksandr-pavlyk

Copy link
Copy Markdown
Contributor Author

/ok to test

For drivers in version range [12000, 12040), do not add
"paraminfo" to the _loader dictionary. At runtime, raise
NotImplementedError if "paraminfo" is not in the dictionary.

Only modify _loader["new"] for python version >=12
…lParamInfo

Except for one test where we check that NotImplementedError is raised.
@oleksandr-pavlyk

Copy link
Copy Markdown
Contributor Author

/ok to test

Older driver, specifically 535.247.01, returns error code 400
for cluster-size related occupancy queries for devices with compute
capability less than (9, 0)

It works fine with newer drivers, provided the actual requested
cluster size is zero.
@oleksandr-pavlyk

Copy link
Copy Markdown
Contributor Author

/ok to test

@oleksandr-pavlyk oleksandr-pavlyk marked this pull request as ready for review June 26, 2025 14:48
@copy-pr-bot

copy-pr-bot Bot commented Jun 26, 2025

Copy link
Copy Markdown
Contributor

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@leofang leofang left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Sasha!

@github-project-automation github-project-automation Bot moved this from Todo to In Review in CCCL Jun 26, 2025
@leofang leofang merged commit 0edec40 into NVIDIA:main Jun 26, 2025
53 checks passed
@github-project-automation github-project-automation Bot moved this from In Review to Done in CCCL Jun 26, 2025
@oleksandr-pavlyk oleksandr-pavlyk deleted the fix-nvbugpro-5348750 branch June 26, 2025 14:55
@github-actions

Copy link
Copy Markdown
Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cuda.core Everything related to the cuda.core module P0 High priority - Must do!

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants