Skip to content

docs(getting_started): standardize on CUDA thread/kernel terminology#397

Open
samlaf wants to merge 1 commit into
Rust-GPU:mainfrom
samlaf:docs/getting-started-terminology
Open

docs(getting_started): standardize on CUDA thread/kernel terminology#397
samlaf wants to merge 1 commit into
Rust-GPU:mainfrom
samlaf:docs/getting-started-terminology

Conversation

@samlaf

@samlaf samlaf commented Jun 26, 2026

Copy link
Copy Markdown

Human Note

Found the getting_started a bit confusing as a gpu learner, as it uses nomenclature that doesn't match cuda's docs. Wondering whether this is intended to make this docs use similar wording as the broader rust-gpu effort, but given that this is rust-cuda, figured being closer to cuda usage of the words would make more sense?

LLM generated

The getting started guide used "invocation" / "multiple invocations of a kernel" to describe parallel execution. That vocabulary comes from the GLSL/SPIR-V/Vulkan compute world — natural for the rust-gpu org, whose flagship project targets SPIR-V, where "invocation" is the official term for a single execution instance of a shader (exactly what CUDA calls a thread). The doc was even internally consistent about it.

The problem is that "invocation" means something different in CUDA-native usage: a "kernel invocation" is the <<<>>> launch — one invocation per launch, which then spawns many threads. So a reader coming from CUDA C++ parses "multiple invocations running in parallel" as multiple grid launches on streams, a separate concept. Since this is rust-cuda, that collision is a real cost.

Standardize on CUDA's own model — kernel = the __global__ function, thread = the unit of parallel execution, launch = the <<<>>> call — and drop "invocation" entirely. This is unambiguous in a CUDA context and easier to follow for beginners.

Also fixes the outright error "mutable state shared by multiple kernels executing in parallel" (there is one kernel; threads execute it), and two incidental typos in the Blocks bullet ("that it execute" → "that execute together", "blocks index avaiable" → "block's index available").

The getting started guide used "invocation" / "multiple invocations of a
kernel" to describe parallel execution. That vocabulary comes from the
GLSL/SPIR-V/Vulkan compute world — natural for the rust-gpu org, whose
flagship project targets SPIR-V, where "invocation" is the official term
for a single execution instance of a shader (exactly what CUDA calls a
thread). The doc was even internally consistent about it.

The problem is that "invocation" means something different in CUDA-native
usage: a "kernel invocation" is the `<<<>>>` launch — one invocation per
launch, which then spawns many threads. So a reader coming from CUDA C++
parses "multiple invocations running in parallel" as multiple grid
launches on streams, a separate concept. Since this is rust-cuda, that
collision is a real cost.

Standardize on CUDA's own model — kernel = the `__global__` function,
thread = the unit of parallel execution, launch = the `<<<>>>` call —
and drop "invocation" entirely. This is unambiguous in a CUDA context and
easier to follow for beginners.

Also fixes the outright error "mutable state shared by multiple kernels
executing in parallel" (there is one kernel; threads execute it), and two
incidental typos in the Blocks bullet ("that it execute" → "that execute
together", "blocks index avaiable" → "block's index available").
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant