Skip to content

fix: make keyword_token safe by validating UTF-8 input#2093

Closed
kmr-ankitt wants to merge 1 commit intotursodatabase:mainfrom
kmr-ankitt:fix/safe-keyword-token
Closed

fix: make keyword_token safe by validating UTF-8 input#2093
kmr-ankitt wants to merge 1 commit intotursodatabase:mainfrom
kmr-ankitt:fix/safe-keyword-token

Conversation

@kmr-ankitt
Copy link
Copy Markdown

This PR fixes an unsound usage of unsafe { str::from_utf8_unchecked(word) } in the public function keyword_token in mod.rs.

The function now uses std::str::from_utf8(word).ok()? to safely handle invalid UTF-8, eliminating the unsoundness.
No logic or API changes.
Code compiles and tests pass (where possible).

Closes: #1859

@penberg
Copy link
Copy Markdown
Collaborator

penberg commented Jun 6, 2025

Hey @kmr-ankitt, there are other places that call from_utf8_unchecked() too, almost certainly for performance reasons. As far as I can tell, it's probably better to attempt to fix this in the upstream repository and then merge back, instead of fixing parts of it in libsql.git. Happy to also discuss fixing this in https://github.com/tursodatabase/limbo where we hard forked the parser.

@penberg penberg closed this Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unsound Use of str::from_utf8_unchecked in keyword_token

2 participants