Skip to content

node-bencode can produce dictionary entries with duplicate keys. #146

@issuefiler

Description

@issuefiler

Bug

node-bencode can produce dictionary entries with duplicate keys.


node-bencode assumes that binary string keys made out of unique Javascript string keys are unique as well, which is false.

encode.string = function (buffers, data) {
buffers.push(text2arr(text2arr(data).byteLength + ':' + data))
}

https://github.com/ThaUnknown/uint8-util/blob/149c44c010b3ad17a7904c4266545bbca1fd4403/_node.js#L13

 encode.string = function (buffers, data) { 
   buffers.push(text2arr(text2arr(data).byteLength + ':' + data)) 
 } 
export const text2arr = str => new Uint8Array(Buffer.from(str, 'utf8'))

Proof-of-concept

For example, let node-bencode try encoding {"\uD800": 1, "\uDFFF": 2}. It’ll produce dictionary entries with the duplicate key, "3:\xEF\xBF\xBD".

const lone_surrogates = "\uD800\uDFFF";
// Lone (“unmatched”) UTF-16 surrogates. Invalid in UTF-16.

const a = Buffer.from(lone_surrogates[0], "UTF-8");
const b = Buffer.from(lone_surrogates[1], "UTF-8");
// Decoding the Javascript strings in UTF-16 and encoding them into UTF-8.

console.log(a, a.toString(), b, b.toString());
//  Since those Javascript strings are invalid in UTF-16,
// those lone surrogates are decoded
// into `REPLACEMENT CHARACTER`s (U+FFFD)
// and subsequently encoded into `<Buffer ef bf bd>`.
// Meaning,

console.log(a.equals(b));
// is true, when (lone_surrogates[0] === lone_surrogates[1]) is false.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions