added an optional parameter to specify the character encoding of a source#78
Conversation
|
Thanks for the PR, much appreciated! For now, I pulled it into a new branch to get the API right. It's not a huge problem to parse the encoding from the resource definition and hand it down to the Table factory method. Reading Resources from files should be easy, and the API for Tables and Datapackages would match. What's giving me headaches is whether to set an encoding on a URL-based Table in the first place. Usually, the web server should give us that. In the context of Table, it should be fine to trust the web server and therefore not have an Essentially, it comes down to:
@roll How do other implementations handle this? or what's your opinion? |
|
Hi, in Python we infer the encoding from a byte sample (buffer) if it's not provided. We tried to use an encoding from the HTTP headers but it's ofter missleading so we stopped using it. |
Overview
I have added an optional parameter to specify the character encoding of a source. The test TableEncodingTests::createTableFromIso8859() passes.
However, I'm not at all sure that my approach makes sense in context of the rest of the code/framework. Somehow we would have to transfer the
encodingproperty from the Tabular Data Resource to theTable.fromSourcemethod invocation.Closes #77
Please preserve this line to notify @iSnow (lead of this repository)