Shortcuts¶
Functions for dealing with byte strings of unknown encoding.
- chardetng_py.shortcuts.detect(byte_str, *, allow_utf8=False, tld=None)¶
Detect the encoding of
byte_str.- Parameters:
byte_str (
bytesorbytearray) – Input buffer to decode.allow_utf8 (
bool) – If set toFalse, the return value of this method won’t be"UTF-8". When performing detection on"text/html"on non-file:URLs, Web browsers must passFalse, unless the user has taken a specific contextual action to request an override. This way, Web developers cannot start depending on UTF-8 detection. Such reliance would make the Web Platform more brittle.tld (
bytesorbytearrayorNone) – Iftldcontains non-ASCII, period, or upper-case letters. The exception condition is intentionally limited to signs of failing to extract the label correctly, failing to provide it in its Punycode form, and failure to lower-case it. Full DNS label validation is intentionally not performed to avoid panics when the reality doesn’t match the specs.
- Returns:
strThe encoding of
byte_str, suitable for use withstr.decode.
- Return type:
str