Is there a native way to encode or decode HTML entities using JavaScript or ES6? For example, <
would be encoded as <
. There are libraries like html-entities
for Node.js but it feels like there should be something built into JavaScript that already handles this common need.
Roll Your Own <sup>(caveat - use HE instead for most use cases)</sup>
For pure JS without a lib, you can Encode and Decode HTML entities using pure Javascript like this:
let encode = str => {
let buf = [];
for (var i = str.length - 1; i >= 0; i--) {
buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
}
return buf.join('');
}
let decode = str => {
return str.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
}
Usages:
encode("Hello > © <") // "Hello > © <"
decode("Hello > © © <") // "Hello > © © <"
However, you can see this approach has a couple shortcomings:
- It encodes even safe characters
H
→H
- It can decode numeric codes (not in the astral plane), but doesn't know anything about full list of html entities / named character codes supported by browsers like
>
Use the HE Library (Html Entities)
- Support for all standardized named character references
- Support for unicode
- Works with ambiguous ampersands
- Written by Mathias Bynens
Usage:
he.encode('foo © bar ≠ baz 𝌆 qux');
// Output : 'foo © bar ≠ baz 𝌆 qux'
he.decode('foo © bar ≠ baz 𝌆 qux');
// Output : 'foo © bar ≠ baz 𝌆 qux'