HTML Entity Decode [duplicate] – Dev

The best answers to the question “HTML Entity Decode [duplicate]” in the category Dev.

QUESTION:

How do I encode and decode HTML entities using JavaScript or JQuery?

var varTitle = "Chris' corner";

I want it to be:

var varTitle = "Chris' corner";

ANSWER:

You could try something like:

var Title = $('<textarea />').html("Chris&apos; corner").text();
console.log(Title);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

JS Fiddle.

A more interactive version:

$('form').submit(function() {
  var theString = $('#string').val();
  var varTitle = $('<textarea />').html(theString).text();
  $('#output').text(varTitle);
  return false;
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<form action="#" method="post">
  <fieldset>
    <label for="string">Enter a html-encoded string to decode</label>
    <input type="text" name="string" id="string" />
  </fieldset>
  <fieldset>
    <input type="submit" value="decode" />
  </fieldset>
</form>

<div id="output"></div>

JS Fiddle.

ANSWER:

I recommend against using the jQuery code that was accepted as the answer. While it does not insert the string to decode into the page, it does cause things such as scripts and HTML elements to get created. This is way more code than we need. Instead, I suggest using a safer, more optimized function.

var decodeEntities = (function() {
  // this prevents any overhead from creating the object each time
  var element = document.createElement('div');

  function decodeHTMLEntities (str) {
    if(str && typeof str === 'string') {
      // strip script/html tags
      str = str.replace(/<script[^>]*>([\S\s]*?)<\/script>/gmi, '');
      str = str.replace(/<\/?\w(?:[^"'>]|"[^"]*"|'[^']*')*>/gmi, '');
      element.innerHTML = str;
      str = element.textContent;
      element.textContent="";
    }

    return str;
  }

  return decodeHTMLEntities;
})();

http://jsfiddle.net/LYteC/4/

To use this function, just call decodeEntities("&amp;") and it will use the same underlying techniques as the jQuery version will—but without jQuery’s overhead, and after sanitizing the HTML tags in the input. See Mike Samuel’s comment on the accepted answer for how to filter out HTML tags.

This function can be easily used as a jQuery plugin by adding the following line in your project.

jQuery.decodeEntities = decodeEntities;

ANSWER:

Original author answer here.

This is my favourite way of decoding HTML characters. The advantage of using this code is that tags are also preserved.

function decodeHtml(html) {
    var txt = document.createElement("textarea");
    txt.innerHTML = html;
    return txt.value;
}

Example: http://jsfiddle.net/k65s3/

Input:

Entity:&nbsp;Bad attempt at XSS:<script>alert('new\nline?')</script><br>

Output:

Entity: Bad attempt at XSS:<script>alert('new\nline?')</script><br>

ANSWER:

Like Robert K said, don’t use jQuery.html().text() to decode html entities as it’s unsafe because user input should never have access to the DOM. Read about XSS for why this is unsafe.

Instead try the Underscore.js utility-belt library which comes with escape and unescape methods:

_.escape(string)

Escapes a string for insertion into HTML, replacing &, <, >, ", `, and ' characters.

_.escape('Curly, Larry & Moe');
=> "Curly, Larry &amp; Moe"

_.unescape(string)

The opposite of escape, replaces &amp;, &lt;, &gt;, &quot;, &#96; and &#x27; with their unescaped counterparts.

_.unescape('Curly, Larry &amp; Moe');
=> "Curly, Larry & Moe"

To support decoding more characters, just copy the Underscore unescape method and add more characters to the map.