Require charset meta tag with the value of `utf-8`

Require charset meta tag with the value of utf-8 (meta-charset-utf-8)

meta-charset-utf-8 warns against not declaring the character encoding as utf-8 inline.

Why is this important?

The character encoding should be specified for every HTML page, either by using the charset parameter on the Content-Type HTTP response header (e.g.: Content-Type: text/html; charset=utf-8) and/or using the charset meta tag in the file.

Sending just the Content-Type HTTP header is in general ok, but it’s usually a good idea to also add the charset meta tag because:

  • Server configurations might change (or servers might not send the charset parameter on the Content-Type HTTP response header).
  • The page might be saved locally, case in which the HTTP header will not be present when viewing the page.

One should always choose utf-8 as the encoding, and convert any content in legacy encodings to utf-8.

For the charset meta tag <meta charset="utf-8"> should be used.

<meta charset="utf-8">:

  • Is backwards compatible and works in all known browsers, so it should always be used over the old <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">.

  • The charset value should be utf-8 not other values such as utf8. Using utf8 for example, is a common mistake, and even though it is valid nowadays as the specifications and browsers now alias utf8 to utf-8, that wasn’t the case in the past, so things might break in some older browsers. The same may be true for other agents (non-browsers) that may scan/get the content and may not have the alias.

  • Must be inside the <head> element and within the first 1024 bytes of the HTML, as some browsers only look at those bytes before choosing an encoding.

    Moreover, it is recommended that the meta tag be the first thing in the <head>. This ensures it is before any content that could be controlled by an attacker, such as a <title> element, thus, avoiding potential encoding-related security issues (such as the one in old IE).

What does the rule check?

The rule checks if <meta charset="utf-8"> is specified as the first thing in the <head>.

Examples that trigger the rule

The character encoding is not specified in <html>:

<!doctype html>
<html lang="en">
<head>
<title>example</title>
...
</head>
<body>...</body>
</html>

The character encoding is specified using the meta http-equiv:

<!doctype html>
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>example</title>
...
</head>
<body>...</body>
</html>

The charset value is not utf-8:

<!doctype html>
<html lang="en">
<head>
<meta charset="utf8">
<title>example</title>
...
</head>
<body>...</body>
</html>

The meta charset is not the first thing in <head>:

<!doctype html>
<html lang="en">
<head>
<title>example</title>
<meta charset="utf8">
...
</head>
<body>...</body>
</html>

Examples that pass the rule

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>example</title>
...
</head>
<body>...</body>
</html>

Further Reading