5

Is it possible to target elements that have no language set nor inherited, i.e. are in unspecified ("unknown") language?

Trivia

HTML document or element language can be set using HTML lang attribute, e.g.:

<html lang="en">
<h1>Dictionary</h1>
<dl>
<dt><abbr lang="en-Mors">-... - .--</abbr>
<dd><i lang="fr-Latn">à propos</i>
</dl>

or using code(s) in the HTTP Content-language header:

HTTP/2 200 OK
[other headers]
Content-language: en,en-Brai,fr-Latn

<html>
<h1>Dictionary</h1>
[rest of document]

or it's long deprecated yet still working <meta http-equiv> counterpart:

<html>
 <head>
  <meta http-equiv="content-language" content="en,en-Brai,fr-Latn">
</head>
<html>
<h1>Dictionary</h1>
[rest of document]

In either case using :lang(en) CSS selector matches main heading from examples and all other elements that has not explicit lang attribute with value not equal or starting with "en".

Goal

In case the document is sent without Content-language HTTP header or <meta> element and without lang attribute, is it possible to match those elements that falls to inevitable "unknown" language?

Plus in document or DOM fragment that has language set by any aforementioned mean, is it possible to use lang() CSS selector to match elements with empty lang="" attribute, that effectively 'opts out' of having language?

HTTP/2 200 OK
[no content-language header nor meta present]

<html>
<p>I Want to select this. <span>And this.</span></p>
<p lang="">And this.</p>
<p lang="en">Not this. <span lang="">But this again.</span></p>

What does not work

Neither :lang(), :lang(unknown), :lang('') nor :not(:lang(*)) works for this purpose. Selectors derived from :not([lang]), [lang=''] would logically give false negative for use-cases with HTTP Content-language header/meta present.

Answer requirements

Seeking answer that either gives solution without false negatives or confirms it is not possible with references to specs (or their absence) and explanation why is it so.


Notes:

When empty lang="" attribute is present, targeting it with [lang=""] attribute selector works, but feels weird considering there is dedicated :lang() pseudo-class for language-related stuff.

vitaliis
  • 4,082
  • 5
  • 18
  • 40
myf
  • 9,874
  • 2
  • 37
  • 49
  • You can't get an HTTP header with CSS. You'll have to do it with JavaScript: https://stackoverflow.com/questions/220231/accessing-the-web-pages-http-headers-in-javascript – code Dec 14 '21 at 19:41
  • Sure, not "directly", but since e.g. `:lang(xx)` CSS selector matches all elements without other explicit lang set via attribute in document delivered with `Content-language: xx` HTTP header, it's quite safe to say you can "indirectly". – myf Dec 14 '21 at 19:59

2 Answers2

3

Edit 2021: This has been accepted as a bug https://bugs.chromium.org/p/chromium/issues/detail?id=1281157


We provide language range in :lang() rule and they are matched against language tags. They've mentioned about supporting asterisks in language ranges:

Language ranges containing asterisks, for example, must be either correctly escaped or quoted as strings, e.g. :lang(*-Latn) or :lang("*-Latn") ref

And in old 2013 draft:

Each language range in :lang() must be a valid CSS identifier [CSS21] or consist of an asterisk (* U+002A) immediately followed by an identifier beginning with an ASCII hyphen (U+002D) for the selector to be valid. ref

But I can't get p:lang(\*-US) to work on Chrome and Firefox on Windows. The rule p:lang(en\002DUS) works thought, but p:lang(en\002D\002A) does not. Not sure about the status of the support for special range "*" in browsers. Also there is no mention of matching undefined by the special range "*" in Matching of Language Tags.


But,p:lang(\*) and p:not(:lang(\*)) work on iPadOs in both Safari and Chrome. Open this jsfiddle on ipad
worksonIOS
I think chromium doesn’t support the full :lang() feature.


Workaround: If a little bit of JavaScript is acceptable then you can try following solution:

document.addEventListener('DOMContentLoaded', init);

function init() {
  if (!document.documentElement.lang) {
    fetchSamePageHeaders(checkHeaderLanguage);
  }
}

//make a lightweight request to the same page to get headers
function fetchSamePageHeaders(callback) {
  var request = new XMLHttpRequest();
  request.onreadystatechange = function() {
    if (request.readyState === XMLHttpRequest.DONE) {
      if (callback && typeof callback === 'function') {
        callback(request.getAllResponseHeaders());
      }
    }
  };

  // The HEAD method asks for a response identical to that 
  // of a GET request, but without the response body.
  //you can also use 'GET', 'POST' method depending on situation      
  request.open('HEAD', document.location, true);
  request.send(null);
}

function checkHeaderLanguage(headers) {
  //console.log(headers);
  headers = headers.split("\n").map(x => x.split(/: */, 2))
    .filter(x => x[0]).reduce((ac, x) => {
      ac[x[0]] = x[1];
      return ac;
    }, {});

  if (!headers['content-language']) {
    console.log('No language in response header. Marking the html tag.');
    let html = document.querySelector('html');
    html.lang = 'dummyLang';
  } else {
    console.log('The response header has language:' + headers['content-language']);
  }
}
p {
  margin: 0;
}

p[lang=""],
p:lang(dummyLang) {
  color: darkgreen;
  font-size: 2em;
}

p:lang(en\2dus)::after {
  content: '<= english';
  font-size: 0.5em;
  color: rebeccapurple;
}
<p>I Want to select this.</p>
<p lang="">And this.</p>
<p lang="en-us">Not this.</p>
<span lang='en-us'>
    <p>Also, not this.</p>
    <p lang="">But, this too.</p>
</span>


Here we are using JavaScript to determine if the language has been mentioned in the html tag or in response header. And assigning the html tag dummyLang language. You may also want to check meta tags.
For detailed explanation about Getting HTTP headers in javascript and pros and cons of this technique, refer this SO discussion.

myf
  • 9,874
  • 2
  • 37
  • 49
the Hutt
  • 16,980
  • 2
  • 14
  • 44
  • 1
    Beside the pros/cons discussion you mention, the `.no-lang-header p:not([lang])` rule would not work correctly if the `p` was nested in another element that had a `lang=".."` attribute which would cause the `p` to inherit it, but the css rule would still select it. – Gabriele Petrioli Dec 19 '21 at 10:36
  • Good catch! I've updated the answer. – the Hutt Dec 19 '21 at 11:40
  • Thanks for excellent pointers, analysis and bugreport. To wrap it up: IIUC, the only browser core natively supporting `:lang("*")` at this moment (2021-12) is Safari's, right? I assume all browsers on "iPadOs" are technically "skins" of WebKit, so it explains you have seen it working in Chrome in there (?) – myf Dec 26 '21 at 01:51
  • And second loosely related opinion-based question if I may: how do you feel about (working/valid) `:not(:lang("*"))` versus (not-working/invalid) `:lang("")` used for the same purpose? I plan raising an issue at csswg-drafts asking for confirmation of validity of the former and suggesting possible introduction of the later. Does it make sense / is there any blatant obstacle for `:lang("")` matching `` I've missed? – myf Dec 26 '21 at 02:01
  • Yes, Apple requires every browser to run on WebKit. Even Google Chrome is forced to use WebKit.[ref](https://9to5google.com/2021/05/03/ios-browsers-underpowered-apple/). On other platforms Chromium uses [Blink](http://www.chromium.org/blink) – the Hutt Dec 26 '21 at 02:02
  • 1
    As per the specs `:lang(en)` and `:lang("en")` are same. That makes `:lang()` and `:lang("")` same. But `` and `` are not same. So, there will be confusion whether to treat `:lang()` as `` or ``. – the Hutt Dec 26 '21 at 02:13
  • Good point, yes, it really seems a bit strange. Luckily `:lang()` is still invalid and `:lang("")` not supported, so if there are no hacks in past code-bases abusing it for deliberate rule invalidation, it might be fine after all. Moreover, this analogy could be rephrased in favor of it maybe, because technically `` is equivalent of `` (since empty string attribute value is implied), not `` with no attribute. So `:lang()` for `` could make some sense after all. (But I admit those are quite obscure arguments.) – myf Dec 26 '21 at 02:52
  • (Like `el[attr=""]` already *matches* both `` and `` - since they are the same). (Learnt from https://stackoverflow.com/a/67952059/540955) – myf Dec 26 '21 at 03:11
  • Makes sense. One more point is how to select elements with lang other than inherited lang. Suppose I want to make other lang, non native, content distinct and stand out. In other words., how to specifically select inherited lang? Also using `:lang(var(--content-lang))` can be made possible. – the Hutt Dec 26 '21 at 03:18
  • 1
    FYI, issue filed for CSSWG: https://github.com/w3c/csswg-drafts/issues/6915 As for addressing "any" language change in DOM cascade it is also interesting question. For "sane" markup where there will be no nested elements of the same `lang` value , one could rely simply on `[lang]` selector, but yes, not bulletproof. – myf Dec 28 '21 at 14:43
  • 1
    The spec says that the matching is done as per [RFC4647 Matching of Language Tags - section 3.3.2](https://www.rfc-editor.org/rfc/rfc4647#section-3.3.2). And the RFC says, "*The special range `*` in a language priority list matches any tag.*" So `:lang("*")` needs to be a valid selector. – the Hutt Dec 28 '21 at 15:42
0

I have managed to come up with a work around, first you can run some js to set the lang attribute of every element with no lang attribute to "xyz" and then select that using css.....

document.querySelectorAll("p").forEach(e => {
  if (e.lang == "") e.lang = "xyz";
})
p:lang(xyz) {
  color: red;
}
<p>I Want to select this.</p>
<p lang="">And this.</p>
<p lang="en">Not this.</p>
Someone
  • 350
  • 3
  • 13
  • 1
    This would give false match in the "document came with Content-language HTTP header" scenario, and give false matches for nested elements. This is effectively the same as using plain `*:not([lang]), *[lang=""] { color: red }` CSS rule from the question. – myf Dec 19 '21 at 04:02