43

Possible Duplicate:
Why does HTML think “chucknorris” is a color?

How is the bgcolor property calculated?

When i use the following html code...

<body bgcolor="#Deine Mutter hat eine Farbe und die ist grün."></body>

...what I get is the following color.

#Deine Mutter hat eine Farbe und die ist grün.

By the way: When I try to use it in CSS it won´t work and will apply the standard color:

body{
    color: #IchwillGOLD;
}

Why?

Community
  • 1
  • 1

2 Answers2

18

My first try on this was a little trial on error and while I found some interesting properties of the system, it wasn't quite enough to form an answer. Next, I turned my attention to the standard. The reason that I believed this to be in the standard was that I tested it on three different browsers and they actually all did the same thing. Using the standard I found out what happens:

  1. All characters that aren't hexadecimal are replaced by zeroes (so only zeroes, 1-9 and a-e remain)
  2. The string is zero padded at the end to be a multiple of three
  3. The string is then divided up in three equal parts, each representing a color
  4. If the resulting strings are longer than 8 characters, take the last 8 characters of each string
  5. As long as each of the strings starts with a zero, the first character is removed from each string (not happening for this particular string since it starts with De
  6. The first two characters are taken from each of those strings and converted to a number for use as one of the components of the color

This way you'll see you get 00FA00 for Deine Mutter hat eine Farbe und die ist grün.

The html5 standard describes the process more precisely and actually describes a couple more cases here: http://www.w3.org/TR/html5/common-microsyntaxes.html#colors under the "rules for parsing a legacy color value"

Jasper
  • 11,590
  • 6
  • 38
  • 55
  • Trial and error did seem to highlight the point of breaking into 3 parts for me... – loxxy Nov 14 '12 at 15:44
  • 1
    @Ioxxy: I got 1 through trial and error and I was on my way to 2 (I was able to delete 2 zeroes but not three from the end of the string) when I decided to just look it up. – Jasper Nov 14 '12 at 15:46
  • 1
    Well, Besides that, the very first step would be to match against keywords red, blue.. etc – loxxy Nov 14 '12 at 15:51
  • @Ioxxy: No. The first step would be to fail if it's an empty string, the second to strip whitespace from the beginning and end of the string, the third to fail if it says transparent and only the fourth step is to match against CSS 1 color words. Once that is done, the fifth step is to parse a three letter Hex string. I purposely left out a lot of things for clarity and provided a link to the standard where those things can be found – Jasper Nov 14 '12 at 15:56
  • I wonder why this isn't the same for the CSS background property... – danijar Nov 14 '12 at 15:57
  • 1
    @sharethis Because this is was made to make existing websites from back in the day when standards weren't adhered to by browsers be rendered correctly. CSS was invented in the days of standards, so it never suffered such abuse. – Jasper Nov 14 '12 at 16:13
  • 2
    In addition, these presentational attributes are eventually translated to their corresponding CSS rules with zero author-level specificity, as per http://www.w3.org/TR/CSS21/cascade.html#preshint This is why WebKit, for example, passes the resulting color value to a `CssColor` struct. – BoltClock Nov 14 '12 at 17:11
  • @Jasper: Can you give examples for values that are non-standard and that are the reasons? I do not see much of backwards compatibility here. For example non-standard color names are destroyed by what you describe in your answer. So I wonder what kind of backwards compatibility is the reason to process the way you describe in your answer. Also following the HTML 5 standard you relate to in your answer, there are many reasons this must give an error, however it does not. You don't explain why in your answer. – hakre Nov 16 '12 at 08:20
  • @hakre: I don't see any reasons why this might give an error. If you'll point me at those reasons for errors, I'll gladly discuss them. – Jasper Nov 16 '12 at 22:22
  • @Jasper: The HTML5 specs [you linked](http://www.w3.org/TR/html5/common-microsyntaxes.html#colors) state this should give an error at multiple places in the numbered list; for example (but not limited to): *"2. If input is not exactly seven characters long, then **return an error**."* [Bold by me] – hakre Nov 17 '12 at 09:58
  • @hakre: you're reading the **rules for parsing simple color values**, but as I said in my original answer, you should be reading the **rules for parsing legacy color values** – Jasper Nov 17 '12 at 11:59
  • @Jasper: Thanks for the clarification, but I still have a question. where is written that `bgcolor` should be parsed that way? Where is written that it is such a legacy color value for which those parsing rules apply? – hakre Nov 17 '12 at 13:41
  • @Jasper: I compiled a question of it's own about that so this is of more use: http://stackoverflow.com/questions/13439800/status-of-attributes-depricated-or-obsolete – hakre Nov 18 '12 at 11:50
9

As i stated in the Comments, the HTMLParser adds it as a CSS Property, and as already answered by Jasper, it is by Specification.

Implementation

Webkit parses the html in HTMLParser.cpp and if the Parser is inBody it adds the bgColor Attribute as CssColor in HTMLBodyElement.cpp

// Color parsing that matches HTML's "rules for parsing a legacy color value"
void HTMLElement::addHTMLColorToStyle(StylePropertySet* style, CSSPropertyID propertyID, const String& attributeValue)
{
    // An empty string doesn't apply a color. (One containing only whitespace does, which is why this check occurs before stripping.)
    if (attributeValue.isEmpty())
        return;

    String colorString = attributeValue.stripWhiteSpace();

    // "transparent" doesn't apply a color either.
    if (equalIgnoringCase(colorString, "transparent"))
        return;

    // If the string is a named CSS color or a 3/6-digit hex color, use that.
    Color parsedColor(colorString);
    if (!parsedColor.isValid())
        parsedColor.setRGB(parseColorStringWithCrazyLegacyRules(colorString));

    style->setProperty(propertyID, cssValuePool().createColorValue(parsedColor.rgb()));
}

You have good chances to end in this method:

static RGBA32 parseColorStringWithCrazyLegacyRules(const String& colorString)

I think it is to support legacy Colors like this : body bgcolor=ff0000 ( Mozilla Gecko Test ).

  1. Skip a leading #
  2. Grab the first 128 characters, replacing non-hex characters with 0. 1120
  3. Non-BMP characters are replaced with "00" due to them appearing as two "characters" in the String.
  4. If no digits return Color black
  5. Split the digits into three components, then search the last 8 digits of each component.

Code of Webkit/HTMLElement.cpp:parseColorStringWithCrazyLegacyRules:

static RGBA32 parseColorStringWithCrazyLegacyRules(const String& colorString)
{
    // Per spec, only look at the first 128 digits of the string.
    const size_t maxColorLength = 128;
    // We'll pad the buffer with two extra 0s later, so reserve two more than the max.
    Vector<char, maxColorLength+2> digitBuffer;
    size_t i = 0;
    // Skip a leading #.
    if (colorString[0] == '#')
        i = 1;

    // Grab the first 128 characters, replacing non-hex characters with 0.
    // Non-BMP characters are replaced with "00" due to them appearing as two "characters" in the String.
    for (; i < colorString.length() && digitBuffer.size() < maxColorLength; i++) {
        if (!isASCIIHexDigit(colorString[i]))
            digitBuffer.append('0');
        else
            digitBuffer.append(colorString[i]);
    }

    if (!digitBuffer.size())
        return Color::black;

    // Pad the buffer out to at least the next multiple of three in size.
    digitBuffer.append('0');
    digitBuffer.append('0');

    if (digitBuffer.size() < 6)
        return makeRGB(toASCIIHexValue(digitBuffer[0]), toASCIIHexValue(digitBuffer[1]), toASCIIHexValue(digitBuffer[2]));

    // Split the digits into three components, then search the last 8 digits of each component.
    ASSERT(digitBuffer.size() >= 6);
    size_t componentLength = digitBuffer.size() / 3;
    size_t componentSearchWindowLength = min<size_t>(componentLength, 8);
    size_t redIndex = componentLength - componentSearchWindowLength;
    size_t greenIndex = componentLength * 2 - componentSearchWindowLength;
    size_t blueIndex = componentLength * 3 - componentSearchWindowLength;
    // Skip digits until one of them is non-zero, 
    // or we've only got two digits left in the component.
    while (digitBuffer[redIndex] == '0' && digitBuffer[greenIndex] == '0' 
        && digitBuffer[blueIndex] == '0' && (componentLength - redIndex) > 2) {
        redIndex++;
        greenIndex++;
        blueIndex++;
    }
    ASSERT(redIndex + 1 < componentLength);
    ASSERT(greenIndex >= componentLength);
    ASSERT(greenIndex + 1 < componentLength * 2);
    ASSERT(blueIndex >= componentLength * 2);
    ASSERT(blueIndex + 1 < digitBuffer.size());

    int redValue = toASCIIHexValue(digitBuffer[redIndex], digitBuffer[redIndex + 1]);
    int greenValue = toASCIIHexValue(digitBuffer[greenIndex], digitBuffer[greenIndex + 1]);
    int blueValue = toASCIIHexValue(digitBuffer[blueIndex], digitBuffer[blueIndex + 1]);
    return makeRGB(redValue, greenValue, blueValue);
}
pce
  • 5,571
  • 2
  • 20
  • 25
  • Only for clarification, @Soluter should accept your answer. – pce Nov 14 '12 at 16:37
  • I didn't mean to quarrel about points or anything. Just remarking that it wasn't any different. But it is in fact a (possibly) useful clarification, so I removed the comment. – Jasper Nov 14 '12 at 16:40