4

I am trying to beautify CSS code using JavaScript.

A minified CSS code looks like this:

str = 'body{margin:0;padding:0;}section,article,.class{font-size:2em;}'

So far I could beautify the code by using multiple replaces:

str.replace(/{/g, " {\n")
    .replace(/}/g, "}\n")
    .replace(/;/g,";\n")
    .replace(/,/g, ",\n")

This is working but I want to improve it

  • How can I add a tab before each property?
  • Is it possible to aggregate all replace calls in one RegEx?
  • Is it possible to detect very last properties that don't have semicolon? (that is valid CSS)
Mohsen
  • 64,437
  • 34
  • 159
  • 186
  • Why not use a pre-built css minifier? http://stackoverflow.com/questions/787789/any-recommendations-for-a-css-minifier – beatgammit Nov 21 '12 at 20:28
  • 1
    @tjameson He wants to UN-minify, not minify. – ean5533 Nov 21 '12 at 20:29
  • @tjameson He's looking for a beautifier, not a minifier. – Brian Campbell Nov 21 '12 at 20:29
  • 1
    There are plenty of beautifier scripts available online. Most likely some in JavaScript as well. Using RegEx only is kind of limited for this kind of task unless you accept/embrace its limitations (it can't manage the entire grammar and syntax of the CSS language). – Mihai Stancu Nov 21 '12 at 20:32
  • @MihaiStancu what you said is true when parsing programming languages. I'm not sure it's valid for CSS too - it'd be interesting if someone good at CS could prove it – Raffaele Nov 21 '12 at 20:36
  • @Raffaele I don't know whether there are some crazy new CSS3 features, but I am pretty sure CSS has neither nesting nor any requirement of a matching repetition. Hence, it should be regular. – Martin Ender Nov 21 '12 at 20:49
  • @m.buettner - Not that non-regular language can't be matched by today's "regular" expressions, e.g. matching "a string of a's followed by the same number of b's" shouldn't be possible with a "truly" regular expressions, but with backreferencing we can. – Andrew Cheong Nov 21 '12 at 21:04
  • @acheong87 no I don't think backreferencing can do that (specific example). Recursion and balacing groups can. But yes, of course, most regex engines can match more than regular expressions. But often, it's not all that advisable to do so, because (especially with programming/markup languages), you **do** overlook the odd syntax exception. – Martin Ender Nov 21 '12 at 21:10
  • I believe CSS is regular and it should fit in one single expression. – Mohsen Nov 21 '12 at 21:15
  • @Mohsen it **cannot** go into a single expression, because you have different replacements for each match. of course you could match "all spots that need treatment", then use a callback for the replacement, and have the callback analyze those spots again. but that only defers the list of cases ;) – Martin Ender Nov 21 '12 at 21:25
  • @m.buettner - You're right. I'm not sure what the example I had in mind was, now. But there was one ;) – Andrew Cheong Nov 21 '12 at 21:36
  • @acheong87 `a{n}ba{n}` for arbitrary `n` is not regular, but possible with backreferencing ;) – Martin Ender Nov 21 '12 at 21:37
  • @m.buettner - Ah, you nailed it. I kept thinking a "b" was in the example somewhere; forgot it might just be a separator. – Andrew Cheong Nov 21 '12 at 21:41

2 Answers2

3

I think it's hard to reduce the number of regular expressions, since sometimes you need only a line break, sometimes you need a tab, too. Sometimes you need to write back one and sometimes two characters. But here is a list of replacements that makes the CSS look quite nice:

str.replace(/\{/g, " {\n\t")        // Line-break and tab after opening {
   .replace(/;([^}])/g, ";\n\t$1")  // Line-break and tab after every ; except
                                    // for the last one
   .replace(/;\}/g, ";\n}\n\n")     // Line-break only after the last ; then two
                                    // line-breaks after the }
   .replace(/([^\n])\}/g, "$1;\n}") // Line-break before and two after } that
                                    // have not been affected yet
   .replace(/,/g, ",\n")            // line break after comma
   .trim()                          // remove leading and trailing whitespace

Makes this:

 str = 'body{margin:0;padding:0}section,article,.class{font-size:2em;}'

Look like this:

body {
    margin:0;
    padding:0;
}

section,
article,
.class {
    font-size:2em;
}

If you don't care about those omitted semicolons being put back in place, you can shorten this a bit though, by changing the order:

str.replace(/\{/g, " {\n\t")
   .replace(/\}/g, "\n}\n\n")    // 1 \n before and 2 \n after each }
   .replace(/;(?!\n)/g, ";\n\t") // \n\t after each ; that was not affected
   .replace(/,/g, ",\n")
   .trim()
Martin Ender
  • 43,427
  • 11
  • 90
  • 130
  • To me this is a very good starting point. I personally prefer when the colon is followed by a single space. Also, consider what happens when setting `background-image: url('/crazy{u;rl}');` - I know maybe the OP didn't ask for a rock solid algorithm :) – Raffaele Nov 21 '12 at 21:30
  • @Raffaele right, now we're getting into the awkward territory due to which regular expressions are not recommended for language parsing ;). With some odd lookaheads, this will still be possible, but ridiculously clutter up the regex. – Martin Ender Nov 21 '12 at 21:32
  • Sure, that's why if I used RegEx, I wouldn't use that simple approach :) However, as I said in a previous comment, I feel that CSS can be matched with RegEX (and your statement that it's a regular language seems to confirm this - but I'm not trained in CS so can't tell) and the usual answer when someone asks a similar question for XML or code (*"You can't do it"*) shouldn't be fired here :) – Raffaele Nov 21 '12 at 21:56
1

I don't know if CSS is a regular language (my guess is yes), but this should doable with regex regardless.

There's no need to match a last property, whether it contains a semicolon or not. First match all closing curly braces, like you've done, except add a newline both before and after each:

.replace(/}/g, "\n}\n")

Then match all semicolons except those that come before a newline (which were inserted by the regex above) and add a newline and tab using the \t character after each:

.replace(/;([^\n])/g, ";\n\t$1")


This is just the tip of the iceberg, unfortunately. Don't forget to look for all the different types of selectors, such as those containing : or >, if you plan to add spaces around those. There's probably lots of other stuff you'll need to consider, too.

user428517
  • 4,132
  • 1
  • 22
  • 39
  • oops, fixed, thanks. i don't believe anything more complicated than these two rules is needed to handle ends of blocks of css. – user428517 Nov 21 '12 at 21:16