1

The following script colorizes SQL commands buried in log files, and a couple of other "tags":

red="\x1b[31m"
green="\x1b[32m"
yellow="\x1b[33m"
blue="\x1b[34m"
white="\x1b[37m"

BLACK="\x1b[30;1m"
RED="\x1b[31;1m"
GREEN="\x1b[32;1m"
YELLOW="\x1b[33;1m"
BLUE="\x1b[34;1m"
CYAN="\x1b[36;1m"
WHITE="\x1b[37;1m"

onred="\x1b[41m"

lblack="\x1b[90m"
lred="\x1b[91m"
lgreen="\x1b[92m"
lyellow="\x1b[93m"
lblue="\x1b[94m"
lmagenta="\x1b[95m"
lcyan="\x1b[96m"
lwhite="\x1b[97m"

reset_color="\x1b[0m"

sed -r "s/'[^']*'/${CYAN}&${reset_color}/g;
        s/[a-z_]*_id/${white}&${reset_color}/g;

        s/(.*\[)(AbstractApplicationContext)(\].*)/${BLACK}\\1${reset_color}${yellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ActionService)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(Authenticated)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(CascadeHandlerImpl)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ClasspathHacker)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ConfigManagerImpl|ConfigManagerLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ContextFilter|ContextImpl|ContextLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(DataSourceRestrictionConverter)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(DatabaseLoader)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(DefaultListableBeanFactory)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(DispatchFilter)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(FileHelper|FileIndex)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(LicenseManagerImpl)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(LocalizedStringsLoader)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(LoggingPropertyPlaceholderConfigurer)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(PooledDbDriverImpl)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ProjectLoader)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(PropertiesLoaderSupport)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(RenderingFilter)(\].*)/${BLACK}\\1${reset_color}${lwhite}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(SecurityControllerImpl|SecurityServiceImpl)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(WorkflowRulesContainerImpl|WorkflowRulesContainerLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(XmlBeanDefinitionReader)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;

        s/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/${yellow}\\1${reset_color}\\2${yellow}\\4${reset_color} ${YELLOW}\\5${reset_color}\\6${yellow}\\7${reset_color}/g;
        s/LEFT OUTER JOIN [^ ]* ON/${yellow}&${reset_color}/g;
        s/ORDER BY/${yellow}&${reset_color}/g;
        s/(ASC|DESC)/${yellow}&${reset_color}/g;
        s/GROUP BY/${yellow}&${reset_color}/g;
        s/(INSERT INTO) ([^ ]*)(.*)(VALUES)/${green}\\1${reset_color} ${GREEN}\\2${reset_color}\\3${green}\\4${reset_color}/g;
        s/(INSERT INTO) ([^ ]*)(.*)/${green}\\1${reset_color} ${GREEN}\\2${reset_color}\\3${reset_color}/g;
        s/(UPDATE) *([^ ]*) (SET|set)/${blue}\\1${reset_color} ${BLUE}\\2${reset_color} ${blue}\\3${reset_color}/g;
        s/DELETE FROM *[^ ]* WHERE/${RED}&${reset_color}/g;

        s/\*\*\*ROLLBACK\*\*\*/${white}${onred}&${reset_color}/g;
        s/\[ERROR\]/${WHITE}${onred}&${reset_color}/g;
        s/SQLServerException:/${WHITE}${onred}&${reset_color}/g"

Though, when run on a 1.9 MB log file, it takes 1:41 (1 minute, 41 seconds) to execute... while a simple cat is run in less than 0:01...

Same results (under 1 sec) if I remove the block between "AbstractApplicationContext" and "XmlBeanDefinitionReader".

I don't understand why that particular block makes such a difference!?

Is there a way to optimize such colorizing script?

Sample file extract (to duplicate for making it become a large file):

[INFO ][2016-05-20 16:17:51,346][ContextLoader] - [Root WebApplicationContext: initialization started]
[INFO ][2016-05-20 16:17:51,505][XmlBeanDefinitionReader] - [Loading XML bean definitions from ServletContext resource [/WEB-INF/config/context/appContext.xml]]
[INFO ][2016-05-20 16:17:52,986][PropertiesLoaderSupport] - [Loading properties file from class path resource [config/mail.properties]]
[INFO ][2016-05-20 16:17:55,900][ConfigManagerLoader] - [Reading XML config]
[INFO ][2016-05-20 16:17:55,991][ConfigManagerLoader] - [Reading XML config: OK]
[WARN ][2016-05-20 16:17:56,384][ConfigManagerLoader] - [Low max memory=477102080. Java max memory=1000 MB is recommended for production use, as a minimum.]
[INFO ][2016-05-20 16:17:58,309][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]]
[INFO ][2016-05-20 16:17:58,337][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]: OK, strings:759]
[INFO ][2016-05-20 16:17:58,641][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]]
[INFO ][2016-05-20 16:17:58,768][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]: OK, strings:46436]
[INFO ][2016-05-20 16:17:58,830][LocalizedStringsLoader] - [Loading localized strings for locale=[nl_NL]]
[INFO ][2016-05-20 16:17:58,946][LocalizedStringsLoader] - [Loading localized strings for locale=[nl_NL]: OK, strings:46436]
[INFO ][2016-05-20 16:17:59,434][PropertiesLoaderSupport] - [Loading properties file from class path resource [config/mail.properties]]
[INFO ][2016-05-20 16:18:00,476][XmlBeanDefinitionReader] - [Loading XML bean definitions from class path resource [project-child-context.xml]]
[DEBUG][2016-05-20 16:18:01,259][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:18:01,340][DbConnectionImpl] - [Updated: -1 records]
[INFO ][2016-05-20 16:18:01,363][DatabaseImpl] - [Database loaded: data]
[INFO ][2016-05-20 16:18:01,379][DatabaseLoader] - [Loading Database [data]: OK]
[INFO ][2016-05-20 16:18:01,393][DatabaseLoader] - [Loading Database [schema]]
[DEBUG][2016-05-20 16:18:01,865][DbConnectionImpl] - [SELECT column FROM table WHERE table_name = 'sample']
[DEBUG][2016-05-20 16:18:01,894][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:18:01,898][DbConnectionImpl] - [Updated: -1 records]
[INFO ][2016-05-20 16:18:06,241][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[checkRequestDuplicates]]
[INFO ][2016-05-20 16:18:06,384][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[getStatistic]]
[INFO ][2016-05-20 16:18:06,971][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[saveRecord]]
[INFO ][2016-05-20 16:18:07,126][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[EmailService]]
[INFO ][2016-05-20 16:18:07,542][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[LocalizationRead]]
[INFO ][2016-05-20 16:18:09,578][FileIndex$1] - [File index loading started]
[DEBUG][2016-05-20 16:18:19,406][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:18:19,410][DbConnectionImpl] - [Updated: -1 records]
[INFO ][2016-05-20 16:18:22,201][LicenseManagerImpl] - [Checkout concurrent license=[Main]]
[DEBUG][2016-05-20 16:18:22,209][SecurityControllerImpl] - [Determine next request]
[DEBUG][2016-05-20 16:18:22,239][RenderingFilter] - [Rendering mode]
[DEBUG][2016-05-20 16:18:22,253][Authenticated] - [User is authenticated]
[DEBUG][2016-05-20 16:18:22,306][FileHelper] - [Find file=[hovertip.js]]
[DEBUG][2016-05-20 16:18:22,399][FileHelper] - [Find file=[hovertip.css]]
[DEBUG][2016-05-20 16:22:18,263][DbConnectionImpl] - [INSERT INTO notifications_log (columns, columns) VALUES ('2016-05-20', 'ERROR')]
[DEBUG][2016-05-20 16:22:18,334][DbConnectionImpl] - [Updated: 1 records]
[DEBUG][2016-05-20 16:22:18,393][DbConnectionImpl] - [***COMMIT***]
[DEBUG][2016-05-20 16:22:18,549][DbConnectionImpl] - [***ROLLBACK***]
[DEBUG][2016-05-20 16:23:37,659][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:23:37,662][DbConnectionImpl] - [Updated: -1 records]
[DEBUG][2016-05-20 16:23:37,886][DataSourceImpl] - [SELECT col_id FROM table_1 LEFT OUTER JOIN table_2 ON table_1.col_id=table_2.col_id]
[DEBUG][2016-05-20 16:23:37,926][DbConnectionImpl] - [***COMMIT***]
[DEBUG][2016-05-20 16:23:37,930][ContextFilter] - [---------- Request: processing finished]
[DEBUG][2016-05-20 16:38:38,033][DbConnectionImpl] - [UPDATE users SET pwd = NULL WHERE user_name = 'me']
[DEBUG][2016-05-20 16:38:38,051][DbConnectionImpl] - [Updated: 1 records]
[DEBUG][2016-05-20 16:38:38,058][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:38:38,063][DbConnectionImpl] - [Updated: -1 records]
[DEBUG][2016-05-20 17:43:25,087][DbConnectionImpl] - [***COMMIT***]
[DEBUG][2016-05-20 17:43:25,096][ContextFilter] - [---------- Request: processing finished]
user3341592
  • 1,419
  • 1
  • 17
  • 36

2 Answers2

1

Reduce similar lines

s/(.*\[)(ActionService)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
s/(.*\[)(Authenticated)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;

and so forth

by one line

s/(.*\[)(ActionService|Authenticated)(\].*)/${BLACK}\\1${reset_color}${yellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;

to get more performance.

Cyrus
  • 84,225
  • 14
  • 89
  • 153
1

It's taking a very long time because you're comparing every input line against every one of your 37 regexps! Use awk and just do the test once, e.g.:

$ cat file
[INFO ][ts 1][ContextLoader] - [rest 1]
[INFO ][ts 2][XmlBeanDefinitionReader] - [rest 2]
[INFO ][ts 3][PropertiesLoaderSupport] - [rest 3]
$
$ cat tst.awk
BEGIN {
    red    = "<red>"            # "\x1b[31m"
    green  = "<green>"          # "\x1b[32m"
    yellow = "<yellow>"         # "\x1b[33m"
    black  = "<black>"          # "\x1b[30;1m"
    reset  = "<reset>"          # "\x1b[0m"

    color["ContextLoader"] = red
    color["XmlBeanDefinitionReader"] = green
    color["PropertiesLoaderSupport"] = yellow
}
match($0,/((\[[^]]+\]){2}\[)([^]]+)(.*)/,a) {
    print black a[1] reset color[a[3]] a[3] reset black a[4] reset
}

$ awk -f tst.awk file
<black>[INFO ][ts 1][<reset><red>ContextLoader<reset><black>] - [rest 1]<reset>
<black>[INFO ][ts 2][<reset><green>XmlBeanDefinitionReader<reset><black>] - [rest 2]<reset>
<black>[INFO ][ts 3][<reset><yellow>PropertiesLoaderSupport<reset><black>] - [rest 3]<reset>

The above isolates whatever string occurs in the section of each line you care about and then looks up the color of that string in a hash for printing. It uses GNU awk for the 3rd arg to match(), it's a simple tweak for other awks.

I used color names like <red> just so you could see the output. In your real system you'd just write red = "\x1b[31m" instead of red = "<red>" # "\x1b[31m" etc.

To update the above to handle the SQL statements too, emulating the logic from your sed script:

s/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/${yellow}\\1${reset_color}\\2${yellow}\\4${reset_color} ${YELLOW}\\5${reset_color}\\6${yellow}\\7${reset_color}/g;
s/LEFT OUTER JOIN [^ ]* ON/${yellow}&${reset_color}/g;

would be something like (first block from the existing awk script in this answer):

....
match($0,/((\[[^]]+\]){2}\[)([^]]+)(.*)/,a) {
    print black a[1] reset color[a[3]] a[3] reset black a[4] reset
    next
}
match($0,/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/,a) {
    print yellow a[1] reset a[2] yellow a[4] reset yellow a[5] reset a[6] yellow reset
    next
}
match($0,/LEFT OUTER JOIN [^ ]* ON/,a) {
    print yellow a[0] reset
    next
}
....

Note the "next" statement at the end of every block - that tells awk to stop processing the current line of input and go back to the start of it's implicit while read line work loop. We use that for efficiency to stop awk from analyzing the line again after it's already succeeded in matching a regexp against the line.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • FYI, doing this takes between 5 and 6 seconds to display a big file -- which takes 4 s without highlighting (just cat). So, that becomes bearable... – user3341592 May 26 '16 at 05:35
  • Should I put everything in AWK, then? Or keep, for example, the lines highlighting in AWK (as you did) and have the SQL keywords highlighting in Sed? And pipe both? – user3341592 May 26 '16 at 05:35
  • Do it all in awk. You don't need sed when you're using awk. When you have to do simple substitutions on individual lines you use sed, for anything more complicated than that you use awk - you never use both. If you had included some SQL statements in your sample input I'd have shown you how to do that too but you can probably figure it out (a separate `match() { action }` block). If you need a reference for awk, get the book Effective Awk Programming, 4th Edition, by Arnold Robbins (all other awk books are obsolete). – Ed Morton May 26 '16 at 11:55
  • I added some of the code to handle the SQL statements at the bottom of the answer. As you can see it's pretty obvious given your existing sed script. – Ed Morton May 26 '16 at 12:07
  • Thanks for improving your example. Though, it's not really functioning because of your `next` command. I need to color SQL keywords, and SQL column names ending with `_id` for example. It can't be done in one pass, as `INSERT INTO` may be followed by `SELECT`, as `SELECT` may include one, two or even more `JOIN`, as many `_id` columns can be specified everywhere, as there can be sub-SELECT, etc. I've done a sample file which includes all complex cases. And I'd like the script to be no more than 20% slower than a normal `cat`, so that I can really use it on the fly, piping a `tail -f` command. – user3341592 May 26 '16 at 21:29
  • Note that my first trial was using `awk` for coloring full lines, as you did, and `sed` for coloring parts of lines. The 2 scripts were called `colorlog` and `colorsql`. I don't understand why, but when both were piped in sequence after a `tail -f`command, there was no output at all!? Piped alone after `tail -f`, they both worked OK. So, there is an interaction when piping both of them which totally escapes me... Any idea? – user3341592 May 26 '16 at 21:32
  • Then remove the `next` and change `print`s to `$0 = ` in the relevant SQL parts and if that doesn't do it then post a new question wrt what you need to do with your SQL, complete with **concise**, testable sample input and expected output. Is all of this log stuff and the SQL all in one file? Seems like you'd want 2 different tools to parse 2 different formats. wrt the `tail | awk | sed` it's probably a buffering issue similar to that described at http://stackoverflow.com/questions/5427483/why-no-output-is-shown-when-using-grep-twice. – Ed Morton May 26 '16 at 21:48
  • New question posted on http://stackoverflow.com/questions/37513885/colorize-log-files. Thanks! – user3341592 May 29 '16 at 19:14