Here's my take on this. Tested in FreeBSD, so I'd expect it to work just about anywhere...
#!/usr/bin/awk -f
BEGIN {
depth=1;
}
$1 ~ /^#(\.#)*\)$/ {
thisdepth=split($1, _, ".");
if (thisdepth < depth) {
# end of subsection, back out to current depth by deleting array values
for (; depth>thisdepth; depth--) {
delete value[depth];
}
}
depth=thisdepth;
# Increment value of last member
value[depth]++;
# And substitute it into the current line.
for (i=1; i<=depth; i++) {
sub(/#/, value[i], $0);
}
}
1
The basic idea is that we maintain an array (value[]
) of our nested chapter values. After updating the array as required, we step through the values, substituting the first occurrence of the octothorpe (#
) each time with the current value for that position of the array.
This will handle any level of nesting, and as I mentioned above, it should work both in GNU (Linux) and non-GNU (FreeBSD, OSX, etc) versions of awk.
And of course, if one-liners are your thing, this can be compacted:
awk -vd=1 '$1~/^#(\.#)*\)$/{t=split($1,_,".");if(t<d)for(;d>t;d--)delete v[d];d=t;v[d]++;for(i=1;i<=d;i++)sub(/#/,v[i],$0)}1'
which could also be expressed, for easier reading, like this:
awk -vd=1 '$1~/^#(\.#)*\)$/{ # match only the lines we care about
t=split($1,_,"."); # this line has 't' levels
if (t<d) for(;d>t;d--) delete v[d]; # if levels decrease, trim the array
d=t; v[d]++; # reset our depth, increment last number
for (i=1;i<=d;i++) sub(/#/,v[i],$0) # replace hash characters one by one
} 1' # and print.
UPDATE
And after thinking about this for a bit, I realize that this can be shrunk further. The for
loop contains its own condition, there's no need to place it inside an if
. And
awk '{
t=split($1,_,"."); # get current depth
v[t]++; # increment counter for depth
for(;d>t;d--) delete v[d]; # delete record for previous deeper counters
d=t; # record current depth for next round
for (i=1;i<=d;i++) sub(/#/,v[i],$0) # replace hashes as required.
} 1'
Which of course minifies into a one liner like this:
awk '{t=split($1,_,".");v[t]++;for(;d>t;d--)delete v[d];d=t;for(i=1;i<=d;i++)sub(/#/,v[i],$0)}1' file
Obviously, you can add the initial match condition if you require it, so that you only process lines that look like titles.
Despite being a few characters longer, I believe this version runs ever so slightly faster than karakfa's similar solution, probably because it avoids the extra if
for each iteration of the for
loop.
UPDATE #2
I include this because this because I found it fun and interesting. You can do this in bash alone, no need for awk. And it's not much longer in terms of code.
#!/usr/bin/env bash
while read word line; do
if [[ $word =~ [#](\.#)*\) ]]; then
IFS=. read -ra a <<<"$word"
t=${#a[@]}
((v[t]++))
for (( ; d > t ; d-- )); do unset v[$d]; done
d=t
for (( i=1 ; i <= t ; i++ )); do
word=${word/[#]/${v[i]}}
done
fi
echo "$word $line"
done < input.txt
This follows the same logic as the awk script above, but works entirely in bash using Parameter Expansion to replace #
characters. One flaw it suffers from is that it does not maintain whitespace around the first word on every line, so you'd lose any indents. With a bit of work, that could be mitigated too.
Enjoy.