The following will transform your file into tab-delimited form, where sort
or other standard tools can handle it trivially:
while read -r line; do
printf '%s\n' "$line" | xargs printf '%s\t'
echo
done
This works because xargs
parses quotes and whitespace, breaking each line into its individual elements, and then passes each element to printf '%s\t'
, which prints those elements with tabs between them; the echo
then adds newlines between the lines of output.
The output can then be fed into something like the following:
sort -t $'\t' -k2,2 -k1,1
...which will sort the tab-delimited columns, first on the second key (city, in your example), then on the first (street name, in your example).
Let's take the below input file, which will make behavior clearer than was the case with the original proposal:
"Street A" "City A" 1
"Street B" "City B" 2
"A Street" "City A" 3
"B Street" "City B" 4
"Street A" "A City" 5
"Street B" "B City" 6
Street City 7
Run through the above, with LANG=C sort -s -t$'\t' -k2,2 -k1,1 | expand -t16
, -- thus sorting first by city, then by street, then printing with 16-space tabstops -- and output is as follows:
Street A A City 5
Street B B City 6
Street City 7
A Street City A 3
Street A City A 1
B Street City B 4
Street B City B 2
By contrast, use LANG=C sort -s -t$'\t' -k1,1 -k2,2 | expand -t16
to sort first by street and then by city (and print with 16-space tabs), and you get the following:
A Street City A 3
B Street City B 4
Street City 7
Street A A City 5
Street A City A 1
Street B B City 6
Street B City B 2
If you want to go back from the tab-delimited format to the quoted format, this is feasible too:
#!/bin/bash
# ^^^^- Important, not /bin/sh
while IFS=$'\t' read -r -a cols; do
for col in "${cols[@]}"; do
if [[ $col = *[[:space:]]* ]]; then
printf '"%s" ' "$col"
else
printf '%s ' "$col"
fi
done
printf '\n'
done
Taking your original input and running it through the first script (to convert to tab-delimited form), then sort -t$'\t' -k1,1 -k2,2
(to sort in that form), then this second script (to convert back to whitespace separators with quotes), yields the following:
"A Street" "City A" 3
"B Street" "City B" 4
Street City 7
"Street A" "A City" 5
"Street A" "City A" 1
"Street B" "B City" 6
"Street B" "City B" 2