How to sort a file by a column which is a mix of numeric, alphabet and punctuation characters?

I have a text file which is of the form –

b   SN:2
d   SN:5
f   SN:10
g   SN:11
h   SN:15
i   SA:3
j   SN:1
k   SN:4

And I want to sort by the second column, actually the numerical value in the second column. I’ve tried –

$ sort -n -k2,2 file
$ sort -k2.4,2.5n file

but nothing seems to work.

Here is Solutions:

We have many solutions to this problem, But we recommend you to use the first solution because it is tested & true solution that will 100% work for you.

Solution 1

Because you don’t use -t option (or -b with GNU sort), so you must count from beginning of leading spaces. POSIX defined sort -k EXTENDED DESCRIPTION as:

A field comprises a maximal sequence of non-separating characters and, in 
the absence of option -t, any preceding field separator

So you must use:

$ sort -nk2.7 file
j   SN:1
b   SN:2
i   SA:3
k   SN:4
d   SN:5
f   SN:10
g   SN:11
h   SN:15

But you can use : as field separator, then sort numeric by second field:

$ sort -t':' -nk2 file
j   SN:1
b   SN:2
i   SA:3
k   SN:4
d   SN:5
f   SN:10
g   SN:11
h   SN:15

Solution 2

Just for you case the man sort

If neither -t nor -b is in effect, characters in a field are counted
from the beginning of the preceding whitespace.

So

sort -k2.7n file

will do job

HINT!

If you’d like to count from the line begining you can use -t to accept absent char to treat the line as one field:

sort -t% -k1.8n file

Note: Use and implement solution 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply