Grep word boundaries

Accorging to GNU documentation:

‘\<’ Match the empty string at the beginning of word.
‘\>’ Match the empty string at the end of word.

My /etc/fstab looks like this:

/dev/sdb1       /media/fresh      ext2   defaults     0 0

I want grep to return TRUE/FALSE for the existence of /media/fresh. I tried to use \< and \> but it didn’t work. Why?

egrep '\</media/fresh\>' /etc/fstab

Workaround:

egrep '[[:blank:]]/media/fresh[[:blank:]]' /etc/fstab

But it looks uglier.

My grep is 2.5.1

Here is Solutions:

We have many solutions to this problem, But we recommend you to use the first solution because it is tested & true solution that will 100% work for you.

Solution 1

\< and \> match empty string at the begin and end of a word respectively and only word constituent characters are:

[[:alnum:]_]

From man grep:

Word-constituent characters are letters, digits, and the underscore.

So, your Regex is failing because / is not a valid word constituent character.

Instead as you have spaces around, you can use -w option of grep to match a word:

grep -wo '/media/fresh' /etc/fstab

Example:

$ grep -wo '/media/fresh' <<< '/dev/sdb1       /media/fresh      ext2   defaults     0 0'
/media/fresh

Solution 2

This problem with \< (and also\b) applies not only to /, but to all non-word characters. (i.e. characters other than [[:alnum:]] and _. )

The problem is that the regex engine will always bypass a non-word character like / when searching for the next anchor \<.
That’s why you should not put non-word characters like / right after \<.
If you do, by construction, nothing will match.

An alternative to the -w option of grep, would be something like this:

egrep "(^|\W)/media/fresh($|\W)"

Note: Use and implement solution 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply