Why does my grep expression need to use $'string' to match tab characters?

If you take this code:

echo -e '\t\t\tString' | grep '^[\t]*String'

the result is blank because it doesn’t match, yet this:

echo -e '\t\t\tString' | grep $'^[\t]*String'

works. I swear that I must have used the first line’s code a hundred times in my scripts and in the terminal, without ever using the “$” character like that, and it’s always seemed to work. Has there been some recent change? Why does it need the “$” character? Or am I doing something wrong?

Here is Solutions:

We have many solutions to this problem, But we recommend you to use the first solution because it is tested & true solution that will 100% work for you.

Solution 1

ANSI-C Quoting

According to the Bash manual, this is called ANSI-C quoting. The manual says:

Words of the form $’string’ are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard.

In practice, this means that '\t' will not be expanded into a tab character, but $'\t' will. The output should be equivalent to using echo -e, but can be used anywhere you’d use a string without requiring command substitution.

Utilities like GNU sed perform their own expansion of escape characters, but GNU grep doesn’t. The Bash shell, not grep, expands escaped characters within ANSI-C quoted strings. Without the ANSI-C quoting, the regular expression you posted contains no tab characters to match the input.

Solution 2

You should probably realize that there’s no single type of regular expressions. There are at least basic regular expressions or BRE (sometimes only RE), extended regular expressions or ERE and perl compatible regular expressions or PCRE. All those languages use slightly different syntax. Current versions of GNU grep support all three and the BRE are default. For ERE you need to use -E option and for PCRE -P option. Your example will work only with -P since with basic and extended RE the backslash loses its meaning and [\t] matches either backslash or character t. You were probably using that pattern in some other language that supports PCRE by default, which makes sense since they are the most powerful version. Or perhaps you had alias grep='grep -P' somewhere.

Solution 3

The first line works if you leave out the ^. Maybe it worked but it didn’t work the way you assumed? I doubt that grep‘s behaviour has changed in such an important point.

echo does not translate escape sequences by default. You need the -e for that. Similar with the shell. You need $'...' to make the shell use escape sequences.

Note: Use and implement solution 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply