Regex question mark

To match a string with pattern like:

-TEXT-someMore-String

To get -TEXT-, I came to know that this works:

/-(.+?)-/ // -TEXT-

As of what I know, ? makes preceding token as optional as in:

colou?r matches both colour and color

I initially put in regex to get -TEXT- part like this:

/-(.+)-/

But it gave -TEXT-someMore-.

How does adding ? stops regex to get the -TEXT- part correctly? Since it used to make preceding token optional not stopping at certain point like in above example ?

Here is Solutions:

We have many solutions to this problem, But we recommend you to use the first solution because it is tested & true solution that will 100% work for you.

Solution 1

As you say, ? sometimes means “zero or one”, but in your regex +? is a single unit meaning “one or more — and preferably as few as possible”. (This is in contrast to bare +, which means “one or more — and preferably as many as possible”.)

As the documentation puts it:

However, if a quantifier is followed by a question mark,
then it becomes lazy, and instead matches the minimum
number of times possible, so the pattern /\*.*?\*/
does the right thing with the C comments. The meaning of the
various quantifiers is not otherwise changed, just the preferred
number of matches. Do not confuse this use of
question mark with its use as a quantifier in its own right.
Because it has two uses, it can sometimes appear doubled, as
in \d??\d which matches one digit by preference, but can match two if
that is the only way the rest of the pattern matches.

Solution 2

Alternatively, you can use Ungreedy modifier to set the whole regular expression to search for preferably as short as possible match:

/-(.+)-/U

Solution 3

? before a token is shorthand for {0,1}, which means: Anything up from 0 to 1 appearances as the foremost.

But + is not a token, but a quantifier. shorthand for {1,}: 1 up to endless appearances.

A ? after a quantifier sets it into nongreedy mode. If in greedy mode, it matches as much of the string as possible. If non greedy it matches as little as possible

Solution 4

Another, perhaps the underlying error in your regex is that you try to match a number of arbitrary characters via .+?. However, what you really want is probably: “any character except -“. You can get that via [^-]+ In this case, it doesn’t matter if you do a greedy match or not — the repeated match will terminate as soon as you encounter the second “-” in your string.

Note: Use and implement solution 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply