Regular expression to match a line that doesn’t contain a word ?

Regular expression to match a line that doesn’t contain a word ?

Asked on December 14, 2018 in Regex.
Add Comment


  • 3 Answer(s)

    Here is a solution:

     In notion that regex doesn’t support inverse matching is not entirely true. You can mimic this behavior by using negative look-arounds:

    ^((?!hede).)*$
    

     Regex above will match any string, or line without a line break, not containing the (sub)string ‘hede’. As mentioned, this is not something regex is “good” at (or should do), but still, it is possible.

    you match line break chars, use the DOT-ALL modifier the trailing s in the below pattern:

    /^((?!hede).)*$/s
    

    or use it inline:

    /(?s)^((?!hede).)*$/
    

    where the /…/ are the regex delimiters, i.e., not part of the pattern

    If the DOT-ALL modifier is not available, you can mimic the same behavior with the character class [\s\S]:

    /^((?!hede)[\s\S])*$/
    

    Explanation:
        A string is just a list of n characters. Before, and after each character, there’s an empty string. So a list of n characters will have n+1 empty strings. Consider the string “ABhedeCD“.

    ┌──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┐
    S = │e1│ A │e2│ B │e3│ h │e4│ e │e5│ d │e6│ e │e7│ C │e8│ D │e9│
    └──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┘
    index 0 1 2 3 4 5 6 7
    

    Here e e‘s are empty strings. The regex (?!hede). looks ahead to see if there’s no substring “hede” to be seen, and in case it can be seen, then the . (dot) will match any character except a line break. Look-arounds are also called zero-width-assertions because they do not consume any characters.

    So,  every empty string is first validated to see if there’s no “hede” up ahead, before a character is consumed by the . (dot). The regex (?!hede). will do that only once, so it is wrapped in a group, and repeated zero or more times: ((?!hede).)*. Finally, the start- and end-of-input are anchored to make sure the entire input is consumed: ^((?!hede).)*$

    The input “ABhedeCD” will fail because on e3, the regex (?!hede) fails (there is “hede” up ahead!).

    Answered on December 14, 2018.
    Add Comment

    Sample code:

    Note that the solution to does not start with “hede”:

    ^(?!hede).*$
    

    Generally much more efficient than the solution to does not contain “hede”:

    ^((?!hede).)*$
    

    The former checks for “hede” only at the input string’s first position, rather than at every position.

    Answered on December 14, 2018.
    Add Comment

    Try this:

    wherever you just using it for grep, you can use grep -v hede to get all lines which do not contain hede.

    ETA Oh, rereading the question, grep -v is probably  by “tools options”.

    Answered on December 14, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.