How to match two consecutive words with egrep , regex?

I would like a regular expression which finds two consecutive words
for example

cancan
bonbon
chichi

You can try the following regular expression to match two consecutive words without any space:

grep -E '\b(\w+)\1\b' filename

Match two consecutive words with egrep, regex with a whitespace

grep -E '\b(\w+)\s+\1\b' filename

How does it work?

Above regular expression will match any two consecutive words that are the same.

  • \b : Boundaries match word boundaries.
  • (\w+) : Captures one word.
  • \s+ : Matches one or more whitespace characters.
  • \1 : Refers back to the first captured word.

See also

https://www.cyberciti.biz/faq/howto-use-grep-command-in-linux-unix/
And
https://www.cyberciti.biz/faq/grep-regular-expressions/

1 Like

Where can i learn about \s+ \1 \w+ ? Because they are not mentioned in the attached tutorials

Will update the page.

1 Like

Could I do egrep “(\w+)(\w+)” instead of egrep “(\w+)\1 ” ?