Skip to content

Problem matching word boundaries (\b) at the end of a string #385

Description

@nsmmrs

Describe the bug
When using regex matching, \b doesn't seem to match word boundaries at the end of a matched string.

Reproducing
See example 5 here: https://bit.ly/4hpMYXb

Example 6 demonstrates that word-boundaries at the beginning of a string match just fine.

Removing the second word-boundary pattern fixes example 5, but breaks example 6:
https://bit.ly/41OGBqG

Expected behavior
I've tried a few different engines, and they all seem to respect word-boundaries which are also end-of-strings. For example, Ruby:

"abc 123 xyz".scan(/\b\w+\b/) #=> ["abc", "123", "xyz"]

Also, not sure if this is analogous, but given a file with no trailing newline:

IO.read("regex-test.txt") #=> "abc 123 xyz"

rg gives this output:

rg '\b\w+\b' regex-test.txt -o
1:abc
1:123
1:xyz

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions