Skip to content Skip to sidebar Skip to footer

Regex - Remove Space Between Two Punctuation Marks But Not Between Punctuation Mark And Letter

I have the following regex for removing spaces between punctuation marks. re.sub(r'\s*(\W)\s*', r'\1', s) which works fine in almost all of my test cases, except for this one: Thi

Solution 1:

This should work:

import re

str = 'This is! ? a test! ?'
res = re.sub(r'(?<=[?!])\s+(?=[?!])', '', str)
print(res)

Output:

This is!? a test!?

Explanation:

(?<=[?!])   # positive lookbehind, make sure we have a punctuation before (you can add all punctuations you want to check)
\s+         # 1 or more spaces
(?=[?!])    # positive lookahead, make sure we have a punctuation after

Solution 2:

Try this:

string = "This is! ? a test! ?"
string = re.sub(r"(\W)\s*(\W)", r"\1\2", string)
print(string)

Output:

This is!? a test!?

Solution 3:

In order to match a punctuation char with a regex in Python, you may use (?:[^\w\s]|_) pattern, it matches any char but a letter, digit or whitespace.

So, you need to match one or more whitespaces (\s+) that is immediately preceded with a punctuation char ((?<=[^\w\s]|_)) and is immediately followed with such a char ((?=[^\w\s]|_)):

(?<=[^\w\s]|_)\s+(?=[^\w\s]|_)

See the online regex demo.

Python demo:

import re
text = "This is! ? a test! ?"print( re.sub(r"(?<=[^\w\s]|_)\s+(?=[^\w\s]|_)", "", text) )
# => This is!? a test!?

Solution 4:

Another option is to make use of the PyPi regex module use \p{Punct} inside positive lookarounds to match the punctuation marks.

Python demo

For example

import regex

pattern = r"(?<=\p{Punct})\s+(?=\p{Punct})"
s = 'This is! ? a test! ?'print(regex.sub(pattern, '', s))

Output

This is!? a test!?

Note that \s could also match a newline. You could also use [^\S\r\n] to match a whitespace char except newlines.

Post a Comment for "Regex - Remove Space Between Two Punctuation Marks But Not Between Punctuation Mark And Letter"