Regex - Remove Space Between Two Punctuation Marks But Not Between Punctuation Mark And Letter
I have the following regex for removing spaces between punctuation marks. re.sub(r'\s*(\W)\s*', r'\1', s) which works fine in almost all of my test cases, except for this one: Thi
Solution 1:
This should work:
import re
str = 'This is! ? a test! ?'
res = re.sub(r'(?<=[?!])\s+(?=[?!])', '', str)
print(res)
Output:
This is!? a test!?
Explanation:
(?<=[?!]) # positive lookbehind, make sure we have a punctuation before (you can add all punctuations you want to check)
\s+ # 1 or more spaces
(?=[?!]) # positive lookahead, make sure we have a punctuation after
Solution 2:
Try this:
string = "This is! ? a test! ?"
string = re.sub(r"(\W)\s*(\W)", r"\1\2", string)
print(string)
Output:
This is!? a test!?
Solution 3:
In order to match a punctuation char with a regex in Python, you may use (?:[^\w\s]|_)
pattern, it matches any char but a letter, digit or whitespace.
So, you need to match one or more whitespaces (\s+
) that is immediately preceded with a punctuation char ((?<=[^\w\s]|_)
) and is immediately followed with such a char ((?=[^\w\s]|_)
):
(?<=[^\w\s]|_)\s+(?=[^\w\s]|_)
See the online regex demo.
import re
text = "This is! ? a test! ?"print( re.sub(r"(?<=[^\w\s]|_)\s+(?=[^\w\s]|_)", "", text) )
# => This is!? a test!?
Solution 4:
Another option is to make use of the PyPi regex module use \p{Punct}
inside positive lookarounds to match the punctuation marks.
For example
import regex
pattern = r"(?<=\p{Punct})\s+(?=\p{Punct})"
s = 'This is! ? a test! ?'print(regex.sub(pattern, '', s))
Output
This is!? a test!?
Note that \s
could also match a newline. You could also use [^\S\r\n]
to match a whitespace char except newlines.
Post a Comment for "Regex - Remove Space Between Two Punctuation Marks But Not Between Punctuation Mark And Letter"