Replace With Multi Line Regex
Given the following text, I want to remove everything in data_augmentation_options{random_horizontal_flip {..}} (... means other text in the following) i.e., input is : ... bat
Solution 1:
You are only matching a sinlge line instead of all the lines.
You can repeat the lines for this format keypoint_flip_permutation: \d+
and match the 2 closing curly's
Note that you don't need re.S
as there is no dot in the pattern.
data_augmentation_options {\s+random_horizontal_flip\s+{(?:\s+keypoint_flip_permutation: \d+)+\s*}\s*}\s*
Explanation
data_augmentation_options {
Match literally\s+random_horizontal_flip\s+
match the starting line{
Match literally(?:
Non capture group\s+keypoint_flip_permutation: \d+
Match the string string followed by 1+ digits
)+
Repeat 1+ times\s*}
Match optional whitespace chars and}
\s*}
Match optional whitespace chars and}
\s*
Match optional whitespace chars
If you want to remove only the trailing newline, you can match \r?\n
at the end instead of \s*
for example
print(re.sub(r"data_augmentation_options {\s+random_horizontal_flip\s+{(?:\s+keypoint_flip_permutation: \d+)+\s*}\s*}\s*", "", s))
Solution 2:
A few modification to your regex to become
data_augmentation_options {\s+random_horizontal_flip\s+{(\s+keypoint_flip_permutation:\s\d+\s)+\s+}\s+}
- replace
[\s]
by just\s
, which is equivalent - put the
\s+
inside the capture group()
- replace
\d
by\d+
to match multi-digits numbers
Solution 3:
You can use a lookahead to stop the deletion at the second pattern:
>>>re.sub(r'^[ \t]*data_augmentation_options[\s\S]+?(?=^[ \t]*data_augmentation_options)','\n\n',s, flags=re.M)
batch_size: 4
num_steps: 30
data_augmentation_options {
random_crop_image {
min_aspect_ratio: 0.5
max_aspect_ratio: 1.7
random_coef: 0.25
}
}
The (?=^[ \t]*data_augmentation_options)
is the lookahead.
Post a Comment for "Replace With Multi Line Regex"