Replace With Multi Line Regex
Given the following text, I want to remove everything in data_augmentation_options{random_horizontal_flip {..}} (... means other text in the following) i.e., input is : ... bat
Solution 1:
You are only matching a sinlge line instead of all the lines.
You can repeat the lines for this format keypoint_flip_permutation: \d+ and match the 2 closing curly's
Note that you don't need re.S as there is no dot in the pattern.
data_augmentation_options {\s+random_horizontal_flip\s+{(?:\s+keypoint_flip_permutation: \d+)+\s*}\s*}\s*
Explanation
data_augmentation_options {Match literally\s+random_horizontal_flip\s+match the starting line{Match literally(?:Non capture group\s+keypoint_flip_permutation: \d+Match the string string followed by 1+ digits
)+Repeat 1+ times\s*}Match optional whitespace chars and}\s*}Match optional whitespace chars and}\s*Match optional whitespace chars
If you want to remove only the trailing newline, you can match \r?\n at the end instead of \s*
for example
print(re.sub(r"data_augmentation_options {\s+random_horizontal_flip\s+{(?:\s+keypoint_flip_permutation: \d+)+\s*}\s*}\s*", "", s))
Solution 2:
A few modification to your regex to become
data_augmentation_options {\s+random_horizontal_flip\s+{(\s+keypoint_flip_permutation:\s\d+\s)+\s+}\s+}
- replace
[\s]by just\s, which is equivalent - put the
\s+inside the capture group() - replace
\dby\d+to match multi-digits numbers
Solution 3:
You can use a lookahead to stop the deletion at the second pattern:
>>>re.sub(r'^[ \t]*data_augmentation_options[\s\S]+?(?=^[ \t]*data_augmentation_options)','\n\n',s, flags=re.M)
batch_size: 4
num_steps: 30
data_augmentation_options {
random_crop_image {
min_aspect_ratio: 0.5
max_aspect_ratio: 1.7
random_coef: 0.25
}
}
The (?=^[ \t]*data_augmentation_options) is the lookahead.
Post a Comment for "Replace With Multi Line Regex"