Finding Gappy Sublists Within A Larger List
Solution 1:
If all that you care about is that the words appear in order somehwere in the array, you can use a collections.deque
and popleft
to iterate through the list, and if the deque
is emptied, you have found a valid match:
from collections import deque
deffind_gappy(arr, m):
dq = deque(m)
for word in arr:
if word == dq[0]:
dq.popleft()
ifnot dq:
returnTruereturnFalse
By comparing each word
in arr
with the first element of dq
, we know that when we find a match, it has been found in the correct order, and then we popleft
, so we now are comparing with the next element in the deque
.
To filter your initial list, you can use a simple list comprehension that filters based on the result of find_gappy
:
matches= ['she', 'is', 'student']
x = [i for i in x if find_gappy(i, matches)]
# [['she', 'is', 'a', 'student'], ['she', 'is', 'a', 'great', 'student'], ['she', 'is', 'a', 'very', 'very', 'exceptionally', 'good', 'student']]
Solution 2:
You can compare two lists, with a function like this one. The way it works is it loops through your shorter list, and every time it finds the next word in the long list, cuts off the first part of the longer list at that point. If it can't find the word it returns false.
defis_sub_sequence(long_list, short_list):
for word in short_list:
if word in long_list:
i = long_list.index(word)
long_list = long_list[i+1:]
else:
returnFalsereturnTrue
Now you have a function to tell you if the list is the desired type, you can filter out all the lists you need from the 'list of lists' using a list comprehension like the following:
a = [['she', 'is', 'a', 'student'],
['she', 'is', 'a', 'lawer'],
['she', 'is', 'a', 'great', 'student'],
['i', 'am', 'a', 'teacher'],
['she', 'is', 'a', 'very', 'very', 'exceptionally', 'good', 'student']]
b = ['she', 'is', 'student']
filtered = [x for x in a if is_sub_sequence(x,b)]
The list filtered
will include only the lists of the desired type.
Post a Comment for "Finding Gappy Sublists Within A Larger List"