Split String From Beautifulsoup Output In A List
I have the following output from my code Code: text = soup.get_text() Output: Article Title Some text: Text blurb. More blurb. Even more blurb. Some more blurb. Seco
Solution 1:
If you take a look at your text, you want to split by repeated newlines \n
from
text
>>'Article Title\n\n Some text: Text blurb.\n\nMore blurb.\n\nEven more blurb. \n\nSome more blurb. \n\n\n\n\n\nSecond Article Title\n\nSome text: Text blurb.\n\nMore blurb.\n\nEven more blurb. \n\nSome more blurb. '
You can then just use define a parameter for text.split('\n\n\n\n\n')
, if you don't add a parameter, Python simply splits by whitespaces. After your first split, you can then split your other elements by \n\n
.
[i.split('\n\n') for i in text.split('\n\n\n\n\n')]
>>[['Article Title',
' Some text: Text blurb.',
'More blurb.',
'Even more blurb. ',
'Some more blurb. '],
['\nSecond Article Title',
'Some text: Text blurb.',
'More blurb.',
'Even more blurb. ',
'Some more blurb. ']]
Post a Comment for "Split String From Beautifulsoup Output In A List"