Skip to content Skip to sidebar Skip to footer

Beautiful Soup Can't Find The First Tag (xml)

I am using BeautifulSoup 4 (and the parser lmxl) to parse an XML file used for the MLB API. The API generates a scoreboard for the current games for a particular day, and I'm havin

Solution 1:

It appears I misunderstood the find function. You can index it for a keyword to lookup within the tag itself the attribute you want. So, essentially I should have been doing the following:

soup = BeautifulSoup(webpage, 'xml') # webpage is the xml file for today's games
tags = soup.findAll('game', {'home_file_code': 'tor'})
for games in tags:
    print(games.find('status')['status']
    print(games['home_file_code'])

Now print(games['home_file_code'] will find the home_file_code as expected because it already exists within the tag we looked up.

I'm sure someone can give a more thorough answer, but that was the fundamental misunderstanding I was having.

Solution 2:

I'm not the greatest of programmers, but I'm pretty sure you're not finding the first tag because it is incorrectly defined. XML tags, if they contain anything, must have an opening and a closing part like this: <games>year="2017" month="04" day="16"</games> and not like this: <games year="2017" month="04" day="16"> So first thing you need to fix your XML formatting and then take it from there.

Post a Comment for "Beautiful Soup Can't Find The First Tag (xml)"