Skip to content Skip to sidebar Skip to footer

Web Scraping With Python And Beautiful Soup

I am practicing building web scrapers. One that I am working on now involves going to a site, scraping links for the various cities on that site, then taking all of the links for e

Solution 1:

Well city_tags is a bs4.element.ResultSet (essentially a list) of tags and you are calling find_all on it. You probably want to call find_all in every element of the resultset or in this specific case just retrieve their href attribute

import requests
from bs4 import BeautifulSoup

main_url = "http://www.chapter-living.com/"# Getting individual cities url
re = requests.get(main_url)
soup = BeautifulSoup(re.text, "html.parser")
city_tags = soup.find_all('a', class_="nav-title")  # Bottom page not loaded dynamycally
cities_links = [main_url + tag["href"] for tag in city_tags]  # Links to cities

Solution 2:

As the error says, the city_tags is a ResultSet which is a list of nodes and it doesn't have the find_all method, you either have to loop through the set and apply find_all on each individual node or in your case, I think you can simply extract the href attribute from each node:

[tag['href'] for tag in city_tags]

#['https://www.chapter-living.com/blog/',# 'https://www.chapter-living.com/testimonials/',# 'https://www.chapter-living.com/events/']

Post a Comment for "Web Scraping With Python And Beautiful Soup"