Web Scraping With Python And Beautiful Soup
I am practicing building web scrapers. One that I am working on now involves going to a site, scraping links for the various cities on that site, then taking all of the links for e
Solution 1:
Well city_tags is a bs4.element.ResultSet
(essentially a list) of tags and you are calling find_all on it. You probably want to call find_all in every element of the resultset or in this specific case just retrieve their href attribute
import requests
from bs4 import BeautifulSoup
main_url = "http://www.chapter-living.com/"# Getting individual cities url
re = requests.get(main_url)
soup = BeautifulSoup(re.text, "html.parser")
city_tags = soup.find_all('a', class_="nav-title") # Bottom page not loaded dynamycally
cities_links = [main_url + tag["href"] for tag in city_tags] # Links to cities
Solution 2:
As the error says, the city_tags is a ResultSet which is a list of nodes and it doesn't have the find_all
method, you either have to loop through the set and apply find_all
on each individual node or in your case, I think you can simply extract the href
attribute from each node:
[tag['href'] for tag in city_tags]
#['https://www.chapter-living.com/blog/',# 'https://www.chapter-living.com/testimonials/',# 'https://www.chapter-living.com/events/']
Post a Comment for "Web Scraping With Python And Beautiful Soup"