Selenium Can Not Scrape Shopee E-commerce Site Using Python
I am not able to pull the price of products on Shopee (a e-commercial site). I have taken a look at the problem solved by @dmitrybelyakov (link: Scraping AJAX e-commerce site usin
Solution 1:
You can use requests and the search API for the site
import requests
headers = {
'User-Agent': 'Mozilla/5',
'Referer': 'https://shopee.com.my/search?keyword=h370m'
}
url = 'https://shopee.com.my/api/v2/search_items/?by=relevancy&keyword=h370m&limit=50&newest=0&order=desc&page_type=search'
r = requests.get(url, headers = headers).json()
foritemin r['items']:
print(item['name'], ' ', item['price'])
If you want roughly the same scale:
for item in r['items']:
print(item['name'], ' ', 'RM' + str(item['price']/100000))
Solution 2:
To extract the price of products on Shopee using Selenium and Python you can use the following solution:
Code Block:
from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC options = webdriver.ChromeOptions() options.add_argument('--headless') options.add_argument('start-maximized') options.add_argument('disable-infobars') options.add_argument('--disable-extensions') browserdriver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe') browserdriver.get('https://shopee.com.my/search?keyword=h370m') WebDriverWait(browserdriver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='shopee-modal__container']//button[text()='English']"))).click() print([my_element.text for my_element in WebDriverWait(browserdriver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//span[text()='RM']//following::span[1]")))]) print("Program Ended")
Console Output:
['430.00', '385.00', '435.00', '409.00', '479.00', '439.00', '479.00', '439.00', '439.00', '403.20', '369.00', '420.00', '479.00', '465.00', '465.00'] Program Ended
Solution 3:
When visiting the website. I come across this popup https://gyazo.com/0a9cd82e2c9879a1c834a82cb15020bd. I am guessing, why selenium cannot detect the xpath you are looking for, is because this popup is blocking the element.
right after starting the selenium session, try this:
popup=browserdriver.find_element_by_xpath('//*[@id="modal"]/div[1]/div[1]/div/div[3]/button[1]')
popup.click()
Post a Comment for "Selenium Can Not Scrape Shopee E-commerce Site Using Python"