Parsing HTML data with BeautifulSoup - cannot extract the 'href' out in one string
I'm trying to parse out the html to get the - 'href' link; My code is parsing the 'href link' into separate string, but I'm hoping to get a complete string.
Here is my code:
data = requests.get("https://www.chewy.com/b/food_c332_p2",
auth = ('user', 'pass'),
headers = {'User-Agent': user_agent})
with open("dogfoodpage/dg2.html","w+") as f:
f.write(data.text)
with open("dogfoodpage/dg2.html") as f:
page = f.read()
soup = BeautifulSoup(page,"html.parser")
test = soup.find('a',class_= "kib-product-title")
productlink = []
for items in test:
for link in items.get("href"):
productlink.append(link)
Here is my output:
Here is the html structure for test:
Comments
Post a Comment