Parsing HTML data with BeautifulSoup - cannot extract the 'href' out in one string

By Ritesh Sahu - February 14, 2023

I'm trying to parse out the html to get the - 'href' link; My code is parsing the 'href link' into separate string, but I'm hoping to get a complete string.

Here is my code:

data = requests.get("https://www.chewy.com/b/food_c332_p2", 
                    auth = ('user', 'pass'), 
                    headers = {'User-Agent': user_agent})

with open("dogfoodpage/dg2.html","w+") as f:
    f.write(data.text)

with open("dogfoodpage/dg2.html") as f:
    page = f.read()
    soup = BeautifulSoup(page,"html.parser")
     
test = soup.find('a',class_= "kib-product-title")

productlink = []

for items in test:
   for link in items.get("href"):
       productlink.append(link)

Here is my output:

Here is the html structure for test:

Search This Blog

Theprogrammersfirst | A technical portal.

Parsing HTML data with BeautifulSoup - cannot extract the 'href' out in one string

Comments

Post a Comment

Popular posts from this blog

Spring Elasticsearch Operations

Hibernate Search - Elasticsearch with JSON manipulation

Today Walkin 14th-Sept