python - BeautifulSoup get only the "general" text in a td tag, and nothing in nested tags -
say html looks this:
<td>potato1 <span somestuff...>potato2</span></td> ... <td>potato9 <span somestuff...>potato10</span></td> i have beautifulsoup doing this:
for tag in soup.find_all("td"): print tag.text and get
potato1 potato2 .... potato9 potato10 would possible text that's inside tag not text nested inside span tag?
you can use .contents as
>>> tag in soup.find_all("td"): ... print tag.contents[0] ... potato1 potato9 what does?
a tags children available list using .contents.
>>> tag in soup.find_all("td"): ... print tag.contents ... [u'potato1 ', <span somestuff...="">potato2</span>] [u'potato9 ', <span somestuff...="">potato10</span>] since interested in first element, go for
print tag.contents[0]
Comments
Post a Comment