beautifulsoup - extract yahoo finance balance sheet with python -


i learning use beautifulsoup , python extract html table. tried using following code extract balance sheet google. however, can't seem rows scraped correctly.

i can't manage omit rows spacer , don't manage extract rows of totals (eg. total asset).

any advice? advice on simplifying code valuable.

from bs4 import beautifulsoup import requests  def bs_extract(stock_ticker):     url= 'https://finance.yahoo.com/q/bs?s='+str(stock_ticker)+'&annual'     source_code = requests.get(url)     plain_text=source_code.text     soup = beautifulsoup(plain_text)      c1= ""     c2= ""     c3= ""     c4= ""     c5= ""      table = soup.find("table", { "class" : "yfnc_tabledata1" })     # print (table)     row in table.findall("tr"):         cells = row.findall("td")         if len(cells)==5:             c1=cells[0].find(text=true)             c2=cells[1].find(text=true)             c3=cells[2].find(text=true)             c4=cells[3].find(text=true)             c5=cells[4].find(text=true)         elif len(cells)==6:             c1=cells[1].find(text=true)             c2=cells[2].find(text=true)             c3=cells[3].find(text=true)             c4=cells[4].find(text=true)             c5=cells[5].find(text=true)         elif len(cells)==1:             c1=cells[0].find(text=true)             c2=""             c3=""             c4=""             c5=""         else:             pass         print(c1,c2,c3,c4,c5)   bs_extract('goog') 

you might find easier data structured, through yql. see http://goo.gl/qkewxw


Comments

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -