How to get the offset of a text in HTML tag and all the parents to the root tag in PHP? -
i extract article example publicationyear, title , authors this:
$aut = $xpath->query("//table[@cellpadding='6']//b[1]"); $authors = array(); foreach($aut $node) $authors[] = $node->nodevalue; $title = $doc->getelementsbytagname('h3')->item(1); $publicationyear = $xpath->query("//p[1]//text()[(following::br)]")->item(0)->nodevalue; $aux = $xpath->query("//p[2]//text()[(preceding::br)]"); $doi = substr($aux->item($aux->length - 1)->nodevalue, 4); for strings(the full name, year, title) need tags come before :
form1_table3_tbody1_tr1_td1_table5_tbody1_tr1_td2_p2
and position in tag start: 163,end: 190. know informations grouped in tags, need index of tag if has siblings that's why example has table 3 third son of forum 1. if there's way of doing in php or @ least javascript
update in te article have:
... <td valign="top"> <h3 class="blue-space">d-lib magazine</h3> <p class="blue">november/december 2014<br> volume 20, number 11/12<br><a href="http://www.dlib.org/dlib/november14/brook/../11contents.html" target="_blank">table of contents</a> </p> ... and $publicationyear first code val 2014. first code works fine. need create other 3 variables $fathers =...td1_p1, $start=18, $end=22
Comments
Post a Comment