python - Retaining split characters -

July 15, 2011

i have following data:

<http://dbpedia.org/data/plasmodium_hegneri.xml> <http://code.google.com/p/ldspider/ns#headerinfo> _:header16125770191335188966549 <http://dbpedia.org/data/plasmodium_hegneri.xml> . _:header16125770191335188966549 <http://www.w3.org/2006/http#responsecode> "200"^^<http://www.w3.org/2001/xmlschema#integer> <http://dbpedia.org/data/plasmodium_hegneri.xml> . _:header16125770191335188966549 <http://www.w3.org/2006/http#date> "mon, 23 apr 2012 13:49:27 gmt" <http://dbpedia.org/data/plasmodium_hegneri.xml> . _:header16125770191335188966549 <http://www.w3.org/2006/http#content-type> "application/rdf+xml; charset=utf-8" <http://dbpedia.org/data/plasmodium_hegneri.xml> .

now want transform data following form -- such last string enclosed in < > appears before line in appears #@ added.

#@ <http://dbpedia.org/data/plasmodium_hegneri.xml> <http://dbpedia.org/data/plasmodium_hegneri.xml> <http://code.google.com/p/ldspider/ns#headerinfo> _:header16125770191335188966549 . #@ <http://dbpedia.org/data/plasmodium_hegneri.xml> _:header16125770191335188966549 <http://www.w3.org/2006/http#responsecode> "200"^^<http://www.w3.org/2001/xmlschema#integer> . #@ <http://dbpedia.org/data/plasmodium_hegneri.xml> _:header16125770191335188966549 <http://www.w3.org/2006/http#date> "mon, 23 apr 2012 13:49:27 gmt" . #@ <http://dbpedia.org/data/plasmodium_hegneri.xml> _:header16125770191335188966549 <http://www.w3.org/2006/http#content-type> "application/rdf+xml; charset=utf-8" .

i wrote following python code in order same:

infile = open('testnq.nq', 'r') outfile= open('outfile.ttl','w') while true:     infileline1=infile.readline()     if not infileline1:         break #eof     splitstring=infileline1.split(' ')     line1= "#@ " + splitstring[len(splitstring)-2]     outfile.write(line1)     line2=""     num in range (0,len(splitstring)-2):         line2= line2 + splitstring[num]     outfile.write(line2)  outfile.close()

but not able obtain spaces @ desired places. can please suggest how can same in python or using linux commands

with risk of using regular expression , complicating things, may work:

import re  line = """<http://dbpedia.org/data/plasmodium_hegneri.xml> <http://code.google.com/p/ldspider/ns#headerinfo> _:header16125770191335188966549 <http://dbpedia.org/data/plasmodium_hegneri.xml> .""" print re.sub('^(?p<before>.*)(?p<match>\<[^>]+\>)(?p<after>[^<]*)$', '#@ \g<match>\n\g<before>\g<after>', line)  line = """_:header16125770191335188966549 <http://www.w3.org/2006/http#responsecode> "200"^^<http://www.w3.org/2001/xmlschema#integer> <http://dbpedia.org/data/plasmodium_hegneri.xml> .""" print re.sub('^(?p<before>.*)(?p<match>\<[^>]+\>)(?p<after>[^<]*)$', '#@ \g<match>\n\g<before>\g<after>', line)

which outputs:

#@ <http://dbpedia.org/data/plasmodium_hegneri.xml> <http://dbpedia.org/data/plasmodium_hegneri.xml> <http://code.google.com/p/ldspider/ns#headerinfo> _:header16125770191335188966549  . #@ <http://dbpedia.org/data/plasmodium_hegneri.xml> _:header16125770191335188966549 <http://www.w3.org/2006/http#responsecode> "200"^^<http://www.w3.org/2001/xmlschema#integer>  .

Search This Blog

JVParth

python - Retaining split characters -

Comments

Post a Comment

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -

How to provide Authorization & Authentication using Asp.net, C#? -