performance - Improve genbank feature addition -
i trying add more 70000 new features genbank file using biopython.
i have code:
from bio import seqio bio.seqfeature import seqfeature, featurelocation fi = "myoriginal.gbk" fo = "mynewfile.gbk" result in results: start = 0 end = 0 result = result.split("\t") start = int(result[0]) end = int(result[1]) record in seqio.parse(original, "gb"): record.features.append(seqfeature(featurelocation(start, end), type = "misc_feat")) seqio.write(record, fo, "gb") results list of lists containing start , end of each 1 of features need add original gbk file.
this solution extremely costly computer , not know how improve performance. idea?
you should parse genbank file once. omitting results contains (i not know exactly, because there missing pieces of code in example), guess improve performance, modifying code:
fi = "myoriginal.gbk" fo = "mynewfile.gbk" original_records = list(seqio.parse(fi, "gb")) result in results: result = result.split("\t") start = int(result[0]) end = int(result[1]) record in original_records: record.features.append(seqfeature(featurelocation(start, end), type = "misc_feat")) seqio.write(record, fo, "gb")
Comments
Post a Comment