sqlite - 256000 Bytes Limit of Python's SQLite3 Databases -


i have written web crawler online pre-processing , stores data database. database layout quite simple, let me outline it:

  • table dispatched lists items dispatched @ particular time, 1 of columns. there 15 other columns describe items: 1 of these type text , rest type int.
  • the table missed lists time periods crawler failed watch, e.g. because of network problems. has 4 columns type int , 2 columns type text.

i left run many hours twice. both times, resulting database file had total size of 256000 bytes, @ least according ls -l. have seen recorded data regularly 1-3 items recorded per minute, starting particular time, there no new items listed more.

to me, sounds if there limitation hit. given resulting database file size 1000 * 2^8 bytes both times, suspect limitation on maximum database file size, documentation doesn't that.

the moment sqlite stopped appending new rows database, there were

  • 5187 rows on dispatched , 3 rows on missed during first run and
  • 5212 rows on dispatched , 2 rows on missed during second run.

i'm using sqlite3 module python 2.7. appreciate might point out going on, why sqlite stopped appending new rows after hitting 256000 bytes , how fix this.


update: i've decided add snippets code.

the application consists of tasks scheduled , executed class named gatherer. first, lets have @ task of interest here:

class controltask(task):      def __init__(self, player, scheduled_start):         task.__init__(self, scheduled_start)         self.player = player      def __str__(self):         return u'control: player %d' % self.player.player_id      def execute(self, gt):         # [...]         gt.dispatch(self.player)         # [...] 

the gatherer calls execute on each task when time has come, providing reference via gt argument:

class gatherer:      def process_next_task(self):         task = self.tasks.get()         while true: 

i can see messages after sqlite stop appending rows:

            print task.__str__() 

but not see neither of following error messages. furthermore, note loop won't exit until task finishes without exceptions:

            try:                 task.execute(self)                 break             except engine.networkexception:                 print 'network problem detected. reconnecting.'                 self.reconnect()             except:                 print 'unknown exception occured. reconnecting.'                 self.reconnect()      def reconnect(self):         # [...] 

the controltask invokes dispatch on gatherer, which, in turn, writes data task computed database:

    def dispatch(self, player):         self.db_cursor.execute('''insert dispatched values             (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)''', (\                 none, player.deadline, player.age, player.tsi, \                 player.speciality, player.skills[0], player.skills[1], \                 player.skills[2], player.skills[3] , player.skills[4], \                 player.skills[5], player.skills[6], player.skills[7], \                 player.skills[8], player.skills[9], player.skills[10]))         self.db.commit() 

the gatherer creates database when instantiates , closes database connection when garbage collected:

    def __init__(self):         self.tasks = queue.priorityqueue()          self.db = sqlite3.connect('%s.db' % time.strftime('%d.%m.%y %h:%m'))         self.db_cursor = self.db.cursor()          self.db_cursor.execute('''create table if not exists dispatched (             amount int, deadline int, age int, tsi int, speciality text,             s1 int, s2 int, s3 int, s4 int, s5 int, s6 int,             s7 int, s8 int, s9 int, s10 int, s11 int)''')          self.db_cursor.execute('''create table if not exists missed (             missed_from int, missed_until int,             min_age int, max_age int,             min_lvl int, max_lvl int)''')      def __del__(self):         self.db.close() 

and finally, here actual program:

if __name__ == '__main__':      gt = gatherer()      tasks_processed = 0     while true:         print 'starting %d. task while %d enqueued:' % \             (tasks_processed+1, gt.tasks.qsize())         try:             gt.process_next_task()         except keyboardinterrupt:             break         except:             del gt             raise         tasks_processed = tasks_processed + 1 


Comments

Popular posts from this blog

How to provide Authorization & Authentication using Asp.net, C#? -

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

How to use Authorization & Authentication in Asp.net, C#? -