python 3.x - Python34 word2vec.Word2Vec OverFlowError -


i'm studying word2vec, when use word2vec train text data, occur overflowerror numpy.

the message is,

model.vocab[w].sample_int > model.random.randint(2**32)] warning (from warnings module):   file "c:\python34\lib\site-packages\gensim\models\word2vec.py", line 636     warnings.warn("c extension not loaded word2vec, training slow. " userwarning: c extension not loaded word2vec, training slow. install c compiler , reinstall gensim fast training. exception in thread thread-1: traceback (most recent call last):   file "c:\python34\lib\threading.py", line 920, in _bootstrap_inner     self.run()   file "c:\python34\lib\threading.py", line 868, in run     self._target(*self._args, **self._kwargs)   file "c:\python34\lib\site-packages\gensim\models\word2vec.py", line 675, in worker_loop     if not worker_one_job(job, init):   file "c:\python34\lib\site-packages\gensim\models\word2vec.py", line 666, in worker_one_job     job_words = self._do_train_job(items, alpha, inits)   file "c:\python34\lib\site-packages\gensim\models\word2vec.py", line 623, in _do_train_job     tally += train_sentence_sg(self, sentence, alpha, work)   file "c:\python34\lib\site-packages\gensim\models\word2vec.py", line 112, in train_sentence_sg     word_vocabs = [model.vocab[w] w in sentence if w in model.vocab ,   file "c:\python34\lib\site-packages\gensim\models\word2vec.py", line 113, in <listcomp>     model.vocab[w].sample_int > model.random.randint(2**32)]   file "mtrand.pyx", line 935, in mtrand.randomstate.randint (numpy\random\mtrand\mtrand.c:9520) overflowerror: python int large convert c long 

can tell me cases?

my machine x64 , os windows 7, python34 32bit. numpy , scipy 32bit.

i well. looks gensim has potential workaround in dev branch.

https://github.com/piskvorky/gensim/commit/726102df66000f2afcea82d95634b055e6521dc8

this doesn't solve core issue of navigating between different hardware , install int sizes, think should alleviate issues particular line.

the necessary change involves switching out

model.vocab[w].sample_int > model.random.randint(2**32)

for

model.vocab[w].sample_int > model.random.rand() * 2**32

this avoids 64 bit / 32 bit int issue created in randint.

update: manually incorporated change gensim install , prevents error.


Comments

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -

How to provide Authorization & Authentication using Asp.net, C#? -