How to use Spark Classifaction for unseen instances? -

June 15, 2011

it looks training , test set should both present @ time of classification model creation in apache spark? if have unseen instances come later , not exist when we're creating model? have re-build model when receive unseen instance? doesn't make classification impractical in real scenarios?

it looks training , test set should both present @ time of classification model creation in apache spark?

test instances can loaded appart train instances can see in naive bayes example.

from pyspark.mllib.classification import naivebayes pyspark.mllib.linalg import vectors pyspark.mllib.regression import labeledpoint  def parseline(line):     parts = line.split(',')     label = float(parts[0])     features = vectors.dense([float(x) x in parts[1].split(' ')])     return labeledpoint(label, features)  data = sc.textfile('data/mllib/sample_naive_bayes_data.txt').map(parseline)  # split data aproximately training (60%) , test (40%) training, test = data.randomsplit([0.6, 0.4], seed = 0)  # train naive bayes model. model = naivebayes.train(training, 1.0)  # make prediction , test accuracy. predictionandlabel = test.map(lambda p : (model.predict(p.features), p.label)) accuracy = 1.0 * predictionandlabel.filter(lambda (x, v): x == v).count() / test.count()

what if have unseen instances come later , not exist when we're creating model?

this scenario same scikit , other machine learning tools, although spark offers algorithms can process streams.

Search This Blog

JVParth

How to use Spark Classifaction for unseen instances? -

Comments

Post a Comment

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -

How to provide Authorization & Authentication using Asp.net, C#? -