groovy - Apache Spark difference between two RDDs -


say have example job (in groovy w/ java api):

def set1 = [] def set2 = [] 0.upto(10) { set1 << } 8.upto(20) { set2 << } def rdd1 = context.parallelize(set1) def rdd2 = context.parallelize(set2)  //what next? 

how set delta between two? know union can create rdd has of data in rdds, how do opposite of that?

if want set subtraction subtract answer. if want "outer" collection try:

rdd1.subtract(rdd2).union(rdd2.subtract(rdd1)) 

Comments

Popular posts from this blog

How to provide Authorization & Authentication using Asp.net, C#? -

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

android - Pass an Serializable object in AIDL -