mapreduce - How can I persist cached tables in memory after the program ends (Apache Spark)? -

January 15, 2013

i'm new apache spark , have simple question dataframe caching.

when cached dataframe in memory using df.cache() in python, found data removed after program terminates.

can keep cached data in memory can access data next run without doing df.cache() again?

the cache used cache() tied current spark context; purpose prevent having recalculate intermediate results in current application multiple times. if context gets closed, cache gone. nor can share cache between different running spark contexts.

to able reuse data in different context, have save file system. if prefer results in memory (or have chance of being in memory when try reload them) can @ using tachyon.

Search This Blog

JVParth

mapreduce - How can I persist cached tables in memory after the program ends (Apache Spark)? -

Comments

Post a Comment

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -

How to provide Authorization & Authentication using Asp.net, C#? -