what is difference between hadoop and spark -
as spark growing in market nowadays can see spark’s major use cases on hadoop like:
- iterative algorithms in machine learning
- interactive data mining , data processing
- spark apache hive-compatible data warehousing system can run 100x faster hive.
- stream processing: log processing , fraud detection in live streams alerts, aggregates , analysis
- sensor data processing: data fetched , joined multiple sources, in-memory dataset helpful easy
, fast process.
my question is:
- is spark going replace hadoop in upcoming days?
- hadoop work concurrently while spark runs in parallel?(is true?)
spark differ hadoop in sense let integrate data ingestion, proccessing , real time analytics in 1 tool. spark map reduce framework differ standard hadoop map reduce because in spark intermediate map reduce result cached, , rdd(abstarction distributed collection ii fault tollerant) can saved in memory if there need reuse same results (iterative alghoritms, group , etc etc).
my answer superficial , not not answer question completly point out of main difference (much more in reality) spark , databricks official site documented , question answered there :
Comments
Post a Comment