Apache Spark
Features
- Real time processing: it uses in memory computation, which makes it really fast.
- Generality: combines SQL, streaming and complex analytics
- Speed: 100x faster than Hadoop
- Deployment: easily deployed
- Ease of use: Application be written easily in java, scala, Python, R and SQL.
- Powerful caching: provides powerful cache system.
Questions:
- How are spark and Hadoop connected? I thought its completely different big data platforms. While downloading it asked for Choose package type: Pre-built for Apache Hadoop 2.7, what is the meaning of this?
Data warehousing and Data management BigData