RDDs are datasets that consist of records.
RDDs are resilient, meaning that if a node performing an operation in Spark is lost, the dataset can be reconstructed.
RDDs are immutable, meaning that after they are instantiated and populated with data, they cannot be updated.
DDs are distributed, meaning the data in RDDs is divided into one or many partitions and distributed as in-memory collections of objects across worker nodes in the cluster.
这道题你会答吗?花几分钟告诉大家答案吧!