Spark2, PySpark and Jupyter installation and configuration

Steps to be followed for enabling SPARK 2, pysaprk and jupyter in cloudera clusters. 1.INSTALL ORACLE JDK IN ALL NODES Download and install java. It should be jdk 1.8+ # cd /usr/java/ # wget –no-cookies –no-check-certificate –header “Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie” “http://download.oracle.com/otn-pub/java/jdk/8u144-b15/jdk-8u144-linux-x64.tar.gz” # tar xzf jdk-8u144-linux-x64.tar.gz   2.Install java with Alternatives # cd /usr/java # alternatives … Continue reading Spark2, PySpark and Jupyter installation and configuration

Advertisements