Bii o ṣe le Fi sori ẹrọ ati Ṣiṣeto Apache Spark lori Ubuntu/Debian


Apark Spark jẹ ipilẹ iširo kaakiri orisun-ṣiṣi ti a ṣẹda lati pese awọn abajade iširo yiyara. O jẹ ẹrọ iširo inu-iranti, itumo data yoo wa ni ilọsiwaju ni iranti.

Sipaki ṣe atilẹyin ọpọlọpọ awọn API fun ṣiṣanwọle, ṣiṣe aworan, SQL, MLLib. O tun ṣe atilẹyin Java, Python, Scala, ati R bi awọn ede ti o fẹ julọ. Spark ti wa ni fifi sori ẹrọ julọ ni awọn iṣupọ Hadoop ṣugbọn o tun le fi sori ẹrọ ati tunto sipaki ni ipo iduro.

Ninu nkan yii, a yoo rii bi a ṣe le fi Apak Spark sori ẹrọ ni Debian ati awọn pinpin orisun Ubuntu.

Fi Java ati Scala sori Ubuntu

Lati fi Apaki Spark sori Ubuntu, o nilo lati fi Java ati Scala sori ẹrọ rẹ. Pupọ ninu awọn pinpin ode oni wa pẹlu Java ti a fi sii nipasẹ aiyipada ati pe o le ṣayẹwo rẹ nipa lilo pipaṣẹ atẹle.

$ java -version

Ti ko ba si iṣẹjade, o le fi Java sii nipa lilo nkan wa lori bawo ni a ṣe le fi Java sori Ubuntu tabi ṣaṣe ṣiṣe awọn ofin wọnyi lati fi Java sori awọn pinpin kaakiri Ubuntu ati Debian.

$ sudo apt update
$ sudo apt install default-jre
$ java -version

Nigbamii ti, o le fi Scala sii lati ibi ipamọ ti o yẹ nipa ṣiṣe awọn ofin wọnyi lati wa fun iwọn ati fi sii.

$ sudo apt search scala  ⇒ Search for the package
$ sudo apt install scala ⇒ Install the package

Lati jẹrisi fifi sori Scala, ṣiṣe aṣẹ atẹle.

$ scala -version 

Scala code runner version 2.11.12 -- Copyright 2002-2017, LAMP/EPFL

Fi Apaki Spark sori Ubuntu sii

Bayi lọ si aṣẹ wget osise lati ṣe igbasilẹ faili taara ni ebute.

$ wget https://apachemirror.wuchna.com/spark/spark-3.1.1/spark-3.1.1-bin-hadoop2.7.tgz

Bayi ṣii ebute rẹ ki o yipada si ibiti o ti gbe faili rẹ ti o gbasilẹ ki o ṣiṣẹ aṣẹ atẹle lati yọ faili taakiri Apache Spark jade.

$ tar -xvzf spark-3.1.1-bin-hadoop2.7.tgz

Lakotan, gbe itọsọna Spark jade lati/itọsọna itọsọna.

$ sudo mv spark-3.1.1-bin-hadoop2.7 /opt/spark

Tunto Awọn iyipada Ayika fun sipaki

Bayi o ni lati ṣeto awọn oniyipada ayika diẹ ninu faili .profile rẹ ṣaaju ki o to bẹrẹ sipaki.

$ echo "export SPARK_HOME=/opt/spark" >> ~/.profile
$ echo "export PATH=$PATH:/opt/spark/bin:/opt/spark/sbin" >> ~/.profile
$ echo "export PYSPARK_PYTHON=/usr/bin/python3" >> ~/.profile

Lati rii daju pe awọn oniyipada agbegbe tuntun wọnyi le ṣee de laarin ikarahun ati pe o wa fun Apak Spark, o tun jẹ dandan lati ṣiṣe aṣẹ atẹle lati mu awọn ayipada aipẹ si ipa.

$ source ~/.profile

Gbogbo awọn alakomeji ti o ni ibatan sipaki lati bẹrẹ ati da awọn iṣẹ duro labẹ folda sbin naa.

$ ls -l /opt/spark

Bẹrẹ Spark Spark ni Ubuntu

Ṣiṣe aṣẹ atẹle lati bẹrẹ iṣẹ oluwa Spark ati iṣẹ ẹrú.

$ start-master.sh
$ start-workers.sh spark://localhost:7077

Ni kete ti iṣẹ naa ti bẹrẹ lọ si ẹrọ aṣawakiri ki o tẹ oju-iwe sipaki wiwọle URL wọnyi. Lati oju-iwe naa, o le rii oluwa mi ati iṣẹ ẹrú ti bẹrẹ.

http://localhost:8080/
OR
http://127.0.0.1:8080

O tun le ṣayẹwo ti itanna-ikarahun ba ṣiṣẹ daradara nipa ṣiṣilẹ pipaṣẹ ẹyin-ina naa.

$ spark-shell

Iyẹn ni fun nkan yii. A yoo mu ọ pẹlu nkan ti o nifẹ miiran laipẹ.