Fi Hadoop Multinode Cluster sori ẹrọ ni lilo CDH4 ni RHEL/CentOS 6.5


Hadoop jẹ ilana siseto orisun orisun ti o dagbasoke nipasẹ afun lati ṣe ilana data nla. O nlo HDFS (Eto Faili Pinpin Hadoop) lati tọju data kọja gbogbo awọn datanodes ninu iṣupọ ni ọna pinpin ati awoṣe mapreduce lati ṣe ilana data naa.

Namenode (NN) jẹ daemon oluwa eyiti o ṣakoso HDFS ati Jobtracker (JT) jẹ daemon oluwa fun ẹrọ mapreduce.

Ninu ẹkọ yii Mo n lo CentOS 6.3 VM meji ‘oluwa’ ati ‘oju ipade’ viz. (oluwa ati oju ipade ni awọn orukọ ile-iṣẹ mi). IP ‘oluwa’ jẹ 172.21.17.175 ati oju ipade IP ni ‘172.21.17.188’. Awọn itọnisọna wọnyi tun ṣiṣẹ lori awọn ẹya RHEL/CentOS 6.x.

 hostname

master
 ifconfig|grep 'inet addr'|head -1

inet addr:172.21.17.175  Bcast:172.21.19.255  Mask:255.255.252.0
 hostname

node
 ifconfig|grep 'inet addr'|head -1

inet addr:172.21.17.188  Bcast:172.21.19.255  Mask:255.255.252.0

Ni akọkọ rii daju pe gbogbo awọn ogun iṣupọ wa nibẹ ni faili ‘/ ati be be/awọn ọmọ-ogun‘ (lori oju ipade kọọkan), ti o ko ba ṣeto DNS.

 cat /etc/hosts

172.21.17.175 master
172.21.17.188 node
 cat /etc/hosts

172.21.17.197 qabox
172.21.17.176 ansible-ground

Fifi Hadoop Multinode Cluster sori ẹrọ ni CentOS

A lo ibi ipamọ CDH ti oṣiṣẹ lati fi CDH4 sori gbogbo awọn ọmọ-ogun (Titunto si ati Node) ninu iṣupọ kan.

Lọ si oju-iwe gbigba lati ayelujara CDH ti osise ki o mu ẹya CDH4 (ie 4.6) tabi o le lo atẹle wget pipaṣẹ lati ṣe igbasilẹ ibi ipamọ ati fi sii.

# wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/i386/cloudera-cdh-4-0.i386.rpm
# yum --nogpgcheck localinstall cloudera-cdh-4-0.i386.rpm
# wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm
# yum --nogpgcheck localinstall cloudera-cdh-4-0.x86_64.rpm

Ṣaaju ki o to fi Ijọpọpọ Hadoop Multinode, ṣafikun Cloud Key Public GPG Cloudera si ibi ipamọ rẹ nipasẹ ṣiṣe ọkan ninu aṣẹ atẹle ni ibamu si faaji eto rẹ.

## on 32-bit System ##

# rpm --import http://archive.cloudera.com/cdh4/redhat/6/i386/cdh/RPM-GPG-KEY-cloudera
## on 64-bit System ##

# rpm --import http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera

Nigbamii, ṣiṣe aṣẹ atẹle lati fi sori ẹrọ ati ṣeto JobTracker ati NameNode lori olupin Titunto.

 yum clean all 
 yum install hadoop-0.20-mapreduce-jobtracker
 yum clean all
 yum install hadoop-hdfs-namenode

Lẹẹkansi, ṣiṣe awọn ofin wọnyi lori olupin Titunto si ipade orukọ atẹle.

 yum clean all 
 yum install hadoop-hdfs-secondarynam

Nigbamii, olutọpa iṣẹ ṣiṣe & datanode lori gbogbo awọn ogun iṣupọ (Node) ayafi awọn ogun JobTracker, NameNode, ati Secondary (tabi Imurasilẹ) Awọn ogun NameNode (ni oju ipade ninu ọran yii).

 yum clean all
 yum install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode

O le fi alabara Hadoop sori ẹrọ ti o lọtọ (ninu ọran yii Mo ti fi sii lori datanode o le fi sii sori ẹrọ eyikeyi).

 yum install hadoop-client

Bayi ti a ba ṣe pẹlu awọn igbesẹ loke jẹ ki a lọ siwaju lati fi ranṣẹ awọn hdfs (lati ṣee ṣe lori gbogbo awọn apa).

Daakọ iṣeto ni aiyipada si itọsọna /etc/hadoop (lori oju ipade kọọkan ninu iṣupọ).

 cp -r /etc/hadoop/conf.dist /etc/hadoop/conf.my_cluster
 cp -r /etc/hadoop/conf.dist /etc/hadoop/conf.my_cluster

Lo pipaṣẹ awọn omiiran lati ṣeto itọsọna aṣa rẹ, bi atẹle (lori oju ipade kọọkan ninu iṣupọ).

 alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50
reading /var/lib/alternatives/hadoop-conf

 alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster
 alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50
reading /var/lib/alternatives/hadoop-conf

 alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster

Bayi ṣii faili 'core-site.xml' ki o ṣe imudojuiwọn “fs.defaultFS” lori oju ipade kọọkan ninu iṣupọ.

 cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
 <name>fs.defaultFS</name>
 <value>hdfs://master/</value>
</property>
</configuration>
 cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
 <name>fs.defaultFS</name>
 <value>hdfs://master/</value>
</property>
</configuration>

Imudojuiwọn ti n bọ “dfs.permissions.superusergroup” ni hdfs-site.xml lori oju ipade kọọkan ninu iṣupọ.

 cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
     <name>dfs.name.dir</name>
     <value>/var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value>
  </property>
  <property>
     <name>dfs.permissions.superusergroup</name>
     <value>hadoop</value>
  </property>
</configuration>
 cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
     <name>dfs.name.dir</name>
     <value>/var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value>
  </property>
  <property>
     <name>dfs.permissions.superusergroup</name>
     <value>hadoop</value>
  </property>
</configuration>

Akiyesi: Jọwọ rii daju pe, iṣeto ti o wa loke wa lori gbogbo awọn apa (ṣe ni oju ipade kan ati ṣiṣe scp lati daakọ lori isinmi awọn apa).

Ṣe imudojuiwọn "dfs.name.dir tabi dfs.namenode.name.dir" ni 'hdfs-site.xml' lori NameNode (lori Titunto ati Node). Jọwọ yi iye pada bi a ti ṣe afihan.

 cat /etc/hadoop/conf/hdfs-site.xml
<property>
 <name>dfs.namenode.name.dir</name>
 <value>file:///data/1/dfs/nn,/nfsmount/dfs/nn</value>
</property>
 cat /etc/hadoop/conf/hdfs-site.xml
<property>
 <name>dfs.datanode.data.dir</name>
 <value>file:///data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn</value>
</property>

Ṣiṣẹ awọn ofin ni isalẹ lati ṣẹda ilana itọsọna & ṣakoso awọn igbanilaaye olumulo lori ẹrọ Namenode (Master) ati ẹrọ Datanode (Node).

 mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn
 chmod 700 /data/1/dfs/nn /nfsmount/dfs/nn
  mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
  chown -R hdfs:hdfs /data/1/dfs/nn /nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn

Ọna kika Namenode (lori Titunto si), nipa ipinfunni aṣẹ atẹle.

 sudo -u hdfs hdfs namenode -format

Ṣafikun ohun-ini atẹle si faili hdfs-site.xml ki o rọpo iye bi o ti han lori Titunto.

<property>
  <name>dfs.namenode.http-address</name>
  <value>172.21.17.175:50070</value>
  <description>
    The address and port on which the NameNode UI will listen.
  </description>
</property>

Akiyesi: Ninu iye ọran wa yẹ ki o jẹ adiresi IP ti oluwa VM.

Bayi jẹ ki a ran MRv1 (Ẹya-idinku ẹya 1). Ṣii 'maapu-site.xml' faili ti o tẹle awọn iye bi o ti han.

 cp hdfs-site.xml mapred-site.xml
 vi mapred-site.xml
 cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
 <name>mapred.job.tracker</name>
 <value>master:8021</value>
</property>
</configuration>

Nigbamii, daakọ faili 'maapu-site.xml' si ẹrọ soso nipa lilo pipaṣẹ scp atẹle.

 scp /etc/hadoop/conf/mapred-site.xml node:/etc/hadoop/conf/
mapred-site.xml                                                                      100%  200     0.2KB/s   00:00

Bayi tunto awọn ilana ifipamọ agbegbe lati lo nipasẹ MRv1 Daemons. Lẹẹkansi ṣii faili 'mapred-site.xml' ki o ṣe awọn ayipada bi o ṣe han ni isalẹ fun TaskTracker kọọkan.

<property>
 <name>mapred.local.dir</name>
 <value>/data/1/mapred/local,/data/2/mapred/local,/data/3/mapred/local</value>
</property>

Lẹhin ti o ṣalaye awọn ilana wọnyi ninu faili 'mapred-site.xml', o gbọdọ ṣẹda awọn ilana naa ki o fi awọn igbanilaaye faili to tọ si wọn lori oju ipade kọọkan ninu iṣupọ rẹ.

mkdir -p /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local /data/4/mapred/local
chown -R mapred:hadoop /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local /data/4/mapred/local

Bayi ṣiṣe aṣẹ atẹle lati bẹrẹ HDFS lori gbogbo oju ipade ninu iṣupọ.

 for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
 for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done

O nilo lati ṣẹda/tmp pẹlu awọn igbanilaaye to dara gẹgẹbi a ti sọ ni isalẹ.

 sudo -u hdfs hadoop fs -mkdir /tmp
 sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
 sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
 sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
 sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred

Bayi jẹrisi iṣeto faili HDFS.

 sudo -u hdfs hadoop fs -ls -R /

drwxrwxrwt   - hdfs hadoop          	0 2014-05-29 09:58 /tmp
drwxr-xr-x   	- hdfs hadoop          	0 2014-05-29 09:59 /var
drwxr-xr-x  	- hdfs hadoop          	0 2014-05-29 09:59 /var/lib
drwxr-xr-x   	- hdfs hadoop         	0 2014-05-29 09:59 /var/lib/hadoop-hdfs
drwxr-xr-x   	- hdfs hadoop          	0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache
drwxr-xr-x   	- mapred hadoop          0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred
drwxr-xr-x   	- mapred hadoop          0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred/mapred
drwxrwxrwt   - mapred hadoop          0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

Lẹhin ti o bẹrẹ HDFS ki o ṣẹda '/ tmp', ṣugbọn ṣaaju ki o to bẹrẹ JobTracker jọwọ ṣẹda itọsọna HDFS ti a ṣalaye nipasẹ paramita 'mapred.system.dir' (nipasẹ aiyipada & # 36 {hadoop.tmp.dir}/mapred/system) ki o yi eni pada si maapu.

 sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system
 sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system

Lati bẹrẹ MapReduce: jọwọ bẹrẹ awọn iṣẹ TT ati JT.

 service hadoop-0.20-mapreduce-tasktracker start

Starting Tasktracker:                               [  OK  ]
starting tasktracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-tasktracker-node.out
 service hadoop-0.20-mapreduce-jobtracker start

Starting Jobtracker:                                [  OK  ]

starting jobtracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-jobtracker-master.out

Nigbamii, ṣẹda itọsọna ile fun olumulo hadoop kọọkan. o ni iṣeduro pe ki o ṣe eyi lori NameNode; fun apere.

 sudo -u hdfs hadoop fs -mkdir  /user/<user>
 sudo -u hdfs hadoop fs -chown <user> /user/<user>

Akiyesi: ibiti ni orukọ olumulo Linux ti olumulo kọọkan.

Ni omiiran, o fagile itọsọna ile bi atẹle.

 sudo -u hdfs hadoop fs -mkdir /user/$USER
 sudo -u hdfs hadoop fs -chown $USER /user/$USER

Ṣii ẹrọ aṣawakiri rẹ ki o tẹ url bi http:// ip_address_of_namenode: 50070 lati wọle si Namenode.

Ṣii taabu miiran ninu ẹrọ aṣawakiri rẹ ki o tẹ url bi http:// ip_address_of_jobtracker: 50030 lati wọle si JobTracker.

Ilana yii ti ni idanwo ni aṣeyọri lori RHEL/CentOS 5.X/6.X. Jọwọ sọ asọye ni isalẹ ti o ba dojukọ eyikeyi awọn iṣoro pẹlu fifi sori ẹrọ, Emi yoo ran ọ lọwọ pẹlu awọn solusan.