紅帽LINUX 5下安裝Hadoop 2.0.0-alpha
一,安裝環(huán)境與配置前準(zhǔn)備工作
硬件:4個虛擬機(jī)分別為master1:192.168.1.220,master2:192.168.1.221,slave1:192.168.1.222,slave2:192.168.1.223
http://code.google.com/p/hdfs-fuse/downloads/list fuse-hdfs
系統(tǒng):紅帽 LINUX 5
HADOOP版本:最新版本hadoop-2.0.0-alpha 安裝包為hadoop-2.0.0-alpha.tar.gz
下載官網(wǎng)地址:http://apache.etoak.com/hadoop/common/hadoop-2.0.0-alpha/
JDK版本:jdk-6u6-linux-i586.bin(最低要求為JDK 1.6)
虛擬機(jī)的安裝和LINUX的安裝不介紹,GOOGLE一大堆
創(chuàng)建相關(guān)目錄:mkdir /usr/hadoop(hadoop安裝目錄)mkdir /usr/java(JDK安裝目錄)
二,安裝JDK(所有節(jié)點(diǎn)都一樣)
1,將下載好的jdk-6u6-linux-i586.bin通過SSH上傳到/usr/java下
2,進(jìn)入JDK安裝目錄cd /usr/java 并且執(zhí)行chmod +x jdk-6u6-linux-i586.bin
3,執(zhí)行./jdk-6u6-linux-i586.bin(一路回車,遇到y(tǒng)es/no全部yes,最后會done,安裝成功)
4,配置環(huán)境變量,執(zhí)行cd /etc命令后執(zhí)行vi profile,在行末尾添加
- export JAVA_HOME=/usr/java/jdk1.6.0_27
- export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:/lib/dt.jar
- export PATH=$JAVA_HOME/bin:$PATH
5,執(zhí)行chmod +x profile將其變成可執(zhí)行文件
6,執(zhí)行source profile使其配置立即生效
7,執(zhí)行java -version查看是否安裝成功
三,修改主機(jī)名,所有節(jié)點(diǎn)均一樣配置
1,連接到主節(jié)點(diǎn)192.168.1.220,修改network,執(zhí)行cd /etc/sysconfig命令后執(zhí)行vi network,修改HOSTNAME=master1
2,修改hosts文件,執(zhí)行cd /etc命令后執(zhí)行vi hosts,在行末尾添加:
192.168.1.220 master1
192.168.1.221 master2
192.168.1.222 slave1
192.168.1.223 slave2
3,執(zhí)行hostname master1
4,執(zhí)行exit后重新連接可看到主機(jī)名以修改OK
四,配置SSH無密碼登陸
1,SSH無密碼原理簡介:首先在master上生成一個密鑰對,包括一個公鑰和一個私鑰,并將公鑰復(fù)制到所有的slave上。
然后當(dāng)master通過SSH連接slave時,slave就會生成一個隨機(jī)數(shù)并用master的公鑰對隨機(jī)數(shù)進(jìn)行加密,并發(fā)送給master。
最后master收到加密數(shù)之后再用私鑰解密,并將解密數(shù)回傳給slave,slave確認(rèn)解密數(shù)無誤之后就允許master不輸入密碼進(jìn)行連接了
2,具體步驟:
1、執(zhí)行命令ssh-keygen -t rsa之后一路回車,查看剛生成的無密碼鑰對:cd .ssh 后執(zhí)行l(wèi)l
2、把id_rsa.pub追加到授權(quán)的key里面去。執(zhí)行命令cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys
3、修改權(quán)限:執(zhí)行chmod 600 ~/.ssh/authorized_keys
4、確保cat /etc/ssh/sshd_config 中存在如下內(nèi)容
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
如需修改,則在修改后執(zhí)行重啟SSH服務(wù)命令使其生效:service sshd restart
5、將公鑰復(fù)制到所有的slave機(jī)器上:scp ~/.ssh/id_rsa.pub 192.168.1.222:~/ 然后輸入yes,最后輸入slave機(jī)器的密碼
6、在slave機(jī)器上創(chuàng)建.ssh文件夾:mkdir ~/.ssh 然后執(zhí)行chmod 700 ~/.ssh(若文件夾以存在則不需要創(chuàng)建)
7、追加到授權(quán)文件authorized_keys執(zhí)行命令:cat ~/id_rsa.pub >> ~/.ssh/authorized_keys 然后執(zhí)行chmod 600 ~/.ssh/authorized_keys
8、重復(fù)第4步
9、驗證命令:在master機(jī)器上執(zhí)行 ssh 192.168.1.222發(fā)現(xiàn)主機(jī)名由master1變成slave1即成功,最后刪除id_rsa.pub文件:rm -r id_rsa.pub
3,按照以上步驟分別配置master1,master2,slave1,slave2,要求每個master與每個slave之間都可以無密碼登錄
五,安裝HADOOP,所有節(jié)點(diǎn)都一樣
1,將hadoop-2.0.0-alpha.tar.gz上傳到HADOOP的安裝目錄/usr/hadoop中
2,解壓安裝包:tar -zxvf hadoop-2.0.0-alpha.tar.gz
3,創(chuàng)建tmp文件夾:mkdir /usr/hadoop/tmp
4,配置環(huán)境變量:vi /etc/profile
- export HADOOP_DEV_HOME=/usr/hadoop/hadoop-2.0.0-alpha
- export PATH=$PATH:$HADOOP_DEV_HOME/bin
- export PATH=$PATH:$HADOOP_DEV_HOME/sbin
- export HADOOP_MAPARED_HOME=${HADOOP_DEV_HOME}
- export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
- export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
- export YARN_HOME=${HADOOP_DEV_HOME}
- export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
- export HDFS_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
- export YARN_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
5,配置HADOOP
配置文件位于/usr/hadoop/hadoop-2.0.0-alpha/etc/hadoop下
1、創(chuàng)建并配置hadoop-env.sh
vi /usr/hadoop/hadoop-2.0.0-alpha/etc/hadoop/hadoop-env.sh 在末尾添加export JAVA_HOME=/usr/java/jdk1.6.0_27
2、配置core-site.xml文件
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/usr/hadoop/tmp</value>
- </property>
- <property>
- <name>fs.default.name</name>
- <value>hdfs://localhost:9000</value>
- </property>
3、創(chuàng)建并配置slaves:vi slaves 并添加以下內(nèi)容
192.168.1.222
192.168.1.223
4、配置hdfs-site.xml
- <configuration>
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>file:/usr/hadoop/hdfs/name</value>
- <final>true</final>
- </property>
- <property>
- <name>dfs.federation.nameservice.id</name>
- <value>ns1</value>
- </property>
- <property>
- <name>dfs.namenode.backup.address.ns1</name>
- <value>192.168.1.223:50100</value>
- </property>
- <property>
- <name>dfs.namenode.backup.http-address.ns1</name>
- <value>192.168.1.223:50105</value>
- </property>
- <property>
- <name>dfs.federation.nameservices</name>
- <value>ns1,ns2</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.ns1</name>
- <value>192.168.1.220:9000</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.ns2</name>
- <value>192.168.1.221:9000</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.ns1</name>
- <value>192.168.1.220:23001</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.ns2</name>
- <value>192.168.1.221:13001</value>
- </property>
- <property>
- <name>dfs.dataname.data.dir</name>
- <value>file:/usr/hadoop/hdfs/data</value>
- <final>true</final>
- </property>
- <property>
- <name>dfs.namenode.secondary.http-address.ns1</name>
- <value>192.168.1.220:23002</value>
- </property>
- <property>
- <name>dfs.namenode.secondary.http-address.ns2</name>
- <value>192.168.1.221:23002</value>
- </property>
- <property>
- <name>dfs.namenode.secondary.http-address.ns1</name>
- <value>192.168.1.220:23003</value>
- </property>
- <property>
- <name>dfs.namenode.secondary.http-address.ns2</name>
- <value>192.168.1.221:23003</value>
- </property>
- </configuration>
5、配置yarn-site.xml
- <configuration>
- <!-- Site specific YARN configuration properties -->
- <property>
- <name>yarn.resourcemanager.address</name>
- <value>192.168.1.220:18040</value>
- </property>
- <property>
- <name>yarn.resourcemanager.scheduler.address</name>
- <value>192.168.1.220:18030</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address</name>
- <value>192.168.1.220:18088</value>
- </property>
- <property>
- <name>yarn.resourcemanager.resource-tracker.address</name>
- <value>192.168.1.220:18025</value>
- </property>
- <property>
- <name>yarn.resourcemanager.admin.address</name>
- <value>192.168.1.220:18141</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce.shuffle</value>
- </property>
- </configuration>
六,啟動HADOOP集群,并測試WORDCOUNT
1,格式化 namenode:分別在兩個master上執(zhí)行:hadoop namenode -format -clusterid eric
2,啟動HADOOP:在master1執(zhí)行start-all.sh或先執(zhí)行start-dfs.sh再執(zhí)行start-yarn.sh
3,分別在各個節(jié)點(diǎn)上執(zhí)行jps命令,顯示結(jié)果如下即成功啟動:
[root@master1 hadoop]# jps
1956 Bootstrap
4183 Jps
3938 ResourceManager
3845 SecondaryNameNode
3652 NameNode
[root@master2 ~]# jps
3778 Jps
1981 Bootstrap
3736 SecondaryNameNode
3633 NameNode
[root@slave1 ~]# jps
3766 Jps
3675 NodeManager
3551 DataNode
[root@slave1 ~]# jps
3675 NodeManager
3775 Jps
3551 DataNode
4,在master1上,創(chuàng)建輸入目錄:hadoop fs -mkdir hdfs://192.168.1.220:9000/input
5,將/usr/hadoop/hadoop-2.0.0-alpha/目錄下的所有txt文件復(fù)制到hdfs分布式文件系統(tǒng)的目錄里,執(zhí)行以下命令
hadoop fs -put /usr/hadoop/hadoop-2.0.0-alpha/*.txt hdfs://192.168.1.220:9000/input
6,在master1上,執(zhí)行HADOOP自帶的例子,wordcount包,命令如下:
cd /usr/hadoop/hadoop-2.0.0-alpha/share/hadoop/mapreduce
hadoop jar hadoop-mapreduce-examples-2.0.0-alpha.jar wordcount hdfs://192.168.1.220:9000/input hdfs://192.168.1.220:9000/output
7,在master1上,查看結(jié)果命令如下:
[root@master1 hadoop]# hadoop fs -ls hdfs://192.168.1.220:9000/output
Found 2 items
-rw-r--r-- 2 root supergroup 0 2012-06-29 22:59 hdfs://192.168.1.220:9000/output/_SUCCESS
-rw-r--r-- 2 root supergroup 8739 2012-06-29 22:59 hdfs://192.168.1.220:9000/output/part-r-00000
[root@master1 hadoop]# hadoop fs -ls hdfs://192.168.1.220:9000/input
Found 3 items
-rw-r--r-- 2 root supergroup 15164 2012-06-29 22:55 hdfs://192.168.1.220:9000/input/LICENSE.txt
-rw-r--r-- 2 root supergroup 101 2012-06-29 22:55 hdfs://192.168.1.220:9000/input/NOTICE.txt
-rw-r--r-- 2 root supergroup 1366 2012-06-29 22:55 hdfs://192.168.1.220:9000/input/README.txt
[root@master1 hadoop]# hadoop fs -cat hdfs://192.168.1.220:9000/output/part-r-00000即可看到每個單詞的數(shù)量
8,可以通過IE訪問:http://192.168.1.220:23001/dfshealth.jsp
到此整個過程就結(jié)束了………
參考文獻(xiàn):http://www.cnblogs.com/xia520pi/archive/2012/05/16/2503949.html
http://blog.csdn.net/azhao_dn/article/details/7480201
http://www.cnblogs.com/MGGOON/archive/2012/03/14/2396481.html
http://www.haogongju.net/art/763686以及官方網(wǎng)站