Hadoop集群搭建
首先說一下配置環(huán)境:三臺(tái)電腦
192.168.30.149 hadoop149 namenode和jobtracker ###因?yàn)?49機(jī)器稍微好一點(diǎn)
- 192.168.30.150 hadoop150 datanode和TaskTracker
- 192.168.30.148 hadoop150 datanode和TaskTracker
配置ssh無需密碼登陸:
- $ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
- $ cat~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
- 我的master在149可以吧149的.pub文件拷貝到150和148上 然后執(zhí)行
我用的hadoop版本是hadoop-0.20.2 下載地址:
-
google吧 過兩天弄個(gè)網(wǎng)盤都放在上面再寫到這里。
下載后:編輯幾個(gè)文件:
在/root/hadoop-0.20.2/conf中(這里注意的是幾臺(tái)電腦的hadoop文件路徑必須相同):加入如下一句話
[root@localhostconf]# vim
[root@localhostconf]# vim core-site.xml
fs.default.name hdfs://192.168.30.149:9000 ###具體的意義之后會(huì)講解
[root@localhostconf]# vim mapred-site.xml
mapred.job.tracker #p#
hdfs://192.168.30.149:9004
[root@localhostconf]# vim hdfs-site.xml
dfs.replication 2
[root@localhostconf]# vim masters
- hadoop149
[root@localhostconf]# vim slaves
- hadoop150
- hadoop148
一共編輯了5個(gè)文件,具體意義代表什么,之后會(huì)講到
這里注意要被指/etc/hosts文件,如下(192.168.30.149):
[root@localhostconf]# vim /etc/hosts
- # Do not removethe following line, or various programs
- # that requirenetwork functionality will fail.
- 127.0.0.1 localhost.localdomain localhost
- ::1 localhost6.localdomain6 localhost6
- 192.168.30.149hadoop149
- 192.168.30.150hadoop150
- 192.168.30.148hadoop148
#p#
4.啟動(dòng)hadoop:
這里用簡(jiǎn)單的命令進(jìn)行啟動(dòng),
A.格式化文件系統(tǒng):
- #bin/hadoop namenode –format
B.啟動(dòng)hadoop #bin/start-all.sh
C.利用hadoop自帶的例子測(cè)試hadoop是否啟動(dòng)成功
- #bin/hadoop fs -mkdir input ###在文件系統(tǒng)中創(chuàng)建input文件夾
- #bin/hadoopfs -put README.txt input ###把本地readme.txt上傳到input中
- #bin/hadoop fs –lsr ###查看本件系統(tǒng)所有文件
- 存在文件并且大小不為0則hadoop文件系統(tǒng)搭建成功。
- #bin/hadoopjar hadoop-0.20.2-examples.jar wordcount input/README.txt output
- ###將輸出結(jié)果輸出到output中
- #bin/hadoop jar hadoop-0.20.2-examples.jar wordcount input/1.txt output
11/12/02 17:47:14 INFOinput.FileInputFormat: Total input paths to process : 1
11/12/02 17:47:14 INFO mapred.JobClient:Running job: job_201112021743_0001
11/12/02 17:47:15 INFOmapred.JobClient: map 0% reduce 0%
11/12/02 17:47:22 INFOmapred.JobClient: map 100% reduce 0%
11/12/02 17:47:34 INFOmapred.JobClient: map 100% reduce 100%
11/12/02 17:47:36 INFO mapred.JobClient:Job complete: job_201112021743_0001
11/12/02 17:47:36 INFO mapred.JobClient:Counters: 17
11/12/02 17:47:36 INFOmapred.JobClient: Job Counters
11/12/02 17:47:36 INFOmapred.JobClient: Launched reducetasks=1
11/12/02 17:47:36 INFOmapred.JobClient: Launched maptasks=1
11/12/02 17:47:36 INFOmapred.JobClient: Data-local maptasks=1
11/12/02 17:47:36 INFOmapred.JobClient: FileSystemCounters
11/12/02 17:47:36 INFOmapred.JobClient: FILE_BYTES_READ=32523
11/12/02 17:47:36 INFOmapred.JobClient: HDFS_BYTES_READ=44253
11/12/02 17:47:36 INFOmapred.JobClient: FILE_BYTES_WRITTEN=65078
11/12/02 17:47:36 INFOmapred.JobClient: HDFS_BYTES_WRITTEN=23148
11/12/02 17:47:36 INFOmapred.JobClient: Map-Reduce Framework
11/12/02 17:47:36 INFOmapred.JobClient: Reduce inputgroups=2367
11/12/02 17:47:36 INFOmapred.JobClient: Combine outputrecords=2367
11/12/02 17:47:36 INFOmapred.JobClient: Map inputrecords=734
11/12/02 17:47:36 INFOmapred.JobClient: Reduce shufflebytes=32523
11/12/02 17:47:36 INFOmapred.JobClient: Reduce outputrecords=2367
11/12/02 17:47:36 INFO mapred.JobClient: Spilled Records=4734
11/12/02 17:47:36 INFOmapred.JobClient: Map outputbytes=73334
11/12/02 17:47:36 INFOmapred.JobClient: Combine inputrecords=7508
11/12/02 17:47:36 INFOmapred.JobClient: Map outputrecords=7508
11/12/02 17:47:36 INFOmapred.JobClient: Reduce inputrecords=2367
也可以通過本地瀏覽器進(jìn)行查看狀態(tài):50070和50030端口(注意配置本地C:\Windows\System32\drivers\etc\hosts文件)
- 192.168.30.150 hadoop150
- 192.168.30.149 hadoop149
- 192.168.30.148 hadoop148
【編輯推薦】