成人午夜视频全免费观看高清-秋霞福利视频一区二区三区-国产精品久久久久电影小说-亚洲不卡区三一区三区一区

hadoop分布式環(huán)境的搭建過程-創(chuàng)新互聯(lián)

本篇內(nèi)容介紹了“hadoop分布式環(huán)境的搭建過程”的有關(guān)知識(shí),在實(shí)際案例的操作過程中,不少人都會(huì)遇到這樣的困境,接下來就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧!希望大家仔細(xì)閱讀,能夠?qū)W有所成!

創(chuàng)新互聯(lián)是一家專注于成都網(wǎng)站制作、成都網(wǎng)站設(shè)計(jì)與策劃設(shè)計(jì),修文網(wǎng)站建設(shè)哪家好?創(chuàng)新互聯(lián)做網(wǎng)站,專注于網(wǎng)站建設(shè)十年,網(wǎng)設(shè)計(jì)領(lǐng)域的專業(yè)建站公司;建站業(yè)務(wù)涵蓋:修文等地區(qū)。修文做網(wǎng)站價(jià)格咨詢:18982081108

1. Java安裝與環(huán)境配置

Hadoop是基于Java的,所以首先需要安裝配置好java環(huán)境。從官網(wǎng)下載JDK,我用的是1.8版本。 在Mac下可以在終端下使用scp命令遠(yuǎn)程拷貝到虛擬機(jī)linux中。

danieldu@daniels-MacBook-Pro-857 ~/Downloads scp jdk-8u121-linux-x64.tar.gz root@hadoop100:/opt/softwareroot@hadoop100's password:danieldu@daniels-MacBook-Pro-857 ~/Downloads

其實(shí)我在Mac上裝了一個(gè)神器-Forklift。 可以通過SFTP的方式連接到遠(yuǎn)程linux。然后在操作本地電腦一樣,直接把文件拖過去就行了。而且好像配置文件的編輯,也可以不用在linux下用vi,直接在Mac下用sublime遠(yuǎn)程打開就可以編輯了 :)

然后在linux虛擬機(jī)中(ssh 登錄上去)解壓縮到/opt/modules目錄下

[root@hadoop100 include]# tar -zxvf /opt/software/jdk-8u121-linux-x64.tar.gz -C /opt/modules/

然后需要設(shè)置一下環(huán)境變量, 打開 /etc/profile, 添加JAVA_HOME并設(shè)置PATH用vi打開也行,或者如果你也安裝了類似forklift這樣的可以遠(yuǎn)程編輯文件的工具那更方便。

vi /etc/profile

按shift + G 跳到文件最后,按i切換到編輯模式,添加下面的內(nèi)容,主要路徑要搞對(duì)。

#JAVA_HOMEexport JAVA_HOME=/opt/modules/jdk1.8.0_121export PATH=$PATH:$JAVA_HOME/bin

按ESC , 然后 :wq存盤退出。

執(zhí)行下面的語句使更改生效

[root@hadoop100 include]# source /etc/profile

檢查java是否安裝成功。如果能看到版本信息就說明安裝成功了。

[root@hadoop100 include]# java -versionjava version "1.8.0_121"Java(TM) SE Runtime Environment (build 1.8.0_121-b13)Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)[root@hadoop100 include]#

2. Hadoop安裝與環(huán)境配置

Hadoop的安裝也是只需要把hadoop的tar包拷貝到linux,解壓,設(shè)置環(huán)境變量.然后用之前做好的xsync腳本,把更新同步到集群中的其他機(jī)器。如果你不知道xcall、xsync怎么寫的??梢苑幌轮暗奈恼隆_@樣集群里的所有機(jī)器就都設(shè)置好了。

[root@hadoop100 include]# tar -zxvf /opt/software/hadoop-2.7.3.tar.gz -C /opt/modules/[root@hadoop100 include]# vi /etc/profile 繼續(xù)添加HADOOP_HOME#JAVA_HOMEexport JAVA_HOME=/opt/modules/jdk1.8.0_121export PATH=$PATH:$JAVA_HOME/bin#HADOOP_HOMEexport HADOOP_HOME=/opt/modules/hadoop-2.7.3export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin [root@hadoop100 include]# source /etc/profile把更改同步到集群中的其他機(jī)器[root@hadoop100 include]# xsync /etc/profile[root@hadoop100 include]# xcall source /etc/profile[root@hadoop100 include]# xsync hadoop-2.7.3/

3. Hadoop分布式配置

然后需要對(duì)Hadoop集群環(huán)境進(jìn)行配置。對(duì)于集群的資源配置是這樣安排的,當(dāng)然hadoop100顯得任務(wù)重了一點(diǎn)hadoop分布式環(huán)境的搭建過程

編輯0/opt/modules/hadoop-2.7.3/etc/hadoop/mapred-env.sh、yarn-env.sh、hadoop-env.sh 這幾個(gè)shell文件中的JAVA_HOME,設(shè)置為真實(shí)的絕對(duì)路徑。

export JAVA_HOME=/opt/modules/jdk1.8.0_121

打開編輯 /opt/modules/hadoop-2.7.3/etc/hadoop/core-site.xml, 內(nèi)容如下

<configuration> <property>  <name>fs.defaultFS</name>  <value>hdfs://hadoop100:9000</value> </property> <property>  <name>hadoop.tmp.dir</name>  <value>/opt/modules/hadoop-2.7.3/data/tmp</value> </property> </configuration

編輯/opt/modules/hadoop-2.7.3/etc/hadoop/hdfs-site.xml, 指定讓dfs復(fù)制5份,因?yàn)槲疫@里有5臺(tái)虛擬機(jī)組成的集群。每臺(tái)機(jī)器都擔(dān)當(dāng)DataNode的角色。暫時(shí)也把secondary name node也放在hadoop100上,其實(shí)這里不太好,好能和主namenode分開在不同機(jī)器上。

<configuration> <property>  <name>dfs.replication</name>  <value>5</value> </property> <property>  <name>dfs.namenode.secondary.http-address</name>  <value>hadoop100:50090</value> </property> <property>  <name>dfs.permissions</name>  <value>false</value> </property></configuration>

YARN 是hadoop的集中資源管理服務(wù),放在hadoop100上。 編輯/opt/modules/hadoop-2.7.3/etc/hadoop/yarn-site.xml

<configuration><!-- Site specific YARN configuration properties --> <property>  <name>yarn.nodemanager.aux-services</name>  <value>mapreduce_shuffle</value> </property> <property>  <name>yarn.resourcemanager.hostname</name>  <value>hadoop100</value> </property> <property>  <name>yarn.log-aggregation-enbale</name>  <value>true</value> </property> <property>  <name>yarn.log-aggregation.retain-seconds</name>  <value>604800</value> </property></configuration>

為了讓集群能一次啟動(dòng),編輯slaves文件(/opt/modules/hadoop-2.7.3/etc/hadoop/slaves),把集群中的幾臺(tái)機(jī)器都加入到slave文件中,一臺(tái)占一行。

hadoop100hadoop101hadoop102hadoop103hadoop104

最后,在hadoop100上全部做完相關(guān)配置更改后,把相關(guān)的修改同步到集群中的其他機(jī)器

xsync hadoop-2.7.3/

在啟動(dòng)Hadoop之前需要format一下hadoop設(shè)置。

hdfs namenode -format

然后就可以啟動(dòng)hadoop了。從下面的輸出過程可以看到整個(gè)集群從100到104的5臺(tái)機(jī)器都已經(jīng)啟動(dòng)起來了。通過jps可以查看當(dāng)前進(jìn)程。

[root@hadoop100 sbin]# ./start-dfs.shStarting namenodes on [hadoop100]hadoop100: starting namenode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-root-namenode-hadoop100.outhadoop101: starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop101.outhadoop102: starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop102.outhadoop100: starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop100.outhadoop103: starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop103.outhadoop104: starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop104.outStarting secondary namenodes [hadoop100]hadoop100: starting secondarynamenode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-root-secondarynamenode-hadoop100.out[root@hadoop100 sbin]# jps2945 NameNode3187 SecondaryNameNode3047 DataNode3351 Jps[root@hadoop100 sbin]# ./start-yarn.shstarting yarn daemonsstarting resourcemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-root-resourcemanager-hadoop100.outhadoop103: starting nodemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop103.outhadoop102: starting nodemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop102.outhadoop104: starting nodemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop104.outhadoop101: starting nodemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop101.outhadoop100: starting nodemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop100.out[root@hadoop100 sbin]# jps3408 ResourceManager2945 NameNode3187 SecondaryNameNode3669 Jps3047 DataNode3519 NodeManager[root@hadoop100 sbin]#

4. Hadoop的使用

使用hadoop可以通過API調(diào)用,這里先看看使用命令調(diào)用,確保hadoop環(huán)境已經(jīng)正常運(yùn)行了。

這中間有個(gè)小插曲,我通過下面的命令查看hdfs上面的文件時(shí),發(fā)現(xiàn)連接不上。

[root@hadoop100 ~]# hadoop fs -lsls: Call From hadoop100/192.168.56.100 to hadoop100:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

后來發(fā)現(xiàn),是我中間更改過前面提到的xml配置文件,忘記format了。修改配置后記得要format。

hdfs namenode -format

hdfs 文件操作

[root@hadoop100 sbin]# hadoop fs -ls /[root@hadoop100 sbin]# hadoop fs -put ~/anaconda-ks.cfg /[root@hadoop100 sbin]# hadoop fs -ls /Found 1 items-rw-r--r-- 5 root supergroup  1233 2019-09-16 16:31 /anaconda-ks.cfg[root@hadoop100 sbin]# hadoop fs -cat /anaconda-ks.cfg

文件內(nèi)容

[root@hadoop100 ~]# mkdir tmp[root@hadoop100 ~]# hadoop fs -get /anaconda-ks.cfg ./tmp/[root@hadoop100 ~]# ll tmp/total 4-rw-r--r--. 1 root root 1233 Sep 16 16:34 anaconda-ks.cfg

執(zhí)行MapReduce程序

hadoop中指向示例的MapReduce程序,wordcount,數(shù)數(shù)在一個(gè)文件中出現(xiàn)的詞的次數(shù),我隨便找了個(gè)anaconda-ks.cfg試了一下:

[root@hadoop100 ~]# hadoop jar /opt/modules/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /anaconda-ks.cfg ~/tmp19/09/16 16:43:28 INFO client.RMProxy: Connecting to ResourceManager at hadoop100/192.168.56.100:803219/09/16 16:43:29 INFO input.FileInputFormat: Total input paths to process : 119/09/16 16:43:29 INFO mapreduce.JobSubmitter: number of splits:119/09/16 16:43:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1568622576365_000119/09/16 16:43:30 INFO impl.YarnClientImpl: Submitted application application_1568622576365_000119/09/16 16:43:31 INFO mapreduce.Job: The url to track the job: http://hadoop100:8088/proxy/application_1568622576365_0001/19/09/16 16:43:31 INFO mapreduce.Job: Running job: job_1568622576365_000119/09/16 16:43:49 INFO mapreduce.Job: Job job_1568622576365_0001 running in uber mode : false19/09/16 16:43:49 INFO mapreduce.Job: map 0% reduce 0%19/09/16 16:43:58 INFO mapreduce.Job: map 100% reduce 0%19/09/16 16:44:10 INFO mapreduce.Job: map 100% reduce 100%19/09/16 16:44:11 INFO mapreduce.Job: Job job_1568622576365_0001 completed successfully19/09/16 16:44:12 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=1470FILE: Number of bytes written=240535FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=1335HDFS: Number of bytes written=1129HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job CountersLaunched map tasks=1Launched reduce tasks=1Rack-local map tasks=1Total time spent by all maps in occupied slots (ms)=6932Total time spent by all reduces in occupied slots (ms)=7991Total time spent by all map tasks (ms)=6932Total time spent by all reduce tasks (ms)=7991Total vcore-milliseconds taken by all map tasks=6932Total vcore-milliseconds taken by all reduce tasks=7991Total megabyte-milliseconds taken by all map tasks=7098368Total megabyte-milliseconds taken by all reduce tasks=8182784Map-Reduce FrameworkMap input records=46Map output records=120Map output bytes=1704Map output materialized bytes=1470Input split bytes=102Combine input records=120Combine output records=84Reduce input groups=84Reduce shuffle bytes=1470Reduce input records=84Reduce output records=84Spilled Records=168Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=169CPU time spent (ms)=1440Physical memory (bytes) snapshot=300003328Virtual memory (bytes) snapshot=4159303680Total committed heap usage (bytes)=141471744Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format CountersBytes Read=1233File Output Format CountersBytes Written=1129[root@hadoop100 ~]#

在web端管理界面中可以看到對(duì)應(yīng)的application:

執(zhí)行的結(jié)果,看到就是“#” 出現(xiàn)的最多,出現(xiàn)了12次,這也難怪,里面好多都是注釋嘛。

[root@hadoop100 tmp]# hadoop fs -ls /root/tmpFound 2 items-rw-r--r-- 5 root supergroup   0 2019-09-16 16:44 /root/tmp/_SUCCESS-rw-r--r-- 5 root supergroup  1129 2019-09-16 16:44 /root/tmp/part-r-00000[root@hadoop100 tmp]# hadoop fs -cat /root/tmp/part-r-0000cat: `/root/tmp/part-r-0000': No such file or directory[root@hadoop100 tmp]# hadoop fs -cat /root/tmp/part-r-00000# 12#version=DEVEL 1$6$JBLRSbsT070BPmiq$Of51A9N3Zjn/gZ23mLMlVs8vSEFL6ybkfJ1K1uJLAwumtkt1PaLcko1SSszN87FLlCRZsk143gLSV22Rv0zDr/ 1%addon 1%anaconda 1%end 3%packages 1--addsupport=zh_CN.UTF-8 1--boot-drive=sda 1--bootproto=dhcp 1--device=enp0s3 1--disable 1--disabled="chronyd" 1--emptyok 1。。。

通過web 界面可以查看hdfs中的文件列表 http://192.168.56.100:50070/explorer.html#

“hadoop分布式環(huán)境的搭建過程”的內(nèi)容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí)可以關(guān)注創(chuàng)新互聯(lián)網(wǎng)站,小編將為大家輸出更多高質(zhì)量的實(shí)用文章!

網(wǎng)站標(biāo)題:hadoop分布式環(huán)境的搭建過程-創(chuàng)新互聯(lián)
標(biāo)題鏈接:http://jinyejixie.com/article30/jgiso.html

成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供電子商務(wù)、網(wǎng)站排名虛擬主機(jī)、網(wǎng)站策劃、服務(wù)器托管網(wǎng)頁設(shè)計(jì)公司

廣告

聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請(qǐng)盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如需處理請(qǐng)聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來源: 創(chuàng)新互聯(lián)

綿陽服務(wù)器托管
陆丰市| 西乡县| 孝感市| 巴东县| 扎赉特旗| 贵州省| 瑞昌市| 馆陶县| 巴彦淖尔市| 临高县| 乳源| 浮梁县| 浮梁县| 望江县| 武城县| 木里| 开原市| 元氏县| 湾仔区| 金塔县| 咸宁市| 登封市| 嘉祥县| 五常市| 长葛市| 仙游县| 石屏县| 苍南县| 格尔木市| 南川市| 娄烦县| 曲阜市| 车险| 宝山区| 华蓥市| 本溪市| 嘉禾县| 海宁市| 宁海县| 竹北市| 洪泽县|