本篇內容介紹了“hadoop2.7.3+HA+YARN+zookeeper高可用集群如何部署”的有關知識,在實際案例的操作過程中,不少人都會遇到這樣的困境,接下來就讓小編帶領大家學習一下如何處理這些情況吧!希望大家仔細閱讀,能夠學有所成!
讓客戶滿意是我們工作的目標,不斷超越客戶的期望值來自于我們對這個行業(yè)的熱愛。我們立志把好的技術通過有效、簡單的方式提供給客戶,將通過不懈努力成為客戶在信息化領域值得信任、有價值的長期合作伙伴,公司提供的服務項目有:域名與空間、網(wǎng)絡空間、營銷軟件、網(wǎng)站建設、郊區(qū)網(wǎng)站維護、網(wǎng)站推廣。
JDK | 1.8.0_111-b14 |
hadoop | hadoop-2.7.3 |
zookeeper | zookeeper-3.5.2 |
JDK的安裝和集群的依賴環(huán)境配置不再敘述
hadoop配置主要涉及hdfs-site.xml,core-site.xml,mapred-site.xml,yarn-site.xml四個文件。以下詳細介紹每個文件的配置。
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://cluster1</value> <description>HDFS namenode的邏輯名稱,也就是namenode HA,此值要對應hdfs-site.xml里的dfs.nameservices</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/hadoop/tmp</value> <description>hdfs中namenode和datanode的數(shù)據(jù)默認放置路徑,也可以在hdfs-site.xml中分別指定</description> </property> <property> <name>ha.zookeeper.quorum</name> <value>master:2181,salve1:2181,salve2:2181</value> <description>zookeeper集群的地址和端口,zookeeper集群的節(jié)點數(shù)必須為奇數(shù)</description> </property> </configuration>
<configuration> <property> <name>dfs.name.dir</name> <value>/usr/hadoop/hdfs/name</value> <description>namenode的數(shù)據(jù)放置目錄</description> </property> <property> <name>dfs.data.dir</name> <value>/usr/hadoop/hdfs/data</value> <description>datanode的數(shù)據(jù)放置目錄</description> </property> <property> <name>dfs.replication</name> <value>4</value> <description>數(shù)據(jù)塊的備份數(shù),默認是3</description> </property> <property> <name>dfs.nameservices</name> <value>cluster1</value> <description>HDFS namenode的邏輯名稱,也就是namenode HA</description> </property> <property> <name>dfs.ha.namenodes.cluster1</name> <value>ns1,ns2</value> <description>nameservices對應的namenode邏輯名</description> </property> <property> <name>dfs.namenode.rpc-address.cluster1.ns1</name> <value>master:9000</value> <description>指定namenode(ns1)的rpc地址和端口</description> </property> <property> <name>dfs.namenode.http-address.cluster1.ns1</name> <value>master:50070</value> <description>指定namenode(ns1)的web地址和端口</description> </property> <property> <name>dfs.namenode.rpc-address.cluster1.ns2</name> <value>salve1:9000</value> <description>指定namenode(ns2)的rpc地址和端口</description> </property> <property> <name>dfs.namenode.http-address.cluster1.ns2</name> <value>salve1:50070</value> <description>指定namenode(ns2)的web地址和端口</description> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://master:8485;salve1:8485;salve2:8485/cluster1 </value> <description>這是NameNode讀寫JNs組的uri,active NN 將 edit log 寫入這些JournalNode,而 standby NameNode 讀取這些 edit log,并作用在內存中的目錄樹中</description> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/usr/hadoop/journal</value> <description>ournalNode 所在節(jié)點上的一個目錄,用于存放 editlog 和其他狀態(tài)信息。</description> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> <description>啟動自動failover。自動failover依賴于zookeeper集群和ZKFailoverController(ZKFC),后者是一個zookeeper客戶端,用來監(jiān)控NN的狀態(tài)信息。每個運行NN的節(jié)點必須要運行一個zkfc</description> </property> <property> <name>dfs.client.failover.proxy.provider.cluster1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> <description>配置HDFS客戶端連接到Active NameNode的一個java類</description> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> <description>解決HA集群腦裂問題(即出現(xiàn)兩個 master 同時對外提供服務,導致系統(tǒng)處于不一致狀態(tài))。在 HDFS HA中,JournalNode 只允許一個 NameNode 寫數(shù)據(jù),不會出現(xiàn)兩個 active NameNode 的問題, 但是,當主備切換時,之前的 active NameNode 可能仍在處理客戶端的 RPC 請求,為此,需要增加隔離機制(fencing)將之前的 active NameNode 殺死。常用的fence方法是sshfence,要指定ssh通訊使用的密鑰dfs.ha.fencing.ssh.private-key-files和連接超時時間</description> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> <description>ssh通訊使用的密鑰</description> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> <description>連接超時時間</description> </property> </configuration>
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>指定運行mapreduce的環(huán)境是yarn,與hadoop1截然不同的地方</description> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> <description>MR JobHistory Server管理的日志的存放位置</description> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> <description>查看歷史服務器已經(jīng)運行完的Mapreduce作業(yè)記錄的web地址,需要啟動該服務才行</description> </property> <property> <name>mapreduce.jobhistory.done-dir</name> <value>/data/hadoop/done</value> <description>MR JobHistory Server管理的日志的存放位置,默認:/mr-history/done</description> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>hdfs://mycluster-pha/mapred/tmp</value> <description>MapReduce作業(yè)產(chǎn)生的日志存放位置,默認值:/mr-history/tmp</description> </property> </configuration>
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> <description>默認</description> </property> <property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>1024</value> <description>該值配置小于1024時,NM是無法啟動的!會報錯: NodeManager from slavenode2 doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager.</description> </property> </configuration>
zookeeper的配置主要是zoo.cfg和myid兩個文件
cp zoo_sample.cfg zoo.cfg
dataDir:數(shù)據(jù)的放置路徑 dataLogDir:log的放置路徑
initLimit=10 syncLimit=5 clientPort=2181 tickTime=2000 dataDir=/usr/zookeeper/tmp/data dataLogDir=/usr/zookeeper/tmp/log server.1=master:2888:3888 server.2=slave1:2888:3888 server.3=slave2:2888:3888
vi myid
master節(jié)點編輯:1
slave1節(jié)點編輯:2
slave2節(jié)點編輯:3
如下:
[hadoop@master data]$ vi myid 1
bin/zkServer.sh start
[hadoop@master hadoop-2.7.3]$ zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.5.2-alpha/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: follower
[hadoop@slave1 root]$ zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.5.2-alpha/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: leader
[hadoop@slave2 root]$ zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.5.2-alpha/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: follower
[hadoop@slave1 root]$ zkCli.sh Connecting to localhost:2181 2016-12-18 02:05:03,115 [myid:] - INFO [main:Environment@109] - Client environment:zookeeper.version=3.5.2-alpha-1750793, built on 06/30/2016 13:15 GMT 2016-12-18 02:05:03,118 [myid:] - INFO [main:Environment@109] - Client environment:host.name=salve1 2016-12-18 02:05:03,118 [myid:] - INFO [main:Environment@109] - Client environment:java.version=1.8.0_111 2016-12-18 02:05:03,120 [myid:] - INFO [main:Environment@109] - Client environment:java.vendor=Oracle Corporation 2016-12-18 02:05:03,120 [myid:] - INFO [main:Environment@109] - Client environment:java.home=/usr/local/jdk1.8.0_111/jre 2016-12-18 02:05:03,120 [myid:] - INFO [main:Environment@109] - Client environment:java.class.path=/usr/local/zookeeper-3.5.2-alpha/bin/../build/classes:/usr/local/zookeeper-3.5.2-alpha/bin/../build/lib/*.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/slf4j-log4j12-1.7.5.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/slf4j-api-1.7.5.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/servlet-api-2.5-20081211.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/netty-3.10.5.Final.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/log4j-1.2.17.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jline-2.11.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jetty-util-6.1.26.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jetty-6.1.26.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/javacc.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jackson-mapper-asl-1.9.11.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jackson-core-asl-1.9.11.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/commons-cli-1.2.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../zookeeper-3.5.2-alpha.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../src/java/lib/*.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../conf:.:/usr/local/jdk1.8.0_111/lib/dt.jar:/usr/local/jdk1.8.0_111/lib/tools.jar:/usr/local/zookeeper-3.5.2-alpha/bin:/usr/local/hadoop-2.7.3/bin 2016-12-18 02:05:03,120 [myid:] - INFO [main:Environment@109] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:java.io.tmpdir=/tmp 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:java.compiler=<NA> 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:os.name=Linux 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:os.arch=amd64 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:os.version=3.10.0-327.22.2.el7.x86_64 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:user.name=hadoop 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:user.home=/home/hadoop 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:user.dir=/tmp/hsperfdata_hadoop 2016-12-18 02:05:03,121 [myid:] - INFO [main:Environment@109] - Client environment:os.memory.free=52MB 2016-12-18 02:05:03,123 [myid:] - INFO [main:Environment@109] - Client environment:os.memory.max=228MB 2016-12-18 02:05:03,123 [myid:] - INFO [main:Environment@109] - Client environment:os.memory.total=57MB 2016-12-18 02:05:03,146 [myid:] - INFO [main:ZooKeeper@855] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@593634ad Welcome to ZooKeeper! 2016-12-18 02:05:03,171 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1113] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled 2016-12-18 02:05:03,243 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@948] - Socket connection established, initiating session, client: /127.0.0.1:56184, server: localhost/127.0.0.1:2181 2016-12-18 02:05:03,252 [myid:localhost:2181] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1381] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x200220f5fe30060, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0]
1.1在三個節(jié)點上啟動Journalnode deamons,然后jps,出現(xiàn)JournalNode進程。
sbin/./hadoop-daemon.sh start journalnode
jps JournalNode
1.2格式化master上的namenode(任意一個),然后啟動該節(jié)點的namenode。
bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode
1.3在另一個namenode節(jié)點slave1上同步master上的元數(shù)據(jù)信息
bin/hdfs namenode -bootstrapStandby
1.4停止hdfs上的所有服務
sbin/stop-dfs.sh
1.5初始化zkfc
bin/hdfs zkfc -formatZK
1.6啟動hdfs
sbin/start-dfs.sh
1.7啟動yarn
sbin/start-yarn.sh
2.1直接啟動hdfs和yarn即可,namenode、datanode、journalnode、DFSZKFailoverController都會自動啟動。
sbin/start-dfs.sh
2.2啟動yarn
sbin/start-yarn.sh
[hadoop@master hadoop-2.7.3]$ jps 26544 QuorumPeerMain 25509 JournalNode 25704 DFSZKFailoverController 26360 Jps 25306 DataNode 25195 NameNode 25886 ResourceManager 25999 NodeManager
[hadoop@slave1 root]$ jps 2289 DFSZKFailoverController 9400 QuorumPeerMain 2601 Jps 2060 DataNode 2413 NodeManager 2159 JournalNode 1983 NameNode
[hadoop@slave2 root]$ jps 11984 DataNode 12370 Jps 2514 QuorumPeerMain 12083 JournalNode 12188 NodeManager
“hadoop2.7.3+HA+YARN+zookeeper高可用集群如何部署”的內容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關的知識可以關注創(chuàng)新互聯(lián)網(wǎng)站,小編將為大家輸出更多高質量的實用文章!
網(wǎng)頁標題:hadoop2.7.3+HA+YARN+zookeeper高可用集群如何部署
標題網(wǎng)址:http://jinyejixie.com/article16/jopcdg.html
成都網(wǎng)站建設公司_創(chuàng)新互聯(lián),為您提供面包屑導航、商城網(wǎng)站、網(wǎng)站內鏈、定制開發(fā)、自適應網(wǎng)站、標簽優(yōu)化
聲明:本網(wǎng)站發(fā)布的內容(圖片、視頻和文字)以用戶投稿、用戶轉載內容為主,如果涉及侵權請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內容未經(jīng)允許不得轉載,或轉載時需注明來源: 創(chuàng)新互聯(lián)