Hadoop的HA高可用


Hadoop的HA高可用(可行)

一、集群的规划

ZooKeeper集群

192.168.116.121 192.168.116.122 192.168.116.123
hsiehchou121 hsiehchou122 hsiehchou123

Hadoop集群

192.168.116.121 192.168.116.122 192.168.116.123 192.168.116.124
hsiehchou121 hsiehchou122 hsiehchou123 hsiehchou124
NameNode1 NameNode2 DataNode1 DataNode2
ResourceManager1 ResourceManager2 NodeManager1 NodeManager2
Journalnode Journalnode

二、准备工作

1、安装JDK
2、配置环境变量
3、配置免密码登录
4、配置主机名

三、配置Zookeeper(在192.168.116.121安装)

在主节点(hsiehchou121)上配置ZooKeeper

1、配置/root/hd/zookeeper-3.4.10/conf/zoo.cfg文件

dataDir=/root/hd/zookeeper-3.4.10/zkData
+++++++++++++++zkconfig+++++++++++++++++
server.1=hsiehchou121:2888:3888
server.2=hsiehchou122:2888:3888
server.3=hsiehchou123:2888:3888

2、在/root/training/zookeeper-3.4.6/tmp目录下创建一个myid的空文件

echo 1 > /root/hd/zookeeper-3.4.10/tmp/myid

3、将配置好的ZooKeeper拷贝到其他节点,同时修改各自的myid文件

scp -r /root/hd/zookeeper-3.4.10/ hsiehchou122:/root/hd
scp -r /root/hd/zookeeper-3.4.10/ hsiehchou123:/root/hd

四、安装Hadoop集群(在hsiehchou121上安装)

1、修改hadoop-env.sh

export JAVA_HOME=/root/hd/jdk1.8.0_192

2、修改core-site.xml

<configuration>
    <!-- 指定hdfs的nameservice为mycluster -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>

    <!-- 指定hadoop临时目录 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/root/hd/hadoop-2.8.4/tmp</value>
    </property>

    <!-- 指定zookeeper地址 -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>hsiehchou121:2181,hsiehchou122:2181,hsiehchou123:2181</value>
    </property>
</configuration>

3、修改hdfs-site.xml(配置这个nameservice中有几个NameNode)

<configuration> 
    <!--指定hdfs的nameservice为mycluster,需要和core-site.xml中的保持一致 -->
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>

    <!-- mycluster下面有两个NameNode,分别是nn1,nn2 -->
    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2</value>
    </property>

    <!-- nn1的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>hsiehchou121:9000</value>
    </property>

    <!-- nn1的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>hsiehchou121:50070</value>
    </property>

    <!-- nn2的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>hsiehchou122:9000</value>
    </property>

    <!-- nn2的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>hsiehchou122:50070</value>
    </property>

    <!-- 指定NameNode的日志在JournalNode上的存放位置 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hsiehchou121:8485;hsiehchou122:8485;/mycluster</value>
    </property>

    <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/root/hd/hadoop-2.8.4/journal</value>
    </property>

    <!-- 开启NameNode失败自动切换 -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <!-- 配置失败自动切换实现方式 -->
    <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
            sshfence
            shell(/bin/true)
        </value>
    </property>

    <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>

    <!-- 配置sshfence隔离机制超时时间 -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
</configuration>

4、修改mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

5、修改yarn-site.xml

<configuration>
    <!-- 开启RM高可靠 -->
    <property>
       <name>yarn.resourcemanager.ha.enabled</name>
       <value>true</value>
    </property>

    <!-- 指定RM的cluster id -->
    <property>
       <name>yarn.resourcemanager.cluster-id</name>
       <value>yarncluster</value>
    </property>

    <!-- 指定RM的名字 -->
    <property>
       <name>yarn.resourcemanager.ha.rm-ids</name>
       <value>rm1,rm2</value>
    </property>

    <!-- 分别指定RM的地址 -->
    <property>
       <name>yarn.resourcemanager.hostname.rm1</name>
       <value>hsiehchou121</value>
    </property>
    <property>
       <name>yarn.resourcemanager.hostname.rm2</name>
       <value>hsiehchou122</value>
    </property>

    <!-- 指定zk集群地址 -->
    <property>
       <name>yarn.resourcemanager.zk-address</name>
       <value>hsiehchou121:2181,hsiehchou122:2181,hsiehchou123:2181</value>
    </property>

    <property>
       <name>yarn.nodemanager.aux-services</name>
       <value>mapreduce_shuffle</value>
    </property>

    <property>
         <name>yarn.resourcemanager.recovery.enabled</name>
         <value>true</value>
    </property>
    <property>
         <name>yarn.resourcemanager.store.class</name>
         <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
    <property>
          <name>yarn.scheduler.maximum-allocation-mb</name>
          <value>32768</value>
    </property>
    <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>32768</value>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>4096</value>
    </property>
    <property>
          <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>24</value>
    </property>
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>/tmp/yarn-logs</value>
    </property>
</configuration>

6、修改slaves

hsiehchou123
hsiehchou124

7、将配置好的hadoop拷贝到其他节点

scp -r /root/hd/hadoop-2.8.4/ root@hsiehchou122:/root/hd/
scp -r /root/hd/hadoop-2.8.4/ root@hsiehchou123:/root/hd/
scp -r /root/hd/hadoop-2.8.4/ root@hsiehchou124:/root/hd/

五、启动Zookeeper集群

1、启动Zookeeper集群

[root@hsiehchou121 hadoop-2.8.4]# zkServer.sh start
[root@hsiehchou122 hadoop-2.8.4]# zkServer.sh start
[root@hsiehchou123 hadoop-2.8.4]# zkServer.sh start

六、在hsiehchou121和hsiehchou122上启动journalnode

hadoop-daemon.sh start journalnode

七、格式化HDFS(在hsiehchou121上执行)

1. 格式化ZooKeeper

[root@hsiehchou121 hadoop-2.8.4]# bin/hdfs zkfc -formatZK

2、启动HDFS

1)在各个JournalNode节点上,输入以下命令启动journalnode服务
[root@hsiehchou121 hadoop-2.8.4]# sbin/hadoop-daemon.sh start journalnode

2)在[nn1]上,对其进行格式化,并启动
[root@hsiehchou121 hadoop-2.8.4]# bin/hdfs namenode -format
[root@hsiehchou121 hadoop-2.8.4]# sbin/hadoop-daemon.sh start namenode

3)在[nn2]上,同步nn1的元数据信息
[root@hsiehchou122 hadoop-2.8.4]# bin/hdfs namenode -bootstrapStandby

八、在hsiehchou121上启动Hadoop集群

[root@hsiehchou121 hadoop-2.8.4]# start-all.sh

日志
This script is Deprecated. Instead use start-dfs.sh and start-yar
Starting namenodes on [hsiehchou121 hsiehchou122]
hsiehchou121: starting namenode, logging to /root/hd/hadoop-2.8.4-hsiehchou121.out
hsiehchou122: starting namenode, logging to /root/hd/hadoop-2.8.4-hsiehchou122.out
hsiehchou124: starting datanode, logging to /root/hd/hadoop-2.8.4-hsiehchou124.out
hsiehchou123: starting datanode, logging to /root/hd/hadoop-2.8.4-hsiehchou123.out
Starting journal nodes [hsiehchou121 hsiehchou122 ]
hsiehchou121: starting journalnode, logging to /root/hd/hadoop-2.alnode-hsiehchou121.out
hsiehchou122: starting journalnode, logging to /root/hd/hadoop-2.alnode-hsiehchou122.out
Starting ZK Failover Controllers on NN hosts [hsiehchou121 hsiehc
hsiehchou121: starting zkfc, logging to /root/hd/hadoop-2.8.4/logou121.out
hsiehchou122: starting zkfc, logging to /root/hd/hadoop-2.8.4/logou122.out
starting yarn daemons
starting resourcemanager, logging to /root/hd/hadoop-2.8.4/logs/ysiehchou121.out
hsiehchou123: starting nodemanager, logging to /root/hd/hadoop-2.ager-hsiehchou123.out
hsiehchou124: starting nodemanager, logging to /root/hd/hadoop-2.ager-hsiehchou124.out

hsiehchou122上的ResourceManager需要单独启动
命令
[root@hsiehchou121 hadoop-2.8.4]# ./sbin/yarn-daemon.sh start resourcemanager


文章作者: 谢舟
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 谢舟 !
 上一篇
Storm基础 Storm基础
流式计算专题批量计算、实时计算、离线计算、流式计算 共同点:数据源 –> 采集数据 –> task worker –> task worker –> sink 输出 批量计算和流式计算区别:处理
2019-05-04
下一篇 
Kafka Kafka
离线部分Hadoop->离线计算(hdfs / mapreduce) yarnzookeeper->分布式协调(动物管理员)hive->数据仓库(离线计算 / sql)easy codingflume->数据采集sq
2019-04-26
  目录