2.快速开始-单机hbase
手册描述了运行在本地文件系统的单机版hbase的例子,这个配置并不适用于hbase生产环境,但是可以让你可以用hbase进行试验,这一小节教你如何使用hbaseshell客户端在hbase上创建一张表,并在表中插入一列,进行put/scan操作,使表有效、无效,开启或停止hbase 在本地文件系统上使用hbase并不能保证持久性。如果文件没有正常关闭,HDFS本地文件系统将失去编辑。当你在练习新的软件、开启或关闭服务时,这很有可能发生,你需要在hdfs上启动hbase以确保所有写入都被保存。在本地上运行是让你熟悉基本系统运行的捷径。
2.2. 开始hbase
procedure: 下载、配置并开启hbase
1.从apache download mirrors下载镜像
2.解压
$ tar xzvf hbase-2.0.0-SNAPSHOT-bin.tar.gz
$ cd hbase-2.0.0-SNAPSHOT/
3.HBase 0.98.5之后,在开始hbase之前,你需要设置环境变量JAVA_HOME,hbase提供了一个central mechanism 编辑conf/hbase-env.sh,去掉JAVA_HOME这行之前的批注#,根据你的系统设定好,bin/java
4.conf/hbase-site.xml是hbase的主要配置文件,当前你只要指定HBASE和zk写数据的本地目录。默认情况下,会在/tmp下创建一个新的目录 Many servers are configured to delete the contents of /tmp upon reboot, so you should store the data elsewhere.
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///home/testuser/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/testuser/zookeeper</value>
</property>
</configuration>
不需要创建hbase数据目录,hbase会去创建,如果你创建了,hbase会去进行一个迁移、复制,而你并不想
5.bin/start-hbase.sh 提供了启动hbase的方便的方法,执行它,jps来验证你有个hmaster的进程。在单机模式下hbase的所有服务都运行在一个jvm上 hmaster hregionserver zk
Procedure: Use HBase For the First Time
1.连接
使用hbase shell连接你正在运行的hbase
2.hbase help
注意:表名(table name)、rows columns 都在引号中
3.创建表
使用create命令来创建表,你必须指定表名和ColumnFamily名 create ‘test’, ‘cf’
4.list命令
list ‘test’
5.向表中传入数据 put
hbase(main):003:0> put ‘test’, ‘row1’, ‘cf:a’, ‘value1’
这里我们插入3个值,第一个插入在row1, column cf:a,值value1.hbase中的列是由一个列族前缀cf跟着一个冒号,然后是column qualifier后缀a
6.scan
7.获取单独一行数据get ‘test’, ‘row1’
8.disable a table
9.删掉drop
10.退出
Procedure: Stop HBase
bin/stop-hbase.sh
2.3.伪分布式本地安装
伪分布式是指HBase依然在一个主机上运行,但hbase服务(Hmaster/hreginserver/zk)run as a separate process
1.关闭hbase
2.配置
hbase-site.xml
hbase.cluster.distributed true 这项配置代表hbase在distributed mode上运行,下一步,更改hbase.rootdir从本地文件系统到你的hdfs上,使用hdfs://// URI符号 You do not need to create the directory in HDFS. HBase will do this for you. If you create the directory, HBase will attempt to do a migration, which is not what you want.
3.开启hbase
bin/start-hbase.sh 如果配置正确,jps显示 HMaster and HRegionServer processes running
4.检查hdfs中hbase目录
hadoop fs -ls /hbase
5.创建表并用数据填充(populate)
6.启动停止一个备份的hbase主节点。
HMaster server controls the HBase cluster你可以启动9个备份hmaster节点,这样算上主的,总共有10个hmasters To start a backup HMaster, use the local-master-backup.sh 对每个你想要的备份主节点,给他增加一个代表端口偏移的参数。每个HMaster使用3个端口(默认:16010,16020,16030),所以下面的命令启动3个备份节点使用端口16012/16022/16032, 16013/16023/16033, and 16015/16025/16035
local-master-backup.sh 2 3 5 要kill一个备份master而不关闭整个集群,首先要找到进程id(PID) ,在/tmp/hbase-USER-X-master.pid中,
cat /tmp/hbase-testuser-1-master.pid |xargs kill -9
7.Start and stop additional RegionServers
HRegionServer管理数据,一般来讲一个HRegionSerever管理集群中的一个节点,运行多个HRS在同一个系统中,可以用来测试伪分布式 local-regionservers.sh It works in a similar way to the local-master-backup.sh command, in that each parameter you provide represents the port offset for an instance.每个RS需要两个端口,默认端口是16020和16030,然而增加的RegionServer基本端口不是默认端口,因为默认端口被HMASTER使用,这个端口自hbase1.0开始也作为RegionServer,基本的端口用16200和16300代替。我们可以在一台服务器上运行99个额外的RegionServers,且可以这台服务器不是HMaster或备份HMaster。下面的命令在连续的端口(16202/16302)上启动四个增加的RS
$ .bin/local-regionservers.sh start 2 3 4 5
8.stop hbase
2.4. Advanced - Fully Distributed
现实中,我们需要一个分布式的配置去完整的测试HBase并在真实的场景中使用它。在分布式配置中,集群包含多个节点,每个运行一个或多个HBase服务。包含主备MASTER,多个ZK节点,多个RegionSever节点。本文增加两个额外的节点到集群中,具体结构
Procedure: Configure Passwordless SSH Access
1.node-a:$ ssh-keygen -t rsa 2.on node-b/c:create a .ssh/ directory in the user’s home directory 3.Copy the public key to the other nodes. On each of the other nodes, create a new file called .ssh/authorized_keys if it does not already exist, and append the contents of the id_rsa.pub file to the end of it. Note that you also need to do this for node-a itself.
$ cat id_rsa.pub >> ~/.ssh/authorized_keys
- Test password-less login.
Procedure: Prepare node-a
node-a将运行primary master和zk,但没有RS,所以停止node-a上的RS
1.Edit conf/regionservers and remove the line which contains localhost. Add lines with the hostnames or IP addresses for node-b and node-c.
即使你特别想运行RS在node-a,你应该用hostname来包含它,其他服务器将用来与它通信
2.配置HBase在node-b上做为backup master
Create a new file in conf/ called backup-masters, and add a new line to it with the hostname for node-b. In this demonstration, the hostname is node-b.example.com.
3.配置ZK
现实中,一定要认真考虑ZK的配置。这个配置将direct(指导?)hbase启动和管理一个ZK实例
On node-a, edit conf/hbase-site.xml and add the following properties.
hbase.zookeeper.quorum node-a.example.com,node-b.example.com,node-c.example.com hbase.zookeeper.property.dataDir /usr/local/zookeeper
Procedure: Prepare node-b and node-c
node-b will run a backup master server and a ZooKeeper instance.
1.Download and unpack HBase.
Download and unpack HBase to node-b, just as you did for the standalone and pseudo-distributed quickstarts.
2.Copy the configuration files from node-a to node-b.and node-c.
Each node of your cluster needs to have the same configuration information. Copy the contents of the conf/ directory to the conf/ directory on node-b and node-c.
Procedure: Start and Test Your Cluster
1.Be sure HBase is not running on any node.
2.Start the cluster.
On node-a, issue the start-hbase.sh command. Your output will be similar to that below. ZK先起来,然后是master,Rs最后是backup masters
ZooKeeper Process Name The HQuorumPeer process is a ZooKeeper instance which is controlled and started by HBase.
Web UI Port Changes hbase0.98x以后版本,HTTP端口(hbase web ui)从60010(master)60030(each RS)变为 16010(master)16030(RS)