RichThinker: 7月 2012

2012年7月11日星期三

Bulkload data from Hadoop dfs to HBase

To bulkload data from Hadoop dfs to Hbase, the following steps requires to be ready before the importation.

1. Create table in HBase
create 't1', 'c1'

2. Make sample directory in dfs
hadoop dfs -mkdir sampledir

3. Put sample data into hadoop dfs
hadoop dfs -put input.data sampledir

4. Pull the data from hadoop dfs to hbase
./hadoop jar /opt/hbase/hbase-0.90.3.jar importtsv -Dimporttsv.columns=HBASE_ROW_KEY,c1 t1 sampledir

If the following error encounters:
Error: java.lang.ClassNotFoundException: org.apache.zookeeper.KeeperException

copy the jar file into /hbase/lib
cp zookeeper-3.3.3.jar /hadoop/lib

2012年7月10日星期二

Hbase (Version 0.90.3) Installation

The installation of Zookeeper + Hadoop + Hbase is quite complicated. Though the menu and online resource has explained some methods to fully install the package. But sometimes tragedy happens. Here is my solution to install Hbase 0.90.3 against Hadoop 0.20.3

1. Make sure version of Hbase and Hadoop matches,
If the version you use in Hadoop is 0.20.3, do not try to install Hbase with version differs alog
Error like:
FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
java.io.IOException: Call to node1:9000 failed on local exception: java.io.EOFException
happens

2. untar the file hbase-0.90.3.tar.gz with tar -zxvf hbase-0.90.3.tar.gz

3. Place the directory under /opt/hbase

4. Configure regionservers, hbase-site.xml, hbase-env.sh

Regionservers: place the hostname of master and slaves in hadoop here
hbase-site.xml: (IF .META. is CORRUPTED, be sure to replace rootdir from original path, says hdfs://node1:9000/hbase to hdfs://node1:9000/hbase1)

hbase.rootdir
hdfs://node1:9000/hbase
The directory shared by region servers.

hbase.cluster.distributed
true

hbase.zookeeper.quorum
node1

hbase-env.sh:

export HBASE_OPTS="-ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"

export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers

export HBASE_LOG_DIR=${HBASE_HOME}/logs

export HBASE_PID_DIR=/var/hadoop/hbase-pids

export HBASE_MANAGES_ZK=false

5. Replace the jar file (!IMPORTANT)

rm /opt/hbase/lib/hadoop-core-0.20-append-r1056497.jar
cp /opt/hadoop-0.20.2/hadoop-0.20.0-core.jar /opt/hbase/lib/

6. sh start-hbase.sh

7. Using jps to check existing process

10700 TaskTracker-- from hadoop
  10494 SecondaryNameNode-- from hadoop
  2676 RunJar-- from hadoop
  10232 NameNode -- from hadoop
  10575 JobTracker-- from hadoop
  10361 DataNode -- from hadoop
  3541 HMaster --from hbase
  3686 HRegionServer -- from hbase
  31336 QuorumPeerMain -- from zookeepr

6. ./hbase shell (START HBASE ING~)

7. If you encounter error: java.io.IOException: HRegionInfo was null or empty in -ROOT-
your meta is corrupted, be sure to use step 4 in setting hbase-site.xml to resolve the issue.

GOOD LUCK

2012年7月11日 星期三

Bulkload data from Hadoop dfs to HBase

2012年7月10日 星期二

Hbase (Version 0.90.3) Installation

2012年7月11日星期三

2012年7月10日星期二