Users Online
· Members Online: 0
· Total Members: 188
· Newest Member: meenachowdary055
Forum Threads
Latest Articles
Articles Hierarchy
HBase Tutorials for Beginners
Create, Insert, Read Tables in HBase
Hbase is a column oriented NoSql database for storing a large amount of data on top of Hadoop eco system. Handling tables in Hbase is a very crucial thing because all important functionalities such as Data operations, Data enhancements and Data modeling we can perform it through tables only in HBase.
Handling tables performs the following functions
- Creation of tables with column names and rows
- Inserting values into tables
- Retrieving values from tables
In HBase, we can perform table operations in two ways
- Shell command
- JAVA API
We already have seen how we can perform shell commands and operations in HBase. In this tutorial, we are going to perform some of the operations using Java coding through Java API.
Through Java API, we can create tables in HBase and also load data into tables using Java coding.
In this tutorial - we will learn,
HBase create table with Rows and Column names
In this section, we are going to create tables with column families and rows by
- Establishing connection with HBase through Java API
- Using eclipse for Java coding, debugging and testing
Establishing connection through Java API:
The Following steps guide us to develop Java code to connect HBase through Java API.
Step 1) In this step, we are going to create Java project in eclipse for HBase connection.
Creation of new project name "HbaseConnection" in eclipse.
For Java related project set up or creation of program: Refer /java-tutorial.html
If we observe the screen shot above.
- Give project name in this box. In our case, we have project name "HbaseConnection"
- Check this box for default location to be saved. In this /home/hduser/work/HbaseConnection is the path
- Check the box for Java environment here. In this JavaSE-1.7 is the Java edition
- Choose your option where you want to save file. In our case, we have selected option second "Create separate folder for sources and class files"
- Click on finish button.
- When you click on Finish button, it's going to create "HbaseConnection" project in eclipse
- It will directly come to eclipse home page after clicking finish button.
Step 2) On eclipse home page follow the following steps
Right click on project -> Select Build Path -> Configure build path
From above screenshot
- Right click on project
- Select build path
- Select configure build path
After clicking Configure Build path, it will open another window as shown in below screen shot
In this step, we will add relevant HBase jars into java project as shown in the screenshot.
- Important jars to be added hbase-0.94.8.jar, hadoop-core-1.1.2.jar
- Click on finish button
- Come to libraries
- Press option - Add External Jars
- Select required important jars
- Press finish button to add these files to 'src' of java project under libraries
After adding these jars, it will show under project "src" location. All the Jar files that fall under the project are now ready for usage with Hadoop ecosystem.
Step 3) In this step by using HBaseConnection.java, the HBase Connection would be established through Java Coding
- Select Run
- Select Run as Java Application
- This code will establish connection with HBase through Java API
- After Running this code 'guru99' table will be created in HBase with two column families named "education" and "projects". At present, the empty schema is only created in HBase.
From screen shot above we are performing following functions.
- Using HTableDescriptor we can able to create "guru99" table in HBase
- Using addFamily method, we are going to add "education" and "projects" as column names to "guru99" table.
The below coding is going to
- Establish connection with HBase and
- Create "guru99" table with two columns
Code Placed under HBaseConnection_Java document
// Place this code inside Hbase connection import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; Import org.apache.hadoop.hbase.client.HBaseAdmin; public class HBaseConnection { public static void main(String[] args) throws IOException { HBaseConfigurationhc = new HBaseConfiguration(new Configuration()); HTableDescriptorht = new HTableDescriptor("guru99"); ht.addFamily( new HColumnDescriptor("education")); ht.addFamily( new HColumnDescriptor("projects")); System.out.println( "connecting" ); HBaseAdminhba = new HBaseAdmin( hc ); System.out.println( "Creating Table" ); hba.createTable( ht ); System.out.println("Done......"); } }
This is required code you have to place in HBaseConnection.java and have to run java program
After running this program, it is going to establish a connection with HBase and in turn it will create a table with column names.
- Table name is "guru99"
- Column names are "education" and "projects"
Step 4) We can check whether "guru99" table is created with two columns in HBase or not by using HBase shell mode with "list" command.
The "list" command gives information about all the tables that is created in HBase.
Refer "HBase Shell and General Commands" article for more information on "list" command.
In this screen, we going to do
- Code checking in HBase shell by executing "list" command.
- If we run "list" command, it will display the table created in HBase as below. In our case, we can see table "guru99" is created
Placing values into tables and retrieving values from table:
In this section, we are going to
- Write data to HBase table and
- Read Data from HBase table
For example, we will
- Insert values into table "guru99" that previously created in the Step (3) of Creation of Table with Rows and Column names
- And then retrieve this value from table "guru99"
Here is the Java Code to be placed under HBaseLoading.java as shown below for both writing and retrieving data.
Code Placed under HBaseLoading_Java document
import java.io.IOException; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.client.ResultScanner; import org.apache.hadoop.hbase.client.Scan; import org.apache.hadoop.hbase.util.Bytes; public class HBaseLoading { public static void main(String[] args) throws IOException { // When you create a HBaseConfiguration, it reads in whatever you've set into your hbase-site.xml and in hbase-default.xml, as long as these can be found on the CLASSPATH org.apache.hadoop.conf.Configurationconfig = HBaseConfiguration.create(); //This instantiates an HTable object that connects you to the "test" table HTable table = newHTable(config, "guru99"); // To add to a row, use Put. A Put constructor takes the name of the row you want to insert into as a byte array. Put p = new Put(Bytes.toBytes("row1")); //To set the value you'd like to update in the row 'row1', specify the column family, column qualifier, and value of the table cell you'd like to update. The column family must already exist in your table schema. The qualifier can be anything. p.add(Bytes.toBytes("education"), Bytes.toBytes("col1"),Bytes.toBytes("BigData")); p.add(Bytes.toBytes("projects"),Bytes.toBytes("col2"),Bytes.toBytes("HBaseTutorials")); // Once you've adorned your Put instance with all the updates you want to make, to commit it do the following table.put(p); // Now, to retrieve the data we just wrote. Get g = new Get(Bytes.toBytes("row1")); Result r = table.get(g); byte [] value = r.getValue(Bytes.toBytes("education"),Bytes.toBytes("col1")); byte [] value1 = r.getValue(Bytes.toBytes("projects"),Bytes.toBytes("col2")); String valueStr = Bytes.toString(value); String valueStr1 = Bytes.toString(value1); System.out.println("GET: " +"education: "+ valueStr+"projects: "+valueStr1); Scan s = new Scan(); s.addColumn(Bytes.toBytes("education"), Bytes.toBytes("col1")); s.addColumn(Bytes.toBytes("projects"), Bytes.toBytes("col2")); ResultScanner scanner = table.getScanner(s); try { for (Result rr = scanner.next(); rr != null; rr = scanner.next()) { System.out.println("Found row : " + rr); } } finally { // Make sure you close your scanners when you are done! scanner.close(); } } }
First of all, we are going to see how to write data, and then we will see how to read data from an hbase table.
Step 1) In this step, we are going to write data into HBase table "guru99"
First we have to write code for insert and retrieve values from HBase by using-HBaseLoading.java program.
For creating and inserting values into a table at the column level, you have to code like below.
From the screen shot above
- When we create HBase configuration, it will point to whatever the configurations we set in hbase-site.xml and hbase-default.xml files during HBase installations
- Creation of table "guru99" using HTable method
- Adding row1 to table "guru99"
- Specifying column names "education" and "projects" and inserting values into column names in the respective row1. The values inserted here are "BigData" and "HBaseTutorials".
Step 2) Whatever the values that we placed in HBase tables in Step (1) , here we are going to fetch and display those values.
For retrieving results stored in "guru99"
The above screen shot shows the data is being read from HBase table 'guru99'
- In this, we are going to fetch the values that are stored in column families i.e "education" and "projects"
- Using "get" command we are going to fetch stored values in HBase table
- Scanning results using "scan" command. The values that are stored in row1 it will display on the console.
Once writing code is done, you have to run java application like this
- Right click on HBaseLoading.java -> Run As -> Java Application
- After running "HBaseLoading .java" the values going to insert into "guru99" in each column in HBase and in the same program it can retrieve values also.
Retrieving Inserted Values in HBase shell mode
In this section, we will check the following
- Values that are inserted into HBase table "guru99"
- Column names with values present in HBase Table guru99
From the above screen shot, we will get these points
- If we run "scan" command in HBase shell it will display the inserted values in "guru99" as follow
- In HBase shell, it will display values inserted by our code with column and row names
- Here we can see the column name inserted are "education" and "projects"
- The values inserted are "BigData" and "HBase Tutorials" into mentioned columns
Summary:
As we discussed in this article now, we are familiar with how to create tables, loading data into tables and retrieving data from table using Java API through Java coding. We can able to perform all type of shell command functionalities through this Java API. It will establish good client communication with HBase environment.
In our next article, we will see trouble-shooting for the HBase problems.