PHP-Fusion Powered Website - Articles: HBase Tutorials for Beginners

Users Online

· Guests Online: 19

· Members Online: 0

· Total Members: 188
· Newest Member: meenachowdary055

Forum Threads

Newest Threads

No Threads created

Hottest Threads

No Threads created

Latest Articles

· Leccture 8
· Udemy – Intro Robo...
· Dropshipping
· Udemy – The Comple...
· Udemy – Course 4: ...

Oh no! Where's the JavaScript?
Your Web browser does not have JavaScript enabled or does not support JavaScript. Please enable JavaScript on your Web browser to properly view this Web site,
or upgrade to a Web browser that does support JavaScript; Firefox, Safari, Opera, Chrome or a version of Internet Explorer newer then version 6.

Articles Hierarchy

Articles Home » Big Data » HBase Tutorials for Beginners

HBase Tutorials for Beginners

HBase Tutorials for Beginners

HBase is an open source, distributed database, developed by Apache Software foundation.

Initially, it was Google Big Table, afterwards it was re-named as HBase and is primarily written in Java.

HBase can store massive amounts of data from terabytes to petabytes.

HBase Unique Features

HBase is built for low latency operations
HBase is used extensively for random read and write operations
HBase stores large amount of data in terms of tables
Provides linear and modular scalability over cluster environment
Strictly consistent to read and write operations
Automatic and configurable sharding of tables
Automatic failover supports between Region Servers
Convenient base classes for backing Hadoop MapReduce jobs in HBase tables
Easy to use Java API for client access
Block cache and Bloom Filters for real-time queries
Query predicate pushes down via server side filters.

Here is what we cover in the Tutorial series

Tutorial	HBase Architecture, Data Flow, and Use cases
Tutorial	How to Download & Install Hbase
Tutorial	HBase Shell and General Commands
Tutorial	Create, Insert, Read Tables in HBase
Tutorial	HBase: Limitations, Advantage & Problems
Tutorial	HBase Troubleshooting
Tutorial	HBase Vs Hive
Tutorial	Hbase Interview Questions & Answers

Why to Choose HBase?

A table for a popular web application may consist of billions of rows. If we want to search particular row from such huge amount of data, HBase is the ideal choice as query fetch time in less. Most of the online analytics applications uses HBase.

Traditional relational data models fail to meet performance requirements of very big databases. These performance and processing limitations can be overcomed by HBase.

Importance of NoSQL Databases in Hadoop

In big data analytics, Hadoop plays a vital role in solving typical business problems by managing large data sets and gives best solutions in analytics domain.

In Hadoop ecosystem, each component plays its unique role for the

Data processing
Data validation
Data storing

In terms of storing unstructured, semi-structured data storage as well as retrieval of such data's, relational databases are less useful. Also, fetching results by applying query on huge data sets that are stored in Hadoop storage is a challenging task. NoSQL storage technologies provide the best solution for faster querying on huge data sets.

Other NoSQL storage type Databases

Some of the NoSQL models present in the market are Cassandra, MongoDB, and CouchDB. Each of these models has different ways of storage mechanism.

For example, MongoDB is a document-oriented database from NoSQL family tree. Compared to traditional databases it provides best features in terms of performance, availability and scalability. It is an open source document-oriented database, and it's written in C++.

Cassandra is also a distributed database from open source Apache software which is designed to handle a huge amount of data stored across commodity servers. Cassandra provides high availability with no single point of failure.

While, CouchDB is a document-oriented database in which each document fields are stored in key-value maps.

How HBase different from other NoSQL model

HBase storage model is different from other NoSQL models discussed above. This can be stated as follow

HBase stores data in the form of key/value pairs in a columnar model. In this model, all the columns are grouped together as Column families
HBase provides flexible data model and low latency access to small amounts of data stored in large data sets
HBase on top of Hadoop will increase throughput and performance of distributed cluster set up. In turn, it provides faster random reads and writes operations

Which NoSQL Database to choose?

MongoDB, CouchDB, and Cassandra are of NoSQL type databases that are feature specific and used as per their business needs. Here, we have listed out different NoSQL database as per their use case.

Data Base Type Based on Feature	Example of Database	Use case (When to Use)
Key/ Value	Redis, MemcacheDB	Caching, Queue-ing, Distributing information
Column Oriented	Cassandra, HBase	Scaling, Keeping Unstructured, non-volatile
Document Oriented	MongoDB, Couchbase	Nested Information, JavaScript friendly
Graph Based	OrientDB, Neo4J	Handling Complex relational information. Modeling and Handling classification.

Where is HBase used?

Telecom Industry

Problem Statement:

Storing billions of CDR (Call detailed recording) log records generated by telecom domain
Providing real-time access to CDR logs and billing information of customers
Provide cost effective solution comparing to traditional database systems

Solution:

HBase is used to store billions of rows of call detailed records. If 20TB of data is added per month to the existing RDBMS database, performance will deteriorate. To handle a large amount of data in this use case, HBase is the best solution. HBase performs fast querying and display records.

Banking Industry

Problem Statement:

The Banking industry generates millions of records on a daily basis. In addition to this, banking industry also needs analytics solution that can detect Fraud in money transactions.

Solution:

To store, process and update huge volumes of data and performing analytics, an ideal solution is - HBase integrated with several Hadoop eco system components.

That apart, HBase can be used -

Whenever there is a need to write heavy applications.
Performing online log analytics and to generate compliance reports.

Summary:-

HBase provides unique features and will solve typical industrial use cases. As a column-oriented storage, it provides fast querying, fetching of results and high amount of data storage.

Page 1 of 9: 12 3 4...9

Comments

No Comments have been Posted.

Post Comment

Please Login to Post a Comment.

Ratings

Rating is available to Members only.

Please login or register to vote.

No Ratings have been Posted.

Render time: 1.11 seconds

16,898,937 unique visits