Users Online
· Guests Online: 31
· Members Online: 0
· Total Members: 188
· Newest Member: meenachowdary055
· Members Online: 0
· Total Members: 188
· Newest Member: meenachowdary055
Forum Threads
Newest Threads
No Threads created
Hottest Threads
No Threads created
Latest Articles
Articles Hierarchy
Hadoop Tutorial: Master BigData
Sqoop vs Flume vs HDFS in Hadoop
Sqoop | Flume | HDFS |
Sqoop is used for importing data from structured data sources such as RDBMS. | Flume is used for moving bulk streaming data into HDFS. | HDFS is a distributed file system used by Hadoop ecosystem to store data. |
Sqoop has a connector based architecture. Connectors know how to connect to the respective data source and fetch the data. | Flume has an agent based architecture. Here, code is written (which is called as 'agent') which takes care of fetching data. | HDFS has a distributed architecture where data is distributed across multiple data nodes. |
HDFS is a destination for data import using Sqoop. | Data flows to HDFS through zero or more channels. | HDFS is an ultimate destination for data storage. |
Sqoop data load is not event driven. | Flume data load can be driven by event. | HDFS just stores data provided to it by whatsoever means. |
In order to import data from structured data sources, one has to use Sqoop only, because its connectors know how to interact with structured data sources and fetch data from them. | In order to load streaming data such as tweets generated on Twitter or log files of a web server, Flume should be used. Flume agents are built for fetching streaming data. | HDFS has its own built-in shell commands to store data into it. HDFS can not import streaming data |
Comments
No Comments have been Posted.
Post Comment
Please Login to Post a Comment.