Users Online
· Guests Online: 14
· Members Online: 0
· Total Members: 188
· Newest Member: meenachowdary055
· Members Online: 0
· Total Members: 188
· Newest Member: meenachowdary055
Forum Threads
Newest Threads
No Threads created
Hottest Threads
No Threads created
Latest Articles
Articles Hierarchy
Articles: Hadoop
04. Hadoop - Introduction
Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models. A Hadoop frame-worked application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.
Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models. A Hadoop frame-worked application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.
05. Hadoop - Environment Setup
Hadoop is supported by GNU/Linux platform and its flavors. Therefore, we have to install a Linux operating system for setting up Hadoop environment. In case you have an OS other than Linux, you can install a Virtualbox software in it and have Linux inside the Virtualbox.
Hadoop is supported by GNU/Linux platform and its flavors. Therefore, we have to install a Linux operating system for setting up Hadoop environment. In case you have an OS other than Linux, you can install a Virtualbox software in it and have Linux inside the Virtualbox.
06. Hadoop - HDFS Overview
Hadoop File System was developed using distributed file system design. It is run on commodity hardware. Unlike other distributed systems, HDFS is highly faulttolerant and designed using low-cost hardware.
Hadoop File System was developed using distributed file system design. It is run on commodity hardware. Unlike other distributed systems, HDFS is highly faulttolerant and designed using low-cost hardware.
08. Hadoop - Command Reference
There are many more commands in "$HADOOP_HOME/bin/hadoop fs" than are demonstrated here, although these basic operations will get you started. Running ./bin/hadoop dfs with no additional arguments will list all the commands that can be run with the FsShell system. Furthermore, $HADOOP_HOME/bin/hadoop fs -help commandName will display a short usage summary for the operation in question, if you are stuck.
There are many more commands in "$HADOOP_HOME/bin/hadoop fs" than are demonstrated here, although these basic operations will get you started. Running ./bin/hadoop dfs with no additional arguments will list all the commands that can be run with the FsShell system. Furthermore, $HADOOP_HOME/bin/hadoop fs -help commandName will display a short usage summary for the operation in question, if you are stuck.
09. MapReduce
MapReduce is a framework using which we can write applications to process huge amounts of data, in parallel, on large clusters of commodity hardware in a reliable manner.
MapReduce is a framework using which we can write applications to process huge amounts of data, in parallel, on large clusters of commodity hardware in a reliable manner.
10. Hadoop - Streaming
Hadoop streaming is a utility that comes with the Hadoop distribution. This utility allows you to create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer.
Hadoop streaming is a utility that comes with the Hadoop distribution. This utility allows you to create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer.
11. Hadoop Multi Node Cluster
This chapter explains the setup of the Hadoop Multi-Node cluster on a distributed environment. As the whole cluster cannot be demonstrated, we are explaining the Hadoop cluster environment using three systems (one master and two slaves); given below are their IP addresses. Hadoop Master: 192.168.1.15 (hadoop-master) Hadoop Slave: 192.168.1.16 (hadoop-slave-1) Hadoop Slave: 192.168.1.17 (hadoop-slave-2) Follow the steps given below to have Hadoop Multi-Node cluster setup.
This chapter explains the setup of the Hadoop Multi-Node cluster on a distributed environment. As the whole cluster cannot be demonstrated, we are explaining the Hadoop cluster environment using three systems (one master and two slaves); given below are their IP addresses. Hadoop Master: 192.168.1.15 (hadoop-master) Hadoop Slave: 192.168.1.16 (hadoop-slave-1) Hadoop Slave: 192.168.1.17 (hadoop-slave-2) Follow the steps given below to have Hadoop Multi-Node cluster setup.
12. Hadoop Interview Questions and Answers
Dear readers, these Hadoop Interview Questions have been designed specially to get you acquainted with the nature of questions you may encounter during your interview for the subject of Hadoop. As per my experience good interviewers hardly plan to ask any particular question during your interview, normally questions start with some basic concept of the subject and later they continue based on further discussion and what you answer −
Dear readers, these Hadoop Interview Questions have been designed specially to get you acquainted with the nature of questions you may encounter during your interview for the subject of Hadoop. As per my experience good interviewers hardly plan to ask any particular question during your interview, normally questions start with some basic concept of the subject and later they continue based on further discussion and what you answer −
Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
Hadoop Questions and Answers Data Flow
Hadoop Questions and Answers – Data Flow This set of Hadoop Multiple Choice Questions & Answers (MCQs) focuses on “Data Flow”.
Hadoop Questions and Answers – Data Flow This set of Hadoop Multiple Choice Questions & Answers (MCQs) focuses on “Data Flow”.
Page 1 of 2: 12