Mahout is basically a set of machine learning Java libraries meant to be used for various tasks, such as classification, evaluation clustering, pattern-mining, and so on. There are many good frameworks that are user-friendly and fully equipped with more algorithms to do these tasks. For reference, the R community is much bigger and in the Java world we have had the RapidMiner and Weka frameworks present on the scene for many years. So why should we use Mahout instead of the aforementioned frameworks? Well, the truth is that all the previous frameworks are not meant to be designed for very large datasets. When we refer to very large datasets we refer to datasets, no matter the format, whose records require an order in the scale of a hundred million records. The power of Mahout lies in the fact that the algorithms are meant to be used in a Hadoop environment. Hadoop is a general framework that allows for an algorithm to run in parallel on multiple machines (called nodes) using the distributed computing paradigm.
Click here to view Apache Mahout Tutorial