Hadoop 2 (YARN) : How to setup a single node in Ubuntu (Tutorial)

**Only Draft but public :)**

Tutorial Requirements:

  • Hadoop 2.7.1
  • Java 8
  • Ubuntu (ubuntu-14.04.3-desktop-amd64.iso)
  • VirtualBox

As a convention:

  • we will place our development tools/products under /opt/dev directory.
  • For each product, we will have a directory and the product versions as subdirectories.



++ Add content





++ Add content

3. Add Hadoop users :

Choose a password for them


4. Data and Log directories

nn: name node?
snn: checkpoint
dn: datanode


5. So yarn user..got to be the owner of hadoop installation directory ??


6. Configure core-site.xml


++ Add content


7. Configure hdfs-site.xml :

NameNode : metadata server
DataNode : where the actual data is stored
SecondaryNameNode : checkpoint data for the NameNode


++ Add content


8. Configure mapreduce-site.xml

Initially the file mapreduce-site.xml doesn’t exist, but it can be cloned from mapreduce-site.xml.template


++ Add content

9. Configure yarn-site.xml


++ Add content


10. Format HDFS

The user hdfs which own the NameNode /var/data/hadoop/hdfs/nn must format this directory to setup a new file system.
You will /var/data/hadoop/hdfs/nn as a value for “dfs.namenode.name.dir” in $HADOOP_HOME/etc/hadoop/hdfs-site.xml


Check success by looking for this log:

11. Start HDFS services










Check the services are running by having a PID




12. Start Yarn Services










Remark : It’s almost mandatory to check if any PID for the ran service.
For example, I got no error message for starting nodemanager command.
However, I didn’t find the PID of nodemanager.
Then I decided to check the log using :


And I found this error :


So I had to fix it by correcting yarn-site.xml :

> From :


> To :

13. HDFS Dashboard GUI :


To check the logs :


14. Yarn (Resource Manager) Dashboard GUI :


15. Testing MapReduce


Result :