After installing Java, I wanted to install Hadoop on this Ubuntu 12.04 Server. There are a million tutorials, but I went with this one: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/. I wanted to be able to run Hadoop as all users, so I set JAVA_HOME inside of /etc/profile. That way, all profiles get that environmental variable.
I got down to the part where you format the HDFS filesystem through the following command:
/usr/local/hadoop/bin/hadoop namenode -format
And everything went just fine... NOT.
user@linux01:~$ sudo $HADOOP_INSTALL/bin/hadoop namenode -format
Error: JAVA_HOME is not set.
I was indignant! JAVA_HOME is totally set!
user@linux01:~$ tail -n 4
export JDK_HOME=$JAVA_HOME export PATH=$PATH:/usr/local/jdk1.6.0_32/bin
user@linux01:~$ echo $JAVA_HOME /usr/local/jdk1.6.0_32/bin
user@linux01:~$ ls $JAVA_HOME
appletviewer extcheck jar javac and so forth...
Some geeks on StackOverflow pointed me in the right direction. Environmental variables sometimes don't persist when you sudo. I su'd as my hadoop user and ran the command again. Success!
Of course, that's not the end of it. I went to start Hadoop using:
I can't remember the exact error I got. Namenode and Jobtracker started right up, but Datanode, SecondaryNamenode and Tasktracker didn't. I did some digging, and the ones that worked are part of the Namenode, started by hadoop-daemon.sh. The ones that didn't are part of the Hadoop Datanode, and are started by hadoop-daemons.sh. The processes that were not starting all had error logs complaining about, guess what, JAVA_HOME not being set. Finally, I bit the bullet and hard-coded JAVA_HOME in conf/hadoop-env.sh.
Too long, didn't read
The moral of the story is, hard code JAVA_HOME in conf/hadoop-env.sh.