Some problems I came across when I deployed hadoop.
Java:
1) Install jdk:
openjdk-7-jdk
2) To find the path of java:
You can use "file" command to track the position of java
e.g.
which javac --> /usr/bin/javac
file /usr/bin/javac --> /etc/alternatives/javac
file /etc/alternatives/javac --> /usr/lib/jvm/java-7-openlink-amd64/bin/javac
file /usr/lib/jvm/java-7-openlink-amd64/bin/javac --> Done
/usr/bin/jvm/java-7-openjdk-amd64/
3) Set the environment variable of java in /etc/profile
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-i386
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
source /etc/profile # to make it effective
SSH:
1) Install OpenSSH Server
2) To config logging without pwd
ssh-keygen -t rsa
Copy the public key into the file authorized_keys
cp id_rsa.pub authorized_keys
3) Restart the service of SSH
service sshd restart
Hostname:
Copy the file of map between ip address and hostname into each host:/etc/hosts
Download the most stable version of hadoop: 1.2.1
1) Config the configuration files in directories, e.g. hadoop-env.sh, mapred-site.xml, core-site.xml, hdfs-site.xml
!!!!You may need to set the property to avoid default port
<property>
<name>mapred.job.tracker.http.address</name>
<value>hostname:50030</value>
</property>
Set the directory of hadoop into environment variable(in .bashrc file):
export HADOOP_PREFIX=/home/hadoop/hadoop-dir
exoprt PATH=$PATH:$HADOOP_PREFIX/bin
Some problems when configing Hadoop:
1) To custumize the plug-in of hadoop in eclipse, refer to the following url:
https://docs.google.com/document/d/1yuZ4IjlquPkmC1zXtCeL4GUNKT1uY1xnS_SCBJHps6A/edit?pli=1
2) Plug-in error:
When create a new map/reduce project, it appears the error message as follows:
"The selected wizard could not be started. Plug-in org.apache.hadoop.eclipse was unable to load class org.apache.hadoop.eclipse.NewMapReduceProjectWizard. org/apache/hadoop/eclipse/NewMapReduceProjectWizard : Unsupported major.minor version 51.0"
That is because of the version of java, utilizing the right version of java will be okay, i.e., update-alternatives
3) iptables
See the following error in logfile
java.io.IOException: File /root/hadoop-data/tmp/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
Solutions: stop iptables on all hosts
service iptables stop