Install hadoop in a single Cent-OS node
From Notes_Wiki
Revision as of 15:40, 14 October 2013 by Saurabh (talk | contribs) (Created page with "<yambe:breadcrumb>Java|Java</yambe:breadcrumb> =Install hadoop in a single Cent-OS node= # Install Oracle Java. Steps can be learned from Install oracle java in Cent-OS ...")
<yambe:breadcrumb>Java|Java</yambe:breadcrumb>
Install hadoop in a single Cent-OS node
- Install Oracle Java. Steps can be learned from Install oracle java in Cent-OS
- Create user account and password for hadoop using:
- sudo /sbin/useradd hadoop
- sudo /usr/bin/passwd hadoop
- Configure key based login from hadoop to hadoop itself using:
- sudo su - hadoop
- ssh-keygen
- cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- chmod 0600 ~/.ssh/authorized_keys
- #To test configuration, should echo hadoop
- ssh hadoop@localhost "echo $USER"
- exit
- Download hadoop source from one of the mirrors linked at https://www.apache.org/dyn/closer.cgi/hadoop/common/ Download the latest stable .tar.gz release from stable folder. (Ex hadoop-1.2.1.tar.gz)
- Extract hadoop sources in /opt/hadoop and make hadoop:hadoop its owner using:
- sudo mkdir /opt/hadoop
- cd /opt/hadoop/
- sudo tar xzf <path-to-hadoop-source>
- sudo mv hadoop-1.2.1 hadoop
- sudo chown -R hadoop:hadoop .
- Configure hadoop for single node setup using:
- Login as user hadoop and change pwd to /opt/hadoop/hadoop using:
- sudo su - hadoop
- cd /opt/hadoop/hadoop
- Edit conf/core-site.xml and insert following within configuration tag:
- <property>
- <name>fs.default.name</name>
- <value>hdfs://localhost:9000/</value>
- </property>
- <property>
- <name>dfs.permissions</name>
- <value>false</value>
- </property>
- Edit conf/hdfs-site.xml and insert following within configuration tag:
- <property>
- <name>dfs.data.dir</name>
- <value>/opt/hadoop/hadoop/dfs/name/data</value>
- <final>true</final>
- </property>
- <property>
- <name>dfs.name.dir</name>
- <value>/opt/hadoop/hadoop/dfs/name</value>
- <final>true</final>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>1</value>
- </property>
- Edit conf/mapred-site.xml and following within configuration tag:
- <property>
- <name>mapred.job.tracker</name>
- <value>localhost:9001</value>
- </property>
- Edit conf/hadoop-env.sh and do following changes:
- Uncomment JAVA_HOME and set it to export JAVA_HOME=/opt/jdk1.7.0_40 or appropriate value based on installed java
- Uncommend HADOOP_OPTS and set it to
- export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
- Login as user hadoop and change pwd to /opt/hadoop/hadoop using:
- Format namenode using:
- ./bin/hadoop namenode -format
- Start all services using:
- ./bin/start-all.sh
- Verify that all services got started using 'jps' command whose ouput should be similar to:
- 26049 SecondaryNameNode
- 25929 DataNode
- 26399 Jps
- 26129 JobTracker
- 26249 TaskTracker
- 25807 NameNode
-
- with different process-ids.
- Try to access different services at:
- http://localhost:50030/ for the Jobtracker
- http://localhost:50070/ for the Namenode
- http://localhost:50060/ for the Tasktracker
- To stop all services use:
- ./bin/stop-all.sh
Steps learned from http://tecadmin.net/steps-to-install-hadoop-on-centosrhel-6/
<yambe:breadcrumb>Java|Java</yambe:breadcrumb>