advanced tech: April 2014

Start all the Hadoop services in the following order:

HDFS
MapReduce
ZooKeeper
HBase
Hive Metastore
HiveServer2
WebHCat
Oozie
Ganglia
Nagios

Instructions

Start HDFS
1. Execute these commands on the NameNode host machine:su -l hdfs -c "/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start namenode"
2. Execute these commands on the Secondary NameNode host machine:su -l hdfs -c "/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start secondarynamenode”
3. Execute these commands on all DataNodes:su -l hdfs -c "/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode"
Start MapReduce
1. Execute these commands on the JobTracker host machine:su -l mapred -c "/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start jobtracker; sleep 25"
2. Execute these commands on the JobTracker host machine:su -l mapred -c "/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start historyserver"
3. Execute these commands on all TaskTrackers:su -l mapred -c "/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start tasktracker"

Start ZooKeeper. On the ZooKeeper host machine, execute the following command:

su - zookeeper -c "export  ZOOCFGDIR=/etc/zookeeper/conf ; export ZOOCFG=zoo.cfg ; source /etc/zookeeper/conf/zookeeper-env.sh ; /usr/lib/zookeeper/bin/zkServer.sh start"

Start HBase
1. Execute these commands on the HBase Master host machine:su -l hbase -c "/usr/lib/hbase/bin/hbase-daemon.sh --config /etc/hbase/conf start master"
2. Execute these commands on all RegionServers:su -l hbase -c "/usr/lib/hbase/bin/hbase-daemon.sh --config /etc/hbase/conf start regionserver"
Start Hive Metastore. On the Hive Metastore host machine, execute the following command:su -l hive -c "nohup hive --service metastore > $HIVE_LOG_DIR/hive.out 2> $HIVE_LOG_DIR/hive.log &"

where $HIVE_LOG_DIR is the directory where Hive server logs are stored (example: /var/log/hive).
Start HiveServer2. On the Hive Server2 host machine, execute the following command:sudo su hive -c "nohup /usr/lib/hive/bin/hiveserver2 -hiveconf hive.metastore.uris=\" \" > $HIVE_LOG_DIR /hiveServer2.out 2>$HIVE_LOG_DIR/hiveServer2.log &"

where $HIVE_LOG_DIR is the directory where Hive server logs are stored (example: /var/log/hive).
Start WebHCat. On the WebHCat host machine, execute the following command:su -l hcat -c "/usr/lib/hcatalog/sbin/webhcat_server.sh start"
Start Oozie. On the Oozie server host machine, execute the following command:sudo su -l oozie -c "cd $OOZIE_LOG_DIR/log; /usr/lib/oozie/bin/oozie-start.sh"

where $OOZIE_LOG_DIR is the directory where Oozie log files are stored (for example: /var/log/oozie).
Start Ganglia.
1. Execute this command on the Ganglia server host machine:/etc/init.d/hdp-gmetad start
2. Execute this command on all the nodes in your Hadoop cluster:/etc/init.d/hdp-gmond start
Start Nagios.
```
service nagios start
```

1. HDFS Ports

The following table lists the default ports used by the various HDFS services.

Table 2.1. HDFS Ports
Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
NameNode WebUI	Master Nodes (NameNode and any back-up NameNodes)	50070	http	Web UI to look at current status of HDFS, explore file system	Yes (Typically admins, Dev/Support teams)	`dfs.http.address`
NameNode WebUI	Master Nodes (NameNode and any back-up NameNodes)	50470	https	Secure http service	Yes (Typically admins, Dev/Support teams)	`dfs.https.address`
NameNode metadata service	Master Nodes (NameNode and any back-up NameNodes)	8020/9000	IPC	File system metadata operations	Yes (All clients who directly need to interact with the HDFS)	Embedded in URI specified by `fs.default.name`
DataNode	All Slave Nodes	50075	http	DataNode WebUI to access the status, logs etc.	Yes (Typically admins, Dev/Support teams)	`dfs.datanode.http.address`
		50475	https	Secure http service	Yes (Typically admins, Dev/Support teams)	`dfs.datanode.https.address`
		50010		Data transfer		`dfs.datanode.address`
		50020	IPC	Metadata operations	No	`dfs.datanode.ipc.address`
Secondary NameNode	Secondary NameNode and any backup Secondanry NameNode	50090	http	Checkpoint for NameNode metadata	No	`dfs.secondary.http.address`

2. MapReduce Ports

The following table lists the default ports used by the various MapReduce services.

Table 2.2. MapReduce Ports
Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
JobTracker WebUI	Master Nodes (JobTracker Node and any back-up JobTracker node )	50030	http	Web UI for JobTracker	Yes	`mapred.job.tracker.http.address`
JobTracker	Master Nodes (JobTracker Node)	8021	IPC	For job submissions	Yes (All clients who need to submit the MapReduce jobs including Hive, Hive server, Pig)	Embedded in URI specified by`mapred.job.tracker`
TaskTracker Web UI and Shuffle	All Slave Nodes	50060	http	DataNode Web UI to access status, logs, etc.	Yes (Typically admins, Dev/Support teams)	`mapred.task.tracker.http.address`
History Server WebUI		51111	http	Web UI for Job History	Yes	`mapreduce.history.server.http.address`

3. Hive Ports

The following table lists the default ports used by the various Hive services.

	Note
	Neither of these services are used in a standard HDP installation.

Table 2.3. Hive Ports
Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
Hive Server2	Hive Server machine (Usually a utility machine)	10000	thrift	Service for programatically (Thrift/JDBC) connecting to Hive	Yes (Clients who need to connect to Hive either programatically or through UI SQL tools that use JDBC)	`ENV Variable HIVE_PORT`
Hive Metastore		9083	thrift	Yes (Clients that run Hive, Pig and potentially M/R jobs that use HCatalog)	`hive.metastore.uris`

4. HBase Ports

The following table lists the default ports used by the various HBase services.

Table 2.4. HBase Ports
Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
HMaster	Master Nodes (HBase Master Node and any back-up HBase Master node)	60000			Yes	`hbase.master.port`
HMaster Info Web UI	Master Nodes (HBase master Node and back up HBase Master node if any)	60010	http	The port for the HBaseMaster web UI. Set to -1 if you do not want the info server to run.	Yes	`hbase.master.info.port`
Region Server	All Slave Nodes	60020			Yes (Typically admins, dev/support teams)	`hbase.regionserver.port`
Region Server	All Slave Nodes	60030	http		Yes (Typically admins, dev/support teams)	`hbase.regionserver.info.port`
	All ZooKeeper Nodes	2888		Port used by ZooKeeper peers to talk to each other.Seehere for more information.	No	`hbase.zookeeper.peerport`
	All ZooKeeper Nodes	3888		Port used by ZooKeeper peers to talk to each other.Seehere for more information.		`hbase.zookeeper.leaderport`
		2181		Property from ZooKeeper's config `zoo.cfg`. The port at which the clients will connect.		`hbase.zookeeper.property.clientPort`

5. WebHCat Port

The following table lists the default ports used by the WebHCat service.

Table 2.5. WebHCat Port
Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
WebHCat Server	Any utility machine	50111	http	Web API on top of HCatalog and other Hadoop services	Yes	`templeton.port`

6. Ganglia Ports

The following table lists the default ports used by the various Ganglia services.

Table 2.6. Ganglia Ports
Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
	Ganglia server	8660/61/62/63		For gmond collectors
	All Slave Nodes	8660		For gmond agents
	Ganglia server	8651		For ganglia gmetad

7. MySQL Ports

The following table lists the default ports used by the various MySQL services.

Table 2.7. MySQL Ports
Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
MySQL	MySQL database server	3306

advanced tech

Tuesday, April 29, 2014

Starting HDP Services

Hadoop Ecosystem Default Port