Sunday, 4 January 2015

How to add new nodes to a hadoop cluster ?

Hadoop doesn't require any downtime for adding new nodes to the cluster. The following steps explains the procedure for adding new nodes to an existing hadoop cluster.
  • Get the machine ready with proper hardware and compatible software.
  • Install datanode & tasktracker/nodemanager in the new machine. 
  • Configure the data storage locations and temporary storage locations properly.
  • Add proper configuration files same as that of existing nodes in the hadoop cluster.
  • Add the new machine to the network properly.
  • Start the datanode & tasktracker/nodemanager services in the new machines.
  • The new machines will be automatically added to the hadoop cluster. You can verify this from the namenode and jobtracker/resource manager UI.
  • If the hadoop cluster is configured with allowed hosts, add the hostnames of new machines to the allowed hosts file.
  • After the nodes are added to the cluster you can run the balancer if required

No comments:

Post a Comment

How to check the memory utilization of cluster nodes in a Kubernetes Cluster ?

 The memory and CPU utilization of a Kubernetes cluster can be checked by using the following command. kubectl top nodes The above command...