A Hadoop administrator will have to work closely with the Database team, Network team, BI team and Application teams to make sure that all the Big Data Applications are highly available and performing as expected. Hadoop Admin is responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the Hadoop cluster.
Hadoop Admin is also responsible for deciding the size of the Hadoop cluster based on the data to be stored in HDFS. Ensure that the Hadoop Cluster is up and running all the time. Monitoring the Cluster connectivity and performance. Manage and review Hadoop log files. Backup and Recovery tasks. Resource and security management troubleshooting Application errors and ensuring that they do not occur again.
A Hadoop Admin should not settle for a quick fix to a problem but rather should have curiosity to find the root cause of the problem and solve it in an optimal way to prevent further issues.
• Deploying a Hadoop Cluster, maintaining a Hadoop Cluster, adding and removing nodes using cluster monitoring tools like Ganglia Nagios or Cloudera Manager, configuring the NameNode high availability and keeping a track of all the running Hadoop jobs.
• Implementing, managing and administering the overall Hadoop infrastructure.
• Takes care of the day-to-day running of Hadoop clusters, 24x7x365 when necessary.
Other important skills required–
• Excellent knowledge of UNIX/LINUX OS because Hadoop runs on Linux.
• Knowledge of high degree configuration management and automation tools.
• Knowing of core java is a plus for a Hadoop admin but not mandatory.
• Good understanding of OS concepts, process management and resource scheduling.
• Basics of networking, CPU, memory and storage.
• Good hold of shell and python scripting.
• A knack of all the components in the Hadoop ecosystem like IMPALA Apache Pig, Apache Hive, Apache HBASE, SPARK etc.