We provide IT Staff Augmentation Services!

Hadoop Engineer Resume

5.00/5 (Submit Your Rating)

West Chester, PA

PROFESSIONAL SUMMARY:

  • 7+ years of professional experience in IT field and 3+ years of experience in maintaining, monitoring, deploying and upgrading Apache Hadoop Clusters and Cloudera.
  • Experience in deploying versions of MRv1 and MRv2 (YARN).
  • Strong knowledge on configuring High Availability and Namenode Federation in a cluster.
  • Hands on experience working on Hadoop ecosystem components like Hadoop Map Reduce, HDFS, Zoo Keeper, Oozie, Hive, Cassandra, Sqoop, Pig, Flume.
  • Experience using scripting languages like Pig to manipulate data.
  • Strong understanding on Hadoop architecture and MapReduce framework.
  • Experience in designing, implementing and managing Secure Authentication mechanism to Hadoop cluster with Kerberos.
  • Strong knowledge on setting up automatic failover control and manual failover control using zookeeper and quorum journal nodes.
  • Experience in upgrading the existing Hadoop cluster to latest releases.
  • Experience in working with Flume to load the log data from multiple sources directly into HDFS.
  • Hands on experience in installing, configuring and deploying Hadoop distributions in cloud environments (Amazon Web Services)
  • Experience in performing joins with Map - Reduce and Hive
  • Experience in configuring Hadoop based monitoring tools- Nagios, Ganglia.
  • Experience in transferring data between HDFS and Relational Database with Sqoop.
  • Experience in using Flume to load logs from multiple sources directly into HDFS.
  • Good understanding of distributed systems and parallel processing architectures.
  • Hands on experience in upgrading, applying patches for Cloudera distribution
  • Good knowledge on Cassandra, MongoDB, Netezza and Vertica.
  • Good Understanding of Distributed Systems and Parallel Processing architecture.
  • Developed administrative scripts like Kickstart scripts with Shell
  • Experience in running cron-tab to back up data and schedule jobs.

TECHNICAL SKILLS:

Hadoop Ecosystem Development: HDFS, Hive, Pig, Flume, Oozie, Zookeeper, HBase and Sqoop.

Operating System: Linux, Windows XP, Server 2008.

Databases: MySQL, Oracle, SQL Server

NoSQL: Cassandra

Languages: C,C++,Java, Python, PIG Latin, Linux, shell scripting, Hive QL.

Network administration: TCP/IP fundamentals, LAN and WAN.

Cluster Management tools: Cloudera Manager, Ambari

Security: Kerberos

PROFESSIONAL EXPERIENCE:

Confidential, West Chester, PA

Hadoop Engineer

Responsibilities:

  • Major tasks included upgrading, maintaining and monitoring cluster and its clients
  • Gathered requirements from Engineering and Reporting teams to design solutions on the Hadoop ecosystem
  • Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Deployed a cluster using CDH integrated with Nagios and Ganglia for monitoring
  • Deployed remote Hive Metastore using MySQL
  • HA implementation of HDFS to avoid single point of failure (SPOF) using Quorum Journal Manager (QJM) .
  • Imported weblogs from the web servers into HDFS using Flume.
  • Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
  • Maintained, audited and built new clusters for testing purposes using the Cloudera manager.
  • Worked on analyzing data with Hive and Pig.
  • Custom monitoring scripts for Nagios to monitor the daemons and the cluster status
  • Development of Pig scripts for handling threat analysis raw data to be analyzed
  • Performed a POC on Sqoop imports from heterogeneous data sources to HDFS
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
  • Automated jobs like start, stop, suspend, resume, rerun using Oozie.
  • Custom shell scripts for automating redundant tasks on the cluster
  • Debug and solve the major issues with Cloudera manager by interacting with the Cloudera team from Cloudera.
  • Integrated Cassandra Querying Language called CQL for Apache Cassandra.

Confidential, Louisville, KY

Hadoop Engineer

Responsibilities:

  • Installed and configured various components of Hadoop ecosystem and maintained their integrity.
  • Commissioned DataNodes when data grew and decommissioned when the hardware degraded.
  • Imported the data to and from RDBMS to HDFS using Sqoop.
  • Installed and configured Ganglia and Nagios to monitor Hadoop clusters.
  • Managed Hadoop clusters using Cloudera Manager. Performed a major upgrade in Cloudera distribution for Hadoop.
  • Implemented security for the Hadoop Cluster using Kerberos authentication.
  • Written MR jobs, custom UDF’s in Hive and Pig on HDFS data to analyze the customer behavior.
  • Experience in performing joins with Map-Reduce and Hive
  • Used Pig to analyze large amounts of data using Pig's query language.
  • Good knowledge on handling the xml data and used map reduce jobs to parse the data.
  • Developed event driven ELT applications that supports data flows from RDBMs result sets and log files into HDFS.
  • Managed day to day operations of the cluster for backup, support and maintenance
  • Converted some existing SQL queries to HIVE queries
  • Configured password less SSH for Nodes.
  • Applied patches to the Apache Hadoop source package with Mellanox provided jar files.
  • Involved in loading data from UNIX file system to HDFS.

Confidential

Network Administrator

Responsibilities:

  • Worked closely with infrastructure, network, database and application teams to ensure business applications are highly available and performing within service levels
  • Installed, configured, upgraded and administrated Red Hat and other Linux Operating Systems
  • Managed patching, monitoring system performance and network communication, backups, risk mitigation, troubleshooting, application enhancements, software upgrades and modifications of the Linux servers
  • Reviewed existing software/hardware architecture to identify areas for improvement in the areas of scalability, maintainability, and performance
  • Created and maintained user account information, setup security policies for users
  • Verified that peripherals are working properly
  • Developed administrative scripts like kickstart scripts with Shell
  • Perform back up, file replications and script management for servers
  • Conducted root cause analysis and resolved production problems and data issues
  • Built different servers like Web Servers, FTP server, NFS server, Mail severs, NTPS servers
  • Performed various configurations which include Networking and Iptable, resolving host names and SSH keyless login.

We'd love your feedback!