We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

OBJECTIVE:

Seeking a challenging role in the field of IT Industry as Technical Architect/Developer/Quality Assurance Analyst to contribute towards organizational success and to reach higher echelons.

SUMMARY:

  • Over 8 years of diversified IT professional experience including 2 year of experience in implementing Bigdata solutions using Cloudera Apache Hadoop distribution system.
  • Worked on 34 node Hadoop Cloudera cluster CDH 5.2 for SCPP EVO LRI EQ project .
  • Spark for real time application queries and EOD batches .
  • Expertise in Hadoop architecture and its various components - Hadoop Distributed File System (HDFS), MapReduce, Name node, Data Node, Job Tracker, Task Tracker, Secondary Name Node.
  • Good understanding of Hadoop MapReduce programming paradigm.
  • Good Knowledge on Hadoop Cluster architecture and monitoring.
  • Experience writing queries in HIVE, PIG through command line shell.
  • Experience in managing and reviewing Hadoop log files.
  • Strong understanding of Hadoop eco system components such as HDFS, Map Reduce, Sqoop, Flume, Oozie, Pig, Hive, HBase, and Zookeeper.
  • Proficiency in Java, Hadoop Map reduce, Pig, Hive, Hbase, Sqoop, Flume, Scala, Spark, Kafka, Strom, Oozie and Impala.
  • Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the Bigdata as per the requirement.
  • Working knowledge on major Hadoop ecosystems PIG, HIVE, HBASE and Cloudera Manager.
  • Experience in developing PIG Latin Scripts and using Hive Query Language.
  • Experience working on NoSQL databases including Cassandra and Hbase.
  • Experience developing PigLatin and HiveQL scripts for Data Analysis and ETL purposes and also extended the default functionality by writing User Defined Functions (UDFs) for data specific processing.
  • Experience with migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop & Flume.
  • Hands-on experience developing workflows that execute MapReduce, Sqoop, Flume, Hive and Pig scripts using Oozie.
  • Well-versed database development knowledge using SQL data types, Indexing, Joins, Views, Transactions, Large Objects and Performance tuning.
  • Good knowledge of Data warehousing concepts and ETL and Teradata.
  • Experience writing Shell scripts in Linux OS and integrating them with other solutions.
  • Fluent with the core Java concepts like I/O, Multi-threading, Exceptions, RegEx, Collections, Data-structures and Serialization.
  • Expertise in using automation testing tools like HP Quick Test Professional (QTP), and Load Runner.
  • Strong knowledge in Database Programming using RDBMS databases like SQL Server 7.0/2000/2005 , Oracle 7.0/8/8i/9i and MS Access. Expertise in writing PL/SQL Queries, Stored Procedures, Triggers, Packages, Cursors…etc
  • Good knowledge in Quality Assurance Life Cycle (QALC), Software Development Life Cycle (SDLC), Software Test Life Cycle (STLC), Object Oriented Analysis and Design (OOAD).
  • Excellent Analytical skills to understand the business process and functionality, requirements and to translate them to system requirement specifications
  • Experience in preparing Test plans, Test Data and execution of Test cases to ensure application functionality meet the user requirements.

SKILL:

Hadoop Ecosystem: HDFS, MapReduce, Sqoop, Flume, Hive, Pig, HBase, YARN, Oozie, Impala, Zookeeper, Kafka, Cloudera Manager, Spark

Hadoop Distributions: Apache Hadoop, CDH3, CDH4, Hortonworks.

Programming Languages: Core Java, C, HTML, Visual Basic, ASP.NET, .NET, ADO.NET, XML, Scala

Scripting Languages: Unix/Linux Shell Scripting, Java Script, VB Scripting, Python

Automation/ETL Tools: HP Quick Test Pro, iMacros, Selenium, Ab Initio, Excel Macro.

IDE/Tools/Utilities: Eclipse IDE, MS Visual Studio 2010, Control M, Tivoli.

Methodologies: UML, OOP and Agile-Scrum.

Databases Technologies: Oracle 10g,11g, MS SQL Server, Teradata and Data wharehouse

NoSQL Databases: HBase and Cassandra

Application/Web Servers: Apache, Tomcat, MSIIS, Splunk

Version Control Tools: Tortoise CVS Client, SVN, MS Team Foundation Server (TFS).

Defect Tracking Tools: Test Director, HP Quality Center, Jira, HP ALM.

Operating Systems: LINUX/UNIX, Windows 7, Windows Server 2003/2008

EXPERIENCE:

Hadoop Developer

Confidential

Responsibilities:

  • Performed Planning, installing, configuring, maintaining, and monitoring Hadoop Clusters and using Apache Cloudera (CDH4, CDH5) distributions
  • Worked on Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade
  • Setting up Data Ingestion tools like Flume, Sqoop, SFTP and NDM.
  • Install and Set up HBASE
  • Developed automated scripts using Unix Shell for running Balancer, file system health check, Schema Creation in Hive and User/Group creation on HDFS.
  • Application Development and Providing solutions to business requirements
  • Adding and Decommissioning Hadoop Cluster nodes Including Balancing HDFS block data.
  • Set up Quotas on HDFS, implementing Rack Topology Scripts.
  • Managed and reviewed Hadoop log files, File system management and monitoring, Hadoop Cluster capacity planning
  • Configuring Sqoop and Exporting/Importing data into HDFS
  • Cluster maintenance including adding and removing cluster nodes; cluster Monitoring and Troubleshooting
  • Involved in log file management where the Hadoop logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 2 years for Audit purpose.
  • Worked on Sqoop API, created a version of sqoop for CDS Distribution with lot of customized features
  • Collaborate with cross-functional teams to ensure that applications are properly tested, configured, and deployed.
  • Integration of Hadoop Connectors to existing sqoop for various databases
  • Provided solutions where the data was Streamlined
  • Used Compression and encryption technologies to process data before storing it to HDFS
  • Successfully moved data from one DB to another by landing files in HDFS
  • Worked with GPFS, Hive, Exacta, MS Sql Server, Teradata
  • For scheduling jobs in HDFS used Oozie.
  • Wrote multiple MR jobs are various requirements and to solve purposes

Environment: Java, Rest full Services, Hadoop, Map Reduce, Hive, HBase, Sqoop, Junit, Oracle, Teradata, Greenplum, TDCH, AbInitio, Control-M, Oozie, Oracle Hadoop Connectors and Tableau

Confidential

Hadoop Developer

Responsibilities:

  • Responsible for architectingHadoopclusters with CDH3
  • Involved in the installation of CDH3 and up-gradation from CDH3 to CDH4
  • Installed cluster, worked on commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration.
  • Developed automated scripts using Unix Shell for running Balancer, file system health check, Schema Creation in Hive and User/Group creation on HDFS.
  • Application Development and Providing solutions to business requirements
  • Adding and Decommissioning Hadoop Cluster nodes Including Balancing HDFS block data.
  • Set up Quotas on HDFS, implementing Rack Topology Scripts.
  • Managed and reviewed Hadoop log files, File system management and monitoring, Hadoop Cluster capacity planning
  • Involved in log file management where the Hadoop logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 2 years for Audit purpose.
  • Creating various Mapreduce jobs for performing ETL transformations on the transactional and application specific data sources.
  • Configured Flume to ingest trade data into the HBase database from various JMS Source (MQ)
  • Responsible for designing and managing the Sqoop jobs that uploaded the data from Oracle to HDFS and Hive and vice versa.
  • Performed joins, group by and other operations in MapReduce by using Java and PIG.
  • Processed the output from PIG, Hive and formatted it before sending to theHadoopoutput file.
  • Reviewed the HDFS usage and system design for future scalability and fault-tolerance
  • Setup and benchmarkedHadoop/HBase clusters for internal use
  • Wrote and executed PIG scripts using Grunt shell
  • Installed and configuredHadoop, Map Reduce, HDFS.
  • Used Hive QL to do analysis on the data and identify different correlations.
  • Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
  • Installed and configured Pig and also written Pig Latin scripts
  • Developed UDFs in Java as and when necessary to use in pig and hive queries
  • Used Flume to collect the logs data with error messages across the cluster.
  • Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster.
  • Good understanding of Partitions, Bucketing concepts in hive to optimize performance
  • Developed and scheduled Autosys job for EOD process
  • ManagedHadoopclusters include adding and removing cluster nodes for maintenance and capacity needs.
  • Experience in monitoring and managing theHadoopcluster using Cloudera Manager
  • Actively updated the upper management with daily updates on the progress of project that include the classification levels that were achieved on the data.

Environment: Hadoop, MapReduce, Java, Flume, Sqoop, Hbase, Hive, Pig, Autosys Scheduler, Oracle, Shell Scripting, NOSQL, XML, Cloudera Manager

We'd love your feedback!