Hadoop Developer Resume

OBJECTIVE:

Seeking a challenging role in the field of IT Industry as Technical Architect/Developer/Quality Assurance Analyst to contribute towards organizational success and to reach higher echelons.

SUMMARY:

Over 8 years of diversified IT professional experience including 2 year of experience in implementing Bigdata solutions using Cloudera Apache Hadoop distribution system.
Worked on 34 node Hadoop Cloudera cluster CDH 5.2 for SCPP EVO LRI EQ project .
Spark for real time application queries and EOD batches .
Expertise in Hadoop architecture and its various components - Hadoop Distributed File System (HDFS), MapReduce, Name node, Data Node, Job Tracker, Task Tracker, Secondary Name Node.
Good understanding of Hadoop MapReduce programming paradigm.
Good Knowledge on Hadoop Cluster architecture and monitoring.
Experience writing queries in HIVE, PIG through command line shell.
Experience in managing and reviewing Hadoop log files.
Strong understanding of Hadoop eco system components such as HDFS, Map Reduce, Sqoop, Flume, Oozie, Pig, Hive, HBase, and Zookeeper.
Proficiency in Java, Hadoop Map reduce, Pig, Hive, Hbase, Sqoop, Flume, Scala, Spark, Kafka, Strom, Oozie and Impala.
Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the Bigdata as per the requirement.
Working knowledge on major Hadoop ecosystems PIG, HIVE, HBASE and Cloudera Manager.
Experience in developing PIG Latin Scripts and using Hive Query Language.
Experience working on NoSQL databases including Cassandra and Hbase.
Experience developing PigLatin and HiveQL scripts for Data Analysis and ETL purposes and also extended the default functionality by writing User Defined Functions (UDFs) for data specific processing.
Experience with migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop & Flume.
Hands-on experience developing workflows that execute MapReduce, Sqoop, Flume, Hive and Pig scripts using Oozie.
Well-versed database development knowledge using SQL data types, Indexing, Joins, Views, Transactions, Large Objects and Performance tuning.
Good knowledge of Data warehousing concepts and ETL and Teradata.
Experience writing Shell scripts in Linux OS and integrating them with other solutions.
Fluent with the core Java concepts like I/O, Multi-threading, Exceptions, RegEx, Collections, Data-structures and Serialization.
Expertise in using automation testing tools like HP Quick Test Professional (QTP), and Load Runner.
Strong knowledge in Database Programming using RDBMS databases like SQL Server 7.0/2000/2005 , Oracle 7.0/8/8i/9i and MS Access. Expertise in writing PL/SQL Queries, Stored Procedures, Triggers, Packages, Cursors…etc
Good knowledge in Quality Assurance Life Cycle (QALC), Software Development Life Cycle (SDLC), Software Test Life Cycle (STLC), Object Oriented Analysis and Design (OOAD).
Excellent Analytical skills to understand the business process and functionality, requirements and to translate them to system requirement specifications
Experience in preparing Test plans, Test Data and execution of Test cases to ensure application functionality meet the user requirements.

SKILL:

Hadoop Ecosystem: HDFS, MapReduce, Sqoop, Flume, Hive, Pig, HBase, YARN, Oozie, Impala, Zookeeper, Kafka, Cloudera Manager, Spark

Hadoop Distributions: Apache Hadoop, CDH3, CDH4, Hortonworks.

Programming Languages: Core Java, C, HTML, Visual Basic, ASP.NET, .NET, ADO.NET, XML, Scala

Scripting Languages: Unix/Linux Shell Scripting, Java Script, VB Scripting, Python

Automation/ETL Tools: HP Quick Test Pro, iMacros, Selenium, Ab Initio, Excel Macro.

IDE/Tools/Utilities: Eclipse IDE, MS Visual Studio 2010, Control M, Tivoli.

Methodologies: UML, OOP and Agile-Scrum.

Databases Technologies: Oracle 10g,11g, MS SQL Server, Teradata and Data wharehouse

NoSQL Databases: HBase and Cassandra

Application/Web Servers: Apache, Tomcat, MSIIS, Splunk

Version Control Tools: Tortoise CVS Client, SVN, MS Team Foundation Server (TFS).

Defect Tracking Tools: Test Director, HP Quality Center, Jira, HP ALM.

Operating Systems: LINUX/UNIX, Windows 7, Windows Server 2003/2008

EXPERIENCE:

Hadoop Developer

Confidential

Responsibilities:

Performed Planning, installing, configuring, maintaining, and monitoring Hadoop Clusters and using Apache Cloudera (CDH4, CDH5) distributions
Worked on Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade
Setting up Data Ingestion tools like Flume, Sqoop, SFTP and NDM.
Install and Set up HBASE
Developed automated scripts using Unix Shell for running Balancer, file system health check, Schema Creation in Hive and User/Group creation on HDFS.
Application Development and Providing solutions to business requirements
Adding and Decommissioning Hadoop Cluster nodes Including Balancing HDFS block data.
Set up Quotas on HDFS, implementing Rack Topology Scripts.
Managed and reviewed Hadoop log files, File system management and monitoring, Hadoop Cluster capacity planning
Configuring Sqoop and Exporting/Importing data into HDFS
Cluster maintenance including adding and removing cluster nodes; cluster Monitoring and Troubleshooting
Involved in log file management where the Hadoop logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 2 years for Audit purpose.
Worked on Sqoop API, created a version of sqoop for CDS Distribution with lot of customized features
Collaborate with cross-functional teams to ensure that applications are properly tested, configured, and deployed.
Integration of Hadoop Connectors to existing sqoop for various databases
Provided solutions where the data was Streamlined
Used Compression and encryption technologies to process data before storing it to HDFS
Successfully moved data from one DB to another by landing files in HDFS
Worked with GPFS, Hive, Exacta, MS Sql Server, Teradata
For scheduling jobs in HDFS used Oozie.
Wrote multiple MR jobs are various requirements and to solve purposes

Environment: Java, Rest full Services, Hadoop, Map Reduce, Hive, HBase, Sqoop, Junit, Oracle, Teradata, Greenplum, TDCH, AbInitio, Control-M, Oozie, Oracle Hadoop Connectors and Tableau

Confidential

Hadoop Developer

Responsibilities:

Responsible for architectingHadoopclusters with CDH3
Involved in the installation of CDH3 and up-gradation from CDH3 to CDH4
Installed cluster, worked on commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration.
Developed automated scripts using Unix Shell for running Balancer, file system health check, Schema Creation in Hive and User/Group creation on HDFS.
Application Development and Providing solutions to business requirements
Adding and Decommissioning Hadoop Cluster nodes Including Balancing HDFS block data.
Set up Quotas on HDFS, implementing Rack Topology Scripts.
Managed and reviewed Hadoop log files, File system management and monitoring, Hadoop Cluster capacity planning
Involved in log file management where the Hadoop logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 2 years for Audit purpose.
Creating various Mapreduce jobs for performing ETL transformations on the transactional and application specific data sources.
Configured Flume to ingest trade data into the HBase database from various JMS Source (MQ)
Responsible for designing and managing the Sqoop jobs that uploaded the data from Oracle to HDFS and Hive and vice versa.
Performed joins, group by and other operations in MapReduce by using Java and PIG.
Processed the output from PIG, Hive and formatted it before sending to theHadoopoutput file.
Reviewed the HDFS usage and system design for future scalability and fault-tolerance
Setup and benchmarkedHadoop/HBase clusters for internal use
Wrote and executed PIG scripts using Grunt shell
Installed and configuredHadoop, Map Reduce, HDFS.
Used Hive QL to do analysis on the data and identify different correlations.
Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
Installed and configured Pig and also written Pig Latin scripts
Developed UDFs in Java as and when necessary to use in pig and hive queries
Used Flume to collect the logs data with error messages across the cluster.
Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster.
Good understanding of Partitions, Bucketing concepts in hive to optimize performance
Developed and scheduled Autosys job for EOD process
ManagedHadoopclusters include adding and removing cluster nodes for maintenance and capacity needs.
Experience in monitoring and managing theHadoopcluster using Cloudera Manager
Actively updated the upper management with daily updates on the progress of project that include the classification levels that were achieved on the data.

Environment: Hadoop, MapReduce, Java, Flume, Sqoop, Hbase, Hive, Pig, Autosys Scheduler, Oracle, Shell Scripting, NOSQL, XML, Cloudera Manager

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship