We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Hoffman Estates, IL

SUMMARY

  • Around 7 years of professional experience in field of Information Technology that includes analysis, design, development and testing of complex applications.
  • Working knowledge on all phases of Software Development Life Cycle (SDLC). Ability to track projects from inception to deployment.
  • Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, DataNode and MapReduce programming paradigm.
  • Experience in Hadoop cluster performance tuning by gathering and analyzing the existing infrastructure.
  • Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Expert on Hadoop distributions like Cloudera and HortonWorks.
  • Expert in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa.
  • Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
  • Experience in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Worked on Extending Hive and Pig core functionality by writing custom UDFs.
  • Experience working with technologies like Teradata, Apache Tomcat, Apache Solr, and ElasticSearch
  • Experience in writing MapReduce Programs and using Apache Hadoop API for analyzing the logs.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Expertise in creating Hadoop Clusters using AWS like Amazon EMR, Amazon EC2 & Amazon S3 bucket.
  • Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
  • Excellent Understanding of from Amazon Web Services (AWS) services like EC2, S3, EBS, RDS and VPC.
  • Experience in working with Flume to load the log data from multiple sources directly into HDFS.
  • Experience in writing SQL and PL/SQL scripts & stored procedures for databases like Oracle 9i.
  • Ability to quickly ramp up and start producing results on given any tool or technology.
  • An individual with excellent communication skills and strong business acumen
  • Team player with creative problem solving skills, technical competency and leadership skills.

TECHNICAL SKILLS

Hadoop Ecosystem: MapReduce, HDFS, Hive, Pig, Sqoop, ZooKeeper, Oozie, Flume, HBase, Spark, Hue, Kafka

Language: Python, R, Scala, SQL, PL/SQL

Framework: Spring, Hibernate, Struts, MVC

Methodologies: Agile, Scrum, Waterfall

Databases: Oracle9i, MS SQL server, MySQL, HBase

Application/Web server: Apache Tomcat, Apache Solr, Elastic Search AWS

ETL Tool: Pentaho

Version Controls: SVN, CVS, Visual SourceSafe(VSS)

Operating System: Windows 98/NT/2000/2003/XP/7, Linux

PROFESSIONAL EXPERIENCE

Confidential, Hoffman Estates, IL

Hadoop Developer

Responsibilities:

  • Involved in full development cycle of planning, Analysis, Design, Development, Testing and Implementation
  • Launched and Setup of HADOOP/ HBASE Cluster which includes configuring different components of HADOOP and HBASE Cluster on Linux
  • Experience in loading data from UNIX local file system to HDFS
  • Developed MapReduce programs in Java for parsing the raw data and populating staging tables
  • Created Hive queries to compare the raw data with EDW reference tables and performing aggregates
  • Experience in loading and transforming large sets of structured, semi structured and unstructured data
  • Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS
  • Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster
  • Worked on Job management using Fair scheduler and Developed job processing scripts using Oozie workflow
  • Performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files
  • Consumed the data from Kafka using Apache spark
  • Developed Scala scripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark for Data Aggregation, queries and writing data through Sqoop
  • Expert in installing Hive, creating Hive tables and performing data manipulations using HiveQL
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive
  • Worked on Installing and configuring of Zoo Keeper to co-ordinate and monitor the cluster resources
  • Worked on POC’s with Apache Spark using scala to implement spark in project
  • Collected the logs data from Web Servers and integrated in to HDFS using Flume

Environment: Core Java, Hadoop, Linux, Unix, Hive, HBase, HDFS, Flume, Sqoop, MapReduce programming, Oozie, Spark, Zookeeper, Kafka

Confidential - Norwalk, CT

Hadoop/AWS Developer

Responsibilities:

  • Responsible for coding MapReduce program, Hive queries, testing and debugging the Map Reduce programs
  • Designed, developed and maintained data integration programs in Hadoop and RDBMS environment with both RDBMS and NoSQL data stores for data access and analysis
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing
  • Designed and developedPentahojobs and transformations to load data into dimensions and facts
  • Created Hive tables, loaded the data and Performed data manipulations usng Hive queries in Mapreduce Execution Mode
  • Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE
  • Consumed the data from Kafka using Apache spark
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters
  • Good experience in handling data manipulation using python Scripts
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data
  • Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) using Python
  • Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS
  • Experience working with Apache Solr and Elastic Search
  • Implemented test scripts to support test driven development and continuous integration
  • Used Reporting tools like Tableau to connect with Hive for generating daily reports of data

Environment: Hadoop, Spark, Kafka, Apache Solr, Elastic Solr, AWS, Python, MapReduce, HDFS, Hive, Java (jdk1.6), Oozie, Pentaho.

Confidential

SQL Developer

Responsibilities:

  • Involved in development of Software Development Life Cycle (SDLC) and UML diagrams like Use Case Diagrams, Class Diagrams and Sequence Diagrams to represent the detail in design phase
  • Created new tables, views, indexes and user defined functions.
  • Performed daily database backup & restoration and monitor the performance of Database Server.
  • Actively designed database to fasten certain daily jobs and stored procedures.
  • Optimized query performance by creating indexes.
  • Developed Stored Procedures, Views to be used to supply data for all reports. Complex formulas were used to show derived fields and to format data based on specific conditions.
  • Involved in Administration of SQL Server by creating users & login ids with appropriate roles & grant privileges to users and roles. Worked on authentication modules to provide controlled access to users on various modules
  • Created joins and sub-queries for complex queries involving multiple tables.
  • Developed stored procedures and triggers using PL/SQL in order to calculate and update tables to implement business logic.
  • Responsible for report generation using SQL Server Reporting Services (SSRS) and Crystal Reports based on business requirements.
  • Developed complex SQL queries to perform efficient data retrieval operations including stored procedures, triggers etc.
  • Designed and Implemented tables and indexes using SQL Server.

Environment: Eclipse, Java/J2EE, Oracle, HTML, PL/SQL, Oracle, XML, SQL

Confidential

Programmer Analyst

Responsibilities:

  • Developed SQL Scripts to perform different joins, sub queries, nested querying, Insert/Update and delete data in MS SQL database tables.
  • Experience on modeling principles, database design and programming, creating E-R diagrams and data relationships to design a database.
  • Experience in writing PL/SQL and in developing and implementing Stored Procedures, Packages and Triggers.
  • Responsible for designing advance SQL queries, Cursor and Triggers.
  • Build data connection to database using MS SQL server.
  • Worked on project to extract data from XML file to SQL table and generate file reporting using SQL server 2008.
  • Utilized Tomcat webserver for development purpose.
  • Involved in creation of test cases and performing unit testing.

Environment: PL/SQL, My SQL, SQL Server 2008(SSRS & SSIS), Visual studio 2000/2005, MS Excel

Confidential

Java/J2EE Developer

Responsibilities:

  • Experience in requirements gathering, Analysis, Design and Testing phases.
  • As part of the Design phase, designed state, class, and sequence diagrams using Astah Professional.
  • Experience in working with Scrum Methodologies.
  • Coded Struts Action classes and Model classes.
  • Developed DAO classes using JDBC API and wrote SQL queries to interact with Oracle Database.
  • Handled all bug fixes and enhancements.
  • Hands on experience on JUnit framework and EasyMock.
  • Utilized Log4j for logging and Putty tool to check the server logs.
  • Used SoapUI tool to invoke the Web services.
  • Experience in working on Apache ANT as build tool and CVS as repository.
  • Used ANT as a build tool and developed Buildfile.

Environment: Java 1.5, J2EE, Struts 1.2, JavaScripts, JDBC, Log4j, SOAP, JUnit, WebSphere.

We'd love your feedback!