Hadoop Developer Resume Hoffman Estates, IL - Hire IT People

SUMMARY

Around 7 years of professional experience in field of Information Technology that includes analysis, design, development and testing of complex applications.
Working knowledge on all phases of Software Development Life Cycle (SDLC). Ability to track projects from inception to deployment.
Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, DataNode and MapReduce programming paradigm.
Experience in Hadoop cluster performance tuning by gathering and analyzing the existing infrastructure.
Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
Expert on Hadoop distributions like Cloudera and HortonWorks.
Expert in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa.
Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
Experience in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
Worked on Extending Hive and Pig core functionality by writing custom UDFs.
Experience working with technologies like Teradata, Apache Tomcat, Apache Solr, and ElasticSearch
Experience in writing MapReduce Programs and using Apache Hadoop API for analyzing the logs.
Experience in designing both time driven and data driven automated workflows using Oozie.
Expertise in creating Hadoop Clusters using AWS like Amazon EMR, Amazon EC2 & Amazon S3 bucket.
Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
Excellent Understanding of from Amazon Web Services (AWS) services like EC2, S3, EBS, RDS and VPC.
Experience in working with Flume to load the log data from multiple sources directly into HDFS.
Experience in writing SQL and PL/SQL scripts & stored procedures for databases like Oracle 9i.
Ability to quickly ramp up and start producing results on given any tool or technology.
An individual with excellent communication skills and strong business acumen
Team player with creative problem solving skills, technical competency and leadership skills.

TECHNICAL SKILLS

Hadoop Ecosystem: MapReduce, HDFS, Hive, Pig, Sqoop, ZooKeeper, Oozie, Flume, HBase, Spark, Hue, Kafka

Language: Python, R, Scala, SQL, PL/SQL

Framework: Spring, Hibernate, Struts, MVC

Methodologies: Agile, Scrum, Waterfall

Databases: Oracle9i, MS SQL server, MySQL, HBase

Application/Web server: Apache Tomcat, Apache Solr, Elastic Search AWS

ETL Tool: Pentaho

Version Controls: SVN, CVS, Visual SourceSafe(VSS)

Operating System: Windows 98/NT/2000/2003/XP/7, Linux

PROFESSIONAL EXPERIENCE

Confidential, Hoffman Estates, IL

Hadoop Developer

Responsibilities:

Involved in full development cycle of planning, Analysis, Design, Development, Testing and Implementation
Launched and Setup of HADOOP/ HBASE Cluster which includes configuring different components of HADOOP and HBASE Cluster on Linux
Experience in loading data from UNIX local file system to HDFS
Developed MapReduce programs in Java for parsing the raw data and populating staging tables
Created Hive queries to compare the raw data with EDW reference tables and performing aggregates
Experience in loading and transforming large sets of structured, semi structured and unstructured data
Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS
Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster
Worked on Job management using Fair scheduler and Developed job processing scripts using Oozie workflow
Performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files
Consumed the data from Kafka using Apache spark
Developed Scala scripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark for Data Aggregation, queries and writing data through Sqoop
Expert in installing Hive, creating Hive tables and performing data manipulations using HiveQL
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive
Worked on Installing and configuring of Zoo Keeper to co-ordinate and monitor the cluster resources
Worked on POC’s with Apache Spark using scala to implement spark in project
Collected the logs data from Web Servers and integrated in to HDFS using Flume

Environment: Core Java, Hadoop, Linux, Unix, Hive, HBase, HDFS, Flume, Sqoop, MapReduce programming, Oozie, Spark, Zookeeper, Kafka

Confidential - Norwalk, CT

Hadoop/AWS Developer

Responsibilities:

Responsible for coding MapReduce program, Hive queries, testing and debugging the Map Reduce programs
Designed, developed and maintained data integration programs in Hadoop and RDBMS environment with both RDBMS and NoSQL data stores for data access and analysis
Developed multiple MapReduce jobs in java for data cleaning and preprocessing
Designed and developedPentahojobs and transformations to load data into dimensions and facts
Created Hive tables, loaded the data and Performed data manipulations usng Hive queries in Mapreduce Execution Mode
Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE
Consumed the data from Kafka using Apache spark
Configured deployed and maintained multi-node Dev and Test Kafka Clusters
Good experience in handling data manipulation using python Scripts
Created Pig Latin scripts to sort, group, join and filter the enterprise wise data
Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) using Python
Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS
Experience working with Apache Solr and Elastic Search
Implemented test scripts to support test driven development and continuous integration
Used Reporting tools like Tableau to connect with Hive for generating daily reports of data

Environment: Hadoop, Spark, Kafka, Apache Solr, Elastic Solr, AWS, Python, MapReduce, HDFS, Hive, Java (jdk1.6), Oozie, Pentaho.

Confidential

SQL Developer

Responsibilities:

Involved in development of Software Development Life Cycle (SDLC) and UML diagrams like Use Case Diagrams, Class Diagrams and Sequence Diagrams to represent the detail in design phase
Created new tables, views, indexes and user defined functions.
Performed daily database backup & restoration and monitor the performance of Database Server.
Actively designed database to fasten certain daily jobs and stored procedures.
Optimized query performance by creating indexes.
Developed Stored Procedures, Views to be used to supply data for all reports. Complex formulas were used to show derived fields and to format data based on specific conditions.
Involved in Administration of SQL Server by creating users & login ids with appropriate roles & grant privileges to users and roles. Worked on authentication modules to provide controlled access to users on various modules
Created joins and sub-queries for complex queries involving multiple tables.
Developed stored procedures and triggers using PL/SQL in order to calculate and update tables to implement business logic.
Responsible for report generation using SQL Server Reporting Services (SSRS) and Crystal Reports based on business requirements.
Developed complex SQL queries to perform efficient data retrieval operations including stored procedures, triggers etc.
Designed and Implemented tables and indexes using SQL Server.

Environment: Eclipse, Java/J2EE, Oracle, HTML, PL/SQL, Oracle, XML, SQL

Confidential

Programmer Analyst

Responsibilities:

Developed SQL Scripts to perform different joins, sub queries, nested querying, Insert/Update and delete data in MS SQL database tables.
Experience on modeling principles, database design and programming, creating E-R diagrams and data relationships to design a database.
Experience in writing PL/SQL and in developing and implementing Stored Procedures, Packages and Triggers.
Responsible for designing advance SQL queries, Cursor and Triggers.
Build data connection to database using MS SQL server.
Worked on project to extract data from XML file to SQL table and generate file reporting using SQL server 2008.
Utilized Tomcat webserver for development purpose.
Involved in creation of test cases and performing unit testing.

Environment: PL/SQL, My SQL, SQL Server 2008(SSRS & SSIS), Visual studio 2000/2005, MS Excel

Confidential

Java/J2EE Developer

Responsibilities:

Experience in requirements gathering, Analysis, Design and Testing phases.
As part of the Design phase, designed state, class, and sequence diagrams using Astah Professional.
Experience in working with Scrum Methodologies.
Coded Struts Action classes and Model classes.
Developed DAO classes using JDBC API and wrote SQL queries to interact with Oracle Database.
Handled all bug fixes and enhancements.
Hands on experience on JUnit framework and EasyMock.
Utilized Log4j for logging and Putty tool to check the server logs.
Used SoapUI tool to invoke the Web services.
Experience in working on Apache ANT as build tool and CVS as repository.
Used ANT as a build tool and developed Buildfile.

Environment: Java 1.5, J2EE, Struts 1.2, JavaScripts, JDBC, Log4j, SOAP, JUnit, WebSphere.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Hoffman Estates, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship