Hadoop Developer Resume Austin, TX - Hire IT People

SUMMARY

7+ years of IT Experience in systems development, databases & analytics, with 2+ years of strong experience as Hadoop Developer.
Expertise in Big data, Hadoop, NoSQL and various components such as HDFS, MR2, YARN, Spark, PIG, Hive, Sqoop, HBase, Cloudera Manager, Zoo keeper, Oozie, Kafka, Hue, CDH5, & HDP 2.x. Expertise in writing Hadoop Jobs for analyzing data using MapReduce, Hive, & Pig.
Working experience on Cloudera, Horton Works Hadoop distribution.
Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
Experience in writing UDFs in Hive and Pig.
Experience in Hive Partitioning and Bucketing.
Experience in importing and exporting data using Sqoop to HDFS from Relational Database Systems.
Experience in designing both time driven and data driven automated workflows using Oozie.
Good exposure with NoSQL Data bases: HBase.
Experience in AWS - S3, EC2, Redshift.
Experience in different Hadoop distributions like Cloudera, Horton Works Distributions (HDP) and Elastic Mapreduce (EMR).
Used HiveQL to do analysis on the data and identify different correlations
Strong experience in J2EE, JSP, Servlets, Struts, Spring, Hibernate, JDBC, SOAP, WSDL, JSON, JQuery, Java Script, CSS and HTML, Java Multithreading, Exception Handling.
Developed applications using Spring Framework and implemented spring modules like core container module, application context module, Aspect oriented module (AOP Module), JDBC Module, ORM Module and web module
Experience in developing custom UDFs for Pig and Hive to in corporate methods and functionality of Python into Pig Latin and HiveQL.
Developed MapReduce programs in Python with the Hadoop streaming API.
Good Knowledge of Data Profiling using Informatica Data Explorer.
Extensive experience in ETL Design and Development.
Good Project Management Knowledge Areas and Process groups.
Experience working in an iterative, agile software lifecycle (SDLC) with strong ability to estimate/scope the development of projects.
Well versed in OLTP Data Modeling and Strong knowledge of Entity-Relationship concepts.
Experience in Data Cleaning and Data Preprocessing usingPython Scripting.
Strong in RDBMS Databases, PL/SQL programming.
Strong experience in Oracle SQL queries, PL/SQL Stored Procedures, Functions, Packages, Triggers and Cursors with Query optimizations as part of ETL Development process.
Knowledge on Handling Hive queries using Spark SQL that integrate Spark environment.
Hands on experience of UNIX and shell scripting to automate scripts.
Good experience on FileZilla and WinScp tools for transferring files to UNIX environments.

TECHNICAL SKILLS

Big Data Technologies: Hadoop 2.7.x/2.5.x/2.4.x/1.x.y, HDFS, MapReduce, Sqoop 1.4.x, Oozie, Pig 0.15/0.14/0.11 , Hive 1.2.1/0.14/0.13/0.10 , ZooKeeper, Impala, Hue, Flume, HBase, Spark.

Programming: Python 3.x/2.x, Java 1.7/1.6, C and PL/SQL

ETL/BI Tools: Informatica Power Center 9.x/8.6, OBIEE

Script/Markup: JavaScript, XML, HTML, JSON and Unix Scripting

IDE: Eclipse, Rational Web Application Developer, NetBeans

App/Web Servers: Apache Tomcat Server, Apache / IBM HTTP Server, WebSphere Application Server 6.1/7.0

Messaging & Web Services: SOAP, REST, WSDL, UDDI, JMS and XML

Databases: Teradata 15/14/13, Oracle 9i/10g, MySQL 5.0, MS SQL Server

Methodologies: Agile, Waterfall, Spiral model, Full lifecycle SDLC

Operating Systems: Windows, Linux and Unix

PROFESSIONAL EXPERIENCE

Confidential, Austin, TX

Hadoop Developer

Responsibilities:

Configured Apache Hadoop clusters for application development and Hadoop tools: Hive, Pig, HBase, Zookeeper and Sqoop.
Developed shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HBASE/HDFS for further analysis.
Collected the logs data from web servers and integrated with HBASE using Flume.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
Created Hive tables, data loading and developed Hive UDFs
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting
Gained knowledge on building Apache Spark applications using Scala
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop
Created Hive External tables on the existing HDFS file systems.
Developed shell scripts for rolling day-to-day processes and automation
Developed POC for Apache Kafka
Automated workflows using shell scripts pull data from various databases into Hadoop.
Developed scalable, Hadoop-based data processing algorithms using MapReduce, Pig, Hive, HBase and the Hadoop ecosystem
Deployed Hadoop Cluster in Fully Distributed and Pseudo-distributed modes.
Setup QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
Transform massive amounts of raw data into actionable analytics
Developed scripts to automate the process and generate reports.
Installed, optimized and configured new servers and application upgrades in existing network environment to meet the requirements.
Provided User training and support.

Environment: Hadoop, MapReduce, Spark, Java, Hive, HDFS, PIG, Sqoop, Kafka, Oozie, Flume, HBase, ZooKeeper, CDH4&CDH5, Oracle, Perl, PL/SQL, Python, Linux.

Confidential, Dayton, OH

Hadoop Developer

Responsibilities:

Developed POC’s for Hadoop implementation.
Configured Hadoop clusters on AWS.
Developed shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Used Sqoop to import data from Teradata to HDFS.
Used HDFS commands to move data from local system to HDFS.
Developed MapReduce Programs for parsing the raw data and populating staging tables using Java.
Used Pig & Python scripting for preprocessing the data.
Created staging tables for data transfer from Hive.
Developed and executed Hive Queries for deformalizing the data.
Used Spark API over Hadoop YARN to perform analytics on data in Hive.
Created Hive External tables on the existing HDFS file systems.
Installed and configured Hive and also written Hive UDFs & Queries.
Created Hive queries to compare the raw data with EDW reference tables and performing aggregates.
Created Partitions and Buckets on Hive tables.
Used Python for pattern matching in build logs to format errors and warnings.
Managing and reviewing Hadoop log files.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig Latin Scripts.
Performed joins, group by and other operations in MapReduce.
Developed shell scripts for rolling day-to-day processes and it is automated
Automated workflows using shell scripts pull data from various databases into Hadoop.

Environment: AWS - EC2&S3, Redshift, Horton Works, Teradata, Informatica Power Center 9.1, Power Center Bigdata Edition, Python, HDFS, Hive, PIG, Sqoop, Oozie, Impala, ZooKeeper, Maven, Whirr, XML, Linux.

Confidential

Java Developer

Responsibilities:

Developed UI, presentation layer using JSF Framework, HTML5, JQuery, JavaScript, Ext JS and CSS
Used JDBC to communicate with Oracle 10g database
Extensively used Hibernate in developing data access layer. Developed SQL queries, views and stored procedures using PL/SQL
Implemented Service Oriented Architecture by developing Java web services using WSDL, UDDI and SOAP
Performance tuned Sybase Database, created DB tables, stored procedures and indexes
Used CronTab (Scheduler) to run the Batch Jobs
Used Clear case for the concurrent development in the team and for code repository
Lead daily work transfer between onshore and offshore teams
Worked closely with the demanding client base to ensure that the solutions meet the requirements
Created dynamic HTML pages, used jQuery for client-side validations, and AJAX to create interactive frontend GUI
Developed reporting application using Core Java, JSP, Servlets, Spring Framework, SOAP, XML, JavaScript and Tomcat
Used Maven as build tool for managing a project's build, reporting and documentation from a central piece of information
Used SVN version control to track and maintain the different versions of the project
Requirements gathering, analysis, design, development, testing and Maintenance phases of R&D Redesign

Environment: JSP, Struts, Hibernate, Sybase ASE 12.5, Oracle 9i, Oracle 10g, PL/SQL, Cron Tab, Mongo DB, Junit, ASP, eclipse, JavaScript, XML, HTML, WSDL, SOAP.

Confidential

Java Developer

Responsibilities:

Analysis and design of the application.
Prepared the detailed design document to meet the requirements.
Developed the application using J2EE architecture.
Developed JSP forms.
Designed and developed web pages using HTML and JSP.
Designed and developed Servlets to communicate between presentation and business layer.
Used EJB as a middleware in developing a three-tier distributed application.
Developed Session Beans and Entity beans to business and data process.
Used JMS in the project for sending and receiving the messages on the queue.
Developed the Servlets for processing the data on the server.
The processed data is transferred to the database through Entity Bean.
Used JDBC for database connectivity with MySQL Server.
Used CVS for version control.
Unit testing using JUnit.

Environment: Core Java, J2EE, JSP, Servlets, XML, XSLT, EJB, JDBC, JavaScript, JMS, HTML, CSS, MySQL Server, CVS, Windows 2000

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Austin, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship