Sr. Hadoop Developer Resume
Ashburn, VA
SUMMARY
- 8+ years of extensive experience in software product development, 3+ years of experience of developing large scale applications using Hadoop and Bigdata tools.
- Hands on experience in designing, testing and deploying Map Reduce applications in a Hadoop Ecosystem.
- Experienced in teh Hadoop ecosystem components like Hadoop Map Reduce, Cloudera, Hortonworks, HBase, Spark, Oozie, Hive, Sqoop, Pig, Flume, Tez and Cassandra.
- Good Programming experience with SQL, PL/SQL Database technologies and its relational databases including Oracle, Confidential, MS - SQL.
- Experienced with NoSQL databases - HBase, MongoDB and Cassandra.
- Good proficiency in Bigdata specific languages such as Python and Scala.
- Worked on SparkSQL extensively and used Hive as one of teh Data Sources.
- Good experience and working knowledge on AWS Cloud services which include but are not limited to EC2, S3, Redshift, DynamoDB, EMR, I&AM, SQS, SES, Lambda, VPC, CloudWatch, CloudFront.
- Experience in using different Hadoop Distributions like Hortonworks and Cloudera and also experienced with working on AWS cloud environment.
- Familiar with visualization by using R and SAS and knowledge of AWS services.
- Configured Spark Streaming to receive real time data from teh Kafka and store teh stream data to HDFS. Knowledge of working on Confidential and its Bigdata tools.
- Worked on visualization and BI tools like Tableau and Qlikview.
- Experience in importing/exporting terabytes of data using Sqoop from HDFS to RDBMS and vice-versa.
- Good knowledge on Hadoop MRV1 and MRV2 (YARN) Architecture.
- Potential experience in (SDLC) Analysis, Design, Development, Integration and Testing in diversified areas of Client-Server/Enterprise applications using Java, J2EE technologies.
- Worked on Talend for ETL and data profiling tasks.
- Strong experience in ETL tools on Oracle, DB2 and SQL Server Databases.
- Hands on experience working with Java project build managers Apache MAVEN and ANT.
- Extensive experience in Development and Production support on Linux environment.
- Good knowledge in integration of various data sources like RDBMS, Spreadsheets, Text files, JSON and XML files.
- Well-developed communication skills, ability to work well independently and as part of a team, developing effective client relations, providing superior client service and satisfaction.
TECHNICAL SKILLS
Big Data Ecosystems: Hadoop, HDFS, YARN, Map Reduce, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, Flume, Parquet, Apache Impala, Spark
Frameworks: JPA,J2EE, JSP, Servlets, Struts, Hibernate, .NET Framework 4.5
Methodology: Agile software development
Languages: Java, HiveQL, PigLatin, R, Regex, Advanced PL/SQL, SQL, VBA, C++, C, Shell
Scripting Languages: HTML, CSS, JavaScript, DHTML, XML, JQuery
Web Technologies: Java, J2EE, Servlets, JSP, JDBC, XML, AJAX, SOAP, Restful, Angular JS
Architectures: SOA, Cloud Computing(AWS, EC2)
Application Server: Apache Tomcat, Glassfish 4.0, Web Logic
Database Systems: Netezza, Oracle 11g/10g/9i, DB2, MS-SQL Server, MySQL, MS-Access
Development Tools (IDEs): JIRA, Clear case, Tableau, Splunk, RStudio, Eclipse/Net Beans, Toad, SQL Developer, AWK
Platforms: Windows 7/8/ 10, Ubuntu(Linux), RedHat, SUSE, CentOS
PROFESSIONAL EXPERIENCE
Confidential, Ashburn, VA
Sr. Hadoop Developer
Responsibilities:
- Used Oozie workflow engine for managing interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java Map-reduce, Hive and Sqoop as well as System specific jobs.
- Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream teh log data from servers.
- Used Talend for and data profiling.
- Created Hive tables as teh requirements were either internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
- Used SparkSQL for loading and processing data from Hive data source.
- Imported data from various data sources, performed transformations using Hive, MapReduce.
- Responsible for loading data into HDFS, extracted teh processed data from MySQL into HDFS using Sqoop.
- Wrote lot of Python scripts for Automation.
- UsedSpark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
- Developed Spark code using Scala and Spark -SQL/Streaming for faster testing and processing of data.
- Import teh data from different sources like HDFS/Hbase intoSpark RDD.
- Load teh data into Spark RDD and do in memory data Computation to generate teh Output response.
- Developed complex MapReduce programs in Java for Data Analysis on different data formats.
- Experience on implementation of a log producer in Scala dat watches for application logs, transform incremental log and sends them to a Kafka and Zookeeper based log collection platform.
- Implemented static and dynamic partitioning in Hive.
- Extensively Used Sqoop to import/export data between RDBMS and Hive tables, incremental imports and created Sqoop jobs for last saved value.
- Exported teh analyzed data to teh Relational databases using Sqoop for visualization and to generate reports for teh BI team.
- Created Hive queries to compare teh raw data with EDW reference tables and performing aggregates.
- Managing and scheduling jobs on a Hadoop cluster.
- Used Pig as ETL tool for various data joins.
- Developed Simple and complex MapReduce Jobs using hive.
- Analyzed teh data by performing Hive queries and running Pig scripts to know user behavior.
- Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Implemented Partitioning and bucketing in Hive.
- Experienced in managing and reviewing Hadoop Log Files.
- Expertise in using ORC and Parquet file formats in Hive.
- Created customized BI tool for manager team dat perform Query analytics using HiveQL.
- Configured Flume to extract teh data from teh web server output files to load into HDFS.
- Developed teh Pig UDF'S to pre-process teh data for analysis.
- Developed workflow in Oozie to automate teh tasks of loading teh data into HDFS and pre-processing with Pig.
- Involved in story-driven Agile development methodology and actively participated in daily scrum meetings.
Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Cloudera, MapReduce, HDFS, Hive, Java (jdk1.7), Pig, Linux, XML, HBase, Zookeeper, Sqoop, Amazon Web Services (AWS), Tableau, JIRA, Maven, Eclipse.
Confidential, Parsippany, NJ
Hadoop Developer
Responsibilities:
- Designed and implemented MapReduce jobs to support distributed data processing to process large data sets utilizing Hadoop cluster environment dat which needed by business use cases.
- Used Apache Storm for handling real-time analytics.
- Involved in IRichBolt interface development for teh project
- Handled importing of data from various data sources, performed transformations using PIG, Map Reduce, loaded data into HDFS and Extracted teh data from MySQL into HDFS using SQOOP.
- Experienced in teh data analysis, design, development and MRUnit testing of Hadoop Cluster Structureusing Java.
- Managed nodes on Hadoop cluster connectivity and security.
- Implemented Hive Generic UDF's to implement business logic.
- Implemented six nodes CDH4 Hadoop Cluster on CentOS.
- Used Partitioning pattern in Map Reduce to move records into categories
- Installed and configured Hive and also written Hive QL scripts.
- Experienced with Accessing Hive tables to perform analytics from java applications using JDBC.
- Automated some of teh manual test cases using Python and Bash scripts.
- Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements
- Written advanced MapReduce codes for Joins and Grouping.
- Created Hive external tables on teh MapReduce output before partitioning; bucketing is applied on top of it.
- Performed unit testing of MapReduce jobs on cluster using MRUnit.
- Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting on teh dashboard.
- Unix Scripting to manage teh Hadoop Operation stuffs.
- Written Puppet program for installation and configuration of Cloudera Hadoop CDH3u1.
- Responsible for Developing MR jobs using python, performed unit testing using MRUnit.
- Worked in aggregating data points from nearly 45 external and internal sources in order to view, interrogate and analyze large sets of data to determine best data to use in various analytical solutions.
- Developed data transformations based on teh requirements from teh source system owners.
- Handled importing of data from various data sources, performed transformations using Hive and MapReduce, loaded data into HDFS. And, extracted teh data from MySQL into HDFS using Sqoop.
- Used Pattern matching algorithms to recognize teh customer across different sources and built risk profiles for each customer using Hive and stored teh results in HBase.
Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig 0.10 and 0.11, JDK1.6, HBase, Hue, Talend, Oozie, Spark, Storm, Kafka, Redis, Flume, Junit and Oracle/Informix, Cassandra, AWS (Amazon Web Services), HDFS, DB2 and Hortonworks, Tableau.
Confidential, Austin, TX
Java/ J2EE Developer
Responsibilities:
- Involve in teh design process, coding and testing phases of software development.
- As a programmer, involved in designing and implementation of MVC pattern.
- Extensively used XML where in process details are stored in teh database and used teh stored XML whenever needed.
- Part of core team to develop process engine.
- Developed Action Classes & Validation Struts framework
- Created project related documentations like user guides based on role.
- Implemented modules like Client Management, Vendor Management.
- Attended various Client meetings.
- Implemented Access Control Mechanism to provide various access levels to teh user.
- Designed and developed teh application using J2EE, JSP, XML, Struts, Hibernate, Spring technologies
- Coded DAO and hibernate implementation Class for data access.
- Coded Springs Services Class and Transfer Objects to pass teh data between layers.
- Designed teh Database for teh Jeevica in MS-SQL server 2008
- Implemented Web Services using Axis
- Used different features of Struts like MVC, Validation framework and tag library.
- Created detail design document, Use cases, and Class Diagrams using UML
- Written ANT scripts to build JAR, WAR and EAR files.
- Developed Standalone Java Component dat will interact with Crystal Reports on Crystal Enterprise Server in order to view Reports as well Scheduling of Reports as well storing data as XML and sending data to consumers using SOAP.
- Deployed teh application and tested on Websphere Application Servers.
- Developed Java Scripts for client side validations in JSP.
- Developed JSPs with Struts taglibs for teh presentation layer.
- Coordinated with teh onsite, offshore and QA team to facilitate teh quality delivery from offshore on schedule.
Environment: Java 1.5, Spring, Spring WebService, JSP, JavaScript, Hibernate, SOAP, CSS, Struts, Websphere, MQ Series, JUnit, Apache, Windows XP and Linux
Confidential
Java Developer
Responsibilities:
- Involve in teh design process, coding and testing phases of software development.
- Design usecases, sequence and class diagrams for business requirements.
- Write feature, design and test specifications.
- Performed requirements analysis, performance analysis and problem analysis.
- Give technical guidance and provide mentoring to teh team and enhance competitive advantage for Confidential database.
- Build and deploy Java applications into multiple UNIX based environments and produce both unit and functional test results along with release notes.
- Work with test team to understand outstanding issues and close them in reasonable time.
- Apply design patterns and OO design patterns to improve existing Java based tools.
- Learn Confidential tools and improve APIs to interact with each other.
- Write SQL queries to fetch data from Confidential and test it with internal tools.
- Build Automation test scripts dat cannot be developed by test team and maintain them.
Environment: Java EE6, Confidential, J2EE, JSP, JavaScript, Hibernate, Spring, JavaScript, OO design patterns, FastLoad, MultiLoad, HTML5, XML, Clearcase and JSON
Confidential
Java Developer
Responsibilities:
- Involve in teh design process, coding and testing phases of software development.
- Collaborate with team members and involved in analysis, design and implementation phases of teh software development lifecycle (SDLC) for various software modules of teh web application.
- Implemented MVC design pattern using JPA Framework.
- Used JSP for presentation layer, developed high performance object/relational persistence and query service for entire application utilizing Hibernate.
- Developed teh XML Schema and Web services for teh data maintenance and structures.
- Developed teh application using Java Beans, Servlets and EJB's.
- Actively involved in code review and bug fixing for improving teh performance.
- Created connection through JDBC and used JDBC statements to call stored procedures.
- Used WebSphere Application Server and RAD, Eclipse to develop and deploy teh application.
- Designed database and created tables, written teh complex SQL Queries and stored procedures as per teh requirements.
- Involved in coding for JUnit Test cases, ANT for building teh application.
- Developed application Using J2EE (JDBC, JSF, EJB, Rich faces, JSTL and XML).
- Worked extensively on Dynamic SQL, Bulk Collections, Materialized Views, Ref Cursor, Query Re-Write, Collections etc.
- Demonstrated excellence in achieving improved efficiency of 30% by taking down ticket count to 5 from 50.
- Design usecases, sequence and class diagrams for business requirements.
- Write feature, design and test specifications.
Environment: Java/J2EE, Oracle 11g/10g, SQL, PL/SQL, JSP,JSF, EJB, JPA, Hibernate, Web Logic 8.0, HTML, AJAX, Java Script, JDBC, XML, JMS, XSLT, UML, JUnit, log4j, My Eclipse 6.0
