Hadoop Developer Resume
Baltimore, MD
PROFESSIONAL SUMMARY
- Over 8+ years of professional IT experience including 4+ years in Big data ecosystem related technologies. Expertise in Big Data technologies as consultant, proven capability in project - based teamwork and also as an individual developerwith good communication skills.
- Excellent understanding / noledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Experience in working with Hadoop clusters using AWS EMR, Cloudera, Pivotal and Horton Works Distributions.
- Hands on experience in installing, configuring, and using Hadoopecosystem components like HadoopMap Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, Pig, and Flume.
- Hands on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, and other Hadooprelated eco-systems as a Data Storage and Retrieval systems.
- Performed importing and exporting data into HDFS and Hive using Sqoop.
- Experience in managing and reviewing Hadooplog files.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Extending Hive and Pig core functionality by writing UDFs.
- Good experience installing, configuring, testing Hadoop ecosystem components.
- Well-experienced Mapper, Reducer, Combiner, Partitioner, Shuffling and Sort process along with Custom Partitioning for efficient Bucketing.
- Good experience in writing PIG and Hive UDF’s according to requirements.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Hands on experience in Agile and Scrum methodologies.
- Extensive experience in working with teh Customers to gather required information to analyze, provide data fix or code fix for technical problems, and providing Technical Solution documents for teh users.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
- Have flair to adapt to new software applications and products, self-starter, have excellent communication skills and good understanding of business work flow.
- Expertise in Object-oriented analysis and programming (OOAD) like UML and use of various design patterns
- Working noledge in SQL, PL/SQL, Stored Procedures, Functions, Packages, DB Triggers and Indexes.
- Good experience in designing teh jobs and transformations and load teh data sequentially & parallel for initial and incremental loads.
TECHNICAL SKILLS
Big Data: Hadoop, MapReduce, HDFS, Hive with Tez, Pig, Sqoop, Oozie, Zookeeper, Flume
MongoDB and Cassandra.:
Languages: C, C++, Java, SQL, PL/SQL, UML, and ABAP
Databases: Oracle 8i/9i/10g/11g, SQL Server 7.0 /2000, DB2, MS Access
Technologies: Java 5, Java 6, AJAX, Log4j, Java Help, Java API, JDBC 2.0, and Java Beans
Methodologies: CMMI, Agile Software development, Six Sigma, Quantitative, Project
Management, UML, Design Patterns:
Framework: Ajax, Struts 2.0, JUnit, log4j 1.2, MOCK OBJECTS, Hibernate
Application Server: Apache Tomcat 5.x 6.0, JBOSS 4.0
Tools: HTML, Java Script, XMLTesting Tools: NetBeans, Eclipse, WSAD, RAD
Operating System: UNIX, Mac OSX, Windows Hyper V
Control tools: CVS, Tortoise SVN
Others: MS Office
PROFESSIONAL EXPERIENCE
Confidential, Baltimore, MD
HADOOP DEVELOPER
Responsibilities:
- Experience with professional software engineering practices and best practices for teh full software development life cycle including coding standards, code reviews, source control management and build processes.
- Analyzed all teh available NoSQL databases (mainly Cassandra, MongoDB, HBase) to find out which one suits best for several rewriting applications.
- Implemented teh DAO layer of rewriting applications with MongoDB NoSQL database.
- Implemented sharding, replication on multi node MongoDB database servers.
- Created indexes, aggregation, and have done basic MongoDB monitoring and administration.
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Responsible for building scalable distributed data solutions using Hadoop.
- Work closely with various levels of individuals to coordinate and prioritize multiple projects. Estimate scope, schedule and track projects throughout SDLC.
- Worked in teh BI team in teh area of Big Data Hadoopcluster implementation and data integration in developing large-scale system software.
- Worked in HadoopMapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and processing.
- Worked extensively in creating MapReduce jobs to power data for search and aggregation.
- Experienced in managing and reviewing Hadooplog files.
- Experienced in running Hadoopstreaming jobs to process terabytes data
- Designed a data warehouse using Hive
- Handling structured, semi structured and unstructured data
- Worked extensively with Sqoop for importing and exporting teh data from HDFS to Relational Database systems and vice-versa.
- Developed Simple to complex MapReduce Jobs using Hive and Pig.
- Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted teh data from MySQL into HDFS using Sqoop
- Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
- Extensively used Pig for data cleansing.
- Created partitioned tables in Hive.
- Managed and reviewed Hadoop log files.
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
- Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
- Responsible to manage data coming from different sources
- Extensively used Pig for data cleansing.
- Created partitioned tables in Hive.
- Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS
- Developed teh Pig UDF'S to pre-process teh data for analysis.
- Develop Hive queries for teh analyst
- Developed workflow in Oozie to automate teh tasks of loading teh data into HDFS and pre-processing with Pig.
- Mentored analyst and test team for writing Hive Queries.
- Involved in teh database migrations to transfer data from one database to other and complete virtualization of many client applications
- Supports and assist QA Engineers in understanding, testing and troubleshooting.
- Written build scripts using ant/maven and participated in teh deployment of one or more production systems
- Production Rollout Support that includes monitoring teh solution post go-live and resolving any issues that are discovered by teh client and client services teams.
- Designed, documented operational problems by following standards and procedures using a software-reporting tool JIRA.
Technologies: Hadoop, MapReduce, HDFS, Hive, HBase, Sqoop, Java (jdk1.6), Pig, Flume, Oracle 11/10g, DB2, Teradata, MySQL, Eclipse, PL/SQL, Java, Linux, Shell Scripting, SQL Developer, Toad, Putty, XML/HTML, JIRA
Confidential, Huston, TX
HADOOP DEVELOPER
Roles and Responsibilities:
- Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Designed and developed Oozie workflows for automating jobs.
- Mainly working on handling of Big Data Analytics and infrastructure of Hadoop, MapReduce
- Got good experience with NoSQL database.
- Performed Map Reduce Programs those are running on teh cluster.
- Installed and configured Hive and also written Hive UDFs.
- Created HBase tables to store variable data formats of data coming from different portfolios.
- Implemented best income logic using Pig scripts.
- Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh Business Intelligence (BI) team.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
- Writing HadoopMR programs to get teh logs and feed into Cassandra for Analytics purpose
- Moving data from Oracle to HDFS and vice-versa using SQOOP.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Worked with different file formats and compression techniques to determine standards
- Developed Hive queries and UDF's to analyze/transform teh data in HDFS.
- Developed Hive scripts for implementing control tables logic in HDFS.
- Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
- Developed Pig scripts and UDF's as per teh Business logic.
- Importing log files using Flume into HDFS and load into Hive tables to query data.
- Developed pig scripts to convert teh data from Avro to Text file format.
- Developed hive scripts for implementing control tables logic in HDFS.
- Developed Sqoop commands to pull teh data from Teradata.
- Analyzing/Transforming data with Hive and Pig.
- Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis.
- Designed and developed read lock capability in HDFS.
- Involved in End-to-End implementation of ETL logic.
- Effective coordination with offshore team and managed project deliverable on time.
- Worked on QA support activities, test data creation and Unit testing activities.
Technologies: JDK, RedHat Linux, HDFS, Mahout, Map-Reduce, Apache Crunch, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, DB2, and HBase.
Confidential
JAVA/J2EE DEVELOPER
Roles and Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) of teh application like Requirement gathering, Design, Analysis and Code development.
- Prepared Use Cases, sequence diagrams, class diagrams and deployment diagrams based on UML to enforce Rational Unified Process using Rational Rose.
- Extensively worked on user interface for few modules using HTML, JSP's, and JavaScript.
- Generated Business Logic using servlets, Session beans and deployed them on Web logic server.
- Created complex SQL queries and stored procedures.
- Used Hibernate ORM framework with spring framework for data persistence and transaction management.
- Wrote test cases in JUnit for unit testing of classes.
- Provided technical support for production environments resolving teh issues, analyzing teh defects, providing and implementing teh solution defects.
- Built and deployed Java application into multiple Unix based environments and produced both unit and functional test results along with release notes.
- Analyzed teh banking and existing system requirements and validated them to suit J2EE architecture.
- Designed teh process flow between front-end and server side components
- Developed and implemented teh MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
- Developed web based presentation-using JSP, AJAX using Servlets technologies and implemented using struts framework.
- Designed and developed backend java Components residing on different machines to exchange information and data using JMS.
- Involved in creating teh Hibernate POJO Objects and mapped using Hibernate Annotations.
- Used JavaScript for client-side validation and Struts Validator Framework for form validations.
- Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
- Written Junit Test cases for performing unit testing.
- Integrated Spring DAO for data access using Hibernate, used HQL and SQL for querying databases.
- Worked with QA team for testing and resolving defects.
- Used ANT automated build scripts to compile and package teh application.
- Used JIRA for bug tracking and project management.
Technologies: J2EE, JSP, JDBC, Spring Core, Struts, Hibernate, Design Patterns, XML, WebLogic, Apache Axis, ANT, Clear case, JUnit, JavaScript, WebServices, SOAP, XSLT, JIRA, Oracle, PL/SQL Developer and Windows