Hadoop Engineer Resume
OBJECTIVE:
Seeking a position as a Hadoop Engineer in a growing, environmentally conscious company that will utilize my skills and knowledge of Apache Spark, Hadoop Map - reduce, Big Data ecosystem, Data Warehousing, Business Intelligence, and also to enhance my interests and technical skills.
SUMMARY:
- 6 years of software development experience in ETL, Technologies like Java, SQL , Unix and 2 years of experience on Big Data Technologies like Spark, Hadoop, Hive, Sqoop & other BI tools.
- Experience in designing and implementing ETL projects using ETL Tools like Informatica and Datastage, Big Data projects using Hadoop Ecosystems like Spark, Hive, Mapreduce, Sqoop.
- Good in Implementation and Development of various Business Applications.
- Excellent in Software Development Lifecycle including analysis, design, development, testing, implementation and support.
- Strong knowledge on Spark framework, Hadoop HDFS architecture and MapReduce framework.
- Strong Technical Knowledge in JAVA,SQL and ETL(Extract, Transform and Load) process, Scripting (Shell/Python) .
- Used Spark SQL module for creating and transforming data frames based on business logic.
- Worked on developing ETL processes to load data from multiple data sources to HDFS using SQOOP, perform structural modifications using Spark, HIVE and analyze data using visualization/reporting tools.
- Experience in working with Hive data warehouse system, extending the Hive library using custom UDF’s to query data in non-standard formats.
- Hands on experience in developing SQOOP jobs to import data from RDBMS sources into HDFS as well as export data from HDFS into RDMBS tables.
- Hands on experience in implementing complex business logic and optimizing the queries using Spark SQL and HiveQL. Controlling the data distribution by partitioning to enhance performance.
- Experience in planning, designing and developing applications spanning full life cycle of software development from writing functional specification, designing, implementing, documentation, unit testing and support.
- Exposure to Maven, Git along with Shell Scripting for Build, Deployment and Versioning Process .
- Hands on experience in developing ETL jobs to import/export data from various sources including RDBMS and to implement various business logics.
- Extensively involved in design and implementation of various ETL projects.
- Lead several ETL Development teams ensuring that the development is inline with the Customer Requirement and meets the timelines as per the Project Plan.
- Provided value additions to the Client through various performance tuning initiatives.
- Gained above 95 % Customer Satisfaction Index across various Project assignments.
- Worked with one of the general Insurer ’s divisions of Confidential (American International Group, Inc) - Confidential JAPAN.
- Strong Knowledge in Insurance Domain - especially Property and Casualty Insurance concepts .
- Maintained Excellent Customer Relationship having worked with Japanese Clients for the past 5 years.
- Interaction with the clients directly by working at the Client Location - Confidential Japan, Tokyo for 2 years.
- Done a major contribution to the Projects by interacting directly with the Data Analysts, Business Users, SME s(Subject Matter Experts) and thereby co-ordinating Project Development Activities.
- Have played the role of a Developer, Coordinator & Knowledge Management Anchor.
- Good communication and interpersonal skills, a committed team player and a quick learner.
- Experience as a project coordinator between onsite and offshore teams which involved chairing status calls, delegating open tickets and updating progress to project managers.
- Have Strong Analytical and Troubleshooting Skills.
- Involved in Learning & Development Activities. Conducted various training sessions.
- Good Leadership Skills.
TECHNICAL SKILLS:
Hadoop Ecosystems : HDFS, Spark,MapReduce, Hive, Sqoop
ETL Tools : Informatica,Datastage
Reporting Tools : B.O(Business Objects), Cognos
Programming Languages : Java, Unix, Oracle
Scripting: Shell, Python
Web Technologies : HTML, Java Script, XML, CSS
RDBMS: Mysql, SQL Server,Sybase
IDE Tools : Eclipse, Microsoft Visual Studio, Microsoft Visio
Servers : Apache Tomcat server
Versioning systems : CVS, PVCS, Git
Operating Systems : Windows (XP,7), Linux
PROFESSIONAL EXPERIENCE:
Confidential
Hadoop Engineer
Responsibilities:
- Worked in a migration project to replace business critical Data ware housing System to Hadoop.
- Developed various transformation logics using Spark SQL and Hive as part of the migration project.
- Developed Oozie workflows to source the legacy data into Hadoop and to transform the data as per the downstream specifications.
- Worked in design and development of data movement framework from disparate sources like SQL Server, Teradata and Mysql into Hadoop
- Developed parameterized shell scripts for generating Sqoop scripts, Hive Tables and partitions .
- Developed various Hive UDF's and migrated to Spark UDF’s for data lookups.
- Worked with various data formats such as CSV, text file and Avro.
- Performed benchmarking on sqooping the data and performace tuning to adhere to SLA's.
- Developed Data Quality framework to check for data discrepancies using shell scripting and hive.
Confidential
I.T.Analyst
Environments: Hadoop,Hive,Sqoop,Oozie, Datastage V 9.1 Client , RedHat Linux , Unix Solaris 9, Oracle 11g, PEGA.
Responsibilities:
- Sourced policy, claim and agent data from different source systems using sqoop .
- Developed data integrity scripts using shell and hive.
- Implemented transformations based on business logic to derive new fields using hive .
- Generated reports based on the transformed data .
- Worked as Technical Lead in the Project.
- Involved in the Project Start Up activities and Project Plan.
- Provided Low Level Technical Design for the ETL module in the Project.
- Created key project Artifacts such as Effort Estimations, Audit documents and Project Plan .
- Lead the development team in design and development of ETL jobs, Routines and their interaction with the various back-end, front-end Applications.
- Developed parameterized shell scripts to invoke the Sequence of ETL jobs .
- Developed various routines to deal with data formatting issues especially to handle the Japanese Character sets.
- Involved in the development of Autosys scripts for scheduling the batch jobs.
- Troubleshooting issues during batch executions while they interact with the various back-end,front- end applications.
- Documented the ETL job flow and depicted it in the form of Job Flow Diagrams.
- Played a key role in guiding the team to meet Project Deliverable Timelines as per the Plan .
Confidential
Systems Engineer
Environments: Datastage V 8.5 Client, RedHat Linux, Unix Solaris 9, Oracle 11g.
Responsibilities:
- Extensively involved in various Key Project Activities from the very initial phase of the Project.
- Involved in Mapping of an RDBMS System - Marine System called as JAMP to Global Confidential Requirement.
- Played a key role in understanding the Marine business and contacting the related SME s to implement Marine Mapping.
- Acquired good Knowledge and understanding on Playbook and Mapping Guideline documents which conveyed the key requirements for ODSJI Project.
- Involved in project design, guideline and mappings.
- Involved in the creation of key documents like ETL Development Guideline, Input file Requirements document, Unit Test Results Template.
- Taken care of the Infrastructure set up Activities in the project and co-ordinated to resolve many server issues.
- Developed a POC(Proof Of Concept) for the Project which has been a key prototype for the actual Development.
- Lead ETL Development involving 42 Source Systems by providing the development team with input on sound architecture and design principles.
- Reviewed design and architecture documents to ensure that they are created as per requirement.
- Involved in Code Review and Test Results Review.
- Documented Key Challenges faced during Project Setup and Development Phase.
- Played a crucial role in Ensuring that Development is done in line with the project requirements.
Confidential
Ass istant Systems Engineer
Environments: JAVA, Datastage 7.1 Client, Datastage 7.5 Client, Win XP, Unix Solaris 9,Oracle 11g, Micro Focus COBOL
Responsibilities:
- Requirement analysis, preparing design documents.
- Coding, code reviews and test plan reviews.
- Preparing the project status report and sharing it with client on weekly basis.
- Participate in Issue resolution activities and tracking the status of issues.
- Knowledge Management Activities.
- Gained substantial knowledge about the concepts, design advantages, traps and pitfalls of successful object-relational mapping.
- Extensively involved in Leading & Co-ordinating the Team to handle the project requirements & Deliverables.
- Good Knowledge in project design, guideline and mappings.
- Involved in the creation of Design documents like Analysis documents and Job Flow Diagrams for the project.
- Development and maintenance of Java module involving frameworks - to store the Transaction details of a policy and perform insurance commission amounts calculation.
- Played a key role in the maintenance for interaction of this Java module with Datastage to populate the required monetary values.
- Worked on development and maintenance activities of ETL (Datastage) jobs, sequences, user defined routines and unit test case documents for the project.
- Worked on Unix shell scripts that are used to execute the Datastage jobs.
- Developed the SQL/PLSQL queries in Oracle to extract data from and load data into JODS(Japan Operational DataStore) Database.
- Taken care of the Database Activities in the project and resolved many server issues.
- Worked on Debugging and maintenance of Commission Calculation module which is coded in COBOL.
- Involved in preparation of project related documents like Effort Estimations, Client Metrics, UPP (Unified Project Plan), Induction Manuals, Audit Documents etc.
- Involved in taking up the code reviews, Audits and Project Management reviews for the project.
- Maintained KEDB document which records the various issues in the project along with solution.
- Extensively involved in unit testing and user acceptance testing activities.