Hadoop Developer Resume
Bellevue, WA
SUMMARY
- Overall 10 years of professional experience as Lead and Senior Analyst.
- Good knowledge of Hadoop (HDFS, MapReduce), HBase, Spark, Pig, Hive, Hue, Phoenix, Oozie, Zoo Keeper, Sqoop and Hortonworks Distribution including Apache Hadoop (HDP)
- Experience in working in large hadoop cluster of the 200 data nodes
- Experience in importing and exporting data from different RDBMS like Teradata, MySQL and Oracle into HDFS and Hive using Sqoop
- Very good experience in Hadoop, Pig, Hive, Sqoop, Yarn and designing and implementing Hive jobs to support distributed data processing and process large data sets utilizing the Hadoopcluster
- Experience in Design & Development and maintenance of NoSQL databases like Hbase.
- Experience in banking, capital market and communication applications include customer information and weblog processes.
- Diverse experience utilizing Java tools in business, Web, and client - server environments
- Expertise in management skills by handling all meetings, data analysis, documentation and reports
- Working experience in MQ series
- Worked extensively on Critical change requests and UAT validations.
- Having conceptual and hands on experience on Data conversions
- Ability to quickly adapt to new tools and techniques
- Worked with Business & Data Analysts to clarify Ambiguities
- Good understanding of the restrictions, standards and database architecture for all DB objects related activities.
- Strong Communication, Presentation and Problem Solving skills as well as very good Interpersonal, and Business Skills coupled with many years of technical expertise
- Independent thinker, fully capable of making assessments of problem areas and finding effective solutions.
TECHNICAL SKILLS
Big Data: Apache Hadoop (HDFS, Map Reduce), Hive, Spark, HBase, Phoenix, Sqoop, Pig, Flume, Oozie, Zoo Keeper, Sqoop, Hortonworks Distribution including Apache Hadoop (HDP).
Programming Languages: C, JAVA, SQL, PL/SQL, COBOL
Scripting Languages: Shell, python
RDBMS: Teradata, MySql, Oracle, DB2
Operating Systems: MS Windows (95, NT, 2000, XP), MS DOS, LINUX, z/OS
Testing Tools: QC
Database Tools: TOAD, AQT, Teradata SQL assistant and Data studio
Tools: Git, Eclipse, RAD, Maven, MS office, FileZilla, MQ series, Accurev
PROFESSIONAL EXPERIENCE
Confidential, Bellevue, WA
Hadoop Developer
Responsibilities:
- Worked in the Hortonworks Distribution (HDP 2.2.4.2)
- Participated in the data mapping sessions along source system analysts and data analysts to prepare the source to target mapping documents
- Understanding of Communication Logical Data model
- Ingested data using the different approaches like files, mounted location, DB view, DB
- Cleansed the data using the pig scripts before loading into the data lake.
- Transferred historical data from Teradata into Hive for last 3 years using Sqoop jobs during off peak hours
- Tuned Teradata sqoop parameters to improve performance
- Understanding of the oracle golden gate replication
- Experience in using ORC file format with snappy compression for the performance improvements
- Responsible for creating preparation job to load the data into data model
- Created pig scripts for data transformation
- Used the oozie workflows for execution of different actions required to load the formatted data to hadoop layout.
- Created python streaming script for pig for sequential data processing use case
- Responsible for creating the dispatch job to load data into Teradata layout
- Worked on Teradata BTEQ scripts to load the data into Teradata
- Used ResourceManager REST API’s for monitoring applications/job status and debug the mapper or reducer errors
- Understanding of Kerboroes authentication in oozie workflow for hive and Hbase
- Used Phoenix tables on top of the Hbase table for query performance improvement for non-row key query patterns
- Worked with business teams and created Hive queries, pig scripts for ad hoc analysis.
Environment: HDP 2.2.4.2, Apache Hadoop, Mapreduce, Yarn, Hive, Spark, Pig, Sqoop, Hbase, Phoenix, Oozie, Hue, Linux, shell scripting, Teradata, ORACLE 10g, Quality center
Confidential, Westlake, TX
Hadoop Developer
Responsibilities:
- Participated in requirement analysis and creation of data solution using Hadoop.
- Involved in analyzing system specifications, designing and developing test plans.
- Worked on ingestion process of the web log data into Hadoop platform
- Created process to web log data enrichment, page fixing, sessionization and session flagging.
- Responsible for creating Hive tables, partitions, loading data and writing hive queries.
- Migrated existing inbound processes from legacy system to Hadoop
- Transferred historical data from Oracle into Hive for last 7 years using Sqoop jobs during off peak hours
- Worked with business teams and created Hive queries for ad hoc analysis.
- Participated in the coding of application programs for the out bounds
- Build common purge process to remove the old tables
- Used the existing Pig DataFu sessionize UDF to create session id for processing
- Strong problem solving, reasoning, Leadership experience and analytical skills
- Ability to work constructively with developers, QA Team, Project managers, and Management towards a common goal
- Strong problem solving, reasoning, and analytical skills
- Participated at assigned user conferences, user group meetings, internal meetings,
- Prioritization, Production work list call etc
- Well conversant with software testing methodologies including developing Design documents, Test plans, Test scenarios, Test cases and documentation.
- Prepared documents for trouble shooting.
- Creating and executing SQL queries on an ORACLE database to validate and test data
- Performed functional, regression, system testing, interface testing, integration testing and acceptance testing.
Environment: Apache Hadoop, Mapreduce, Hive, Pig, Sqoop, Hbase, Oozie, Hue, Linux, ORACLE 10g, SQL, PL/ SQL, AQT, Quality center
Confidential, Charlotte, NC
Technical Lead
Responsibilities:
- Worked collaboratively & effectively with remote teams, internal customers & work closely with the support teams for problem analysis and timely
- Worked in the conversion of the 5 mm customer profiles and customer to account linkages from premier customer hub to consumer customer hub.
- Worked in the conversion of 20 k trust only customer profiles and account linkage information from trust customer hub to consumer customer hub.
- Conversion of 40mm account linkage details from Cardlytics to setup existing Targeted Offer Services
- Built synch-up process between consumer, premier and trust hubs.
- Created monthly reporting process for the data between WCC, CED and GWM.
- Built process to extract customer profile delta changes using master process, which uses MQ for storing the delta publish messages from the APIs.
- Built interface to provide the WCC profile, account linkage, combine and separate details to Enterprise Client Data Management Group for reporting
- Built capacity to store the WCC XML publish messages in DB2, this will be used to publish messages to ECDM, if there is any unknown outage in the ECDM server over MQ.
- Built process to remove customer privacy choices data from WCC due to the compliance
- Modified the outbound processes to remove the privacy choices information
- Modified the internal processes to remove the to privacy choices.
- Modified the screens to remove the privacy choices from the interactions.
- Support the MDM upgrade by DDL upgrade - One DDL script including vendor provided upgrade scripts from MDM v8.5 -> MDM v9.0 -> MDM Server v10.0
- Impact Analysis and analysis of existing functionality/Customizations against MDM Server.
- Modification and testing of impacted batch flows and regression testing of critical non-impacted batch flows.
- Modified interfaces to support mobile wallet application.
- Created sync process to between WCC and other system.
- Support a new Customer level service for Targeted Offers (TGO) to identify customer enrolled in Targeted Offers.
- Modified existing interfaces to expose new service attributes for mobile show/hide inline offer p.
- Created interfaces between Cardlytics to create service and sync up services.
- Proficient with the SYNCSORT and point of the contact for the sync queries in the group.
Environment: COBOL, DB2, CICS, Easytrieve, SYNCSORT, SQL, PL/ SQL, Java, XML, RAD 7.0, SOAP, WebSphere6.0, UNIX, Rational Clear Case, MDM database, Data studio, CA Viewprod, Changeman, CA7 scheduling, Rumba, IBM utilities, Windows NT/XP, FTP, MS Office and MS Visio.
Confidential
Software Engineer
Responsibilities:
- Upgraded the existing Confidential return processing to enhance the processing of the Confidential Returns, this will deliver distribution to the customer as soon as possible targeting 15-minute turnaround time after the receipt of the returns files.
- Built a faster matching process, dishonor process and contested dishonor process for all originated returns by eliminating the need to run returns through 2 PEP+ windows.
- Once resolved, the returns (including contested dishonors) will be sent directly to the Returns Distribution System that will enable us to report these to the customer in a timelier manner.
- Upgraded the PEP+ system from VSAM files (E9* files) to DB2 tables and these VSAMs are converted to their corresponding tables based on the Segment Id as a part of PEP+ up gradation.
- MFEEE is IDE tool to develop the APS COBOL programs, I worked on developing the programs using MFEEE and contact for the offshore
- Received Rising start for the contribution towards the Confidential return enhancements
Environment: COBOL, DB2, CICS, Easytrieve, SYNCSORT, SAR, SQL, Changeman, MFEEE, CA7 scheduling and IBM utilities.