Hadoop Developer/lead Resume
Tucson, AZ
SUMMARY
- Experienced Hadoop developer has a strong background with file distribution systems in big data landscape including the immense experience in Data Warehousing ETL & Oracle.
- 9 years of IT experience in the Analysis, Design, Development, Testing and Implementation of business application systems for Big Data & Enterprise Data Warehouse.
- Around 3 years of Experience in Hadoop Eco system architecture and various components.
- Strong experience in Big Data, HDFS, Sqoop, HBase, Pig and Hive.
- Hands on experience on Hadoop Distributed File System (HDFS) and Hadoop MapReduce framework.
- Extensive knowledge of Sqoop for extracting data from traditional RDBMS.
- Good experience in building Pig scripts to Extract, transform raw data from several data sources into HDFS.
- Strong experience in writing Hive QL for end user requirements to perform ad hoc reporting analysis.
- Strong Knowledge of HBase a distributed column - oriented database built on top of the Hadoop File System.
- Extensive knowledge of Developing UDFs, UTDFs in Python as and when necessary to extend Pig functionality.
- Good knowledge of Workflow scheduler Control-M & Oozie to manage Hadoop Jobs.
- Strong knowledge of processing large sets of structured, semi-structured and unstructured Data and supporting systems application architecture.
- Extensive experience of Data Warehousing technology including Oracle 11gR2 SQL, PLSQL & Informatica 9.1.
- Strong knowledge on Cloudera Impala for massively parallel processing Query development.
- Hands on experience in UNIX Operating system & Shell scripting.
- Good knowledge of fetching JSON data using Pig and Hive.
- Very good understanding ofPartitions, Bucketingconcepts in Hive and designed both Managed and Externaltables in Hive to optimize performance.
- Good knowledge of Amazon Web Service components likeEC2, EMR, S3etc.
- Attained performance issuesin Hive, Pig scripts with understanding of Joins, Group and aggregation, and how does it translate to MapReduce jobs.
- Strong experience in Extraction, Transformation and Loading (ETL) data from various legacy sources into Enterprise Data Warehouse and Data Marts using Informatica Power Center as an ETL tool on to Oracle Database.
- Extensive experience in developing Adhoc reporting using SQL, create Stored Procedures, Functions, Views and Triggers and create Complex SQL queries.
- Work with System architects and DBA has to ensure decisions meet long-term enterprise growth needs & reusability factors.
- Expertise in working in every phase of the Software Development Lifecycle, both Waterfall and Agile.
- Excellent Oral and written Communication skills
- Having 3 and half years of work experience with clients at onsite & played Team lead role.
TECHNICAL SKILLS
Big Data Ecosystems: Hadoop, MapReduce, HDFS, Sqoop, Pig, Hive, Impala, HBase and Oozie
ETL Tools: Informatica Power Center v9.x, OWB, Oracle
Database: Oracle 11gR2, No SQL
Programming Languages: C, C++, SQL, PL/SQL, Shell scripting
Reporting: OBIEE, BI Publisher, Crystal Reports & Oracle Discoverer Reports
Scheduling Tools: BMC Control-M, Oozie, UC4 Automic
Operating Systems: Windows 2000/NT/XP/7/10, Red Hat Linux, Solaris
ITSM Tools: Service Now, BMC Remedy Action Request System
GUI Tools: SQL-Developer, TOAD 12.5, SQL*Plus, SQL Loader, XML Publisher, Putty, WinSCP
Enterprise Content Management Tools: EMC Documentum, Webxtra
PROFESSIONAL EXPERIENCE
Confidential, Tucson, AZ
Hadoop Developer/Lead
Responsibilities:
- Work with the business to gather new requirements to write transformation tasks using Pig for which design the data ingestion process using Sqoop.
- Track and maintain tasks/projects completed on time and within given scopebetween Onsite and Offshore team.
- Good understanding ofSqoop jobs with incremental loadto populate Hive External tables.
- Extract data from flat files to HDFS using Unix scripts
- Extensive experience in createPigscripts to transform raw datafrom several data sources into forming baseline data.
- DevelopedHiveQL scripts that helped end user requirements to perform ad hoc analysis and data mining.
- Very good understanding ofPartitions, Bucketingconcepts in Hive and designed both Managed and Externaltables in Hive to optimize performance
- Attained performance issuesin Hive, Pig scripts with understanding of Joins, Group and aggregation, and how does it translate to MapReduce jobs.
- DevelopedUDFsin Python as and when necessary to use in Pig and Hive QL.
- Good Understanding in usingSequence files, RCFile, AVRO and HARfile formats.
- Good understanding ofOozieworkflow for scheduling and orchestrating the ETL process
- Schedule the meetings with different IT groups; coordinate the Deployment activities, plan and prepare project cutover activities and Prepare deployment plan and run the plan.
- Work with System architects and DBA has to ensure decisions meet long-term enterprise growth needs & reusability factors.
- Implemented multiple solutions that improved performance of batch jobs and reports. This helped in reducing support overhead as well as preserving SLAs.
- Responsible for the data transformation tasks which written using PIG scripts and the scripts run every night.
- Perform lead role in project that needed helping hands to bring back to track by leading the team and coordinating with testing, business, external vendors, DBA’s and infrastructure groups as well. Being flexible to take up additional responsibility when there is a need.
- Estimation, planning of timelines and resources for projects that I am leading, creation of solution approach documentsetc.
- Meetings with Vice President, CFO to get the report requirements, generate reports for audit compliance, and forecast.
- Day to dayactivities,Weekly statusupdates, technical upgradeactivitiesfor various applications,process and performance improvements.
Confidential, Tucson, AZ
ETL Lead
Responsibilities:
- Strong Data Warehousing ETL experience of using Informatica 9.1 Power Center Client tools - Mapping Designer, Repository manager, Workflow Manager/Monitor and Server tools - Informatica Server, Repository Server manager.
- Extensive Support for Data warehouse loading, bug fixes, Troubleshoot and debug issues; identify and implement solutions.
- Involved in a POC to evaluate the feasibility of Data lake process with Hadoop Big Data, Pig and Hive and involved in the exploration phase.
- Study and analyze the requirements from given BRD and provide Technical documentation.
- Adhoc reports generation based on business requirements. Generate & Implement reports using BI Publisher, Crystal reports & Discoverer reports.
- Trouble Shoot and apply query, ETL optimization techniques.
- Must maintain coding & Process standards as per data governance rules and adhere to best practices and security guidelines
- Expertise in Data Warehouse/Data mart, ODS, OLTP and OLAP implementations teamed with project scope, Analysis, requirements gathering, Effort Estimation, ETL Design, development, System testing, Implementation and production support.
- Well experienced in ETL data model, Change data capture Type1, 2 and 3.
- Day to day supportactivities,Weekly statusupdates, technical upgradeactivitiesfor various applications,process and performance improvements.
- Support and maintenance process managed and audited by tracking the activities using Request Items, Incidents and Change Requests.
- Track and maintain tasks/projects completed on time and within given scopebetween Onsite and Offshore team.
- Working in the role of onsite lead and single point of contact of Data Warehouse and Reporting solutions for Customer Applications area.
- Get the business requirements and distribute tasks to offshore team, Tracking and planning of assignments from offshore teamas well as working on24/7 production support of ETL & Reporting and maintain the SLA.
- Meeting with Big data Center of excellence teams to understand and assess the scope of work involved for big data eco system.
- Strong experience in Extraction, Transformation and Loading (ETL) data from various sources into Data Warehouses and Data Marts using Oracle Warehouse Builder as an ETL tool on Oracle Database.
- Extensive experience in developing Adhoc reporting using SQL, create Stored Procedures, Functions, Views and Triggers, Complex SQL queries using Oracle PL/SQL.
- Resolved on-going maintenance issues and bug fixes; monitoring Data load of ETL sessions, Daily, Weekly, Monthly Operational reports as well as performance tuning of Report Queries, mappings and sessions.
- Configure the batch jobs in Development environment using Automation Scheduling tools like UC4 and Automic and work with System admin team to move jobs to production.
Confidential, Rye, NY
Sr. ETL Developer
Responsibilities:
- Study and analyzing the requirements using given High level BRD and provide Technical documentation.
- Develop interfaces using Oracle objects like Packages, Procedures, functions, Database Triggers as per business requirement.
- Weekly status meetings with client about the project/assignments status and planned/completed activities.
- Meeting with Business analysis to discuss about the requirements and assign tasks to offshore, collect the work from offshore, consolidate and review.
- Work with business are to get the necessary approvals, signoff’s of the change requests and plan for the production move by co-ordinate with corresponding IT groups.
- Adhoc reports generation and reconciling the data elements between Master Data management and downstream systems.
- Developed various transformations, mappings, Sessions, workflows for loading data from source systems to target as part of ETL process.
- Monitor the ETL load and reconciliation process, and fix the data issues to adhere the SLA.
- Provide Audit/Error Reporting and data fix based on business approvals.
- Meeting with key business people to ensure the data governance policies met.
- Fine tune mappings by using source qualifier without using filter, Use the aggregations in sorted input, Use operators in expressions instead of functions, Increase the cache size, and commit interval etc.
- Complete the big data academic fundamental courses like Hadoop, Pig & Hive in the internal academy.
- Extensive experience in trouble Shoot and apply query, ETL optimization techniques.
- Involved in meetings with big data technical groups to understand the architecture and help in developing Hive queries with the knowledge of SQL.
- Extensively used FTP tool for upload, download the files to servers as part of external table creation as part of development.
- Performed Data Validation at Transaction level using Database Triggers, Cursors.
- Very good interaction with Customer business & Technical team.
- Generate flat files from Database using complex data type like UTL FILE utility.
- Involved in Production data fixes based on Change requests.
- Creating External tables for loading data from flat files into target systems.
Confidential
Sr. ETL Developer
Responsibilities:
- Study and analyzing the requirements using given High level BRD and provide Technical documentation.
- Develop the ETL interfaces to Extract data from various source systems, validate and Transform the data as per MDM and load data into downstream systems.
- Provide Audit/Error Reporting
- Extensively developed the Oracle objects Packages, Procedures, functions, Database Triggers as per business Rules.
- Extensively created Oracle database User defined functions for data validations
- Perform the Unit testing and involved in UAT for helping business users.
- Performed Data Validation at Transaction level using Database Triggers, Cursors.
- Created files from database using complex data type like UTL FILE utility.
- Involved in Production fixes based on Change requests.
- Creating External tables for loading data from flat files.
Confidential
Sr. Software Engineer
Responsibilities:
- Study and analyzing the requirements using given High level BRD and provide Technical documentation.
- Developed Oracle objects like Packages, Procedures, functions, Database Triggers as per business Rules.
- Develop the interfaces to load data from various sources as Full load or incremental load based on transaction ticket no.
- Analyze the problem area and Provide better solutions for transforming data from source to system adhere to master data.
- Fix the data based on Audit/Error Reporting that runs four times a day.
- Perform the Unit testing and bug fixes
- Performed Data Validation at Transaction level using Database functions, Triggers, Cursors.
- Extensively used Toad and its FTP tools.
- Involved in Production fixes based on Change requests.
- Creating External tables for loading data from flat files.
- Good interaction with Customer team.
- Extensive experience in Data warehouse, Change data capture, data model and OLAP.
- Good experience in scheduling the jobs as batch scheduling.