Hadoop Developer Resume
San Jose, CA
PROFESSIONAL SUMMARY:
- 14+ Years of IT experience with emphasis on business requirements analysis, application design, data modeling, development, implementations, testing and project coordination of OLTP and Data ware house applications.
- Well versed with Bigdata Technologies, Hadoop, HDFS, Oracle, Informatica and UNIX.
- Good experience in implementing Hadoop data pipelines to identify customer usage patterns, performing Trend analysis and benchmarking.
- Has good exposure on Hadoop Technologies like HDFS, Map Reduce, Hive, Impala, Sqoop, Oozie and Pig.
- Good knowledge on multiple databases such as Oracle, My Sql, Sybase, Hive.
- Well versed with performance tuning aspects of Hive using Partitioned tables, Indexing, Bucketing.
- Vast knowledge in almost all the Oracle Database objects such as Tables, Views, Materialized views, Indexes, Synonyms, Sequences, Database links, Constraints and Triggers.
- Broad experience in Query writing and tuning including complex SQL queries and PL/SQL programming (cursors, exceptions, bulk collect, stored procedures, functions, triggers and packages).
- Extensive knowledge in performance tuning with SQL Trace, TkProf, Explain Plan, Indexes, Hints, partitioning and compression.
- Highly skilled in Planning, Designing, developing and deploying Data Warehouses / Data Marts.
- Extensively used ETL methodology for supporting Data Extraction, transformations and loading processing, using Informatica (Power Center, Workflow Manager, Workflow Monitor …).
- Experienced in implementing Slowly Changing Dimensions including Type 1, 2, 3 and knowledge in De - normalization, Data Cleansing, Data Quality, Aggregation, Performance Optimization, Audit etc.
- Worked with various Informatica client tools like Source Analyzer, Warehouse designer, Mapping designer, Mapplet Designer, Transformation Developer, Informatica Repository Manager and Workflow Manager.
- Demonstrated Experience in Dimensional Data Modeling using ERWIN.
- Good working experience in UNIX, developing shell scripts and scheduling in Crontab or other scheduling tools.
- Experienced in analyzing business requirements and translating requirements into functional and technical design.
- Well versed with various Project development methodologies such as Agile and Waterfall methods.
- Excellent communication, presentation, project management skills and a very good team player and self-starter with ability to work independently and as part of a team.
TECHNICAL SKILLS:
Big Data Technologies: Hive, Impala, Pig, Map Reduce, HDFS, HBase, Sqoop, Oozie and Spark.
ETL Tools: Informatica (Power Center 9.x, 8.x), SQL*Loader
RDBMS: Oracle 12C, 11g/10g/9i/8i, MS SQL SERVER, Sybase
BI Tools: Business Objects, Crystal Reports
Data Modeling: Erwin Data Modeler, Toad Data Modeler
Database Tools: TOAD, SQL Developer, SQL Navigator, PL/SQL-Developer
Unix Tools: Winscp, Putty, Filezilla, Reflection
Languages: SQL, PL/SQL, Unix Shell scripting, Java, VB5
Scheduler: TES, Dollar U ($U), UC4, Autosys
Version Control: SVN, VSS, CM Synergy
Other Tools: Microsoft Project, Remedy, Jira, HP Quality Center.
PROFESSIONAL EXPERIENCE:
Confidential, San Jose, CA
Hadoop Developer
Responsibilities:
- Designed and implemented scalable infrastructure and platform for large amounts of data ingestion, aggregation and analytics in Hadoop including Hive, Impala, Sqoop, HBase and Oozie.
- Leverage new and emerging practices in Hadoop Data Platforms by using Sqoop, Splunk, Hive, Oozie, Pig, and MapReduce.
- Convert the existing Pentaho mappings and the manual data load jobs into automated jobs.
- Identified the current Performance bottlenecks in the Data load jobs and converted them into 10x faster data loads than the current data load process.
- Ensuring excellent practices are implemented in delivering Big Data Management and Integration Solutions.
- Generate the Dashboards in Tableau to provide the metrics on various granularities that are key in making the business strategy and improving the revenues.
- Implement Continuous Integration, Test Driven Development and Code Analysis including its application within the software development life-cycle.
- Create the database objects such as Tables, Indexes, Partitions, Collections, Bulk Collect and PL/SQL procedures in Oracle.
- Schedule the data loading jobs in Tidal Enterprise Scheduler and Oozie by developing automatic workflows of Sqoop, MapReduce and Hive jobs.
- Gather business requirements from the Business Partners and Subject Matter Experts and translate them into Technical specifications.
- Exported the analyzed data into relational databases using Sqoop for visualization and to generate reports for the Business Intelligence team.
- Tune performance of the existing system by making the design changes to efficiently handle the data.
Environment: Hadoop & Big data technologies, Oracle 12C, UNIX.
Confidential, Boston, MA
Sr. ETL Developer
Responsibilities:
- Involved in requirement gathering and analysis for the data marts focusing on data analysis, data mapping between data sources, staging tables and data warehouses/data marts.
- Implement the Hadoop data piple line to load data from Hadoop to the Oracle relation database.
- Load data into Hive partitioned tables.
- Export and Import data into HDFS, HBase and Hive using Sqoop.
- Involve in create Hive tables, loading with data and writing Hive queries which will run internally in MapReduce way.
- Create reports for the BI team using Sqoop to export data into HDFS and Hive.
- Converted the existing the Information mappings in to the Oozie jobs by using Hive, Sqoop and Oozie.
- Created mappings which involved Slowly Changing Dimensions Type 1 and Type 2 to implement business logic and capturing the deleted records in the source systems.
- Set up the new standard extracts and modify existing standard extracts using Custom Event, Autosys events and the events based on upstream system.
- Involved in the performance tuning of the marts to improve the execution time of the data loading and retrieval.
- Transform business requirements into mart design and develop marts in Enterprise Service Platform (ESP) platform.
- Worked with the Business Analysts and the QA team by providing sql queries for validation and verification of the development.
- Created the PLSQL objects such as Procedures and functions to implement the complex business logic and developed Triggers to implement the data auditing where applicable as per the EDW requirements.
- Developed the views in Oracle DB to retrieve the data by using the Analytical functions, Regular expression functions, Pivoting data and Hierarchical queries.
- Choose and implement the appropriate Table Partitioning and Indexes in Oracle DB to have optimal execution time of data retrieval and thus improve the performance of the Informatica sessions.
- Tune the sql queries for better performance by using Profiler, Explain Plan and Table statistics.
- Proficient working experience in creating Teradata Stored Procedures, Functions, and Cursors
Environment: Informatica Power Center 9.5.1, UNIX, Oracle 11g, Teradata.
Confidential
Sr. ETL Developer
Responsibilities:
- Extracted data from various heterogeneous sources like Oracle, SQL Server, DB2, Sybase, MS Access, and Flat Files.
- Extensively involved in Data Extraction, Data cleansing, Data Transformation and Loading (ETL process) from Source to target systems using Informatica Power Center.
- Designed the ETL process to load data from various sources to the staging environment, data warehouse and target files.
- Used Transformations like Expression, Stored procedure, Sorter, Update strategy, Look up & Joiner and load the data to the Target Table.
- Defined Source and Target Definitions in Informatica using Source Analyzer and Warehouse Designer.
- Extensively worked with various lookup caches like Static Cache, Dynamic Cache, and Persistent Cache.
- Used all the Informatica client components like Designer, workflow Manger, Workflow Monitor, Repository Manger, and Repository Admin Console.
- Responsible for creating and modifying the PL/SQL java class, triggers, procedures, functions and packages according to the business requirements.
- Create and optimize the required tables, views, Indexes, synonyms, sequences, partitions to support the Data warehouse needs.
- Developed the views by using SQL functions such as Pivot and Un-pivot, Hierarchial queries to generate the reports for Financial Pyramid View (FPV).
- Developed the PLSQL procedures and functions to implement the business logic as per the requirements.
- Review and validate the ETL design proposed by the architect team and propose the areas of improvement.
- Responsible for creating indexes, partitions to increase performance and complex SQL queries for joining multiple tables.
- Defined Target load Plan for mapping.
- Used Workflow Manager to create, schedule, execute, and monitor sessions and batches that perform source to target data loads.
- Extensively used Informatica Client Tools such as Source Analyzer, Mapping Designer, workflow Manager and Workflow Monitor.
- Created the UNIX shell scripts to ftp the files from the Staging Unix machine to the Target Unix machine.
- Created programs (scripts) using Unix Shell Script for developing batch jobs.
- Performing the code reviews to make sure the code and the mapping developed meets the project coding standards and guidelines.
- Implement and revise the project coding standards from time to time to be able to meet the new challenges and the project requirements.
- Leading the team of 6 members and providing the technical and functional assistance.
- Set up the reports in Business Objects tool for the business teams to be able to generate the adhoc reports.
- Update the Universe using BO Universe Designer to reflect the latest structures of the database tables and views.
- Participate in daily scrum/standup call as per agile methodology and update the status of the stories in Jira tool.
Environment: Informatica Power Center 9.5.1, UNIX, Oracle 11g, Teradata.
Confidential
Hadoop Developer
Responsibilities:
- Moved all data flat files generated from various application logs to HDFS for further processing.
- Written the PIG Latin scripts to process the HDFS data.
- Created Hive tables to store processed results in a tabular format.
- Developed the sqoop scripts in order to make interaction between RDBMS and HDFS.
- Completely involved in the requirement analysis phase.
- Troubleshoot map reduce jobs, PIG scripts and HIVE queries.
- Developed Hadoop Scripts, Writing Map reduce Programs, Verifying the Hadoop Log Files
- Interacted closely with business users, providing end to end support
- Created Technical design documents based on business process requirements.
Environment: Hadoop, HDFS, Map Reduce, Hive, Sqoop, Pig, Oracle 11g.
Confidential
Sr. Oracle Developer
Responsibilities:
- Worked on major discretionary items which involved in developing the required stored procedures, functions and other required database objects.
- Prepared the PLSQL procedures to generate the Replication metrics.
- Prepared the PLSQL scripts which helped to automate various processes as part of the support related activities.
- Fine-tuned various sql queries and plsql procedures for better performance.
- Responsible for all change and release management during database/application rollouts of completed projects and maintenance releases into production.
- Partitioning is done on some of the large tables. Extensively involved in performance and tuning of all SQL and PL/SQL code
- Responsible for all version control of code and documentation.
- Assisted QA team to create backend testing scripts and documentation.
- Making the Data fix scripts as generic there by same scripts are reusable with minor modifications.
- Coordinating the Disaster Recovery activities.
- Leading the Offshore team and coordination between Onshore and Offshore teams.
- Preparing the required SOP for quick troubleshooting of the various production issues.
- Training and mentoring of new team members and offshore resources.
- Providing various metrics to the senior management on SLA timelines and achievements.
Environment: Centura SQL, UNIX, Oracle 10g.
Confidential
Sr. Oracle Developer
Responsibilities:
- Developing the required stored procedures, functions and other required database objects.
- Prepared the PLSQL scripts which helped to automate various processes to avoid manual monitoring.
- Requirement study and analysis for the modules
- Worked with business users of the application to identify various areas of improvements for the application, thereby making the process more robust.
- Assisting and coordinating in resolving the day-to-day support tickets
- Created programs (scripts) using Shell Script for developing batch jobs
- Responsible for creating and modifying the PL/SQL java class, triggers, procedures, functions and packages according to the business requirements
- Fine-tuned various sql queries and plsql procedures for better performance.
- Troubleshoot and provide immediate and long run fix for the production support issues.
- Assist the BO team in preparing the SQL queries for scheduling the reports generation.
- Responsible for daily extraction, transformation and loading the data into flat files as per the client's requirement.
- Worked with business users of the application to identify various areas of improvements for the application, thereby making the process more robust.
Environment: Oracle 9i, 10g, UNIX, SQL Server, Sybase, Crystal Reports.
Confidential
Oracle Developer
Responsibilities:
- Gathered the requirement from the client and translated the business details into Technical design.
- Create new database objects like tables, Sequences, Procedures, Functions, Packages, Triggers, Indexes and Views.
- Involved in writing several Queries and Procedures.
- Fine Tuned the SQL queries using hints for maximum efficiency and performance.
- Created Procedures, Functions, Packages based on Requirement.
- Prepared user manual and technical support manual.
- Developing the required stored procedures, functions and other required database objects.
- Responsible for all change and release management during database/application rollouts of completed projects and maintenance releases into production.
- Assisted QA team to create backend testing scripts and documentation.
- Making the Data fix scripts as generic there by same scripts are reusable with minor modifications.
- Working on the day to day support activities.
- Created Unix shell scripts to execute pl/sql scripts for generating the data extracts to the external application
- Involved in resolving production problems for the applications and Ensure all support service level agreements are met.
Technologies Used: Oracle.