Big Data Hadoop Developer Resume
SUMMARY
- Having 11 years of experience working with large volumes of data within a data warehouse
- 4+ years of experience in Hadoop Eco systems
- Working experience on designing and implementing complete end - to-end Hadoop Infrastructure including Pig, Hive and Sqoop
- Experienced in loading and transforming large sets of structured, semi-structured and unstructured data.
- Ability to utilize data load tools such as Flume and Sqoop
- Experience in data migration from relational databases to Hadoop HDFS
- Analysis of stored Hadoop data using Pig/Hive
- Working Knowledge of IMPALA
- Working knowledge of TALEND
- Good understanding of parquet file format
- Worked on NoSQL database like HBase, MongoDB and Cassandra.
- Expertise in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Working knowledge of UNIX/LINUX environments
- Extensive experience in Extraction, Transformation and Loading of data using Informatica Power Center
- Strong experience creating complex mappings using transformations like Source Qualifiers, Expressions, Filters, Joiners, Routers, Union, Lookups, Update strategy etc.
- Strong understanding of Data Warehouse Concepts & Dimensional Data Modeling (Star schema, Snowflake, SCD, Fact & Dimension Tables), DAC.
- Expertise and experience with Software Development Life Cycle processes and methodologies
- Experience with agile development methodologies
- In-depth understanding and prior work experience in finance industry
- In depth knowledge of databases like Oracle 8i/9i/10g, SQL Server 2000, MySQL and extensively worked in writing SQL queries, Stored Procedures, Triggers, Cursors, Functions and Packages
- Extensively worked on Siebel CRM and Siebel Enterprise Integration Manager
- In depth knowledge of scheduling tools like Autosys and ControlM and the process to schedule the data load jobs
- Excellent leadership skills, and a deep understanding of application architecture.
- Ability to quickly ramp up and start producing results on given any tool or technology.
- Excellent communication, analytical, interpersonal and presentation skills.
TECHNICAL SKILLS
Big Data Ecosystem: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig Utilities Oozie, Sqoop, HBase, Flume.
Languages: Java 1.4/1.5/1.6, PL/SQL
Web Technologies: HTML, DHTML, Java Script, VB Script, XML
RDBMS: Oracle, SQL Server 2000/2005/2008
CRM: Siebel
Operating Systems: Windows XP/Vista/7, UNIX, Linux.
PROFESSIONAL EXPERIENCE
Big Data Hadoop Developer
Confidential
Responsibilities:
- Design and deployment of Hadoop cluster and big data tools including PIG, HIVE, IMPALA, HBASE, SQOOP and FLUME with Cloudera distribution.
- Detailed design documentation for analysis, design and testing.
- Perform data migration from legacy RDBMS to HDFS using SQOOP.
- Create HIVE external tables and loaded the data into tables and query data using HQL.
- Implemented partitioning and bucketing in HIVE.
- Extensively used PIG for data cleansing
- Managing and scheduling jobs on Hadoop cluster
- Working knowledge in creating tables in HBASE and importing data
Environment: Map Reduce, HDFS, Hive, Pig, Linux, Java, Flume, Sqoop, MySQL
Big Data Hadoop Developer
Confidential
Responsibilities:
- Design and deployment of Hadoop cluster and big data tools including PIG, HIVE, HBASE, SQOOP and FLUME with Cloudera distribution.
- Created Map Reduce Jobs using Pig Latin and Hive Queries
- Used SQOOP tool to load data from RDBMS into HDFS
- Created Map Reduce Jobs using Pig Latin and Hive Queries
- Implemented partitioning and bucketing in HIVE.
- Created unit test plans, test cases and reports on various test cases for testing the data loads
- Managing and scheduling jobs on Hadoop cluster
Environment: Map Reduce, HDFS, Hive, Pig, Linux, Java, Flume, Sqoop, MySQL
Data Migration Lead
Confidential
Responsibilities:
- Data migration requirements gathering.
- Data Mapping between Siebel and different source systems
- Create/Extend new columns as part of the Data Mapping
- Prepare Data Migration Design Document and entity wise data mapping spreadsheets.
- Build Informatica Mapping between Source Systems and Siebel EIM Tables
- Build Informatica workflows and schedule jobs for the daily data loads
- Create IFB files for the load of various entities identified.
- Build and test SQL scripts to insert data for one-time data loads.
- Build Shell scripts to automate batch jobs
Environment: Siebel CRM, Informatica, Oracle 11g, Linux
Data Migration Lead
Confidential
Responsibilities:
- Data migration requirements gathering.
- Data Mapping between Siebel and different source systems
- Interaction with the business on different acquisitions
- Prepare Data Migration Design Document and entity wise data mapping spreadsheets.
- Build Informatica Mapping between Source Systems and Siebel EIM Tables
- Developed mappings/Transformation/mapplets by using mapping designer, transformation developer and Mapplet designer in Informatica Power Center
- Imported Source/Target Tables from the Respective databases and created reusable transformations like Joiner, Routers, Lookups, Rank, Filter, Expression and Aggregator inside a Mapplet and created new mappings using Designer module of Informatica.
- Create/modify DTS packages for the data loads
- Create IFB files for the load of various entities identified.
- Build and test SQL scripts to insert data for one-time data loads.
- Build Autosys jobs for job scheduling
Environment: Siebel CRM, Informatica, MS SQL Server, Windows, Autosys
Data Migration Consultant
Confidential
Responsibilities:
- Interaction with the client on the data related developments / issues
- Develop/Unit Test Siebel EIM Programs
- Performance tuning and revamping of existing EIM scripts.
- Worked closely with business analyst / power users during the Unit and System testing phases
- As a part of the production support, was responsible for planning and executing any application downtime for enhancements and regular maintenance activities.
- Upon failures/Outages, was responsible for bringing in all the related groups on to a bridge call and analyzing the failure reasons and taking appropriate actions to rectify and also informing the management about the same.
- Attending weekly Change Control Meetings to discuss the changes planned, in any of the source systems and analyze the effects of the same at report level/Universe level and come out with action plans to mitigate the same.
- Documenting any Outages, identifying the cause and providing short-term / long-term solutions for the problems occurred.
- Develop/Unit Test Extract, Transformation and Load Programs using T-SQL
- Coordinating with Data Warehouse team and other data source providers to ensure update to date data, thus resulting in close to real-time Analytical reports
Environment: Siebel CRM, MS SQL Server, Windows