Data Analyst Resume
SUMMARY
- Over 8+years of IT experience in Banking, Insurance, Utility and Retail domains.
- 2 years of recent experience with Hadoop software tools which include Podium and Hue and hands on experience about various ecosystems pertaining to Hadoop like MapReduce, yarn, hive, pig, Sqoop, oozie.
- Certified professional on Talend Data Integration Developer v7 and Informatica PowerCenter Data Integration 9.x Developer.
- 4 years of experience as Data/Business Analyst and 4 years of experience as Data Modeller.
- Expertise in Project Management principals, Systems Design, ETL Process flow, Development, SDLC, System analysis, Agile development, Bug fixing, Documentation, Implementation, Maintenance and Integration of various data sources such as DB2, Oracle, MS SQL server, Fixed Width/Delimited flat files, Cobol & XML using Informatica Power Center 9.x/8.x/7.x on Windows and UNIX platforms.
- 2 years of recent experience in designing Sqoop Import jobs, designing Oozie workflows for sqoop jobs and deploying them to SIT using Jenkins.
- Experience with git commands to commit the code using MobaXterm tool.
- Experience working with Solution Architects to determine feasibility of the project and getting approval.
- Hands on experience with Developing and maintenance of hdfs files, loading and transforming data coming from hdfs files as source and Oracle Exadata as target and vice versa using Informatica BDE.
- 2 years of experience with performing integrating data/Data profiling/Complex Data Parsing on Hadoop using Informatica Big Data Edition.
- 4+ years of professional experience with Production Application Support with a practical understanding of Finance / Investments and Utility business.Built large DWH solutions on cutting edge technologies Netezza, Oracle Exadata and SQL Server.
- 2 years of experience building Oracle Data warehouse.
- 2 years of experience with creating business documents and technical documents like BRD, SSRS.
- Proficient in the development of ETL (Extract, Transform, Load) processes with a very good understanding of source to target data mapping, ability to define, capture Metadata and Business rules. Expertise in error handling & debugging.Experienced in Star - Schema and Snow-Flake data modeling, Normalization, Data Profiling, Data Cleansing, Logical & Physical data model, Fact & Dimensional Data modeling using ERWIN.Designed and developed mappings using diverse transformations like Unconnected/Connected Static/Dynamic Lookup, Expression, Router, Rank, Joiner, Sorter, Aggregator, Stored Procedure, Normalizer, Transaction control, SQL, XML, Source Qualifier transformations.
- Experienced in developing complex business rules through Informatica transformations, Workflows/Worklets, Mappings/Mapplets, Test Plans, Test Strategies and Test Cases for Data Warehousing projects.
- Sound noledge of RDBMS concepts, with hands on exposure in the development of relational database environment using SQL, PL/SQL, Stored Procedures and UNIX shell scripting.
- Performance Tuning and extensive experience in SQL * Loader Imports/Exports, and Oracle utilities such as UTL FILE, SQL * Loader, Packages,
- Experienced in job scheduling using Autosys, Crontab/Windows Scheduler, and Informatica Scheduler on multiple platforms like windows NT/2000 and Linux environments. Co-ordination skills in handling of On-site and Offshore models.
- Created/modified jobs through the GUI interface (scheduler console) and created JIL script files for job scheduling.
- Self motivated Team player and dynamic with exceptional oral and written communication skills.
TECHNICAL SKILLS
Languages: HQL, Pig, Python, SQL, PL/SQL, C, Java
Scripting: UNIX shell scripting, DOS Batch scripting
Databases: Hadoop, Netezza, Exadata, DB2, Oracle 11g/10g/9i, SQL Server 2012/2008
ETL Tools: Sqoop, Podium 3.0, Informatica PowerCenter v9.x/8.x/7.x, Informatica BDE, SSIS
BI Tools: Cognos Report/Query Studio, Framework Manager
Data Modeling Tools: Erwin Data Modeller, MS Visio
Other Tools: Jira, Sharepoint, TIBCO, IDQ, Autosys, SQL*Plus, MS Word, MS Excel, SQL * Loader, Toad, SQL Developer, MobaXterm.
Operating Systems: AIX-UNIX, Solaris-Linux, Windows.
Methodologies: ER Modeling, Dimensional Modeling, Star Schema, Snow Flake Schema, Fact andDimensional Tables, Physical and Logical Data Modeling, Scrum, Agile, Waterfall.
PROFESSIONAL EXPERIENCE
Confidential
Data Analyst
Responsibilities:
- Gathering the Business Requirements from Line of Business like Malcodes, SLA’s, Business logic, File format, Database connection string, etc.
- Working with Solution Architects, BAs to analyze data and to discuss the feasibility of the technology prior to documenting BRD, SSRS, dealing with data analytics on (capital markets) loans and cash flows.
- Scheduling meetings to sort out any issues regarding scope, format, technology to be used, etc. between internal team and LoB team.
- Conducted ad-hoc meetings with Stakeholders to discuss the data gaps for example: null values coming up in the downstream data, finding out root cause for data gaps and addressing them asap.
- Writing SSRS document based on the requirements and getting signoffs.
- Designing the jobs for ingestion into Enterprise Data Lake from various sources using ETL Podium tool.
- Extracting the data using HQL, Pig and exporting to the target databases.
- Designing Sqoop scripts for ingestion and the related podium jobs.
- Working on the python scripts to sequence the jobs accordingly.
- Using Oozie to schedule the ETL workflows as per the sequence.
- Deploying the code to SIT using Jenkins tool.
- Extensive usage of git commands to commit the code using MobaXterm tool.
- Good usage of Confluence to document any technical information.
- Analyzing the logic corresponding to a process mapping and also analyzing/examining the data feeds coming through the logics and creating business documents like SSRS based on BRD’s.
- Creating extracts based on business logic and exporting files to target systems based on sqoop-out to target databases.
- Creating extracts using BigData Podium Prepare extraction methodology for the projects like Branch Score Card.
- Support SIT team during SIT phase of the project regarding the defects raised via Jira.
- Support Production team during PAT phase of the project and handling PCR’s effectively.
- Giving good support to the Project Manager of a project regarding the technicalities of the project and halping to set up realistic timelines for the project.
- Worked on projects which require to Data model based on business requirements, putting forward referential integrity in place (Primary key/ Foreign key relation) and establishing relations between fact and dimensions using ERWIN tool.
- Very good working noledge on Talend tool. (Talend was introduced to InfoEx at dat time).
- Worked on high profile projects which include Branch Score Card (Extraction), MBNA North, Enterprise Fraud Platform, IEM, MDM Gold and Silver projects, Phone Channel (Sqoop), TDI (Sqoop).
Environment: Hadoop 2.0.3 alpha, Podium 3.0, Hue, DB2, Oracle 11g, SQL Server 2008, flat files/XML, EBCDIC files, SQL, Sqoop, Oozie, HQL, Pig, Python 2.6.6, MS Word, MS excel, Confluence, MobaXterm 10.5, Talend v6, Erwin, MS Access.
Confidential
Data Consultant
Responsibilities:
- Create/Update ETL Informatica jobs to load data from external systems to load the data warehouse. This process runs daily. Create warehouse tables and extend warehouse columns to bring in new source system changes. Built large Data Warehouse solutions on cutting edge technologies Oracle Exadata and SQL Server.
- Validate the data in the warehouse against the source system. Use of Data Warehouse best practices in support of jobs in maintenance. Investigate and reconcile commercial customer payment/bidding system.
- Create new JIL files to include the new Informatica jobs and build the execution plan.
- Use of ETL processes to accommodate changes in source systems and new business user requirements.
- Designed and Implemented SCD’s type1 and type2 (Date Range) method for tracking current history and full history of transactions respectively.
- Designed and Implemented File List for dynamically creating files, with the file name coming from a port name.
- Problems by reviewing design changes made to production systems and make corresponding changes to the Repository.
- Programming experience in UNIX shell developing scripts to run Informatica workflows and micro jobs for near real time data.
- Designed and developed mappings using diverse transformations like Unconnected/Connected Static/Dynamic Lookup, Expression, Router, Rank, Joiner, Sorter, Aggregator, Stored Procedure, Normalizer, Transaction control, SQL, XML, Source Qualifier transformations.
- Performance tuning, involving identifying Bottle necks at source/transformation/target level, identifying the issues wherever complex transformations like Aggregator are used in the mapping and making necessary changes and fixing the issues for better performance.
- Support infrastructure planned maintenance and outages. Work on refreshing Dev/SIT/UAT Informatica environments on regular basis as per the project needs. Provide production support to ETL applications on rotation basis.
- Extracted hdfs files from Hadoop and performed data profiling/data quality/ complex data parsing using Informatica Big Data Edition tool and loaded into Target.
- Participated in defining standards and best practices for administration. Create and maintain connections (relational, application) and maintain existing. Rep Manager, Reporting Service, ODBC connections, etc.
- Assisted the team in creating SQL server integration service (SISS) packages for initial loads.
- Worked with Shell scripts and pmcmd to interact with Informatica server from command mode.
Environment: Informatica PowerCenter 9.6/9.5, Informatica Big Data Edition, IDQ, DB2, Oracle 11g Exadata DW,SQL Server 2008, flat files/XML, hdfs files, Erwin R9, SQL, PL/SQL, SQL*PLUS, Shell Scripting, Autosys, Unix-AIX, Cognos 10.1, SQL Developer, SQL*Loader, UTL File, TOAD.
Confidential
ETL Developer
Responsibilities:
- Worked with Stakeholders and Business Analysts regarding business requirements, functional specifications and enhancements, based on the business needs created technical design and functional specification
- Developed and assisted data modeler in designing logical and physical data models dat capture currentstate/future state data elements, data flows and work flows using Erwin and MS-Visio.
- Developed Complex data Mappings in Informatica using Power Center transformations (Lookup, Joiner, and Rank, Source Qualifier, Sorter, Aggregator, Filter, Router, Expression, Joiner, Union, Normalizer and Sequence generator transformations), Mapping Parameters, Mapping Variables, and Mapplets and Parameter files.
- Scheduled the Workflows to run on a daily and weekly basis using Autosys Scheduling tool.
- Maintained workflow logs, target databases, code backups and also performed operational support and maintenance of ETL bug fixes and defects.
- Involved in assigning the ticket to theInformaticaproduction support based on the error and also worked on solving issues pertaining to the ticket issued.
- Supported migration of code between production and testing environments and also maintained the code backups.
- Performed tuning of the Informatica mappings using dynamic Cache, round-robin and PL/SQL scripts.
- Actively coordinated with testing team in writing test cases and performing them and halped the team to understand the dependency chain of the whole project.
- Provided Knowledge Transfer and created extensive documentation on the design, development, and implementation of daily loads and workflows for the designed mappings.
Environment: Informatica PowerCenter 8.6, Netezza, SQL Server 2008, flat files/XML, Erwin R7, SQL, PL/SQL, SQL*PLUS, Shell Scripting, Autosys, Unix System, Micro strategy.
Confidential
Database Analyst
Responsibilities:
- Assisted Business Analyst on the project in Requirements Elicitation process to develop a detailed Software Requirement Specification (SRS).
- Worked with the joint application development teams to maintain the updates and produce the Technical Design Documents (TDD).
- Involved in the communication process regarding business needs to all stake-holders regarding new products & initiatives.
- Involved in Designing Logical and Physical Database using ERWIN 4.x, Star-Schema, dimensional modelling of Facts and Dimensions.
- Reverse engineered the physical model with Erwin to generate the logical model for as of state of the project documentation. Performed unit testing and documented the results.
- Created complex Queries for ad-hoc reporting including OLAP environment. Designed and developed simple and complex logical and materialized view queries for simplifying report requirements and data governance.
- Developed PL/SQL Stored Procedures, Functions and UNIX scripts to implement business logic.
- Provided valuable contribution in all phases of SDLC including production support.
- Undergone corporate for Informatica tool set.
- Programming experience in UNIX shell developing scripts to run Informatica workflows.
Environment: Informatica Power Center 7.1, Oracle 10g/9i, Toad, Windows NT, SQL*Plus, PL/SQL, Flat files, MS SQL Server.