- 8 years of experience in Planning, Analysis, Design, Development, Testing, Implementation and Support for ETL Data Warehousing, Data Analytics and Business Intelligence domains in various industry verticals.
- Expertise in ETL Solution Designing, Data Analysis and Development in Banking & Financial Services Industry
- Techno functional expert who understands various aspects of Back Office Accounting and Data Warehousing
- Sound understanding of tools required for performing Business Intelligence, Data Analytics and Machine Learning
- Worked on all of SDLC cycles & exposure to Agile methodology
- Extensive Experience in implementing large - scale multi-dimensional data warehousing, analytics and reporting projects using IBM Infosphere DataStage, Python, Unix, Oracle and DB2 databases
- Worked as Data Warehousing and Analytics Consultant to guide on application integration and provide viable solutions to interfacing systems
- Expertise in Data Architecture, Modeling and ETL development
- Self-motivated and enthusiastic in Big Data and Data Science technologies with knowledge and hands on experience in R, Python, Hadoop, Impala, Hive and Sqoop
- Experience working with big data technologies to ingest, process and store structured and unstructured data in rest and in motion using Hadoop technologies
- Experience in analyzing large data sets and using quantitative and qualitative analysis to draw meaningful and valid insights
- Experience in Machine Learning using Microsoft Azure and Weka
- Expertise in Advanced Analytics and Statistical techniques using R and various Python Data Analysis Libraries like Numpy and Pandas
- Experience in data visualization using Microsoft Power BI, R and MatplotLib
- Extensively worked towards Data Quality Optimization
- Worked towards exploring IBM Datastage Real Time Integration Architecture (RTIA) / SOA solution featuring event-driven, message-based(MQ) and trigger-based infrastructure for the cost-effective integration of multiple applications
- Demonstrated strong creative expertise in designing and implementing various Automation and Runtime Reporting Utilities using application command line tools and web technologies to extract metadata and runtime information
- Experience in delivering data organizing and storage solutions in traditional and Big Data environments.
- Experience working with big data technologies to ingest process and store structured and unstructured data in rest and in motion using Hadoop technologies.
- Development of Pig Scripts and Hive Queries
- Experience in delivering data organizing and storage solutions in traditional and Big Data environments
- Performed successful migrations from various technologies to DataStage
- Expertise in Eagle and Confidential Projects (DS 8.0 to 8.5, 8.5->9.1> 11.3; Eagle 9x -> 12x, 12x -> 15x (Eagle Access 1.0))
- Experience in Dimensional Data Modeling, Star/Snowflake schema, FACT & dimensions tables
- Strong design experience to independently turn Physical Data Model into DataStage job design
- Expert level knowledge of relational databases (especially DB2 and Oracle PL/SQL programming) and worked on Flat, XML files
- Leveraged Quality stage and Information Analyzer for enriching data to meet business objectives and data quality management standards
- Expert level experience for performance troubleshooting and enhancement for large volumes of data
- Strong Unix shell scripting skill
- Strong implementation skills to complete implementation within tight project schedule
- Knowledge of statistical and predictive modeling techniques, such as machine learning, decision trees, probability networks association rules, clustering, regression, classification and neural networks and their application to business decisions.
- Knowledge and perform research on Machine Learning Algorithms (Clustering,, Linear/ Logistic Regression, Decision Trees, Predictive Analysis/Learning, Sentiment analysis - Text Mining using R)
- Currently learning and working to evaluate performance of models and iterate on optimization
- Involved in effective effort estimation, forecasting and project sizing
- Complete team co-ordination involving task allocation among team members and holding frequent discussions to monitor progress of project work.
- Experience in working on onshore-offshore business model and handling client deliveries for entire module.
- Strong interpersonal, written and verbal communication skills. Multilingual, highly organized, detail- oriented professional with strong technical skills.
Industries: Finance, Banking
Programming Skills: Python, R, Unix Shell Scripting, PL/SQL, XML, XSLT, HTML
DBMS Package: IBM DB2, Oracle
Operating Systems: Windows, Unix, Mac
Tools / Applications: DataStage 8.0,8.1,8.5, 11.3
DataStage Information Analyzer,: IBM Infosphere Suite (Metadata WorkBench, Business Glossary),
QualityStage,: IBM Infosphere Server Manager,
MicroSoft Azure,: R Studio,
Sqoop,: Microsoft TFS,
PVCS,: Control - M,
MQ 7.5: SFT,
MS Visio,: HP Quality Center,
Data Visualization: Microsoft Power BI
Libraries/Packages: NumPy, Pandas, SciPy, Matplotlib, Scikit-learn
Technical Lead / Data Engineer
- Delivered organization, client profitability, trust asset reporting, planning and forecasting data marts with performance efficient design which ensured on-time delivery of financial reports to senior management.
- Implemented Business-As-Usual (BAU) ad-hoc data load process with fully automated execution mechanism using Unix shell scripts and DataStage dsjob utility.
- Reduced time spent on quality assurance and decreased defects by enhancing development procedures
- Standardized data storage and normalized the table/entity relationships up to certain extend to eliminate data redundancy and manipulation anomalies which helps automation test data users to analyze the input test data easily with simple SQL queries.
- Team Leadership Effort to Migrate IBM DataStage version from 8.5 to 11.3. This was the first team in the organization to migrate to DS 11.3
- Involved in designing and building centralized DWH and shared access layers for NTGI platform to ensure optimal performance.
- Identify, recommend and implement ETL processes and architecture improvements.
- Implemented project through SDLC process using the Agile methodology.
- Manage build phase and quality assure code to ensure fulfilling requirements and adhering to ETL architecture.
- Mentor team members on domain, methodologies and technical skills.
- Coordinating with Business Users for resolution of queries encountered during UAT and timely resolution.
- Knowledge Transition to Production Support Team prior to release and implementation.
- Integrating DataStage with web services and MQs for processing cashflows and intra-day/EOD trades.
- Designed and customized data models for Data warehouse supporting data from multiple sources on real time.
- Design and Develop Generic Extraction and Upload Framework for NTGI-BB Upload Process.
- Development of Pig Scripts, Hive Queries, Common Security Based Resolution and Error Message Shared Containers.
- Responsible for Extracting the data from Bloomberg Back Office Files, Data cleansing, applying Business Rules, loading data into Data Warehouse, creating marts for reporting and downstream extracts.
- Involved in designing of LDM/PDM and Mapping documents.
- Conduct impact assessment and determine size of effort based on requirements.
- Developed ETL jobs by using IBM Datastage 11.3 to load the DataMart in batch processing.
- Developed job sequencer with proper job dependencies, job control stages, triggers, and used notification activity to send email alerts.
- Documented ETL test plans, test cases, test scripts, and validations based on design specifications for unit testing, system testing, functional testing, prepared test data for testing, error handling and analysis.
- Prepared control M job schedules.
- Managed production deployments of ETL code and Database objects.
- Actively participated in test case preparations and executions.
- Designed and implemented Modular and Reusable ETL processes and associated data structures for Data Staging, Data Balancing & Reconciliation, Exception Handling & Reject Recycling, Surrogate Key Generation & Assignment, Data Lineage Tracking, etc.
- Extensively used stages like XML input, FTP, Modify, Difference, Oracle enterprise, Change Capture, sort, merge, join, lookup, aggregator, pivot, surrogate key generator, funnel, filter, Universe etc.
- Actively participated in production planning, deployment and verification activities. Analyze the data after first run to make sure data quality and integrity check
- Participated in Daily Scrum Call, Sprint Planning, Product Backlog Refinement (PBR), Sprint review and Sprint retrospective meetings.
- Created an automated Generic Load Utility to extract, transform and load all 30 Mainframe files into tables
- Improved Data Quality and standardized definitions for metrics and data elements using common rules engine using DataStage routines
- Streamlined data processing for downstream applications and security using global security Model
- Spearheaded development team of over 15 members disseminated across onsite and offshore locations.
- Delivered organization, client profitability, trust asset reporting and planning and forecasting data marts with performance efficient design which ensured on-time delivery of financial reports to senior management
- Designed and implemented multi-dimensional security for operational reports to ensure highly secured information governance using PL/SQL procedures
- Automated and streamlined execution notification processes for uninterrupted schedule flow using technologies like IBM Info Sphere DataStage Universe
- Documented all business requirements (BRD) for logical data model preparation for future references
- Extensively interacted with end-users and involved in requirement analysis and implementation
- End-to-end knowledge of system, involved in Analyzing Business Requirements
- Performance tuning of ETL code by incorporating proper partition mechanism and optimizing DB2 queries
- Involved in designing of LDM/PDM and Mapping documents.
- Develop DataStage jobs and script to handle Balancing and Controlling - Count and financial validations and resolve data centric issues.
- Used Quality stages for enrich data to meet business objectives and data quality management standards.