- Over 8 years of professional work experience in design, development and improvement of Data Engineering, ETL framework and Business Intelligence products.
- Expertise in database technology with solid understanding and hands - on skills with Oracle PL/SQL, SSRS, MSBI, Unix and Linux, Datastage PX 8.0 / 8.5, SSIS, Informatica, Microsoft SSRS, Dundas data visualization tool, MicroStrategy, Oracle 11G, MS SQL Server 2008, Teradata 13.0.
- Working experience in Modern NoSQL databases like Mongo DB, Hadoop using programming languages like Python.
- Exposure to different Big Data ecosystem and Data Lake infrastructures like Hadoop, MapReduce, HDFS, Pig, Hive.
- Experience in design and development of Distributed Computing and Scaling the Infrastructure for handling larger datasets.
- Solid experience in building business intelligence solutions (data warehouse, ETL, Reports) and ability to understand instrumentation and devise technical solutions.
- Heavily involved in the performance tuning of long running Datastage jobs in Prod.
- Shifted the heavy lifting of data processing from ETL tool to the database-driven code. Also involved in the design and implementation of modern integration model to being timestamp-driven.
- Developed automated framework for executing test cases to validate the data load and ETL mappings to support test driven development.
- Proficient in writing stored procs, materialized views, triggers, views, dynamic programming.
- Hands on experience in multidimensional data modeling, extensive experience in designing of slowly changing dimensions, stars and snowflakes.
Operating Systems: Unix, Linux (Ubuntu and Red Hat), Windows 2000, XP, 7, Mac OS
BI Tools: OBIEE, Microsoft SSRS, Dundas data visualization tool, Microstrategy.
ETL Tools: Datastage 11.2, Informatica 8.0, Microsoft SSIS.
Database (SQL and No-SQL): Mongo DB, Hadoop, IBM DB2, Oracle 11.0, MS SQL SERVER, TeradataMySQL.
Internet Programming: REST API, HTML, XML, CSS, AJAX, Web Services(SOAP/REST)
Software ToolsI: BM Infosphere, SQL Developer, SQL Assistant, WinSQL, WinSCP, Pycharm, MongoChef.
Oracle: 9i PL/SQL Developer IBM Certified Solution Developer V8.5
Confidential, Phoenix, AZ
Senior Data Engineer
- Designed Datastage parallel jobs for reading the data from DB2 systems and pumping into Teradata 14.10. Connector was configured efficiently to prevent the sessions exceeding the maximum limit on the Teradata server.
- Developed backend jobs to generate a daily report (using Python, Mongo DB and Hadoop ecosystem) for marketing manager to show the sales trend on department level as well as product level.
- Transformed and moved many legacy Datastage jobs to Hadoop Ecosystem, and which helped in scaling the existing performance to 40 percent up.
- Developed a python application to perform CRUD operation on pricing and promotions offers so that this data can be consumed by other third-party applications for real time analytics.
- Sales files from the POS systems were processed using Python in Hadoop Ecosystem.
- Bteq scripts were also written to load the data in Teradata.
- Designed the shell scripts for executing the Datastage Sequencer jobs using command line and had them scheduled for running through the ESP scheduler.
- Involved in writing the stored procedure which was the heart of the application, since it streamlined the entire process of conditional checking and sending out logical notifications.
- Involved in writing shell mailing the developers about the failed jobs along with sending text messages to their cellphones as well, depending upon their service providers.
- Created a generic framework in Oracle PL/SQL to control the loading of data being fed to a bunch of materialized views across multiple Dashboards.
Environment: Python, MongoDB, Hadoop, Pig, Oracle 11G, Datastage 11.2, Teradata 14.10, Unix, Linux, JSON, XML, GitHub, Pycharm, MongoChef
- Developed data layer to define campaign and strategy for Terminal One app.
- Designed Datastage parallel jobs for processing the XML messages received from the clients.
- Involved in analyzing the data generated by the business process, defining the granularity, source to target mapping of the data elements, creating Indexes and Aggregate tables for the data warehouse design and development.
- Versed in in corporation of various data sources such as Oracle, MS SQL Server, XML and Flat files into the staging area.
- Knowledge of mapping server/parallel Jobs in DataStage to populate tables in Data warehouse and Data marts.
- Worked directly with Database administrators in implementing Teradata protection features for effective table design and index selection, table implementations, maintenance, and backup, Problem support, Workload monitoring and control, Policies, procedures and guidelines that govern the Teradata environment, SQL code review, Developer and user support and training, Capacity planning, System software testing and benchmarking and Support and coordination during hardware upgrades.
- Involved in Datastage 9.1.2 client edition installation, and the various packs installation.
- Interacted with application developers in creating SQL queries for the Functional module.
- Converted MS Excel worksheets into MS SQL Server 2000 database, such as: Created database and tables/views, Setup relationship among tables, wrote stored procedures, and triggers.
Environment: Datastage, Teradata, SSRS, MSTR, Dundas Data Visualization Tool, MS SQL Server 2008, SVN, Python.