- Over 20 years of IT experience as Architect/Modeler/Admin/Developer
- Hands - on experience in analysis, design, develop, testing and implementing DB/ETL Apps.
- Solid experience with Big Data Technologies (Hadoop), Netezza Appliance (IBM PureData), Oracle
- Performed data mining with large data sets of structured, unstructured, predictive modeling, data visualization
- Solid data Integration (Informatica, Talend and building customized ETL/ELT Process) experience
- Hands-on experience with data cleansing/quality, performance tuning and troubleshooting techniques
- Good knowledge of Data warehousing concepts and techniques (Star/Snowflake schema)
- Sound knowledge on Scripting, OLAP/BI tools, App. level DBA and deployment process
- Excellent analytical, communication and team player skills
Big Data Technologies: Apache Hadoop distributions (Hortonworks) - Hive, Spark, Ambari, Hue Pig, Sqoop, STORM, Falcon; S3, AWS(Amazon Web Services)
Analytics Appliance: Netezza Appliance (IBM PureData)
Databases: Oracle 11g/10g2/9i, PL/SQL, MySQL, SQL Server 2005, T-SQL, MS-Access
ETLs: Informatica PowerCenter 9.0/8.6, TalenD BigData/DQ 6.2, Customized ETL ProcessBI Tools: WebFocus, BOXI, Oracle Discoverer, SQL Server Reporting Services, Crystal
Data Modeling: ER/Studio Enterprise, ERWin Data Modeler, Visio, TOAD, Aginity
Database Utilities: Export/Import, SQL Loader, Stats pack, DBMS XPLAN, Tkprof, Explain Plan
Others: SourceSafe, SVN, CVS, shell/Python Scripting, C#, ADO.NET, ActiveX, SSIS, SAS, Office Control-M, AutoSys, DCOM/MTS, ODBC, Clearquest/ClearCase, VersionOne Scrum
Confidential, Long Island, NY
Data Modeler/Big Data Architect
- As a Big Data Engineer in all aspects of architecture, design and development, capacity planning
- Participated logical/physical database designs (Star and Snowflake schema) using ERD tool
- Involved database architecture reviews and created objects by following the standards and procedures.
- Solid SQL skills (Netezza, MySQL, SQL Server etc) and well versed in Python and Unix Shell scripting
- Performed Data Mining with large data sets of Structured and Unstructured Data, Data Validation, Predictive Modeling, Data Visualization and Software Development.
- Secondary DBA for 2 Netezza Appliances and handled Users/Groups/Resource allocation/notification.
- Administrated Backup, Groom and Stats in high availability env. with no maintenance window.
- Involved maintaining jobs in7 job servers (~200 jobs), Daily/Houlry/15 Minutes frequency (~900 Iteration)
- Performed tuning/optimization and monitoring using pglog/nzAdmin and worked with IBM for issues.
- Created Stored Procedures and tested for ETL/ELT process; Views for other groups to access.
- Created details level design documents and mapping document for ETL process.
- Development and maintenance of ETL jobs for ODS and DW projects; Admin activities using TAC
- Created ETL jobs to build Fact/Dimensions, Aggregated Facts, JobLets/Sub-Jobs to accommodate processes.
- Played key role to decide Big Data/Hadoop by working with IBM/HortonWorks/Cloudera
- Prepared comparison charts for Hadoop implementation (hardware/data nodes) and involved RFP preparation.
- Participated Hadoop POC preparation for DOCSIS process and worked with vendors to setup env.
Environment: Hadoop (Hortonworks) - Hive, Spark, Ambari, Sqoop, AWS, Netezza 7.2, TalenD TIS 6.2, Oracle 11g2, MySQL, ERWin Data Modeler, Aginity, PuTTy, shell/Python Script, SVN¸ and UNIX/LINUX
Confidential, NYC, NY.
- Collected requirements from gathering sessions and reviewing requirements with end users.
- Designed/created logical/physical database relational model (Star and Snowflake schema) using ERD tool
- Participated database architecture reviews and created design by following the standards and procedures.
- Created database objects, PL/SQL package, Stored Procedures, test and debug using Oracle11g and TOAD
- Created details level design documents and mapping document for development purpose.
- Performed development and maintenance of application; design and architecture review
- Created Informatica mappings using transformations and Filter, Aggregator and Normalizer.
- Created Mapplets/Sessions/Workflows and automated shell script to execute WorkFlows; AutoSys.
- Performed tuning and optimization, data validation process with business community
- Reviewed of the detail design and construction of the project; documented the findings and gaps.
- Coordinated with onshore and offshore development and QA teams
Environment: Informatica 9.0.1, Oracle 11g2, PL/SQL, TOAD for Oracle 10.6, ERWin Data Modeler OLAP, SharePoint, VersionOne, CVS, TortoiseCVS¸ WinSCP/PuTTy and Windows 7/UNIX
Confidential, NYC, NY
ETL/Data Integration Consultant
- Implemented ETL to integrate spend data for opportunities/awards/contracts/Grants using Talend/scripts.
- Designed/created dimensionally modeled database both Star schema and Snowflake schema
- Developed PL/SQL package and stored procedures and tested using Oracle11g and TOAD.
- Performed data cleansing/profiling/stewardship activities and loaded conformed dimensional data for facts and measures by extract, transferred data, debugging and analyzing issues, reconciliation to validate
- Used common/advanced components like tMysqlOutputBulk/Exec, tOracleSCD
- Designed and implemented tracking, logging/error handling mechanism and email notifications
- Improved performance using Parallel processing, partition and database partitioning, sorting
- Created deployments, scheduled jobs using shell Script/crons; published data for various Web App./portals
- Maintaining users/rules/repositories using Admin Console; monitored jobs, setup notifications
Environment: Oracle 11g, Oracle AWM, PL/SQL, TOAD, MySQL 5.5, MySQL WorkBench, Talend Open Studio 5.1/Integration Suite 5.1, ERWin Data Modeler, SVN(Subversion), ToolKit/PuTTy, Java, Windows/UNIX/Linux