- Over 13 years of total IT experience in Architecting, Designing, Development and Delivery of solutions and high performance enterprise software & applications.
- Expert in Data modelling on Apache Cassandra, Experience with Datastax Enterprise Edition
- Hands on experience with High performance distributed systems and Big Data Technologies such as Hadoop, Kafka, Apache Spark and NoSQL (Cassandra, HBase).
- Expertise in Cloud, On - Premise and Hybrid deployments of Big Data solutions.
- Specialization in Data Modeling for traditional data warehousing using Star/Snowflake Schema Design, Data Marts, Relational and Dimensional Data Modeling, Fact and Dimensional tables
- Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
- Hands-on experience in installing, configuring and monitoring Hadoop clusters (on-prem & AWS)
- Partnered with BI teams, Data Integration developers, Analysts and DBA’s to deliver well-designed, scalable Information Management ecosystem
- Experience in working with ITIL and TOGAF frameworks, Agile/SCRUM/DevOps based product development teams
- Team player with excellent communication and problem solving skills
Distributed Computing/Big Data: Pig, Hive, HDFS, MapReduce, Sqoop, Storm, Spark, Kafka, Yarn, Oozie, Zookeeper
Languages: Python, R, Scala, Java, JSON, SQL, PL/SQL, Pig Latin, HiveQL, CQL
RDBMS: Oracle, MS SQL Server, MySQL
NoSQL: Datastax Distribution of Apache Cassandra, Amazon DynamoDB
Cloud: Amazon AWS
Business Intelligence Tools: Tableau, Business Objects XI, OBIEE
ETL Tools: Informatica Power Center, Informatica Data Quality, Talend Big Data Edition
Reporting Tools: Tableau, Business Objects XI, Congo s Impromptu Administrator, Cognos Upfront
Scheduling Tools: Automic, Autosys, Tidal
Confidential, Houston, TX
BI Solution Architect
- Worked with the Senior Leadership to define scope for Big Data Platform and identified / selected initial Use Cases that would drive the Big Data initiative.
- Architected Big Data solutions while working directly with business partners. Led team, hands on, in the implementation phase
- Implemented solution platform using HDP (Hortonworks Data Platform), Kafka, Cassandra, Spark Streaming and Tableau, On premise Data Center and AWS
- Installed and managed a multi node Cassandra cluster. Worked on sizing, benchmarking, Cassandra stress to test throughput of our model, system tuning
- Troubleshooting and performance tuning of the Cassandra platform
- Collaborated with the Technology Group and KBR plan personnel to build a time-series database to store events from Sensors data coming from KBR designed ammonia and fertilizer plants. Deploy front end dashboard showing real time alerts, allowing KBR team to provide managed services
- Developed Kafka based system for pub-sub data delivery
- Modeled a Cassandra key store for time series data
- Implemented Spark Streaming solution to capture, validate and process data, store in Cassandra
- Provided a near real time analytics system for IPMS (Integrated Project management system) that integrated data from 200+ Oracle databases into Hive.
- Worked with Oracle Golden Gate Big Data adapter for hive, for ingestion of data from Oracle to HDFS. Loaded to HBase and created external tables in Hive for use by Business Analysts
- Performed cleansing and ETL and migrated data to Amazon Redshift
- Worked with Sqoop for importing metadata from Oracle and initial data load into Hive. Created Hive tables, worked on loading and analyzing data using hive queries
- Developed Hive queries to process the data and generate the data cubes for visualizing
- Designed and modeled databases (RDBMS, NoSQL, Star and Snowflake schema) to accommodate data solutions for Operational and Business Intelligence reporting and provided data sets to the data scientists.
- Designed and implemented database migration from on premise Relational Databases to Cloud.
- Created demand management plans, responsibility matrix (RACI), estimated efforts and scheduled resources.
Environment: Hadoop YARN, Spark-Core, Spark-Streaming, Spark-SQL, Scala, Python, Kafka, Hive, Sqoop, Amazon AWS EC2, S3, Cassandra, Tableau, Oracle, Linux, Oracle, Informatica Power Center, Automic Scheduler,Confidential, Miami, FL
- Led the Data Integration and the Enterprise Datawarehouse team
- Responsible for migrating legacy and historical data from AS400 to Oracle and the existing datawarehouse.
- Conceptualized and architected the data models for the enterprise data warehouse.
- Informatica installation, update and administration.
- Created mappings using Informatica power center 9.1 and transferred data from/to multiple Source/Target systems - RDBMS, Salesforce, Web-Services, LDAP, Flatfiles (Text and CSV), XML and JSON.
- Data profiling and data quality mapplets using Informatica Data Quality.
- Performance tuning of existing Informatica mappings and PL/SQL scripts.
- Created and managed schedules using the UC4 (Automic) Administration.
- Participated in formulating estimates and timelines for project activities and tasks.
- Analyzed, designed and prepared technical specifications.
- Acted as a mentor and provided Knowledge transfer to the developers.
- Conducted code reviews and assisted the users in the User acceptance testing.
- Assisted in round the clock support, reported on incidents and followed up on issue/incident resolution.
- Provided weekly status reports to the project Managers.
Environment: Oracle, AS400, Informatica Power Center, Informatica Power Exchange, Informatica Data Quality, Business Objects XIConfidential, Houston, TX
Lead ETL Developer
- Analyzed and studied the data and their relationships for the two areas - PPM and CBA as they had to be consolidated into a single Universe.
- Worked extensively with the Data Architect in designing the data model that would combine and consolidate the two application databases, CBA and PPM, and provide reporting solutions.
- Created logical mappings for individual data fields from the Source to the new warehouse as well as the mapping for the old reporting fields to the new database fields.
- Worked closely with the Users to identify the data that the Users wanted as Type-2 and what all can be maintained as Type-1
- Created Informatica maps using Informatica Designer to load data from Source to the Data Mart involving 3 phases of data load. This involved creating more than 100 mappings for the complete project.
- Developed Oracle Packages with procedures and functions to supplement the data load that comprised of complex data massaging to writing generic utility functions like deleting data sets, truncating tables and maintaining audit.
- Created Source and Target Connections, Informatica Sessions, Worklets and Workflows using Informatica Workflow Manager.
- Developed on-demand UNIX scripts to automate the load process.
- Created Schedules in Tidal by setting the correct dependencies and ensuring that most independent jobs run in parallel to minimize load time.
- Worked closely with the BO developer in designing the Universe using Business Objects Designer and provided necessary inputs for creating the standard reports.
Environment: Informatica Power Center 8.6, Informatica Power Center 9.0, Informatica Power Center 9.5HP UX, Oracle 11g, SQL, PL/SQL, Toad, Business Objects XI, Tidal.Confidential, Raleigh, NC
Lead Database Developer
- Worked with Confidential and Itron to execute the high-volume benchmark of the Itron’s Data Management solution on Oracle 9i, Microsoft SQL Server 2005 and SQL Server 2008 culminating in independent verification of the product’s scalability
- The test comprised 4.5 million meters with half-hourly data, a volume equivalent to hourly data for 9 million meters. Over 200 million intervals of data were imported in each test. The goal was to ensure IEE MDM performance not only meets the current needs of large utility companies, but can also scale as Itron customers deploy increasingly sophisticated large-scale advanced metering infrastructure (AMI) solutions.
- Extensively analyzed the High-Volume Reading Adapters that was going to be benchmarked and provided inputs for performance tuning.
- Created the necessary Test Scenarios required for the Benchmark.
- Developed Oracle PL/SQL scripts to simulate the test data load that was used for the Benchmarking.
- Identified bottlenecks during Test Execution and tuned the time-consuming queries.
- Executed Tests and monitored Performance of the hardware using Perfmon and the database using OEM.
- Documented the Benchmark Results that was published
Environment: HP Integrity Database Server, HP Blade System with Confidential Windows Server 2003, Oracle 10g R2, SQL Server 2008, OEM, SQL Loader, PL SQL Developer, TOAD, Perfmon, Microsoft Test Suite, HP-UX 11 v2, SQL, PL/SQL, HTML, JAVA ScriptConfidential, Cupertino, CA
Sr Informatica Developer/Team Lead
- Worked closely with the Business analyst to understand the various source data
- Used Informatica Power Center 7.x for migrating data from various OLTP databases to the data mart
- Worked with different sources like Oracle, flat files, XML files
- Extracted data from Sales department to flat files and load the data to the target database
- Extracted data from sources like fixed width and Delimited Flat files transformed the data according the business requirement and then loaded into Oracle database
- Created various Informatica mappings and mapplets to load the data mart. The mappings involved extensive use of transformations like Aggregator, Filter, Router, Expression, Joiner, Sequence generator
- Configured the mappings to handle the updates to preserve the existing records using Update Strategy Transformation (SCD Type-2)
- Performed Performance tuning of targets, sources, mappings and session and performed pipeline partitioning
- Used Debugger to identify the errors in the mappings and fix them
- Performed unit testing to validate mappings and populate the database
- Used TOAD and SQL Plus to write queries and interact with Oracle database
- Implemented several different tasks (Session, Assignment, Command, Decision, Timer, Email, Event-Raise, Event-Wait, Control) in the workflow
- Involved in writing UNIX scripts and used them to automate the scheduling process
- Project Management and Tracking.
- Maintenance and Support.
Environment: Informatica Power Center 7.x (Informatica Server, Repository Server, Repository Manager, Designer, Workflow Manager and Workflow Monitor), Oracle 9i, TOAD, UNIX, PL/SQL, SQL Loader, Windows 2000/XP, AIX 5.3, Oracle 9i, PL/SQL, Mac OSX.Confidential, Norwood, MA
- Involved in the design, development and implementation of the Enterprise Data Warehouse (EDW)
- Worked on ETL Migration process from legacy scripts to Informatica Power Center
- Involved in the creation of logical and physical designs using ERwin data modeler to develop Database schemas in dimensional modeling
- Worked on Informatica Power Center Tools-Source analyzer, Warehouse Designer, Mapping Designer and Transformation developer
- Used various Informatica transformations to recreate data in the data warehouse
- Involved in Data cleansing before loading it on Oracle data warehouse
- Designed and developed complex Aggregate, Join, Lookup transformation rules (business rules) to generate consolidated (fact/summary) data identified by dimensions using Informatica ETL tool
- Used Update Strategy and Lookup transformation for insert, delete, update or reject the records based on business requirements
- Worked on change data capture for slowly changing dimensions (SCD-2) to update a target table to keep full history
- Extensively used Mapplets and Reusable Transformations to prevent redundancy of transformation usage and maintainability
- Used the Repository manager to give permissions to users, created new users and repositories
- Setting up automated scheduling of sessions at regular intervals using Informatica Scheduler
- Worked on identifying Mapping Bottlenecks in Source, Target and Mappings to Improve performance
- Extensively used debugger to troubleshoot logical errors
- Used workflow manager for session management, database connection management and scheduling of jobs
- Monitored workflows and session using Power Center workflows monitor
- Worked with TOAD to write SQL queries and generate the results
- Created Adhoc Reports using Impromptu Administrator.
- Designed Physical Dimension Maps
- Built cubes using Cognos Transformer.
- Project Management and Tracking.
- Working on adhoc client requests.
- Technical Support.
Environment: Informatica Power Center 7.1 (Informatica Server, Repository Server, Repository Manager, Designer, Workflow Manager and Workflow Monitor), Windows XP, UNIX, Solaris 9, Oracle 9i, MS SQL Server 2000, XML, Flat files, ERwin, TOAD, Cognos Impromptu Administrator Version 7.3, Cognos Server Administrator Version 7.3, Oracle 9i. SQL, PL/SQL, HTML, JAVA Script