- My career can be characterized as combination of broad and deep experience covering the full spectrum of Information Technology.
- I have always been attracted to the inherent novelty of technology thereby making the process of learning new things enjoyable.
- I started working with Big Data technology in 2012.
- I follow best practices and proven methodologies with respect to Infrastructure as code, Continuous Integration, Continuous Deployment, Big data, and Business Intelligence.
AWS Architecture: Virtual Private Cloud, Availability Zones, EC2 instances, Subnets, Routing Tables, Network Access Control List (NACL), NAT and NAT Gateway, Route 53 DNS, Elastic Beanstalk, Security Groups, Users, groups, Roles, Policies, Auto - Scaling, Elastic Load Balancer (ELB), Cloud Front, Cloud Watch, Simple Storage Service (S3), Elastic Block Store (EBS), Glacier, Elastic Load Balancer, Auto Scaling, SNS, SQS, Lambda, Containers.
AWS DevOps: AWS CLI, Terraform, Cloud Formation, JSON, YAML, Boto3M Chef and Hosted Chef, Docker, Vagrant, Packer, Github, Jenkins, Node.Js, Cloud Formation, OpsWorks, CodeDeploy, CodePipeline, CodeCommit, CodeBuild, Blue/Green deployment and A/B testing.
Big Data: Cloudera Hadoop, Cassandra Cluster, MongoDB, AWS Elastic MapReduce (EMR), HDFS, Hue, Cloudera Manager, Hive, Pig, Impala, Sqoop, Flume, Amberi, HBase, Storm, Kafka, Oozie, Mesos, Yarn, Kibana, Alooma, Spark, Node.js
Spark Tools: Jupyter notebook, Databricks, Zeppelin notebook.
Packaging Tools: Npm, pip, pip3, maven, Chocolatey.
Linux flavors: Red Hat, Ubuntu, CentOS.
Serverless Technology: AWS Elastic Beanstalk, Api Gateway, Lambda.
Real-time Streaming Analytics: Spark, Kafka, Kinesis, Flink, Storm, ElasticSearch, Kibana, Splunk.
Operating Systems: Unix, Linux, Windows, VMware.
BI Tools: Tableau, Power BI, Informatica, Excel.
Query Engines: Hue, presto, zeppelin, Amberi.
Database: HANA, Hive, PostgreSQL, Oracle, MySQL, Sql Server, MongoDBHbase, Aurora RDS, Cassandra, DynamoDB.
Database Tools: MongoBooster, VisualOps, NoSQL Manager for CassandraWorkbench/J, ElephantSQL, Navicat, MySql Workbench.
Testing Tools: Artillery, Ghost Inspector, Selenium.
Web Data Sources: Stackexchange, Kaggle.
Networks: WAN, LAN, TCP/IP, Firewalls, Gateways, Switches, Routers, NAT, Subnets.
IDEs & Editors: Vi, Atom, Eclipse, VS Code, Sublime, Brackets, Google Developer Tools.
Other: Erwin 7.1, Visual Studio, Visio 2013, Excel 2013, Confluence, Jira, ER Studio Data Architect 2016, Power Point, Pycharm, Cloud9, Adobe In- Design, Illustrator, Photoshop.
AWS DevOps/Big Data Engineer
- Created real-time Spark streaming by reading Twitter feeds on Cloudera Hadoop that fed data to Splunk, Tableau, and HANA.
- Utilizing Splunk Hunk, extracted and analyzed Hadoop HDFS log files.
- Created a distributed online Analytics/Search platform by utilizing Cassandra, ElasticSearch, Kafka, Flink, Kibana, and Alooma.
- Installed and Configured MongDB on Red Hat.
- Installed Mysql on Cassandra and enabled MongoDB to read from Mysql Tables.
- Wrote PySpark and SparkSql programs.
- Created/Configured AWS Virtual Private Cloud, Availability Zones, EC2 instances, Subnets, routing tables, Network Access Control List (NACL), NAT and NAT Gateway, Route 53 DNS, users, Security Groups, Roles, Policies, Auto-Scaling, Elastic Load Balancer (ELB), Cloud Front, Cloud Watch, Simple Storage Service (S3), Elastic Block Store (EBS), and Glacier.
- Installed Hadoop clusters and configured the environment.
- Worked with the developers and laid out an and end-to-end streamlined and automated process by which Continuous Integration and Continuous Deployment were successfully implemented.
- Configured the environment for the developers to run their applications through Elastic Beanstalk thereby eliminating the hardware/software layers that are required to house the applications on-premise.
- Utilized Terraform and AWS CLI for provisioning AWS resources.
- Utilized Chef, JSON, and YAML for AWS Configuration Management.
- Automated application deployment workflow by utilizing AWS containers, Docker, Vagrant, Node.Js, CloudFormation, Opsworks, CodeDeploy, CodePiple, Codebuild, Chef, Jenkins, and Github.
- Created an automated event driven notification service utilizing SNS, SQS, Lambda, and CloudWatch.
- Created and configured an AWS Hadoop Elastic MapReduce cluster.
- Utilized Pig for running ETL jobs on Hadoop.
- Transferred Data between Hadoop and the Target databases using Sqoop.
- Utilized Hive, Impala and Tez to query the Hadoop Database.
- Created Tableau Visualizations by connecting to AWS Hadoop Elastic MapReduce.
- Migrated MySql Database to Amazon Aurora.
- Ran queries against the Aurora Data Warehouse using Sql Workbench/j
- Transferred data to Amazon AWS by using Amazon Database Migration Services.
- Generated Jaspersoft reports by connecting to Amazon RDS MySql.
- Utilized CloudWatch to produce metrics that provide information about the state of the domains.
- Utilizing AWS Route 53 combined with DNS Failover to provide a fault-tolerant architecture with Health checks for user Websites. Amazon Route 53 Traffic Flow’s helps us understand how our end-users are routed to the Web Applications.
- Configured CloudWatch, Lambda, SQS, and SNS to send alert notifications.
- Configured AWS Lambda functions to log the changes in AWS resources.
- Configured the Trusted Advisor to help reduce costs.
- Created Users, Groups and implemented IAM policies.
- Created and launched instances of EC2 Red Hat Linux with EBS storage.
- Implemented the process of taking snapshots of the infrastructure according to Recovery Time Objective and Recovery Point Objective guidelines.
- Configured Visual Studio to work with AWS enabling a suitable environment for writing code.
- Developed interactive dashboards by leveraging Tableau's workflow capabilities.
- Developed forecasting and trending reports by utilizing Analytics, Table calculations, Parameters, Maps, Trend Lines, Groups, Hierarchies & Sets.
- Installed and configured Tableau Server.
- Published Dashboards on Tableau Server.
Data Warehouse Architect
- Created a Cloudera Hadoop cluster, populated its tables with Web Data using Python Scrapy & Beautifulsoup and used Sqoop to populate MySql tables.
- Wrote Ad-Hoc queries on the Hadoop cluster using Impala.
- Utilized Hue to query the Cloudera data.
- Configured HDFS File System.
- Created an ETL job using Pig to automate source to target data transfer between Hadoop and MySql.
- Configured Tableau to read data from Hadoop and created Dashboards.
- Created an AWS private Cloud and configured an EC2 instance to house an Oracle RDS Database.
- Created AWS Subnets, Elastic Block Store (EBS), Availability Zones, Auto-Scaling, Network Access Control List (NACL), Route 53, and Simple Storage Service (S3).
- Migrated applications to AWS.
- Designed the ETL processes.
- Designed the Data Marts.
- Created Star schemas, and Fact tables.
- Developed forecasting and various trend reports by extensively using Tableau Advance analytics like Reference Lines, Trend Lines and Bands.
- Automated the process of data extracts for Tableau.
- Developed Tableau dashboards.
- Designed the staging area’s tables for extracting the data from disparate sources.
- Created the logical and physical data models of the Data Marts in accordance with the requirements as outlined in the functional specifications document.
- Created the Star schemas and Fact tables for each subject area.
- Designed the ETL architecture and data integration aspects of the Data Warehouse.
- Created ETL workflows using Informatica Power Center.
- Wrote Informatica Mappings, Worklets, and Workflows.
- Cleansed the data as they were copied to the staging area as part of implementing data quality control.
- Transferred and transformed the data from the sources to target Data Marts using Power Center.
- Monitored the ETL jobs and improved throughput by tuning the process and optimizing the queries.
- Proposed solutions and alternatives with respect to scalability, flexibility, ease of maintenance, time to delivery, and phased approach to release schedules.
- Working closely with the project manager, product owner, and the scrum master in tailoring our development efforts around the scrum agile methodology; creating/grooming the product backlog, establishing sprints, and going through the sprint review process.
Data Warehouse DBA
- Configured Esxi Hosts and created VMware 5.1 virtual machines.
- Installed Oracle 12C on Red Hat Linux VM on VMware.
- Managing VMware 5.1 Servers (VMguests) that host the on-premise Databases.
- Worked in DBA capacity for the project in charge of Oracle and Sql Server source and target Databases.
- Wrote Oracle PL/Sql and Sql Server T-Sql stored procedures to clean up the source data and improve data quality.
- Designed a star schema comprising two fact tables and twelve dimension tables.
- Developed Informatica mapplets, mappings, worklets, and workflows. Merged the data from the staging area into the star schemas. Scheduled Informatica jobs.
- Managing 12 Oracle Databases (8i, 9i, 10g, and 11g), one of which is a Data Warehouse storing two terabytes of data; upgraded several Oracle Database from 8i and 9i to 10g.
- Designed and implemented the ETL and data integration aspects of the Data Warehouse.
- Worked closely with our BI analyst and designed the Data warehouse star schema.
- Developing Informatica mappings, worklets, and workflows.
- Informatica Administrator; upgraded Power Center 7.1 to 8.1.
- Developed Tableau dashboards.
- Create action filters, parameters and calculations for preparing dashboards and worksheets in Tableau
- Hands-on development assisting users in creating and modifying Tableau worksheets and data visualization dashboards.
- Defined best practices for Tableau report development.
- Responsible for the creation of users, groups, projects, workbooks and the appropriate permission sets for Tableau.
- Managing NetApp 7.2 storage; Creating Aggregates, Volumes, Luns, initiators.
- Managing 10 Sql Server Databases (2005 and 2008), upgraded from 2000 and 2005 to Sql Server 2008 R2.
- Installed Windows 2008 R2 cluster on VMware and installed Sql Server cluster 2008 R2.
- Managing VMware 4.1 Servers (VMguests) that host the Databases; upgraded vcenter 3.5 to 4.1, created clones, templates, and monitored the ESX servers.
- Remedy AR Server Administrator; upgraded Remedy 5.1 to 7.6.
- SharePoint foundation 2010 Administrator; upgraded SharePoint Services 3 to foundation 2010.
- Developing BASE SAS procs.
- SAS Administration
- Developing SharePoint and Wordpress Wiki sites.
- Supporting the developers with their database requests.
- Working closely with the Senior Analyst and the IT director.
- Proving 24/7 support for the applications and databases.
- Monitoring VMware and NetApp.
- Applied Oracle database critical patches and Sql Server service packs.
- Devised back and recovery procedures for Oracle and Sql Server Database.
- Sql Server Reporting Services.
- Installed and configured Sql Server Analysis Services.
- Installed and configured Sql Server Integration Services for creating DTL packages.
- Created and monitored Database auditing jobs.
- Performance troubleshooting and optimization is my niche; optimized our nightly ETL jobs in order to meet SLA.
- I was the project’s DBA and the Data Warehouse Administrator. I also participated in designing the data model as well as developing Informatica workflows and testing the application.
- Managing day-to-day activities of BI databases (source, target, and MicroStrategy), and providing 24/7 support.
- Develop Oracle packages, stored procedures, and triggers.
- Develop Power Center maplets, mappings, worklets, and workflows for membership data integration.
- Monitor and tune different components of the Database Servers such as shared memory, disk I/O, Users, processes, logs, disk space, locks, and latches.
- Automated database monitoring and reporting functions utilizing UNIX SHELL scripts, and PERL scripts.
- Implemented Oracle Data Guard for disaster recovery.
- Setup/configured Oracle RMAN for database backup and recovery.
- Cloned an Oracle database for the purpose of testing.
- Upgraded Oracle 9i to Oracle 10g.
- Upgraded Informatica Power Center 7.1.1 to 8.1.
- Utilizing STATSPACK, ANALYZE, EXPLAIN PLAN, SQL TRACE in order to identify/eliminate bottlenecks, and to improve the execution of long-running queries.
- Serve as ETL expert in all phases of development, implementation, and production support.
- Identified source systems and data to be extracted from various sources based on user requirements.
- Worked with Legacy application teams to determine best approach of procuring source data.
- Performed data mappings from source systems to new target structures on the database platform.
- Built the infrastructure and the star schema for the membership Data Mart.
- Improve Data Warehouse daily refresh execution time by monitoring, measuring, and adjusting ETL components.
- Work with the System Administration and Network teams to eliminate the events that increase load time or cause job failures.
- Perform code reviews and follow project methodologies.
- Perform DBA/Development tasks on a Sql Server Databases.
- Mentoring junior staff members.
- Configured and maintained Informatica Power Exchange.
- Participated in Informatica 8.5 Beta Testing program.
- Work with Microstrategy/OLAP developers to ensure that they have the database objects that they need.
- Administrator for the team's Web 2.0 site.
- Attended Microstrategy classes in order to have a complete understanding of our Business Intelligence environment.
- Formed a Technology group; we deliver presentations on a monthly basis.
Technologies: ORACLE 9i 2.1.0 & 10g 10.1.2, Informatica Power Center 7.1.2 & 8.1, HP 9000, AIX 5.2, PL/SQL, Unix Shell, Perl, TOAD, Microstrategy 8.1.
- Participated in creating the Data Warehouse architecture for E-trade.
- Establishing procedures pertaining to the database design, security, and maintenance.
- Built the infrastructure and the star schema for the Accounting Data Mart.
- Implemented the Logical/Physical Data Model.
- Defined and built the Logical and Physical stages for Data Transformation.
- Designed and created the summary tables.
- Developed ETL mappings in Power Center 7.1.2
- Performed data mappings from source systems such as Siebel to new target structures on the database platform.
- Designed, developed and maintained Power Center workflow load processes to data warehouse.
- Designed, wrote, and implemented procedures that controlled Data Warehouse refresh strategies.
- Wrote shell programs for the automation of Data Warehouse Autosys Job Control.
- Designed and implemented utilities, which automated several DBA functions and maintenance activities including setting up alerts and collection of important database health indicators.
- Generated web reports using Shell scripts and HTML.
- Monitored activities related to data growth, performance and availability.
- Modified and Maintained existing Java programs.
- Migrated a Sybase Database to an Oracle Database
Technologies: ORACLE 9i 2.1.0 & 10g 1.0.2, Informatica Power Center 7.1.2, Sun Solaris 8.1, Red Hat Linux AS release 3, PL/SQL, Unix Shell, TOAD, Erwin 4.5, Sql Server 2000
Data Warehouse DBA
- Belonged to Confidential ’s data Warehousing group responsible for maintaining, improving, and monitoring the system.
- Hand coded ETL procedures and transformations in Oracle PL/Sql.
- Elicited and documented business requirements from the user community.
- Modified existing data models to accommodate new business requirements.
- Planned and implemented Oracle 9i software installations, migration, upgrades, and patches.
- Merged several databases into one database.
- Performed system testing and assisted users in User Acceptance Testing.
- Created test plans.
- Implemented methods to ensure data integrity.
- Performed database sizing and capacity planning.
- Performed DBA tasks such as disk layout architecture, performance tuning, and backups.
- Assisted developers with issue resolution and patch/fix identification and implementation.
- Helped in Deploying of major software releases, feature enhancements, and bug fixes.
- Implemented ORACLE Data Guard (Standby Database) between the production and batch reporting systems.
- Implemented data replication and Materialized Views between the production and Real-time reporting systems.
Technologies: Informatica 7.0.1, ORACLE 9i 2.1.0, Oracle OEM, Sun Solaris 8, HP 9000, windows 2000, Apache Web Server, TOAD, Erwin 4.5, PL/SQL, Unix Shell.
Data Warehouse Architect
- We built a Data Warehouse for an HMO health organization. I led a team of developers and also worked as the DW administrator; working closely with our principal, the two of us built the DW architecture.
- Met with the team on a weekly basis and discussed progress, risks, and potential pitfalls.
- Supervised and performed quality review of junior level DBA's to ensure success of all operations.
- Managed all aspects of the technical implementation.
- Worked closely with development and support staff, customers, business subject matter experts and business partners to gather requirements and translate them into technical designs.
- Assisted in translating business requirements to report specifications.
- Took the lead in instilling best practices into our development environment.
- Established procedures and guidelines for the design, development, and administration of the warehouse.
- Integrated IDX and MUMPS data with other medical data.
- Wrote most of the code for the initial Data Warehouse population, and the monthly incremental refreshes and updates.
- Collaborated in the review, selection, procurement, usage and maintenance of internally developed applications as well as purchased applications, particularly focusing on data and database administration functions as they relate to the associated application/project.
- Ensured that IT business solutions meet the IT Strategy (i.e., making sure that general architectural issues are satisfied).
- Mentored junior team members and enabled them to gain proficiency in Database Administration and Development.
- Set standards and common conventions for the other Database Administrators in the team.
Technologies: Sql Server 2000, windows 2000, Erwin, Perl, MS SQL DTS, Transact Sql
- As an Oracle consultant, I participated in implementing a solution for a local Hospital. The overarching objective was to automate the hospital’s patient monitoring system by utilizing operations research methods.
- We provided the client with optimum operating solutions by developing a scientific model of the system incorporating measurement of factors such as chance and risk to predict and compare the outcomes of alternative decisions, strategies or controls.
- The second phase of the project involved creating an automated alert notification system. This entailed utilizing machine learning by adopting Neural Networks technology. The system intelligently routed alerts according to the probability of the type of the urgent problem that necessitated immediate response.
- We used SAS Analytics and Oracle times ten In-memory Database as well as Neural Networks programing in C language.
- Facilitated the transfer of knowledge to customers regarding Oracle database servers and tools through on-site and classroom training.
- Conveyed course materials about the use of specific products in a classroom environment.
- I taught the following classes - Database Administration, Performance tuning, Backup & Recovery, PL/SQL.
- Determined efficiency of database design by reviewing the query paths used to obtain needed information and recommended Indexing strategy based on results.
- Reviewed database integrity by checking for the existence of referential constraints, default values, unique constraints, and constraint checking.
- Reviewed use of transactions, stored procedures and triggers in database applications and constraints. Reviewed data security plan used. Recommended table/column level privilege restrictions as well as the use of views for information hiding and/or protection.
- Redesigned the Logical and Physical Data Model in order to reduce Data Redundancy and ensure data integrity.
- I taught the following classes.
- Dynamic server Internal Architecture (V5.x, V7.x), Database Administration, Performance Tuning, Triggers & Stored Procedures, Structured Query Language, Logical and Physical data modeling. SAS Analytics, Oracle times ten In-memory Database, Neural Networks, C