Programming Languages: C++, Python, Scala, SQL
Databases: Oracle 11g, Postgrad SQLOperating Systems Windows 10, UNIX/Linux, Microsoft Share Point 2016, Microsoft CRM Dynamic 2016
ETL Tool: Informatica Power Center 9.6.
EIT Tool: Fivetran, Matillion
Big Data Ecosystems: Apache Spark, Hadoop, Map Reduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Oozie, Flume, Apache Kafka
Cloud: AWS Redshift, Snowflake Computing.
Reporting Tool: Tableau, KPMG leasing Tool.
Confidential, Milpitas, CA
Cloud Engineer/ Big Data Hadoop
- Migrating application to SNOWFLAKE Cloud warehouse and working with engineering teams to complete testing and pilot migrations.
- Data Migration from S3 Bucket to SNOWFLAKE Cloud.
- Used Kafka to handle data pipeline for high speed filtering and pattern matching.
- Used Kafka for operation monitoring data pipelines which involvesaggregating statistics from distributed applications to produce centralized feeds of operational data.
- Leads the design and maintenance of logical and physical data models (relational & dimensional), data dictionary and database volumetric.
- Hands on Linux experience
- Conducted design reviews with business analysts, Enterprise data architect and solution lead to create proof of concept for the reports.
- Working on storage, load balancers, virtualization, web, database and messaging services with the ability to dive deep into any of these areas when necessary.
- Working on Business Domain Cybersecurity applications like IRSM, ITC.
- Generating report for BI user in Dashboard using Tableau reporting tool.
- Conducted and participated JAD sessions with Business Owners, Application Development teams to understand and analyze business and reporting requirements
- Responsible for Big data initiatives and engagement including analysis, brainstorming, POC, and architecture.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL and Big Data technologies.
- Implemented Agile Methodology for building Integrated Data Warehouse, involved in multiple sprints for various tracks throughout the project lifecycle.
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Included migration of existing applications and development of new applications using AWS cloud services.
- Involved in data model reviews as data architect with business analysts and business users with explanation of the data model to make sure it is in - line with business requirements.
- Provided suggestion to implement multitasking for existing Hive Architecture in Hadoop also suggested UI customization in Hadoop.
- Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources.
- Created and deployed DDLs based on the physical data model in Development Database
- Massively involved in Data Architect role to review business requirement and compose source to target data mapping documents.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snowflake Schemas.
- Worked on Metadata Repository(MRM) for maintaining the definitions and mapping rules up to mark
- Applied data naming standards, created the data dictionary and documented data model translation decisions and also maintained DW metadata.
- Used SAS/Interface to Teradata to extract data from Teradata and also used SAS/SQL pass through facility.
- Designed and Developed Real Time Stream Processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning.
- Developed and implemented data cleansing, data security, data profiling and data monitoring processes.
- Created process flow diagrams by using MS Visio and maintained design document.
- Specifies overall Data Architecture for all areas and domains of the enterprise, including Data Acquisition, ODS, MDM, Data Warehouse, Data Provisioning, ETL, and BI.
- Advises on and enforces data governance to improve the quality/integrity of data and oversight on the collection and management of operational data.
- Extracted data from SQL Server to create automated visualization reports and dashboards on Tableau.
- Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
- Designed and Developed Oracle SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
Environment: ER/Studio 9.7, Oracle12c, Hive, Amazon Redshift, AWS, Snowflake, Map Reduce, Hadoop, Hbase, Spark, MDM, Agile, NoSQL, SQL, OLAP, OLTP, SQL, HDFS.
Confidential, Mountain View, CA
- Involved in transferring files from OLTP server to Hadoop file system. Actively participated to writing Hive and Impala queries to load and processing data in Hadoop File system.
- Involved in to writing queries with HiveQL and Pig Latin.
- Involved in database connection by Using SQOOP.
- Importing and Exporting Data from MySQL/Oracle to HiveQL.
- Importing and Exporting Data from HiveQL to HDFS.
Confidential, San Jose, CA
Hadoop Developer/Cloud Engineer
- Involved in analyzing the system and business.
- Involved in importing data from MySQL to HDFS using SQOOP.
- Involved in writing Hive queries to load and process data in Hadoop File System.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Involved in working with Impala for data retrieval process.
- Exported data from Impala to Tableau reporting tool, created dashboards on live connection.
- Sentiment Analysis on reviews of the products on the client’s website.
- Exported the resulted sentiment analysis data to Tableau for creating dashboards.
Environment: Cloudera CDH4.3, Hadoop, Map Reduce, HDFS, Hive, Impala, Tableau, Microsoft Azure
Confidential, Kansas City, MO
- Created modules for both the client and server.
- Developed several Data stage jobs for Data Integration of the Source systems Teradata flat file, oracle and SQL Server.
- Use Informatica ETL tool to load data from source to target.
- Used Jira tools to meet the responsibilities for the Project.
- Used Sql to support client related request.
- Stored and merge the code changes using GitHub.