Big Data Principal Architect Resume
SUMMARY
- Elsayed is an accomplished Big Data & Analytics Principal Architect with over 18 years of hands - on experience architecting and delivering enterprise data platforms and complex IT solutions in Big Data, Analytics, and Cloud Computing.
- His expertise lies in building scalable big data, analytics, and real-time data processing solutions to assist businesses to become a data-driven and creatively use data to develop insights and improves operational efficiency and customer experience.
- He has led the architecture and operation of various cloud based and on-premise fully secured and highly available enterprise data platforms using open source and commercial products.
TECHNICAL SKILLS
- Tableau, Qlikview, SAP SEM, Excel;
- SSRS, Oracle Reports and Crystal Report;
- SAP BI/BO, SSRS, TIBCO Jaspersoft, Power BI;
- HDP, HDF, Amazon EMR, Cloudera CDH;
- Hadoop, HDFS, HBase, Hive, Pig, Spark, Kafka, NiFi, Storm, Sqoop, Oozie;
- R, Spark MLLib, Zeppelin, Mahout;
- S3, EC2, EMR, Redshift, RDS, Data Pipeline, Kinesis;
- BLOB Storage, Data Lake Storage, VMs, SQL Database, HDInsight;
- SAP BW, SAP HANA, Teradata, Amazon Redshift;
- Oracle, SQL Server, MySQL, PostgreSQL, MS Access;
PROFESSIONAL EXPERIENCE
Confidential
Big Data Principal Architect
Responsibilities:
- Design and Architect Data Lake (EDL) solution in Azure to meet data and analytics demands.
- Configure Microsoft Azure Cloud Infrastructure objects for Hadoop clusters and ETL process.
- Design and Architect Big Data store using Azure Data Lake Store (ADLS),
- Design & Architect Real-time data streaming solution using Tibco messaging, HBase, and Spark
- Design and configure HIVE data warehouse on top of ADLS using Azure HDInsight.
- Design and Architect ETL and data ingestion solution using Talend for Big Data Streaming.
- Lead the architecture, installation, and configuration of Talend for Real-time Streaming platform.
- Lead the design and implementation of change data capture (CDC) solution using Talend streaming
- Design and implement CI/CD solution, using BitBucket, Jenkins, and Nexus;
- Provide best practice for Data Lake Design, ETL Design, and Real-time data streaming.
Environment: Azure, Azure Data Store, Azure HDInsight, Azure SQL Warehouse, Spark, HBase, Hive, Talend for Big Data Streaming, Tibco Messages, Siebel, SQL Server, Oracle, DB2, Java, Scala
Confidential
Big Data Principal Architect
Responsibilities:
- Designed and Architect Enterprise Data Lake (EDL) solution to meet data and analytics demands.
- Configured Microsoft Azure Cloud Infrastructure for Hadoop clusters.
- Led the architecture, installation, and configuration of HDP 2.6 Multi-Tenancy Hadoop cluster.
- Led the architecture, installation, and configuration of HDF 2.1.2 Data Flow cluster.
- Integrated SAS BI and Visual Analytics to the Big Data platform for reporting and analysis
- Configured Highly Available PROD cluster components, NameNode, ResourceManager, Hive, HBase
- Configured Security Components, Ranger, Ranger KMS, Atlas, Knox.
- Installed and configure Local KDC and enabled Kerberos Security for HDP and HDF clusters.
- Configured one-way trust between KDC and corporate Active directory (AD).
- Configured Ambari, Ranger, NiFi, and Atlas, for Active Directory Authentication.
- Configured SSL and Data Encryption in motion and at rest for all cluster components.
- Provided best practice for Users onboarding, Authorization, and Hive Data Warehouse Design.
- Provided best practice and support to integrate external systems SAS systems, BI systems and Splunk.
- Designed and developed HIVE data warehouse on top of HDFS using Avro and ORC data format.
- Led the development of CDC and real-time data streaming from data sources using Apache NiFi.
- Developed Sqoop data ingestion process to ingest data from Oracle and Teradata to the Data Lake.
Environment: Azure, Hortonwork HDP 2.6, HDF3.0, NiFi, Spark, HDFS, HBase, Hive, Kafka, Sqoop, SQL Server, Oracle, DB2, SAS, Splunk
Confidential
Big Data & Analytics Architect
Responsibilities:
- Developed Data Analytics & BI strategy and led the design and implementation of the transition plan.
- Designed, architect and configured cloud based Hadoop ecosystem using Amazon Elastic MapReduce (EMR), EC2, S3, Kafka, HDFS, Cassandra, Hive, Hue, Pig, Oozie, Spark, and Apache Solr;
- Led the ETL Talend Big Data solutions design, Development Effort, environment set-up, version control and change management using GitLab, Continues Integration and Delivery (CI/CD) using Jenkins, and Artifacts Repository publishing using Nexus;
- Designed and model Enterprise Data Warehouse using Amazon Redshift. Developed and automate Talend ETL jobs to extract data from data sources, perform data cleansing and transformation, and load data to Amazon Redshift to make data ready for reporting and analysis.
- Drove Talend ETL jobs development to ingest data to Hadoop Data Lake from Web server logs, Google Analytics, BigQuery, LiveChat, Salesforce, MongoDB, Retail Banking, and RDBMS;
- Developed real-time data streaming and change data capture solution using Scala, Kafka and Spark Streaming to stream data changes from operational systems to Amazon Redshift;
- Designed and developed performance and operational Dashboards for Marketing, Sales, Operations departments using Tableau. Tasks involved define KPIs, identify data sources, Developed ETL programs to extract data from data sources, wrote SQL and programs to calculate KPIs. Developed Dashboards using Tableau.
Environment: AWS, Redshift, EMR, S3, Solr, Kafka, HDFS, HBase, Hive, Spark, Talend, PostgreSQL, MongoDB, Cassandra, SQL Server, Tableau, TIBCO Jaspersoft, Java, Scala, Google Analytics.
Confidential
Big Data & Analytics Architect
Responsibilities:
- led building scalable infrastructure and Big Data platforms to collect, store and process supply chain network data to power Confidential Sales & Marketing analytics solution. The solution enables strategic decision making by actively revealing and presenting emerging trends and opportunities across the entire supply chain network and customer lifecycle;
- Implemented and developed ETL process for data extraction, transformations, and loading to extract data from SAP, Forecast & Planning System, SAP HANA and ingest data to Big Data lake.
- Built HIVE & SPARK data models to utilize Machine Learning algorithms to perform Root Cause Analysis for Aging Stock and Predictive Analysis for Material Lead Time and Safety Stock;
- Drove the designed and developed of BI and data visualization solution using Tableau Server to develop monthly Dashboards; Sales & Customer Service Level Dashboard, Supply & Vendor Service Level Dashboard, and Planning & Forecast Dashboard using Tableau Server and Tableau Desktop;
- Collected data and run reports to monitor and proactively analyze excess inventory; Coordinated with Sales & Marketing departments to recommend corrective actions on clearance of aging stock; led “Material Distribution” Project to understand customers ordering behavior and identify most costly customers based on cost to serve calculations; Project involves identifying cost drivers for Sales Order Processing, Customer Service, Claim handling, Logistics Execution and transportation; Extract data from SAP ERP systems; visualizing data and present recommendation;
Environment: Cloudera CDH, Hadoop, HDFS, Hive, Spark, SAP ERP, SAP BW/BI, SAP HANA, Tableau, Talend, SQL, PL/SQL, Oracle DB, SQL Server, Java, Scala;
Confidential
Senior Data Architect
Responsibilities:
- Created design specifications for data migration requirements. Prepared data extraction queries and developed data loading programs. Resolved all issues related to data migration and integration points;
- Managed a team of 4 consultants to determine requirements, translate requirements into technical design specifications, and design the interface data model. Developed the interface solution to send and receive data between the Maintenance Management and GIS Systems, Create and update assets, maintenance notifications and maintenance orders in the GIS system;
- Coordinated resources and developed project schedule and final report, provided functional and technical training to end users as required;
Environment: SAP ECC, ESRI ArcGIS, Talend DI, Oracle DB, SQL, Python, Java;
Confidential
BI & DW Solution Architect
Responsibilities:
- Subject matter expert (SME) in Dashboard & Scorecards design and development, Data Warehouse & Database Design, and Data Extraction;
- Interacted with senior managers and business users to understand and analyze data analysis, BI, Data Visualization, and reporting requirements;
- Developed corporate Business Intelligence and performance monitoring Strategy; Drove the design and implementation of performance monitoring and Business Intelligence solutions;
- Designed and developed Balanced Scorecards for quality management department and performance and operational dashboards for human resources and IT client service departments. tasks involved identify KPIs, identify data sources and data requirements to calculate and measure KPIs, developed data extraction and transformation programs to extract data from data sources and calculate KPIs;
- Designed and developed Automatic Performance Reports solution to automatically send performance monitoring reports regularly to business users as an email attachment;
Environment: SAP BW, SAP ERP, SAP SEM, Oracle Weblogic, Oracle, SQL Server, Teradata, Cassandra, Informatica Power Center, Informatica Power Exchange, Erwin, Tableau, SAP Crystal Report, Oracle OBIEE, Oracle Discoverer, PL/SQL, Java, J2EE, C++