We provide IT Staff Augmentation Services!

Hana / Sap Data Services Architect, Big Data Resume

4.00/5 (Submit Your Rating)

SUMMARY

  • HANA Architecture/Development - Architect and develop attribute, analytical, graphical and scripted calculation views, ETL to HANA from various sources, data transformation inside HANA for data marts using SQL programming
  • ETL - BODS (Data Services), MuleSoft, Informatica, Python - Data transformations from Big data Hadoop/Hive, Kafka, Hana, Sales Force, NetSuite, SAP BW, SAP ECC, MSSQL server, MySQL
  • Experience working with Python for data wrangling (Numpy, Pandas), visualization (Matplotlib, Bokeh) and machine learning (Scikit-Learn)
  • Experience working on AWS EC2 instances for HANA, Hadoop
  • Experience working with MapR and Cloudera Hadoop implementations
  • Big Data - Hive, Impala, Kudu, Sqoop, HBase, HDFS, Hadoop, Kafka, Flume, Spark.
  • Analytics/Front End development - Tableau, Qlik View, BOBJ (WEBI, Explorer, IDT, Relational Universe).
  • Worked on Implementations, upgrades, performance enhancements in the HANA, BI/BW and Analytics space.
  • Extensive experience in data analysis and root cause analysis and proven problem solving and analytical thinking capabilities.
  • Self-motivated team player with a passion for learning new technologies. A steady performer with efficient multi-tasking in high stress or fast-paced environments.
  • Ability to prioritize workload and work with minimal supervision.
  • Excellent communication skills and well versed in gathering requirements from end users.
  • Other experience includes research and University teaching career in Physics (Associate Professor).

TECHNICAL SKILLS

HANA: HANA Information Modeling on SAP HANA 2, SPS 12 revision 122, 10.9.0, 7.0, 6.0

Data Warehousing: SAP HANA Standalone Warehouse, SAP BW 7 - 7.3 Migration to HANA, SAP BI (7.3, 7.0, 3.5), Microsoft BI

Big Data: Map-R and Cloudera Hadoop implementations, Hive, Impala, Kudu, Sqoop (ETL), Kafka, Spark

ETL: BODS (Data Services), Python Pandas, MuleSoft, SDA (Smart Data Access) in HANA, SLT, Stored procedures for data transformation in HANA, Informatica, BW data flows

Data Science: Machine Learning Modeling with Scikit-learn

Cloud/Apps: AWS, Salesforce, NetSuite

Databases: Big Data Hive, Impala, HBase, SAP HANA 2.0, Sybase IQ, SQL Anywhere, MySQL, Microsoft SQL Server(2014, 2012, 2010), Teradata, ORACLE 12, 11i, 10g, 9i, 8i, Pervasive, BTrieve

Programming: SQL, SQL Stored procedures, HANA SQL Script, Python, Triggers, SAP ABAP, C#, VB.NET, VB, Java, COM, XML, ASP, JSP, VBScript, HTML, CSS

Analytics: Tableau, BI 4.0 (WEBI, Explorer, Dashboard, Crystal Reports), LUMIRA, BEx Analyzer, BEx Query Designer, Microsoft SQL Server Reporting (SSRS)

SAP Releases: CRM, SAP BW 7.3, SAP ECC 7.0, ECC 6.0, ECC 5.0, R/3 4.7, 4.6c, SAP Solution Manager

Functional: SAP BI/BW, Data Flow and DB tables of CRM, PM, MM, SD and FI Modules, NetSuite Functional/Tables

Platforms: Windows (10, Vista, XP, 2003, 2000, NT, 95/98, Me), Ubuntu, RHEL

Version Control: HANA Repository, GIT, Visual Source Safe, Rational Clear Case, CVS (Subversion, Tortoise)

PROFESSIONAL EXPERIENCE

Confidential

HANA / SAP Data Services Architect, Big Data

Responsibilities:

  • Built complex models in Hana using SQL stored procedures and SQL script in Hana Calculation views to derive the Installed Product Foundation, to replace the processing that existed in the Teradata legacy system
  • Worked primarily with Install Base and Service Agreements data to derive metrics for cross-sell and upsell opportunities for Cisco products. Replaced a legacy system built in Teradata with a new data warehouse in Hana
  • Record counts for some of the transaction data is in the billions. Used Partitioning of tables (hash) in Hana since record count is over 2 billion for most of the transaction tables. Designed SQL stored procedures to process the data in partitioned chunks.
  • ERP system of Record is Oracle ERP for Confidential, Inc. Transaction history is archived in Hadoop (Map-R distribution) under the Hive database. Hadoop HDFS file size per transaction table on average is about 3-4 terabytes, with some tables having record counts of more than 10 billion
  • Major challenge is to bring this big data into Hana for fast analytics using SAP BODS. Since the Hive connector delivered by SAP BODS is new, but limited in functionality, faced many limitations/challenges in importing data from Hadoop Hive.
  • Architected and developed a combination of parallelization techniques in Data Services, partitioning in Hadoop, partitioning in Hana (Hash), SQL transform to query data from Hive source, to move the data at a fast transfer rate. Scaled up memory on both Data Services and Hadoop to support parallelization. Achieved a fast transfer rate of 18 million records per minute
  • Used Python to validate record counts, distinct values per table column between source Hive and target Hana after the data import

Confidential

HANA Architect/Developer, Big Data

Responsibilities:

  • ERP system of Record for Confidential is NetSuite. Existing legacy system used ‘saved searches’, which are compiled queries in NetSuite, created by Power users of the Sales and Distribution division. These saved queries were used to generate flat files which are then imported into Microsoft SQL server using SSIS. SSAS cubes built upon the data in SQL server were used in excel reporting.
  • The objective/project was to create a new data warehouse in Hana that would replace the legacy system mentioned above.
  • The ‘Saved Searches’ are not transparent, so a high degree of reverse engineering was necessary to get to the underlying table, columns and correct joins to form the base for the stand-alone warehouse in Hana
  • Architected and developed the Hana stand-alone data warehouse from scratch for the Sales and Distribution division.
  • Used Informatica (existing) and BODS (new implementation) to design and implement the ETL - Data is imported from sources: NetSuite (ERP data)SQL Server (subset of Master data) and Zyme (Channel Data Management Software) flat files from Confidential distributors on inventories procured and sold into: target data warehouse on Hana
  • Architected and developed the complex transformations on this base data in Hana using SQL, stored procedures, Calculation views, to derive the metrics for the Sales and Distribution Division
  • The metrics derived above are then consumed by BOBJ and tableau reports

Confidential

HANA Architect/Developer, Big Data

Responsibilities:

  • Worked primarily for GCS (Global Customer Support) department. This department relies heavily on metrics developed in HANA using CRM data from Sales Force, SAP and MySQL databases
  • Solely responsible for architecting, developing and deploying CRM data marts for all sub departments within GCS
  • ETL - imported CRM data from Sales Force, NetSuite (legacy), MySQL, SAP, Sybase IQ, flat files into HANA using both BODS and Mule Soft
  • Used SDA (Smart Data Access) to integrate data from remote sources into HANA models
  • Extensive use of stored procedures (both read-only and read-write), HANA views (attribute, analytical and Calculation Views)
  • Developed Dashboards for GCS External Partner data, with HANA views as the backend
  • Although Sales Force backend data for date times is in GMT, the BODS driver for Sales Force brings in data in PST (automatic conversion). The GCS department needed the metrics in GMT and the BODS driver had no settable parameter that would bring the data in GMT. Conversion from PST back to GMT results in erroneous data around the Daylight savings time. So, used Mule Soft to connect to Sales Force to bring in both GMT data and PST data (converted in the data transformations, there is no conversion loss from GMT to PST). For this effort needed to drop and reload all Sales Force tables in HANA.
  • Retro-fit all the existing metrics code base (HANA Stored procedures, HANA Views) from PST to GMT
  • Developed the GCS metrics in Big Data, since the new direction from management was to move to Big Data from HANA - built the POC for bringing the CRM data into Hadoop
  • GCS Metrics in Hadoop: Imported data from Sales Force, HANA, MySQL into HDFS/Hive (data warehousing for Big Data) using Sqoop. Exposed data in Impala (fast analytics for Big Data). Re-developed the GCS data marts using SQL in Impala.
  • Analytics of data in Impala in Tableau (also Qlik View)

We'd love your feedback!