We provide IT Staff Augmentation Services!

Product Architect - Senior Hadoop/bigdata-cloud Resume

4.00/5 (Submit Your Rating)

PROFESSIONAL SUMMARY:

  • Certified Senior Solution Architect/Senior Technical manager/Technical Delivery Manager/Product Architect - Cloud(AWS,AZURE,GCP)-Hadoop-Spark-Big Data Analytics -DevOps-EDW-EAI-ETL-SAP HANA BI BW BODS, Hadoop Integration-SAP VORA with over 16 years of IT experience.
  • Architected, Designed, Developed & Managed multiple Go-live Implementations on Enterprise Data Warehouse, Analytics, ERP, Business Intelligence, Big Data/Hadoop eco-systems leveraging environment scalability, open source tools and libraries, NoSQL DBs and Cloud/SaaS /PaaS/IaaS/FaaS model features to handle massive data in Structured/Unstructured/Streaming/IOT/Batch/BlockChain formats.
  • Blue printing enterprise Data warehousing, Cloud deployment, Analytics solutions, Vision, Data Modeling,Integration, Data governance, Implementing in line with Ralph Kimball/Inmon/,Hadoop Data Lake Design(Capacity,Modeling), Lambda Architecture, TOGAF,SOA,DevOps,SAP’s LSA++ methologies, Hands-on coding & Managing Enterprise Data warehouses, Data Marts, Data Quality,Data Lineage, Data Retention, Data Audit Control and maintenance, Master Data Management, Cross Platform Integration with ERP systems, Solution Optimization, Admin activities, End-to-End process automation/Scheduling, Business Intelligence reporting, Dashboarding, Analytics,Machine Learning & AI.
  • Indepth experience in Technical solution blue-printing, Architecture, Design, Development, Admin experience along with Delivery & Support, DevOps ensuring seamless Development and automate deployments; Lead large diverse teams adhering to Agile & Scrum, ITIL methodologies ; Ensuring planned solutions integrate effectively with all domains in both business and technical environments.
  • Extensive expertise in Design and handling ETL- EDW,EDI with BI, Data Services, large SAP ERP Business Warehouses, OLAP Cubes, HANA, Data Modeling, Reporting with integration to non-ERP systems, OLTP & OLAP systems, Analyzing Legacy systems in terms of Data Migration, Gap analysis, Application migration, SAP Integration with Hadoop on VORA platform.
  • Work with business and Technical teams for requirement assessment, submission or evaluation of RFI’s/RFQ’s/RFPs for new products, identify opportunities and provide the recommendations on process integration across lines of business, proposal writing, choosing optimal technology stack for business usecases and requirements. Good exposure to Pre-Sales.
  • Designed and Built Ingestion / Transformation / Consumption mechanisms on Data Lakes in On-Prem and Cloud. Defined and Implemented Data journey across all layers. Established phases of data persistance and data consumption patterns ensuring Data Security by handling Peronal Identifiable Information (PII), Non-PII, sensitive data in Lake and Cloud. Working closely with Vendors and consultants to ensure the solution is heading towards right direction.
  • Good exposure to Security Architecture,Design, Admin activities in terms of setting up sutiable platform for solutions across organization. Align and adjust to requirements of business solution need and ask by ensuring CI & CD processes.
  • SumUp- Started my proffesional journey as EnterpriseETL-BI Developer, progressed towards Designing and Architecting Enterprise Data Warehouses & BI, Ne ar real time,ERP, BIG Data,Data lakes, Cloud application platforms with Cross platform integration, Technical Solutioning, managing projects from an end-to-end implementation in terms of project management, release stabilization and delivery standpoint; synchronizing business and technology expecations are my core.

TECHNICAL SKILLS:

HADOOP Eco Systems: Cloudera, Horton Works, IBM BigInsights

Cloud Platforms: AWS, Azure, Google Cloud Platform, BlueMix

BIGDATA Tools & Data Formats: HDFS, Apache SPARK, Hive, Apache Sqoop, Apache Pig,Apache Storm,Kafka, Flume, Spark Streaming, Apache Nifi, SAMZA, Map-Reduce, Tez, BigSql, Impala, NoSQL,, ApachParquet, ORC,Oozie,Apache Airflow,Cloudera Data Steward Studio, DiyottaETL.

NoSQL DBs: HBASE, Cassandra, PheonixDB MongoDB,Hbase,Redis,DynamoDB,DocumentDB HADOOP Eco Systems Cloudera, Horton Works, IBM BigInsights

Cloud Platforms: AWS, Azure, Google Cloud Platform, BlueMix

BIGDATA Tools & Data Formats: HDFS, Apache SPARK, Hive, Apache Sqoop, Apache Pig,Apache Storm, Kafka, Flume, Spark Streaming, Apache Nifi, SAMZA, Map-Reduce, Tez,BigSql, Impala, NoSQL, ApacheParquet, Orc,Avro Apache Avro,ORC,Oozie,Apache Airflow,Cloudera Data Steward Studio, Diyotta.

NoSQL DBs: HBASE, Cassandra, PheonixDB MongoDB,Hbase,Redis,DynamoDB,DocumentDB

Databases & Appliance: SAP HANA, HP-Vertica, Oracle, TeraData, Sybase, MS-SQLserver, DB2, Netezza, IBM Infosphere MDM

Enterprise ETL Tools: DATASTAGE 7.5.x/8.x/9.x/11.X (On Hadoop EdgeNode) PX Edition, Informatica, SAS, SSIS/SSAS, Pentaho Data Integration, Talend, IBM Infosphere Data Quality, Data Steward, IGC(Information Governance Catalog - Data Governance and Lineage ), SNOWFLAKE-Cloud based Datawarehouse

Enterprise Reporting Tools: MicroStrategy, Cognos10,11.X, Cognos Analytics, TABLEAU, QilkView, SSRS, Power BI, IBM Cognos Analytics,QlikView.

Enterprise Modeling Tools: UML, CA-ER-win, SAP HANA Modeler, SAP BOBJ - IDT, BPMN-2.0

Data Lineage: IBM Infosphere IGC (Information Governance Catelog), Hadoop side - Apache ATLAS, Collibra SAP BW, Modeling & ETL SAP BI/BW 7.X-7.4, SAPBW4HANA, SAP HANA, SAP BODS 4.X, Information Steward, SAP VORA-Hadoop Integration, SAP Smart Data Access, SAP ECC 6.0, S4HANA,SAP Fiori.

Operating Systems: UNIX, Linux, Microsoft Windows

DevOps-Virtualization: Docker, Kubernetes, Vagrant, Jenkins, GIT

Programming Languages: PLSQL, UNIX Shell Scripting, Core Java, Scala, Python

Server Side Technologies: HTML, DHTML, Core Java.

Scheduling Tools: Apache Airflow, AUTOSYS, Control-M, TIDAL, Unix-Crontab, AQUA, Oozie

Testing Exposure & Tools: Big Data Testing, ETL Testing, SAP Testing, SOAPUI, RESTAPI HP Quality Centre, Manual, API Testing, Database testing. Build & Versioning Tools SBT tool, Eclipse, SCCS, Tortoise SVN,HERMES Tool, TFS, GitHub, BitBucket.

TeraData & Oracle Utilities: Export, Import, SQL*Loader, Fast Load, Multi Load, Fast Export, TPump, BTEQ

Project & Management Tools: and Methodologies MS-Project, Visio, JIRA, IBM Infosphere Data-Architect, SCRUM, Lean-Agile

PROFESSIONAL EXPERIENCE:

Confidential

Product Architect - Senior Hadoop/BigData-Cloud

Responsibilities:

  • Architecting the Restructuring effort of the current product base running on SqlServer, C++ to Hadoop Lake leveraging NoSQL DBs.
  • Blueprinting the high-level approach to make the transition to Big Data world. Deriving the impact point and brain storming with existing customer base to derive future data need and capacity to process.
  • Marking the decision factors and weights in terms of priorities and technically designing a flexible and feasible solution, which could be achieved in a time, bound agile way.
  • Choosing the right tools fit for the requirement and designing common components for enterprise wide use
  • Architect, Define and Design solution flow in terms of green field implementation, migration strategy, integration aspects to automation and go live.
  • Migration strategy from existing Data warehousing ETL Platform to Hadoop Lake for large customers and scalable NoSQL db for mid-range customers.
  • Architecting Data zoning, Access patterns, Security, consumer data points, DR, Hybrid Cloud aspect implementation HDInsights - Azure, Azure DataLake, AWS, AWS DevOps, GoogleCloud based on Customer p of the product deployment up and running.
  • Security, Production planning, solution automation, DEV-OPS, release automations, implementing scheduler at enterprise level.
  • Part of Core Architecture group responsible for design and Architecture of new product line on Hadoop and Cloud. Currently rolling out CAT (Consolidated Audit trail) reporting for Order, Routes, Trades across clients.

Environment: Hadoop- HortonWorks(HDF), C#, SqlServer,Mysql, RabbitMq, JIRA, Agile,Scrum, Kafka,Apache Spark, Cassandra, MongoDB,RedisDB,Apache Nifi, Kylo-ETL,Apache Pheonix, Druid, RedisDB,Azure HD Insights, Apache Airflow,Google Cloud Platform(2-POC’s),Cloudera Data steward Studio, DevOps,Cloud Foundry, CLOUD Security, CI/CD, AWS,AWS DevOps,QlikView

Confidential

Senior DWH-ETL-BI-BIG DATA Solution Architect - HADOOP

Responsibilities:

  • Lead and Designed Hadoop Ingestion Patters for the GTB Data initiative.
  • Performed two POCs with two of Confidential ’s distinctive Hadoop Lakes (Enterprise Data Lake - EDL & TenX ) and weighing option to move forward with one of the Lake to hold the data.
  • Designed & Implemented Ingestion Patterns with agreement from, willing parties all the stakeholders, handling Data in multiple fronts.
  • Tracking back to True source to get the raw data, effective extract patterns and feed to ingestion mechanism ensuring transparency in terms of Data lineage and business lineage.
  • Providing Technical road maps, technology feasibilities, providing required technical expertise. Clearing ambiguities in terms of implementation, results and outcome.
  • Laying down Data Zoning, Data journey, Lineage, Transformations and Business Intelligence best practices.
  • Producing reusable designs and code to teams to proceed with replication and assist them clearing any roadblocks using above mentioned technologies and tools.
  • Hands-On on code and Design of Handling release management activities with code migration and automation.
  • Providing recommendations for process integration across lines of business or business capabilities.
  • Collaborate with EA team on enterprise architecture practice best practices and business solutions
  • Taking new areas of technology space to work in POCs upfront to provide technical feasibility analysis and benchmarking

Environment: HORTON WORKS - HADOOP, Apache SPARK, Scala, Apache Tez, IBM Infosphere Datastage 11.XPX-IBM Infosphere Governance Catalog- DATA Governance, IBM Infosphere Data Architect, Cognos Analytics, HIVE, Sqoop, Kafka, Docker, Apache Atlas, PostgresDB, Apache Airflow, Cassandra, Tableau, Lean Agile, Scrum, Diyotta-ETL,SnowFlake-CloudbasedEDW(POC).

Confidential

Snr DWH-ETL-BI-BIG DATA Solution Architect /Snr Manager - HADOOP

Responsibilities:

  • Architect, Solution, design & Code Hadoop End-to-End implementation in two phases which involves a POC and actualizing to production solution following Lambda Architecture for both Streaming and Batch Mode.
  • Model & Design Data warehouse using Hive/Bigsql, used Sqoop to transfer data from existing legacy systems into HDFS in specific format as required based on volumes. Have written Pig Scripts to transform certain user
  • Requirements. Written HIVE UDF’s in Core Java for certain user requirements, integrated SPRAK and HIVE to operate SPARK in HIVE context.
  • Have used Flume to capture Network traffic Streaming data into HDFS and created Partitioned Hive/Bigsql tables to access to down streams systems.
  • The incoming streams are aligned through KAFKA and passed onto Flume as topic specific for better grouping of data
  • Defining and managing Hadoop External partitioned tables created on Parquet files landed at HDFS and ensuring that latest partitions are recognized using HIVE functions to refresh MetaStore.
  • Managing the Data partitions and Data Lake with Hot & Cold data for Data analytics and Data scientists.
  • Have used DataMeer and Tableau for dash boarding and visualizations.
  • Have setup plain vanilla sand box for testing sample data and performing POC on processing using Spark and comparing the processing time to better demonstrate the processing difference. Have used Scala as part of Spark coding to reduce the coding lines and complexity.
  • Work with Hadoop Admins to optimize the environment based on requirements and fine tuning the components.
  • Have fined tuned system performance parameters and involved in integrating Hive with Spark to have HIVE context in Spark.
  • Used ETL Datastage for other diverse sources with Master data to be cleansed and landed into HDFS and BI reporting purposes.
  • Sample Testing using Hue, manage Geo teams across platforms and Geos following SCRUM and Agile methodologies,
  • Creating & Updating Micro and Macro documents, code reviews, lead a team of 17 members.
  • Facilitate RFP process, liaising with SME to write the key elements of the business and technical portions of the RFP, evaluating RFP response and communicating with Procurement staff to facilitate timely resources and project advancement.
  • Worked on POC on Setting up AWS, SPARK Streaming, STORM and SAMZA for better message channeling and storing in Cassandra DB with SPARK for processing the data access requests for faster accessibility.
  • Designed Oozie workflows to automate and schedule in Hadoop

Environment: HADOOP-IBM BIGINSIGHTS V4.1, BIGSQL, IBM Info-Streams, RedisDB, Lambda Architecture,IBM-Infosphere-ETL-Datastage, Cognos 10, Data Modeling, SAS, Unix Shell Scripting,Oozie Scheduler,Pentaho, DataApache PARQUET, CLOUDERA, HIVE, Apache PIG, Sqoop, Flume, Kafka, SPARK, Scala, DataMeer and Tableau, CassandraDB, MongoDB, AWS, Agile& Scrum, JIRA,Cognos

Confidential

EDW-ETL-Big Data Solution Architect

Responsibilities:

  • Architect, design & Code, Lead teams geographically on Hadoop
  • End-to-End implementation with ETL-BI aspects.
  • Started with POC and went to get the approval for actualization.
  • Responsible for Implementing DHW and BI systems using open source Hadoop technologies to carve out operations on huge data from variety of sources like social media and proprietary DB systems.
  • Have cleansed Master data relating to address data using Address verification interface in Datastage and providing the clean and reliable data for downstream systems.
  • Used SPARK, Tez to process the data in context of user requirement.
  • Used HBASE (NosSQL) DBs for storing the legacy Db documents in key and value storage format.
  • Perform Admin activities; work with testing teams to ensure smooth delivery and integration.
  • Worked on POC on Setting up simple AWS Hadoop cluster for a POC relating to Hadoop.

Environment: HADOOP(Cloudera), HDFS, HIVE, PIG, SPARK, Sqoop, Apache TezTableau, Unix, Shell scripting, Hbase(NoSQL),AWS,Oozie IBM Infosphere DATASTAGE, AVI-Master Data Quality, SOAP UI tesingIBM Infosphere MDM, tableau, Scrum & Agile,DevOps, SAP DOBS, SAP ECC, SAP BOBJ

Confidential

EDW-ETL-Big Data Solution Architect

Responsibilities:

  • End-to-End project road mapping/Blue printing,DataModeling, Data Aquisitions strategies - Source Analysis/Validations(Structured & Unstructred data), Transformational strategies, Staging, Target and downstreams dependencies, Coding complex designs, Resusable & Common components, Leading a Teams across Goes and guide them to work on how - abouts.
  • Ensure that planned solutions integrate effectively with all domains in both business and technical environments
  • Involved in Design of BI reports and Dashboards requirements.
  • Involed in liveraging the BigData open source libraries like HIVE, SPARK, Sqoop to operate with HDFS at HADOOP environment.
  • Involved in providing required designs to BI teams for end user reporting
  • Worked on integration aspects of various systems and managed diverse systems to operate in sych to yield required results.
  • Propose, develop and review EA and business architecture documents.
  • Making formal presentations to senior management, organize and conduct meetings.

Environment: Making formal presentations to senior management, organize and conduct meetings. IBM Infosphere ETL DataStage, AVI, SQL SERVER,SSRS, IBM Infosphere MDM, Unix, Shell scripting,JIRA(Scrum & Agile ),HADOOP(Cloudera), HIVE, Impala,Hue, SPARK, Impala, Sqoop, Tableau,AWS(POC),RESTful API, SOAPUI,DEVOPS

Confidential

Integration Architect

Responsibilities:

  • SAP BW (Data Modeling, Activating business content as per requirement, defining basic structures like Infoobjects and Targets: DSOs, Infocubes, Multiproviders, Infosets to load modeled Data through DTP, Transformations, writing ABAP routines - Start routine, End routine, field routine and Expert Routines in transformations, master data look-up, filtering data from source package, used function module for various conversions, defining the report criteria and variables required, ensure the data is Extracted from ECC, modeled as per business requirement and sent to Reporting, SAP HANA and external systems using Open Hub destination
  • Used composite provider, transient provider, DSO’s Cubes powered with SAP HANA objects in SAP BW 7.3/7.4 with HANA for Agile way f reporting.
  • SAP HANA Data modeling using Attribute view, Analytical view and Calculation views, CE Functions, HANA SQL, Procedures.
  • SAP BOBJ: Design Studio, Webi,UDT/IDT, Crystal reports
  • SAP Business Objects Data Services (SAP ETL-BODS) to migrate data from legacy systems to SAP using Data Services transforms and Quality transforms for Data migration projects.
  • Third Party data from external systems are cleansed by IBM ETL and fed to SAP BW systems and automated using process chains and sequencers in Datastage.
  • Used MicroStrategy as BI tool for 3rd party external vendor reports.
  • IBM ETL Datastage used to Migrated existing Legacy application running on Server jobs to Parallel version and integrate with SAP system

Environment: SAP BW 7.x, SAP HANA DB, SAP BOBJ 4.x,IBM Infosphere ETL Datastage 9.2 Px,Infosphere MDM,QlikView, SQLSERVER, SAP BODS, Crystal Reports

Confidential

Senior Technical SAP BW-HANA-BODS-BOBJ-EDW Project Lead/Manager

Responsibilities:

  • Blue printing/Architecting DWH solutions, data modeling, physicalization of designs.
  • Micro & Macro designs, creating specs using Information analyzer and providing to the teams for development
  • Managed a team of 40 across geos with end to end to responsibilities from data acquisitions till reporting with integration and automation of entire project components.
  • Ensured environmental stability by working with admins and worked on release management activities for all the code to be migrated to UAT-PROD environments.
  • Generated project phases and status reports using MS-project.

Environment: SAP HANA, SAP BW, SAP BODS, SAP BOBJ, Qickview, ETL Datastage 8.5 (Datastage, IBM Infoshpere Fasttrack, Information Analyzer), Unix Shellscripting, Netezza, MS-Project, Excel, IBM Infosphere MDM

Confidential

Senior Technical SAP BW-HANA-BODS-BOBJ-EDW Project Lead/Manager

Responsibilities:

  • Integrate SAP BW and BOBJ.
  • Lead the SAP BEX, BOBJ, Crystal reports for financial end clients.
  • Used Datastage to transform the heterogeneous sources and load to Oracle for reporting.
  • Fixing the Master data quality issues and loading to end systems for data consistency using DATASTAGE
  • Automate the follow in sequencers and Process chains from end to end.
  • Fixing issues based on JIRA tickets and assigning few tickets to offshore teams.

Environment: SAP BW 7.x, Bex, Webi 3.1 Reports, IBM ETL Datastage 8.x Px, ORACLE, UNIX, SHELL Scripting, JIRA, Autosys,Cognos

Confidential

Offshore Project Lead - SAP BW-BI-BODS

Responsibilities:

  • Responsible for migration of legacy system on older version to newer version of SAP BW with underlying database to be migrated to SAP HANA.
  • Responsible for Data migration using Data services and
  • LT replications
  • Handling teams, Following LSA++ architecture for project implementation
  • Involved in designing reports in Business Objects and SAP HANA studio

Environment: SAP BI/BW 7.3,Bex,BODS(ETL Tool from SAP), Crystal Reports, SAP ECC 6.0,SAP HANA, Business Objects, Bex Reporting

Confidential

Sr Datawarehousing: ETL-BI Technical Lead - Release/Configuration Manager

Responsibilities:

  • Design and Coding ETL jobs, BI reports along with maintenance of existing systems jobs on Autosys, fine tuning for optimal performance
  • Be a release manager across the platforms and manage the branches and trunks and deploy the code to different environments based on schedules.
  • Coded SSIS work flows and reports in SSRS for a separate module.
  • Testing, integrating with existing flow along with leading freshers and them on job monitoring, escalation and initial problem identification and resolution mechanisms.
  • Be a bridging lead between core development, support and testing teams and coordinate across all stake holders for releases and Application enhancements

Environment: ETL DATASTAGE(PX / EE),MicroStrategy, UNIX, Shell Scripting, Sybase, SQLServer, Autosys, SSIS, SSRS, Tortoise SVN,AUTOSYS,Builds & Deployments

Confidential

Sr Enterprise EDW Lead Analyst

Responsibilities:

  • Involved in Data modeling, logical designs
  • Design ETL jobs and Cognos reports
  • Written complex Unix scripts for integrational design aspects
  • Written SQLs to filter the data source level to optimize performance at ETL layer
  • Define the Landing, staging criteria, extrapolate target for business
  • Requirements.
  • Implemented change data capture on dimensions.
  • Design the data marts for subjective reporting and analysis.
  • Have designed low level and High level, Gap analysis documents with continuous deploy and improve frame work mechanism.

Environment: ETL-DataStage (PX / EE), Oracle, UNIX, Shell Scripting, Cognos, ERWIN Data modeling, SQL SERVER, BOBJ

Confidential

Sr. IT Consultant-ETL-BI

Responsibilities:

  • Fine tune ETL code for optimal performance
  • Analyze existing code and develop the equivalent in ETL DATASTAGE using variety of stages.
  • Tuning at Source, transformation and target level to achieve max throughput and parallelism.
  • Re-engineered legacy PRO*C code and designed equivalent DataStage code to meet the client requirement.
  • Used Teradata Utilities from data migration and have worked on Crystal reports and Microstrategy reports for separate divisions

Environment: DataStage 7.5 PX(ETL-BI), TeraData, UNIX, Shell Scripting, Pro*C, Crystal Reports MicroStrategy

Confidential

Software Engineer

Responsibilities:

  • Coded Export, Import, SQL*Loader, Fast Load, Multi Load,
  • Fast Export, TPump, BTEQ scripts for mass migration of data to Teradata.
  • Used ETL Datastage to do transformations, automated the process flow using sequencers.
  • Develop the code as per mapping sheet and specifications provided, unit test, document the unit test results for review and pass on to further testing.
  • Supported 3 application jobs along with development tasks.

Environment: DataStage7.5x(Parallel),Unix, Shell Scripting, MicroStrategy, Teradata

Confidential

ETL BI- Developer - Analyst

Responsibilities:

  • Code as per specification and mapping sheet.
  • Discuss with team lead and get the directions relating to code and testing.
  • Written pl/sql procedures, function, triggers and complex SQL scripts.
  • Developed DataStage jobs using all the transformational stages along with performing testing at source level, transformation level, target level.
  • Interacting with senior leads on BI report specification and designing Business objects reports with targets provided by ETL DataStage

Environment: DataStage 7.5.x2, Oracle 9i, PL/SQL, UNIX, Shell Scripting, Business Objects

We'd love your feedback!