We provide IT Staff Augmentation Services!

Big Data Architect Resume

5.00/5 (Submit Your Rating)

Dallas, TX

SUMMARY:

  • Professional with 20+ years of experience in Consulting and System Integration
  • Over 16+ years of data and Information management experience, having successfully accomplished:
  • Architecting multiple Big Data, Business Intelligence, Data Analytics, Data Integration, Data Modeling, Data Rationalization, Visualization, and Data Virtualization solutions for the top fortune 500 customers
  • Designed and built numerous Enterprise Data Warehouses, Operational Data Stores, Near and Near Real - time Data Stores, and Data Marts for the financial services, K12, higher education, and consumer services industries
  • Designed and developed multiple batch and real-time Enterprise Data Lakes using Apache Hadoop tools, Data Warehouse Optimization using amazon Redshift and Hadoop tools, Customer 360 Degree view using Relational, Big Data and No-SQL technologies, In-memory computing and Data Analytics using Cloud technologies
  • Designed multiple Data Management initiatives including, Enterprise Data Quality, Data Governance, Metadata Management, Master Data Management, Data Masking, Test Data Management, Data Retention/Archiving and other Information Lifecycle Management programs
  • Possess competency in envisioning and designing industry specific solutions, future solution/architecture in response to a business problem/opportunity/RFP/RFI, build road-maps to transit from current state to the future state.
  • Experience in implementing complex system integration solution in an onshore/offshore delivery model.

TECHNICAL SKILLS:

Databases: Oracle, SQL Server, MySQL, Greenplum, Teradata, Redshift

ETL: Informatica, DataStage, Talend, Abinitio, SSIS, BODS

Big data: Hadoop, Sqoop, Hive, Pig, Spark, Tez, Kafka, Flink, Storm, NiFi

NoSQL: Mongo, HBase, Cassandra, Kudu, Ignite, Riak, Redis

SQL on Hadoop: Hue, HAWQ, Impala, Drill, Kylin, Phoenix, Lens, Spark-Sql

DQ/ILM/Metadata: Informatica IDA/IDQ/ILM, IBM Info Analyzer, Adaptive

Analytics: Tableau, QlikView, SpotFire, MSTR, BO, Crystal, R

Platforms: Linux, UNIX, Windows, Cloud, Virtualization, SOA

Programming: Python, JSP, ASP, JavaScript, MuleSoft, Java-Spark, Shell script

Modeling / Architecture: ER, Dimensional Modeling, ER Studio/ErWin, TOGAF

PROFESSIONAL EXPERIENCE:

Confidential, Dallas TX

Big Data Architect

Roles and Responsibilities:

  • Employee retention is a being key objective of Tyson, and employee turnover is the single most prevalent HR metrics, the project involved building a new analytics platform using big data technologies to generate business value by providing deep insights on employee turnover. To achieve true insight, a more in-depth analysis of what’s causing turnover in different parts and to provide true prediction potential, apart for SAP data (phase-1), employee performance review(phase-1), leave, and access data are planned to be ingested to the new analytics platform.
  • AS part of HR Analytics solution, my responsibility included:
  • Design, build, and automate SAP data extraction using sqoop
  • Design and build Pig routines for the initial preprocessing, cleansing and generation of surrogate keys for the target dimensional model
  • Design and build data architecture and dimensional models for the new analytics platform
  • Design, build and orchestrate spark scripts using python to load data from staging area to the target EDW on Hive
Confidential, Dallas TX

Big Data Lead Engineer

Roles and Responsibilities:

  • Customer Advocacy and Relationship Enablement (CARE) program is an overhaul of the existing Liberty Mutual Data Platform to build a new state of the art Customer Information (CI) Platform that enables the delivery of personalized, real-time, and end-to-end customer experience. The current Liberty Mutual Data Platform lacks the end-to-end customer view, usage of real-time customer information, missing contextual information during customer interaction, suboptimal usage of data stored in multi-channel systems and a 360 degree view of the customer and their relationship history.
  • To accomplish the CARE solution, my responsibility on big data side was to:
  • Design and build of Python scripts which queries Adobe REST APIs to extract reference data to enrich Liberty’s clickstream/weblogs
  • Design and build a Spark application on java to process 20+GB of clickstream data per day
  • Design and build an Spark application on java to process 40-50GB of Krux logs per day
  • Design and build a combination of Pig/Sqoop/Spark/BDRM applications to enable the processing and integration of 250 million Acxiom Confidential consumer, adobe, and krux data to Liberty’s customer data
  • Design and built an Identity Management application on Spark and Java to provide unique IDs for all prospective consumers and Liberty customers that act as their digital ID across LoBs within Liberty.
  • Process policy/claim email communications both in batch and in real-time using kafka and Flink, process JSON messages and model it for text mining, Build data models on top of Hive to query/analyze email interactions
  • Build PoCs on Spark SQL for downstream processing
Confidential, Capgemini

Analytics and Big Data Solution Architect

Roles and Responsibilities:

  • Head of engineering for the strategic Customer Data Solution, responsible for leading an engineering team. The role requires creating and designing an effective strategy and future state of Customer Master Data Management and Reference Data programs. As part of the solution, I was responsible to accomplish the following:
  • Created a customer data lake by offloading data from the two large global aggregators
  • Created Hive structures for IBM Info Analyzer to perform the DQ operations on the lake
  • Modeled data for 360 degree view of the global customers on top of MongoDB
  • Designed and created REST APIs using MuleSoft to query and capture extended customer data on MongoDB
  • Redesigned HSBC mobile app’s DB2 backend to MongoDB to support 9-15 Million transactions per week
  • Creates Cassandra and Neo4j clusters to demonstrate technology platform for the data streaming and data persistence use cases and Visualization of global customer linkage
  • Design DQ dashboards, Data Management, and ICX4 solution across globe
  • Authored Data Management and Data Governance strategy for T Rowe Price which consists of:
  • Data classification, data policies and data security framework for the customer and financial sensitive data
  • Authored ETL standards and best practices for The Informatica and SSIS ETL tools, Metadata Management framework, standards and best practices using Adaptive, ILM (Data Retention, Test Data Management) processes, framework, standards and best practices
  • Built Enterprise Data Quality framework, standards and best practices using Informatica IDQ/IDA
  • Designed Historical Data Analysis, Data archival/Disaster recovery solutions using amazon Redshift and Amazon S3 storage.
  • Member of the Technical Design Authority (TDA) at Unilever responsible for maintaining programs vision and objectives, create technology PoCs, created and maintained end-to-end BI architecture for Unilever’s “Connect” program which is a 80 terabytes large global enterprise data warehouse built on top of Teradata, Tableau, BODS, SSAS and SSRS tools.
  • Designed and built Data Warehouse Optimization solution PoC using Hortonworks Enterprise Data Hub
  • Offload data from Teradata and Ingested it to the Data Lake using Sqoop
  • Experimented with ORC and AVRO file formats for the persistent staging tables of EDW
  • Created Internal Hive data models in ORC for analytics and data exploration needs
  • Built ETL/Transformation using combination of Hive and Pig
  • Built Hbase tables to store and randomly access key business KPIs aggregated from Hive
  • Used Tableau for data visualization on top of Hive and Hbase
  • Member of the Capgemini’s global Architecture and Advisory (A&A) group, responsible for undertaking customer’s strategic consulting and advisory initiatives, solution architect high value RFP and RFI responses, shape deals and craft future solution for cloud and big data initiatives using TOGAF framework
Confidential

Chief Technology Officer

Roles and Responsibilities:

  • Developed and launched a scalable, SEO optimized, intuitive Auto and Health insurance aggregator for Switzerland with the help of open source Web technologies, SOA architecture, java and data analytics platforms
Confidential, Morgan Stanley

Solution Architect and Delivery Lead

Roles and Responsibilities:

  • Implemented Allay bank s initiative to develop a new technology platform to automate back office business processes of Confidential and Canadian business
  • Delivered North America's first SAP CORE Banking program consisting of building SAP payment module, integration of all SAP modules with the existing or new channels, data migration, SIT, UAT, and test automation.
  • Managed Europe's largest System Integration program resulting out of Lloyds TSB and HBOS merger. Responsible for SIT, UAT, performance test and test automation
  • Implemented SunTrust s BI program to make it Basel II Compliant
  • Delivered Morgan Stanley's IT re-platforming, integration, and data migration initiatives resulting out of its and Smith Barney's merger
  • Developed Farmer s Business Intelligence initiative to enable organic growth
  • Authored multi - year Enterprise Business Intelligence roadmap for Safeco Insurance, Successfully Implemented Safeco s Enterprise BI initiative by integrating various business functions, including Distributors/agents, Q&I, Policy, and Claims.
Confidential, Birmingham

Senior Consultant

Roles and Responsibilities:

  • Design solution framework, technical architecture, technology standards and best practices
  • Design ETL, data quality, data acquisition, CDC, data modeling, ODS, EDW and DMs
  • Implement metadata management, SSO, disaster recovery methods for the EDW
  • Architected and developed KPMG’s ‘K-Frame’ solution that was targeted at the Confidential higher education industry which allowed campuses to go from ‘No BI’ to ‘BI ready’ in months.
  • Lead architect of the BearingPoint ‘NCLB’ solution targeted at the Confidential states to comply with the ‘no child left behind’ legislation.
  • Perfected programming skills on web development, client server, N-tier technologies using VB, MTS, COM, DCOM, Power Builder, Oracle Forms, VBScript, JavaScript, JSP, and ASP

We'd love your feedback!