We provide IT Staff Augmentation Services!

Big Data Architect Resume

3.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY:

  • 15+ years of extensive IT experience in all phases of software development life cycle, leading, architecting and designing applications.
  • Designed / implemented / maintained multiple big data platforms with over 3 PB+ data.
  • 5 years of consulting / client facing experience architecting and/or implementing large, distributed data warehousing, Big Data, Analytics and Business Intelligence intensive applications.
  • Excellent Client prospect and partner presentation skills.
  • Experience with various cloud platform such as Microsoft Azure, AWS, Google cloud.
  • Experience using and developing solutions utilizing the Hadoop ecosystem such Hadoop, Spark, MapReduce, Hive, Pig, Sqoop, Oozie, Ambari, HCatalog, Zookeeper, Flume, Storm, Kafka. NoSQL databases like Hbase, Cassandra
  • Experience with Data Integration, Data Visualization, Dashboards and Reporting tools such as Tableau Server, Tableau Desktop, QlikView, Actuate Birt, Power Query, Power View, Power Map, PowerBI
  • Experience with Enterprise Data Warehouse Platforms such as Teradata, Pentaho, Talend, Oracle, MSSQL, SQL Server, DB2, etc.
  • Good communication, client service, and relationship building skills
  • Able to document, explain, and present complex architectures for the client technical teams
  • Strong leadership, mentoring and interpersonal skills; believe in leading by doing.
  • Constantly learning and leveraging emerging technologies
  • Worked in global delivery model following Agile methodology

TECHNICAL SKILLS:

Big Data: Hadoop Eco System (HDFS, MapReduce, Pig, Hive, Sqoop, Ranger) 4+ years, Cassandra by Datastax, Datastax Enterprise Graph DB 3+ year, Teradata V.2.R.4 and V.2.R.5 4+ years, Spark 1+ year, Mesos and Docker 1+ year, Elastic Search, Solr, Memcached, Redis, Splunk, ELK 1+ year, Cloud & PAAS (Platform as a service), Microsoft Azure and AWS 2+ years, Pivotal Cloud Foundry 1+ year

ETL / ELT: Microsoft DTS (SSIS) 2+ years, Informatica (PowerCenter, PowerExchange) 7+ year, Oracle Warehouse Builder 1+ year, PL/SQL 6+ years, Talend 1+ year

Master Data Management: MSFT Master Data Service + Profisee Maestro MDM 1+ year, Informatica MDM 1+ year

RDBMS: Oracle (version 7 - 11g RAC) 10+ years, SQL Server ( ) 6+ years, Oracle Golden Gate 2+ years, Data Model (Erwin, ER Studio, Model Mart) 10+ years

Visualization: Composite Data virtualization 1+ year, SQL Analysis Services (SSAS) 1+ year, Business Objects Suite (Reporting & Data Services) 5+ years, OBIEE (Siebel Analytics) 6+ years, Tableau 2+ years, TIBCO Spotfire 1+ year

Enterprise Architecture Framework: TOGAF 4+ years, ITIL 2+ years

Other programming languages: Python (Anaconda, Pyspark etc), Core Java, Spring framework ( Spring Boot), Scala, Spark SQL, CQL, Gremelin for DSE Graph DB, UNIX Shell script

DevOps tools: Maven, Jenkins, Git, Ansible, AWS cloud formation template, Chef, Puppet

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Big Data Architect

Responsibilities:

  • Design and develop data services (batch and streaming) for time series data ingestion from edge to cloud (Apache Apex, kafka, Spark and Python for data validation, normalization, transformation and data quality)
  • Implemented and performance tuned storage layer (HDP and Cassandra)for scalability and elasticity
  • Implement cloud foundry (PAAS) in VPC
  • Design and implement Hive and Cassandra Schema (Titan graph database for hierarchical asset data)
  • Containerize deployment of infrastructure and micro-service applications using Docker and Mesos
  • Collaborate with sales teams to develop and execute on solution for commercial customers

Confidential, Atlanta, GA

Senior Java Architect

Responsibilities:

  • Solution design, planning, implementation of HDP upgrade from version 1.x to 2.x
  • Resolve scalability issue with big data refinery (Ab Initio, Talend, Java batch grid and Hadoop)
  • Kerbros enable the HDP Hadoop environment
  • Design a POS eReceipt application leveraging Cassandra db

Confidential, Atlanta, GA

Senior Big Data Solution Architect

Responsibilities:

  • Design and implement a data lake modern data architecture solution using HortonWorks Data Platform (HDP) with Hadoop HDFS, Yarn, Oozie, Hive and Hbase. Deliverables are as following:
  • Solution Design: Develop solution design that includes functional components like Data Extraction, Data Transformation (Talend for Big Data, Map / Reduce running on Tez), Data Processing and Storage (HDFS), Data Presentation (Spotfire), Knox and Ranger for Security, Metadata Capture, Business Process Orchestration
  • System Design: Perform infrastructure design for Hadoop cluster deployment with 200+ nodes to be deployed in Microsoft Azure, Amazon AWS and 3 global data centers
  • Data transformation conversion: convert existing SSIS script to Map/Reduce jobs, Hive / Pig scripts via Talend for big data

Confidential, Atlanta, GA

Senior Solution Architect

Responsibilities:

  • Perform data conversion / migration from Oracle EBS to new Confidential system which includes complete mapping of the data attributes from legacy Confidential to new Confidential, complete configuration and design documents for new Confidential .
  • Develop ETL scripts for data migration.
  • Develop integration interfaces for data exchange with external partners
  • Develop web services to expose data from Confidential system for external portals (landlord & tenant portals)
  • Develop custom reports and dashboards for Confidential
  • Resolve data integrity issues within Confidential and improve data quality

Environment: SQL Server 2012, SSIS, SSRS, Crystal Report and Confidential custom reporting and ETL tools

Confidential, Atlanta, GA

Enterprise Architect

Responsibilities:

  • Data flow diagram for all applications components; System flow that includes all the platform components in scope
  • Designed the data model for Cassandra Nosql database (keyspaces and column families) integrated with Hadoop HDFS. Sized the Cassandra db servers. Used CQL and CLI to generate queries.
  • Installed and setup 20 nodes clusters of DSE (DataStax Enterprise) and Ops Center for monitoring. Use Nodetool to rebalance the ring. Scheduled db repair jobs.
  • Develop roadmap and logical reference architecture for implementing SOA at Confidential
  • Developed use cases, requirements and integration scenarios for evaluation SOA vendors in following capability areas: ESB / MOM, SOA governance, BPM, Data Virtualization
  • Build out lab environments for data virtualization, SOA governance and execute end to end use cases

Confidential, Atlanta, GA

Enterprise Information Architect

Responsibilities:

  • Designed and deployed enterprise data integration framework with Oracle ESB (Weblogic application server) as the main foundation platform for real time data integration, data warehouse for batched process and MDM
  • Developed Master data management (MDM) strategy and implemented multi-domain MDM solution for Member Provider, Product, Agents and Sales Hierarchy.
  • Completed MDM vendor evaluation and selection
  • Designed master data model for Provider, Member, Product, Agents & Sales Hierarchy Domains to support Provider Contracting, Provider Network Delivery, Credentialing, Claims processing, Enrollment, Co-ordination of Benefits, Medical Management, Sales Commission reporting
  • Mapped and loaded data from 7 different transaction source systems to the MDM repository
  • Designed and configured standardization, address cleansing, matching and survivorship rules
  • Implemented MDM data steward console and conducted training
  • Created cross reference reports, source delta changes reports and provided flat file feeds to data warehouse
  • Designed and built an Enterprise Data Warehouse foundation star schema with following data marts:
  • Sales and Enrollment
  • Claims lifecycle
  • Profitability (including Premium, Commission, GL subject areas)
  • Medical Management
  • Led requirements gathering sessions with business stakeholders
  • Designed logical, physical model and data lineage for the data marts using ERwin
  • Designed and developed source to target data mapping, ETL design document, created column metadata and transformation, data quality control rules using Informatica
  • Used OSB (Oracle Service Bus) to develop web services for data exchange with external trading partners conforming to EDI standard
  • Responsible for Enterprise Data governance program and champion data management strategy with C-level executives, application owners

Environment: Oracle 11G, Oracle ESB, Informatica, Erwin, Business Objects (XI R3), MSFT MDS

Confidential, Atlanta, GA

Data Architect

Responsibilities:

  • Responsible for an integrated enterprise level data strategy, which encompasses storing and processing of real time events data, near real time incident analysis, on-line client portal reporting and advanced analytics, with the total data volume of petabyte. Defined, recommended and advocated best practices and design principles for Enterprise data structure, data retention.
  • Responsible for design of an integrated OLTP system that consolidates current multiple silo-systems and work flow and provider data layer support for the SOA.
  • Responsible for design of data mart and data warehouse that can support advanced analytics like forecasting models, predictive models and pattern reorganization.
  • Responsible for performance of systems with near real time replication amongst 3 sites. Collaborate with DBAs to maintain performance metrics and troubleshoot bottlenecks.
  • Provides technical oversight and guidance to application developers for optimal leverage of database design. Assists in defining SQL best practice guidelines, and develops code samples and solution fragments for efficient usage of database resources. Reviews SQL code, analyzes query plans/costs and optimizes SQL code, for access path efficiency, improved response time and throughput, and reduced cost of maintenance and total cost of ownership.
  • Completed the logical and physical model design to support key web services on integrated platform
  • Completed prototype of Hadoop as the VLDB platform (Setup a multi-nodes HDFS Cluster. Ran map reduce jobs to process 1TB of log files. Built data structure in HIVE to query data and maximized the query performance) and built business cases (ROI) / recommendation / migration roadmap for Hadoop.
  • Completed the ETL mapping and development to synchronize data between legacy platform and new consolidated platform
  • Completed MySQL partition design to support archival of the data
  • Completed data cleansing stored procedures for data quality control.
  • Identified systems of record for enterprise metadata. Implemented a meta strategy and set up an enterprise metadata repository

Environment: MySQL, Oracle 10G, Hadoop for VLDB, Erwin

Confidential, Atlanta, GA

BI Architect

Responsibilities:

  • Designed and built an enterprise data warehouse foundation star schema to support the following strategic initiatives:
  • Claim life cycle analysis
  • Physician/Health care provider revenue cycle management
  • Industry Best Practice Benchmark reporting
  • Operational efficiency reporting
  • Customer and product profitability analysis
  • Completed the logical data model and physical database design
  • Completed the ETL mapping/process flow development for initial and incremental loading
  • Designed and built Oracle OLAP cubes for multi-dimension analysis
  • Resolve the data quality issue by implementing data cleansing and profiling of OLTP data.
  • Set up the dependency and schedule for all ETL jobs with dynamic load balance across nodes
  • Set up and administered the meta repository for lineage and impact analysis
  • Designed and developed OBIEE repository and web catalog
  • Developed key OBIEE dashboard reports
  • Conducted user training of OBIEE ad-hoc query tool

Environment: Oracle 11G RAC, Oracle Warehouse Builder and ODI for ETL design, OBIEE for reporting, Erwin

We'd love your feedback!