We provide IT Staff Augmentation Services!

Sr Big Data Architect Resume

3.00/5 (Submit Your Rating)

Austin, TX

SUMMARY

  • Enterprise information management professional with expertise in Leading enterprise data analytics initiatives and data capabilities.
  • Proficient in creating and architecting data requirements for a wide variety of data domains across enterprise.
  • Demonstrated ability to synthesize and translate complex business requirements into accurate, flexible solutions and data products with the ability to communicate results effectively to a diverse audience.

TECHNICAL SKILLS

Cloud: AWS Big data technologies ( S3,Glue, Data pipeline,Atana,Redshift,Kinesis… etc ) Data bricks, Snowflake, ADF

Big data: Spark,Hadoop

ETL Tools: Informatica, AB Initio

Meta Data: ER studio, Erwin

Reporting: Tableau, Excel, Informatica Analyst & IDQ, AB Initio Data Quality

Database: Oracle, DB2, Teradata,snowflake

Programming: Python, core Java

PROFESSIONAL EXPERIENCE

Confidential, Austin, TX

Sr Big Data Architect

Responsibilities:

  • Enterprise Information Management roadmap and strategy initiative projects discussions.
  • Help the product management team to execute enterprise data Analytics projects.
  • Architect and solutioning Viable Products (MVP) with Analytics Product Owner Advisory.
  • POC Development for Enterprise Information Management (EIM) initiatives
  • Confidential Current State Information Management Assessment.
  • Enterprise Data Quality and Data Stewardship Models for EIM projects.
  • Responsible for design and implementation of enterprise Data Lake for effective reporting and analytics.
  • Experienced with batch processing of data sources using Apache Spark and Elastic search.
  • Developed Spark core and Spark SQL scripts using Scala for faster data processing.
  • Experienced in implementing Spark RDD transformations, actions to implement business analysis.
  • Worked closely with Elastic Search team to eliminate the data Ingestion issues by implementing the Parquet files and enabling the broadcast joins to fulfill the requirement.
  • Migrated Hive QL queries on structured into Spark QL to improve performance
  • Developing predictive analytic using Apache Spark Scala APIs.
  • Worked on creating azure data factory(ADF) for enterprise data management for transforming, enriching and managing enterprise data .
  • Lead Data Quality and metadata management programs for different data projects across Confidential .
  • Analyzed and prepared gap analysis for Confidential current state and future state systems architecture for better data analytics solution.
  • Use Amazon Elastic Cloud Compute (EC2) infrastructure for computational tasks and Simple Storage Service (S3) as storage mechanism.
  • Leading Data Engineering teams for effective delivery of data products for enterprise .
  • Created spark and python data management platform for data quality and Analytics using data bricks
  • Worked with master data and reference data intermediate and future state solutions from discovery phase till implementation .
  • Revised/Modernized different data management platforms to align with enterprise data strategy

Environment: Spark,Amazon Web Services (AWS), Informatica, Talend, HIVE,Oracle,Python, Tableau, Visio,Linux

Confidential, Baltimore, MD

Big Data Architect

Responsibilities:

  • Co - authored enterprise data architecture strategy
  • Deep requirements gathering and analysis with data owners and integration teams
  • KPI identification and designed/modeled the data requirements
  • Developed Metadata Standards for the assigned project
  • Led Data Quality Assessments for better metadata management Spark and python
  • Utilized Spark SQL to extract and process data by parsing using Datasets or RDDs in Hive Context, with transformations and actions (map, flat Map, filter, reduce, reduceByKey).
  • Worked on solutioning and engineer enterprise data management using big data tools

Environment: s & Toolsets: Informatica Suite, teradata, AWS, Spark, Python, Tableau

Confidential, Atlanta GA

Big Data

Responsibilities:

  • Responsible for designing big data management analytics platforms for enterprise data and analytics.
  • Responsible for designing data pipelines using ETL for effective data ingestion from existing data management platforms to enterprise Data Lake.
  • Responsible for bringing data into dtaalake from various heterogeneous sources using ETL utilities
  • Involved in designing different data zones within a lake for the enablement of various enterprise data capabilities for effective data management and analytics.
  • Developed strategy and road map for efficient big data quality program.
  • Developed road map for analyzing structured, semi structured and unstructured data requirements for enterprise data ingestion.
  • Involved in data lineage and data reconciliation for movement of data across different zones within a lake.
  • Involved in design and implementation of Change data capture (CDC) and technical data quality (TDQ) solution for Data Lake.

Environment: Hadoop, Ab Initio GDE, Tableau, HIVE, Linux.

Confidential, FranklinLakhs NJ

Data Architect

Responsibilities:

  • Responsible for data migration (loading business data into production) and streamlining the process using ETL utilities.
  • Responsible for running every day incremental and delta load for critical business data to the enterprise data hub.
  • Lead the production support team for resolving data issues and anomalies after every incremental load.
  • Responsible for reporting data load reports to the business owners.
  • Designed and streamlined Process Methodologies for different data anomaly categories starting form data issue Identification to resolution using in-house ETL tool.
  • Involved in enhancing the existing ETL data pipeline for better data migration with reduced data issues.
  • Responsible for reverse engineering source data using metadata management tools for better data migration and integration.
  • Applied SQL tuning/database performance optimization techniques for improved query performance.

Environment: Ab Initio GDE, DB2, Oracle, Teradata, UNIX.

Confidential, Irving, TX

Sr MDM Data Analyst

Responsibilities:

  • Worked on CITIGroup Enterprise Master Data management solution(MDM) - (Customer Management Repository) 360-degree view of CITI customer demographics data.
  • Responsible for identification of data domains for Effective MDM Technology Implementation.
  • Responsible for defining data standards and data elements mapping from multiple heterogeneous data sources to MDM.
  • Responsible for analyzing the different Line of business data to fit into an enterprise master data repository.
  • Responsible for data profiling of in-house and third-party data sourcing to master data hub.
  • Worked with ETL and DBA teams for effective data migration and production data issues resolution.
  • Designed conceptual, logical and physical data models for effective metadata management.
  • Responsible for development and tuning of existing ETL and SQL codes for new data requirements.

Environment: Oracle, Tera data, Ab Initio Data profiler, Ab Initio GDE, IBM infosphere, Erwin

Confidential, NJ

Data Analyst

Responsibilities:

  • Responsible for gathering business data requirements for data management projects.
  • Responsible for Conducting design walkthroughs with data management teams.
  • Responsible for database design and implementation for project requirements.
  • Developed complex SQL queries to fetch data from different tables on databases.

Environment: Microsoft Office suite, SQL

We'd love your feedback!