We provide IT Staff Augmentation Services!

Big Data Etl Cloud Solution Architect Resume

2.00/5 (Submit Your Rating)

SUMMARY:

High performing Big data ETL Architect with 10 plus years of experience across diverse industries including Health Care, Banking and Retail. He has worked on multiple technologies that include Big data components like Spark, Hive, Map - reduce, HDFS, Oozie, Sqoop, Impala, Kafka; programming languages like Python, Java and Unix Shell Scripting; ETL tools like Informatica Cloud/PowerCenter, DataStage; database skills like Oracle, SQL Server, Teradata and Cloud technologies like Azure, AWS, Google Cloud, Salesforce, He is an expert in leading teams, establishing client relationships, production support roles, and cross-functional integration. He has remarkable problem solving, analytical, interpersonal and communication skills. He is a team-player and is committed to excellence. He is currently performing a Big data solution architect role in Azure Cloud using Shell scripts, Spark and Python to ingest data from various sources like FTP servers, APIs, Google Cloud into Azure Data Lake. He has also managed and delivered Dev Ops implementation using TDD and CICD approach for a Health Care client.

PROFESSIONAL EXPERIENCE:

Confidential

Big Data ETL Cloud Solution Architect

  • Performed multifaceted roles in Accenture which includes Big Data Architect, Informatica ETL developer (across all versions 8.x, 9.x, 10.x) Informatica Cloud Support analyst, Teradata developer, Hadoop & Python Development lead, Operations/Release Manager and Business Value Creator.
  • Architect, design & develop Big Data Solutions practice including set up Big Data roadmap, build supporting infrastructure and engineer big data transformation of Structured/Un-structured data.
  • Exposure to Cloud through all major cloud solution providers, Azure, Google Cloud and AWS.
  • Developed complex ETL transformation scripts using UNIX Shell Scripting, Python and PySpark.
  • Developed ETL Mappings/Workflows using Informatica Cloud/Informatica BDM and real time ICRT jobs including development of Processes, Guides, Service Connectors, and Connections.
  • Experience in building complex facts and dimension for Datawarehouse projects. Maintained the code in production post implementation guarantee period.
  • Performed the role of Dev-Ops Lead. Handled End to End Dev-Ops cycle.
  • Database experience includes Oracle, SQL Server, DB2, Netezza, and Teradata.
  • Performed performance tuning in Teradata/Informatica jobs to improve performance.
  • Actively participated in Data Modelling using Erwin Data Modeler.
  • Experience in application design and implementation of Business Data per business requirements with Health Care (including Facets), Banking and Financial Services Domains
  • Implemented solutions for ingesting data from various sources into Azure Data lake and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, Spark, Hive, Bash Scripting, Python.
  • Implementation of Big Data ecosystem (Hortonworks, Microsoft Azure, HDFS, YARN, Spark, Hive, PIG, Oozie, TEZ, Apache Ranger, Sqoop, Zookeeper, Splunk, Ambari) with Cloud Architecture.
  • Created Hive External tables and loaded the data into tables and query data using HQL.
  • Imported millions of structured data to process using Spark and stored the data into HDFS in CSV format.
  • Used Spark SQL to process the huge amount of structured data.
  • Implemented SCD Type 2 logic on Product data using Hive and Spark data frames.
  • Extracted data from multiple APIs using Shell scripts and loaded data into CSV files to perform Data Science activities.
  • Used Python data frames to work with Google Big Query and extracted data tables on daily basis.
  • Extract files from AWS S3 server using Python Boto3 S3 package and loaded the files into Azure Data Lake.
  • Used Google SDK tools such as gsutil, gcloud and BQ command line tools to extract data from Big Query tables and load into Google Cloud Storage.
  • Perform the role of BI ETL Lead which includes guiding the team in converting Informatica ETL components into Python based Scripts hosted in Private Cloud servers.
  • Developed complex queries for data extraction from Hive tables.
  • Hands on experience in big data tools like Hive, Impala, Sqoop.
  • Optimized Teradata BTEQ scripts for better performance.
  • Involved in production issues and performed bug fixes in quick workaround time.
  • Handled the responsibility of Code movement across environments (Dev, INT, UAT, and PROD).
  • Involved in Continuous Integration and Continuous Deployment using Dev Ops tools.
  • Hands on Experience in Tortoise GIT and Intellij for versioning code.
  • Used Jenkins to build code packages from GIT and promote code to higher environments using Udeploy.
  • Understanding of existing system & technical components in Amazon Web Services.
  • Manage AWS servers and infrastructure using EC2, S3 & Cloud Formation.
  • Perform the role of Teradata First line support engineer to review peer team code and designs.
  • Interact with Change Control Board and Scheduling team in promoting objects to UAT and Production.
  • Supporting the production team in the warranty period of three months initially after implementation.
  • Lead 12-member team performing Application Development & Support, by developing & maintaining complex ICS and ICRT jobs using Informatica Cloud.
  • Involved in development of Data Synchronization task, Mapping Configuration task, Data Replication tasks.
  • Involved in real time data processing including development of Processes, Guides, Service Connectors, and Connections. Includes monitoring of Informatica Cloud Processes.
  • Involved in performance tuning of Informatica Cloud Jobs.
  • Responsible for analysis, design and development of production fix components in Informatica Cloud & Salesforce.
  • Manage AWS servers and infrastructure using EC2, S3 & Cloud Formation.
  • Perform tuning of ICS & ICRT Jobs for long running task flows.
  • Responsible for understanding Use case development using Salesforce Application and understand data load from/to salesforce using Informatica cloud.
  • Understanding of existing system & technical components in Amazon Web Services.
  • Leading end to end development of Dev-Ops system using Bit bucket, Jenkins, New Relic and other technologies.

Confidential, New York

Informatica ETL Production Support Lead and Release Manager

  • Worked as Migration Lead & Production Support lead for Informatica jobs running in production.
  • Includes interacting with clients and SME's to understand the business requirement to create design documents for quick fixes in production.
  • Handled the responsibility of Code movement across environments (Development, Stage, UAT, PROD).
  • Supported after Production Deployment during the guarantee period. Includes handling of resolving Production issues within SLA’s.
  • Guided the team in using Reusable components created in similar previous migration project & Led the team in successfully completing the project on time.
  • Effectively handled client change requests by proposing and supporting the design, development and testing of solutions, including ensuring that change requests entrance criteria and documentation is complete, conducting impact analysis and following through on execution of changes.
  • Created & maintaining documents like Production Issue tracker, Production Job status tracker & Run book to reduce KT time, tickets resolution time.
  • Have set up and lead the Complete Support & Maintenance team for one of World’s leading Banking Data warehouse application, a critical & strategic project for the current organization gaining client's confidence and appreciation.

Confidential

Informatica ETL Developer

  • Played a lead role in requirements understanding & collection processes and translating them into technical requirements. Includes preparation of Technical Design document, ETL Source to Target Mappings.
  • Implementation of Data Quality Management (DQM) for Trading Distribution data with upfront quality assurance and built in proactive alerting mechanism.
  • Prepared Estimation for Development efforts using Microsoft Project Plan.
  • Responsible for Code movement across environments (Development, Stage, UAT).
  • Manage Resources technically & operation wise. Responsible for project management activities.
  • Attend status meetings with Client’s Project Management team.
  • Manage day to day business operations and work on monitor compliance with Accenture processes.
  • Administering HR policies regarding Performance Management of consultants reporting to me.
  • Ideating Six sigma process helping business to reduce operating costs for support and maintenance of application.

Confidential

Informatica ETL Developer

  • Involved in creation of ETL mappings and transformations to reflect business rules using Informatica Power Center to move data from multiple sources into target area.
  • End to end Implementation of Claim Data for Health Care Clients involving Facts and Dimensions.
  • Have understood PLSQL Stored Procedures and Packages as part of project deliverable.
  • Prepare the test cases for the developed logic to the users and to make sure that the gathered requirement is converted as ETL logic.
  • Perform the comparison of data between the production data and the UAT test data.
  • Experience in handling XML Sources and XML targets.
  • Actively pursued and recommended areas of improvement for data profiling, data loading & Data Cleansing techniques.
  • Experience in Autosys scheduler to schedule Informatica jobs. Experience with UNIX Shell Scripting.

TECHNICAL SKILLS:

  • Hadoop ecosystem including HDFS, YARN, Spark, Hive, Impala, PIG, Sqoop, Ozie, TEZ, Apache Ranger, Sqoop, Zookeeper, Splunk, Ambari, Spark & Kafka, Zeppelin and Jupyter. Hadoop Lambda Architecture.
  • Programming languages include Python and Java.
  • Informatica Power Center 8.x, 9.x, Informatica Cloud (ICS & ICRT, IICS)
  • Teradata, Oracle, SQL Server, Netezza and DB2.
  • Cloud Experience includes Salesforce CRM, Microsoft Azure, Google Cloud and AWS.
  • Quality Center, Rational Quality Manager, Test Rail.
  • SQL, PLSQL, UNIX Shell Scripting, Teradata BTEQ and Windows Batch Scripting.
  • Experience in scheduling tools like Autosys, Tidal and CA Workstation (ESP Creation).
  • Exposure to Dev-Ops technologies like Bit bucket, Jenkins, AWS Cloud, New Relic, and Sumo Logic.
  • Experience in Hadoop, Cloudera/Hortonworks Distributed platform. Exposure to No SQL.
  • Exposure to Informatica BDM (Native/Hadoop Environment)

We'd love your feedback!