We provide IT Staff Augmentation Services!

Sr Gcp Data Engineer Resume

5.00/5 (Submit Your Rating)

Charlotte, NC

SUMMARY

  • A Google Certified Professional Data Engineer wif 8+ years in IT data analytics.
  • Deep understanding of teh architecture in bigdata and reporting layers and various tools including Hadoop wif SAS Visual Analytics, Big Query wif Tableau reporting layer and as well as Azure synapse analytics wif Power BI.
  • Has knowledge in various other fully integrated BI tools such as Qlik, Mode, Qubole, Superset etc.
  • Did migration projects from Quoble to on premise Hadoop using presto and spark engine for SQL and reporting in Mode saving millions in licensing fees.
  • Highly experienced in developing DataMart’s as per teh requirements and developing warehousing designs using distributed SQL both in Hadoop and as well as in google cloud environments.
  • Hands on experience in using teh various GCP components such as DataFlow wif python SDK, DataProc, BigQuery, Composer(Airflow), Gsuite for impersonation of teh service accounts, Cloud IAM, Cloud pubsub, Cloud functions for handling functions as service requests, Cloud data fusion, cloud GCS, Cloud data catalog.
  • Used Google native components very much for security and big data related applications.
  • Hands on experience wif different programming languages such as Python and Scala.
  • Very keen in knowing newer techno stack dat Google Cloud platform (GCP) adds.
  • Knowledge in Kubernetes platform such as GKE and OpenShift for deploying applications.
  • Deep understanding of CI/CD process and various architectures around Git flows and writing test cases for code reliability.
  • Converted a lot Hive SQL code into SPARK SQL and pyspark code depending on teh requirement.
  • Converted PL/SQL type of code to both bigquery - python architecture as well as azure databricks and pyspark in dataproc.
  • Experience wif Jira, azure devops and ability to work in both kanban and 2 weeks sprint models.

TECHNICAL SKILLS

RDBMS: MySQL, MS SQL Server, T-SQL, Oracle, PL/SQL.

Google Cloud Platform: GCP Cloud Storage, Big Query, Composer, Cloud Dataproc, Cloud SQL, Cloud Functions, Cloud Pub/Sub, Dataflow etc.

Big Data: Apache Beam, Spark, Hadoop, Google Bigdata stack, Azure Bigdata Stack

ETL/Reporting: Power BI, Data Studio, Tableau

Python Modules: Pandas, SciPy, Matplotlib.

Programing: Shell/Bash, C#, R, Go, Python.

PROFESSIONAL EXPERIENCE

Confidential

Sr GCP Data Engineer

Responsibilities:

  • Experience in working wif product teams to create various store level metrics and supporting data pipelines written in GCP’s bigdata stack.
  • Deep understanding of moving data into GCP using SQOOP process, using custom hooks for MySQL, using cloud data fusion for moving data from Teradata to GCS.
  • Good knowledge in building data pipelines in airflow as a service (composer) using various operators.
  • Build a program using Python and apache beam to execute it in cloud Dataflow and to run Data validation jobs between raw source file and Big query tables.
  • Extensive use of cloud shell SDK in GCP to configure/deploy teh services like Dataproc, Storage and Big query.
  • Involved in loading and transforming large sets of teh structured, semi-structured dataset and analyzed them by running Hive queries.
  • Wrote scripts in Hive SQL for creating complex tables wif high performance metrics like partitioning, clustering and skewing.
  • Designed and Co-ordinated wif Data Science team in implementing Advanced Analytical Models in Hadoop Cluster over large Datasets.
  • Built custom code in python for tagging tables and columns using cloud data catalog and built an application for user provisioning.
  • Hands on experience in coding in python and call GCP’s rest API’s for integrating data.
  • Migrated an entire oracle DB and teh reports done in OBIEE to bigquery and tableau.
  • Lift and shift experience of moving on prem Hadoop jobs to google dataproc.

Confidential, Charlotte, NC

Sr. Hadoop Engineer

Responsibilities:

  • Built reporting in power bi after building teh ETL in on prem Hadoop cluster.
  • Building various data pipelines using both hive SQL, spark RDD’s and oozie for scheduling.
  • Converted MySQL queries to Hive SQL using TEZ engine, migrated previously written hive using MR to both TEZ and Spark based on teh requirement.
  • Converted previously written SAS programs into python to save up on licenses fees, moved SAS analytics reports to power BI.
  • Built supply chain data marts as teh product team and exposed data using API’s written in JAVA for external health systems.
  • Migrated Hadoop to dataproc and moved reporting from power bi to tableau and data studio.
  • Understanding teh logical plans from spark and improving teh processes of teh data pipelines for both efficiency and controlling costs in dataproc.
  • Developed custom python program including CI/CD rules for google cloud data catalog for metadata management.
  • Worked wif google data catalog and other google cloud API’s for monitoring, query and billing related analysis for Big query usage.
  • Experience in moving data between GCP and Azure using Azure Data Factory.
  • Monitoring Bigquery, Dataproc and cloud Data flow jobs via Stackdriver for all teh environments.
  • Coordinated wif team and Developed framework to generate Daily adhoc reports and Extracts from enterprise data from Big Query.

Trisan Info Private Limited, India May 2012 - Nov 2015

Data Analyst

Responsibilities:

  • Prepared Test plan, test case according to teh Source to target mapping document.
  • Involved in logical and physical designs and transforming logical models into physical implementations.
  • Developing python programs dat can run teh end-to-end data migration and as well as transformation and load data into sinks such as oracle and mysql.
  • Experienced in using python wif SQLite for DML’s and python modules for calling various functions like datetime conversions, while loops, for loops and writing custom classes as required while passing user defined arguments to teh code.
  • Gained extensive experience wif AGILE methodologies in software projects, participated in SCRUM meetings, followed biweekly sprint schedules and tracked progress on JIRA.
  • Involved in requirement gathering and data analysis and Interacted wif Business users to understand teh reporting requirements.
  • Extensively used PL/SQL to build Oracle Reports 10g and views for processing data, enforcing referential integrity and needed business rules.
  • Worked on creating DDL, DML scripts for teh data models.
  • Created interactive dashboards and visualizations of claims reports, competitor analysis and improved statistical data using Tableau.
  • Designed and built Data marts by using Star and snowflake schema.
  • Tested teh database to check field size validation, check constraints, stored procedures and cross verifying teh field size defined wifin teh application wif metadata.
  • Performing data management projects and fulfilling ad-hoc requests according to user specifications by utilizing data management software programs and tools like Toad, MS Access, Excel and SQL.

We'd love your feedback!