Senior Cloud /Data Engineer Resume

SUMMARY

A Google certified Associate Cloud Engineer and strong skill big data developer with 12+ years of IT experience in Designing, Implementing, and Supporting the various Cloud, Big Data and Data warehouse applications.
8+ years of Hadoop experience in design and development of Big Data applications including experience in developing Spark/Scala jobs for processing large volumes of data.
Implemented Spark jobs using Scala for processing the large volumes of data for daily batch jobs.
More than a year of experience in implementing the cloud solutions using the Big Query, Composer, Airflow, Cloud Sql, Cloud Storage, Cloud Functions and Stack driver.
Rich experience in Apache Hadoop MapReduce, Yarn, Spark, Pig, Sqoop, Hue, Flume, Kafka and Oozie.
Experience in handling huge volumes of streaming messages from Flume/Kafka.
Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Yarn and Map Reduce framework.
Adequate knowledge in Software Development Life Cycle (SDLC), OOPS, Agile Methodology and the Scrum process, Data warehouse Concepts and Database Management practices.
Extensively worked on Extraction, Transformation and Loading data from various sources including RDBMS, Flat files and XML to data warehouse and data marts.
Implemented the solutions to move on - prem data into a predefined GCP storage location.
Data from the Data Frames will be written to a temporary table on Big Query
Implemented the Fact and Dimensions tables using the Big query.
Involved in designing and implementing the Spark framework codes.
Created Spark-SQLs and Data Framesto read the parquet data and create and load the RSP tables in Impala using the Scala API.
Created and updated the Pentaho jobs to Extract, Transform and Load the data from Hive tables to impala table with Snappy and parquet formatting for BA reporting.
Created web services to transfer the internal claims to Guidewire Policy Centre.
Good interpersonal skills with the capability of handling multiple tasks and priorities. Self-motivated with high ability to understand and apply new concepts quickly.
Ability to work independently with minimal supervision and to manage multiple projects in a fast-paced environment to meet deadlines, and an excellent team player.

TECHNICAL SKILLS

Insurance LOB: Personal Auto, Personal Property, Commercial Auto, Commercial Property.

ETL Tools: Informatica Power Centre, Pentaho 7.1

Languages: Shell Scripting, SQL, Java, Scala and Python.

Big Data Technologies: MapReduce, Hive, Impala, Spark, Sqoop, Kafka and Flume.

Cloud Technologies: Big Query, Cloud Storage, Cloud Functions, Data Flow and Pub/sub.

RDBMS: Oracle 11g, DB2, MySQL.

OS: Windows, UNIX, Linux.

Version Control/BTS: CVS, Visual SourceSafe, tortoise SVN, JIRA, Bit Bucket, Source tree, GitHub, Confluence.

PROFESSIONAL EXPERIENCE

Confidential

Senior Cloud /Data Engineer

Responsibilities:

Involved in user interactions, requirement analysis and design for the interfaces.
Created the cloud functions to trigged on GCS bucket File Event(finalize) after uploading the file in the Cloud storage and stores file information in technical meta store.
Involved in implementing the Data collection framework and Bath Ingestion frames works using the Python.
Implemented the solutions to move on -prem data into a predefined GCP storage location.
Data from the Data Frames will be written to a temporary table on Big Query
Implemented the Fact and Dimensions tables using the Big query.
Worked with Business users for creating the pricing Analytical model using Big Query.

Environment: Cloud Storage, Confluent Kafka, Cloud functions, Composer, Airflow, Big Query, Python and GIT Hub.

Confidential

Big Data/Spark Developer

Responsibilities:

Prepared the High Level & Low-Level Design confluence pages.
Performed code reviews and supporting the technical team on various activities.
Involved in designing and implementing the Spark framework codes.
Created Spark-SQLs and Data Framesto read the parquet data and create and load the RSP tables in Impala using the Scala API.
ImplementedSparkjobs using Scala for processing the large volumes of data for daily batch jobs.

Environment: HDFS, Spark, Data frames, Scala, Impala, Jira, GIT Hub, Confluence and Unix.

Confidential

Big Data/Spark Developer

Responsibilities:

Provided production support and monitoring the batch jobs in Control M.
Created and updated the data correction jobs to fix the production data issues.
Implemented solutions to improve performance for Impala/Hive queries.
Supported the implementation and drive to stable state in production.
Troubleshoot production data issues for financial reports and implemented the solutions to maintain healthy audit checks.
Implemented solutions for Impala/Hive queries and Sqoop jobs to support the Cloudera upgrade.

Environment: HDFS, Hive, Spark, Impala, Control M, Maven, Jira, GIT Hub and ServiceNow

Confidential

Pentaho/ETL Developer

Responsibilities:

Prepared the High Level & Low-Level Design documents. Performed code reviews and supporting the technical team on various activities.
Designed, developed and executed Pentaho MapReduce jobs to parse the various input xmls into CSV files.
Created and updated the Pentaho jobs to Extract, Transform and Load the data from Hive tables to impala table with Snappy and parquet formatting for BA reporting.
Created web services to transfer the internal claims to Guidewire Policy Centre.

Environment: MapReduce, HDFS, Pentaho 7.1/8.3, Hive, Impala, Jira, GIT Hub, Confluence.

Confidential

Hadoop Developer

Responsibilities:

Collected variety of large data across the business systems and applications and imported to Hive and HDFS using Data Ingestion tools Sqoop and Flume.
Designed, developed and executed Java MapReduce programs to merge the small files and split big files as per HDFS block size.
Designed, developed and executed Java MapReduce programs to parse the various input files into CSV files.
Created and updated the mappings to Extract, Transform and Load the historical data into the Hive table.
Implemented business logic by writing Pig UDFs, Hive Generic UDF's in Java and used various UDFs from Piggybanks and other sources.
Participated in peer design and code reviews.
Responsible for DD as Scrum Master by handling Sprint Road Map, Sprint plans, Daily Scrums, Sprint Demos, Sprint Retrospect and Sprint Executions.

Environment: Java, MapReduce, HDFS, Sqoop, Flume, Pig, Hive, XML, JSON, MySql, Linux and Oozie.

Confidential

ETL/Java Developer

Responsibilities:

Implemented for ETL jobs using UNIX scheduler.
Developed and executed mappings, workflows, worklets and sessions using the Informatica.
Coordinate with the business group and QA team to identify the issues.
Created and executed the unit test plans.
Participated in system design and technical solutions.
Implement the requirements and the enhancements including coding, testing and debugging.
Fix the bugs reported by QA team and the business group.

Environment: Informatica, Windows XP, DB2, Sybase, Oracle 10g, Java, Eclipse, Swing, Oracle, XML, Windows, Quality Center.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship