We provide IT Staff Augmentation Services!

Senior Cloud /data Engineer Resume

3.00/5 (Submit Your Rating)

SUMMARY

  • A Google certified Associate Cloud Engineer and strong skill big data developer with 12+ years of IT experience in Designing, Implementing, and Supporting the various Cloud, Big Data and Data warehouse applications.
  • 8+ years of Hadoop experience in design and development of Big Data applications including experience in developing Spark/Scala jobs for processing large volumes of data.
  • Implemented Spark jobs using Scala for processing the large volumes of data for daily batch jobs.
  • More than a year of experience in implementing the cloud solutions using the Big Query, Composer, Airflow, Cloud Sql, Cloud Storage, Cloud Functions and Stack driver.
  • Rich experience in Apache Hadoop MapReduce, Yarn, Spark, Pig, Sqoop, Hue, Flume, Kafka and Oozie.
  • Experience in handling huge volumes of streaming messages from Flume/Kafka.
  • Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Yarn and Map Reduce framework.
  • Adequate knowledge in Software Development Life Cycle (SDLC), OOPS, Agile Methodology and the Scrum process, Data warehouse Concepts and Database Management practices.
  • Extensively worked on Extraction, Transformation and Loading data from various sources including RDBMS, Flat files and XML to data warehouse and data marts.
  • Implemented the solutions to move on - prem data into a predefined GCP storage location.
  • Data from the Data Frames will be written to a temporary table on Big Query
  • Implemented the Fact and Dimensions tables using the Big query.
  • Involved in designing and implementing the Spark framework codes.
  • Created Spark-SQLs and Data Framesto read the parquet data and create and load the RSP tables in Impala using the Scala API.
  • Created and updated the Pentaho jobs to Extract, Transform and Load the data from Hive tables to impala table with Snappy and parquet formatting for BA reporting.
  • Created web services to transfer the internal claims to Guidewire Policy Centre.
  • Good interpersonal skills with the capability of handling multiple tasks and priorities. Self-motivated with high ability to understand and apply new concepts quickly.
  • Ability to work independently with minimal supervision and to manage multiple projects in a fast-paced environment to meet deadlines, and an excellent team player.

TECHNICAL SKILLS

Insurance LOB: Personal Auto, Personal Property, Commercial Auto, Commercial Property.

ETL Tools: Informatica Power Centre, Pentaho 7.1

Languages: Shell Scripting, SQL, Java, Scala and Python.

Big Data Technologies: MapReduce, Hive, Impala, Spark, Sqoop, Kafka and Flume.

Cloud Technologies: Big Query, Cloud Storage, Cloud Functions, Data Flow and Pub/sub.

RDBMS: Oracle 11g, DB2, MySQL.

OS: Windows, UNIX, Linux.

Version Control/BTS: CVS, Visual SourceSafe, tortoise SVN, JIRA, Bit Bucket, Source tree, GitHub, Confluence.

PROFESSIONAL EXPERIENCE

Confidential

Senior Cloud /Data Engineer

Responsibilities:

  • Involved in user interactions, requirement analysis and design for the interfaces.
  • Created the cloud functions to trigged on GCS bucket File Event(finalize) after uploading the file in the Cloud storage and stores file information in technical meta store.
  • Involved in implementing the Data collection framework and Bath Ingestion frames works using the Python.
  • Implemented the solutions to move on -prem data into a predefined GCP storage location.
  • Data from the Data Frames will be written to a temporary table on Big Query
  • Implemented the Fact and Dimensions tables using the Big query.
  • Worked with Business users for creating the pricing Analytical model using Big Query.

Environment: Cloud Storage, Confluent Kafka, Cloud functions, Composer, Airflow, Big Query, Python and GIT Hub.

Confidential

Big Data/Spark Developer

Responsibilities:

  • Prepared the High Level & Low-Level Design confluence pages.
  • Performed code reviews and supporting the technical team on various activities.
  • Involved in designing and implementing the Spark framework codes.
  • Created Spark-SQLs and Data Framesto read the parquet data and create and load the RSP tables in Impala using the Scala API.
  • ImplementedSparkjobs using Scala for processing the large volumes of data for daily batch jobs.

Environment: HDFS, Spark, Data frames, Scala, Impala, Jira, GIT Hub, Confluence and Unix.

Confidential

Big Data/Spark Developer

Responsibilities:

  • Provided production support and monitoring the batch jobs in Control M.
  • Created and updated the data correction jobs to fix the production data issues.
  • Implemented solutions to improve performance for Impala/Hive queries.
  • Supported the implementation and drive to stable state in production.
  • Troubleshoot production data issues for financial reports and implemented the solutions to maintain healthy audit checks.
  • Implemented solutions for Impala/Hive queries and Sqoop jobs to support the Cloudera upgrade.

Environment: HDFS, Hive, Spark, Impala, Control M, Maven, Jira, GIT Hub and ServiceNow

Confidential

Pentaho/ETL Developer

Responsibilities:

  • Prepared the High Level & Low-Level Design documents. Performed code reviews and supporting the technical team on various activities.
  • Designed, developed and executed Pentaho MapReduce jobs to parse the various input xmls into CSV files.
  • Created and updated the Pentaho jobs to Extract, Transform and Load the data from Hive tables to impala table with Snappy and parquet formatting for BA reporting.
  • Created web services to transfer the internal claims to Guidewire Policy Centre.

Environment: MapReduce, HDFS, Pentaho 7.1/8.3, Hive, Impala, Jira, GIT Hub, Confluence.

Confidential

Hadoop Developer

Responsibilities:

  • Collected variety of large data across the business systems and applications and imported to Hive and HDFS using Data Ingestion tools Sqoop and Flume.
  • Designed, developed and executed Java MapReduce programs to merge the small files and split big files as per HDFS block size.
  • Designed, developed and executed Java MapReduce programs to parse the various input files into CSV files.
  • Created and updated the mappings to Extract, Transform and Load the historical data into the Hive table.
  • Implemented business logic by writing Pig UDFs, Hive Generic UDF's in Java and used various UDFs from Piggybanks and other sources.
  • Participated in peer design and code reviews.
  • Responsible for DD as Scrum Master by handling Sprint Road Map, Sprint plans, Daily Scrums, Sprint Demos, Sprint Retrospect and Sprint Executions.

Environment: Java, MapReduce, HDFS, Sqoop, Flume, Pig, Hive, XML, JSON, MySql, Linux and Oozie.

Confidential

ETL/Java Developer

Responsibilities:

  • Implemented for ETL jobs using UNIX scheduler.
  • Developed and executed mappings, workflows, worklets and sessions using the Informatica.
  • Coordinate with the business group and QA team to identify the issues.
  • Created and executed the unit test plans.
  • Participated in system design and technical solutions.
  • Implement the requirements and the enhancements including coding, testing and debugging.
  • Fix the bugs reported by QA team and the business group.

Environment: Informatica, Windows XP, DB2, Sybase, Oracle 10g, Java, Eclipse, Swing, Oracle, XML, Windows, Quality Center.

We'd love your feedback!