We provide IT Staff Augmentation Services!

Etl Developer Resume

2.00/5 (Submit Your Rating)

OBJECTIVE:

To become consultant with experience on application design, development, maintenance, enhancement and application support. I seek a position that will enable me to utilize my knowledge and further develop new innovative skills to work on different technologies.

SUMMARY

  • 7+ years of experience in IT - Industry on Big Data/Hadoop Development, Ab initio and Unix.
  • Experience in data ingestion using Apache Sqoop from RDBMS to HDFS and Hive.
  • Experience in integrating Apache Hive with Hbase and Elastic Search.
  • Good SQL experience and exposure to MS SQL Server, MYSQL and Oracle databases.
  • Have basic knowledge in SAP BO, Microsoft PowerBI, Qlik View and Qlik Sense data visualization products on business intelligence.
  • Related experience with Hadoop and its ecosystem components like Hive, Pig, Sqoop, Impala, Greenplum, Oozie etc.
  • Good knowledge about Apache Kafka and Spark (Pyspark).
  • Hadoop cluster setup, installation, and configuration experience of multi-node cluster using Hortonworks and Cloudera distributions of Hadoop.
  • Explored deploying containers using Docker on Kubernetes.
  • Experience in methodologies such as Agile and Scrum.
  • Good problem solving skills, Communications skills and willingness to learn.
TECHNICAL SKILLS

Primary Skill: Hadoop

Sub Skills: Map Reduce, Sqoop, HiveQL, Impala, Unix Shell scripting, Pig Scripting, Hbase, Oozie Scheduling, Kafka and Spark.

Secondary skills: Ab initio, Core Java, Scala, Python, Unix

Sub skills: Core Java, Ab Initio ETL

PROFESSIONAL EXPERIENCE

Confidential

ETL Developer

Responsibilities:

  • Participated in client calls to gather and analyze the requirements.
  • Developed graphs for data extraction and scrubbing for various transformation processes.
  • Worked on different change requests using Ab Initio graphical development environment according to the specifications.
  • Build the graphs to incorporate the transformation logic as per the business requirements.

Confidential

Admin

Responsibilities:

  • User provisioning (creation and deletion) of user on Prod and Dev cluster according to client request.
  • Started learning about Hadoop and its components installation and configuration of Cloudera cluster.
  • Monitoring the alerts set on Prod and Dev clusters for any issues related to capacity usage and services running on it.

Confidential

Hadoop developer

Responsibilities:

  • Extracting DuckCreek XML from a SQL Server database (Sqoop)
  • Converting XML data into JSON (Map Reduce)
  • Inferring schema changes from JSON (Map Reduce)
  • Creating hive external tables on top of the JSON files.
  • Loading JSON data into Hive ORC tables (Hive)
  • Incremental Loading of tables into Confidential
  • Installing HDP and Ambari to set up a 5 node cluster.
  • Loading data from MS SQL to HDFS using Sqoop.
  • Creating Hive External tables on top of those JSON files.
  • Loading the Hive ORC table’s data into Confidential using Fluid query.

Confidential

Hadoop developer

Responsibilities:

  • In writing sqoop scripts to establish a connection between SQL server, Oracle, Db2 and Hadoop for loading data to hive.
  • In writing validation script to validate the loaded data in hive with source data.
  • Established connection to SAP BO (Universe) and Power BI to Impala and Hive through ODBC connections.
  • In writing several transformation scripts such as to update hive tables with flat file data coming weekly basis, loading db2 data using hex functions, Loading CSV files data by cleansing the data etc.

Confidential

Developer

Responsibilities:

  • Understanding and developing data pipelines with Kafka and processing in Spark
  • Work with business team to understand their requirements.
  • Programming in Scala and Python to process data using Spark.

Confidential

Data Engineer

Responsibilities:

  • Automated loading data from different DB sources to HDFS using Sqoop jobs through Oozie coordinators.
  • Creating Greenplum External tables on top of HDFS data using UTF-8 encoding and loading to internal tables after performing if any transformations using PIG.
  • Automated both full load and incremental loads to greenplum and checking data integrity.
  • To achieve performance improvement, started changing transformations from Pig to Pyspark.

We'd love your feedback!