We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

0/5 (Submit Your Rating)

FL

SUMMARY

  • 7 years of overall IT experience as a Hadoop Developer/Data Engineer with strong emphasis on Hadoop Ecosystem, Azure, Spark, Informatica, SQL and Tableau. Involved in various Analysis, Development and Implementation projects.
  • Hands - on experience in Hadoop and its components like HDFS, Spark, Scala, Map Reduce, Hive,
  • Impala, Sqoop, Pig, Oozie Kafka, Mongo, PostgreSQL, SQL Server
  • Hands-on experience in Tableau
  • Hands on Experience in SQL Queries for Data Analysis.
  • Analyzing and acquiring experiences into large data sets, creating visually compelling and actionable
  • Interactive reports and dashboards.
  • Proficient in design and development of various dashboards and reports utilizing Tableau Visualizations
  • Like Dual Axis, Bar Graphs, Scatter Plots, Heat Maps, Bubble Charts, Tree Maps, Waterfall Charts,
  • Geographic Visualization, and other making use of actions, other local and global filters according to the end-user requirement.
  • Hands on Experience in various Database concepts like Data Modeling, physical and logical schema
  • Design, creating Triggers, Indexes and views, Snowflake and star schema, Dimensional and Fact Tables,
  • Slowly Changing Dimensions, Normalization and other database concepts.
  • Performed Data Quality checks on small data sets using Excel (VLookup, Pivot tables).
  • Involved in ingesting data from RDBMS to Hadoop and vice-versa using Sqoop.
  • Hands-on experience in performing using ETL using Hive scripts and loaded data back into HDFS/HIVE.
  • Strong Knowledge in Partitioning Data in Hive using Static and Dynamic Partitioning and Bucketing in Hive
  • Strong understanding of Hive Meta store and used Impala for Adhoc & additional analysis.
  • Hands-on experience in writing Py-spark data frames in Jupyter notebooks and writing the output back to HDFS.
  • Automated, scheduled, co-ordinate workflows in Hadoop using Oozie xml tags.
  • Experience in using Text, CSV, Excel, JSON file format in Hadoop ecosystem
  • Hands-on experience in Informatica, Teradata, also supported multiple projects from start to end life cycle.
  • Handled small files in HDFS and resolved name node issues.
  • Spark performance optimizations using broadcast joins, coalesce, and repartition the data.
  • Good knowledge in UNIX commands, writing shell scripts for QA purposes.
  • Worked in Software Development Life Cycle (analysis, design, development, testing, implementation, deployment, support) using Waterfall, SAFE, Scrum, Agile and Kanban Methodologies.
  • Experience in Azure cloud (ADF, Azure Devops, Azure Data Lake, Data bricks) and NO Sql data bases such as MongoDB, HBASE.
  • Co-ordinated onsite and offshore model.

TECHNICAL SKILLS

Big Data Ecosystems: HDFS, Map Reduce, Hive, Pig, Sqoop, Oozie, Spark, Kafka, YARN, NIFI

NoSQL Databases: MongoDB, HBASE

Scripting Languages: Python, SQL, HiveQL, Unix shell scripts

Operating Systems: Windows, Linux, Unix

Databases: Oracle, Teradata, SQL Server

Tools: and IDE Eclipse, IntelliJ

Version control/ Build Tools: GitHub

Methodologies: Waterfall, Agile

Cloud: Azure

Visualization tools: Tableau, Excel(VLookup, Pivot tables)

ETL Tool: Informatica

PROFESSIONAL EXPERIENCE

Confidential, FL

Senior Hadoop Developer

Responsibilities:

  • Developed and Designed Data flows starting from SOURCE-STAGING-HIVE.
  • Supported Enterprise NIFI data pipelines to ingest DATA from various sources to HDFS.
  • Experience in writing Sqoop scripts to import and export data from Teradata and SQL Server to HDFS vice-versa.
  • Experience in developing Spark applications using Pyspark.
  • Developed Spark scripts using Data frames and RDD.
  • Extensively worked with parquet and Avro file format.
  • Experience in building SCD TYPE dimensional and fact tables using Spark applications.
  • Involved in consuming Kafka messages from Kafka using Spark.
  • Extensively used Hive partitioned tables, map join, bucketing and gained good understanding of dynamic partitioning, also worked on Impala for Adhoc Analysis.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Gained good exposure to Hue interface for monitoring the job status, managing the HDFS files, tracking the scheduled jobs and managing the Oozie workflows.
  • Performed optimizations and performance tuning in Spark and Hive.
  • Developed Unix script to automate data load into HDFS.
  • Experienced in managing and reviewing Hadoop log files.
  • Good knowledge on GIT commands, version tagging and pull requests.
  • Performed unit testing and integration testing, regression testing after the development and participated in code reviews.
  • Deprecating on-Prem process to Azure using Data bricks and ADF pipelines.
  • Using Azure devops for build pipelines and automatic deployment.
  • Interact with business analysts to understand the business requirements and translate them to technical requirements.
  • Collaborate with various technical experts, architects and developers for design and implementation of technical requirements.
  • Documented business requirements, technical specifications, and process flows.

Environment: Hadoop Cloudera, Spark, Hive, Impala, Sqoop, Oozie, Azure, GIT, UNIX Shell scripting, Teradata, SQL

Confidential

Hadoop Developer/ ETL Developer

Responsibilities:

  • Analyzed data sets from different Sources/Domains using SQL and Excel
  • Hands-on experience in writing SQL queries.
  • Knowledge in building sample aggregated reports using Excel pivots and VLOOKUP’s.
  • Analyzed, Supported ETL Workflows in Informatica.
  • Developing code using ETL tools and documenting artifacts to deploy it.
  • Used Informatica Power Center 8.6 and 9.5 for extraction, transformation and load (ETL) of data.
  • Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression, Lookup, Update strategy and Sequence generator, Procedure, XML, and SQL.
  • Scheduling Jobs using AUTOSYS scheduler.
  • Responsible for writing Hive queries for data analysis, loading data to meet the businessrequirements.
  • Developed Spark ETL jobs to apply rules, logics and transform data.
  • Involving in analyzing performance improvement areas and implementing the design approach to increase performance.
  • Created Hive Tables and loaded them using spark.
  • Using Sqoop extracted data from various RDBMS to HDFS.
  • Working on several enhancements to meet the client requirements and managed to successfully deliver them with no defects, besides meeting deadlines.
  • Performing high and low-level designs, provide pseudo codes, implement the prototype and conduct design reviews.
  • Responding to the issues assigned, conduct analysis, suggest / implement workarounds.
  • Performing root-cause analysis of any issues that arise post-implementation and work on solutions related to issue fixing.

Environment: UNIX Shell scripting, ETL Informatica, Teradata, Agile development, SQL

Confidential

Hadoop Developer

Responsibilities:

  • Interact with business analysts to understand the business requirements and translate them to technical requirements.
  • Collaborate with various technical experts, architects and developers for design and implementation of technical requirements.
  • Experience in building workflows using Informatica to load data into Teradata.
  • Designed and developed Data Ingestion scripts using Sqoop from Teradata to Hadoop/HDFS.
  • Consumed Raw Data and performed data cleaning to create a clean source of data using hive and spark scripts.
  • Business Logic implementation using Spark data frames (Adding new columns, joining multiple data frames) and loaded data into Hive tables using HIVE Scripts.
  • Performed QA checks across various environments using Hadoop Tech stack and Shell scripting.
  • Knowledge on SCD TYPE dimensional and fact tables (Star Schema).
  • Extensively use Hive partitioned tables, map side join, bucketing and gained good understanding of dynamic partitioning.
  • Used Hive & Impala to perform data queries and analysis as a part of the QA.
  • Experience in writing Sqoop scripts to import and export data from Teradata to HDFS and vice-versa.
  • Created reports for the product/client team using tableau.
  • Performed optimizations and performance tuning in Spark.
  • Knowledge on GIT commands, version tagging and pull requests.
  • Practical experience with developing applications in IntelliJ and Jupyter Notebooks.
  • Experience in handling offshore and on-site Change Approval calls.
  • Documented business requirements, technical specifications, and process flows.

Environment: Hadoop, Cloudera, Spark, Apache Pig, Apache Hive, HDFS, Sqoop, Oozie, IntelliJ, GIT, UNIX Shell scripting, Informatica, Teradata, Linux, Tableau, Agile development

We'd love your feedback!