Senior Hadoop Developer Resume
3.00/5 (Submit Your Rating)
SUMMARY
- Motivated IT professional with 14 years of experience in data warehousing technologies as Tech Lead, Software Developer & Designer and 3 years of working experience on Big Data Hadoop (Cloudera distribution CDH 4 and 5) technologies like Hive, Sqoop, Impala, HDFS, Apache Spark and Flume.
- Experience in Team Leading, Software Development & Design.
- Diverse domain expertise including Manufacturing, Health Care, Life Science, Banking and Retail
- Experience in Data Warehousing technologies like Informatica Powercenter
- Good Experience in ORACLE 9i, ORACLE 10g, Oracle 11g.
- Experience in all areas of project life cycle using both proprietary methodologies and Agile Techniques.
- Solid expertise in the workings of Hadoop internals, architecture and supporting ecosystem components like Hive, Spark, Sqoop, Pig, Impala and Flume.
- Adept at HiveQL and have good experience of partitioning (time based), dynamic partitioning and bucketing to optimize Hive queries. Also used Hive’s MapJoin to speed up the queries when possible.
- Used Hive to create tables in both delimited text storage format and binary storage format.
- Have excellent working experience in using the two popular Hadoop binary storage formats Avro datafiles and Sequence files.
- Also have experience developing Hive UDAF to apply custom aggregation logic.
- Good working experience using Sqoop to import data into HDFS from RDBMS and vice - versa. Also have good experience in using the Sqoop direct mode with external tables to perform very fast data loads.
- Used AirFlow DAGs for creating workflow and coordinator jobs that schedule and execute various Hadoop jobs such as Spark jobs, Hive, Pig and Sqoop operations.
- Good knowledge on Spark, Python and Scala.
- Good conceptual understanding and experience in cloud computing applications using Amazon EC2, S3, EMR.
- Proven ability to work under pressure, prioritize and meet deadlines. Open to dynamic work environment and ability to work collaboratively with business analysts, testers, developers and other team members in the overall enhancement of the product quality.
- Strong business acumen, strategic thinking, communication, interpersonal and presentation skills, adept at resolving conflicts.
TECHNICAL SKILLS
Hadoop Ecosystem Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Spark, Impala, Flume, Qozie, Airflow DAG
Programming Languages: SQL and PL/SQL, Python, Scala
Operating Systems: Windows 98/XP/2000/NT/VISTA, Unix
RDBMS Databases: Oracle 9i,Oracle 10g,Oracle 11g,DB2,Sybase, SQL Server, Netezza
Scripting Language: Shell Scripting
Tools: TOAD, SQL Developer, ANT, Maven, Visio, Informatica Powercenter, SVN, Bit Bucket, ControlM, Autosys, Workload Automation
PROFESSIONAL EXPERIENCE
Confidential
Senior Hadoop Developer
Responsibilities:
- Building a Data Quality framework, which consists of a common set of model components and patterns that can be extended to implement complex process controls and data quality measurements using Hadoop.
- Created and populated bucketed tables in Hive to allow for faster map side joins and for more efficient jobs and more efficient sampling. Also performed partitioning of data to optimize Hive queries.
- Worked extensively with Sqoop to move data from NETEZZA and ORACLE to HDFS.
- Scheduled Oozie workflow engine to run multiple Sqoop and Hive jobs, which independently run with time and data availability.
- Used Spark SQL functions to move data from stage hive tables to fact and dimension tables in HDFS implementing the CDC logic and performed interactive querying.
- Managing and scheduling Jobs on a Hadoop cluster using Oozie.
- Implemented dynamic partitioning in hive tables and used appropriate file format, compression technique to improve the performance of map reduce jobs.
- Experience in managing and reviewing Hadoop Log files generated through YARN.
- Work with Data Engineering Platform team to plan and deploy new Hadoop Environments and expand existing Hadoop clusters.
- Monitor Autosys jobs and resolve issues in case of failure.
Confidential
Senior Hadoop Developer
Responsibilities:
- Hands on experience in loading data from UNIX file system to HDFS. Also performed parallel transfer of data from landing zone to the HDFS file system using DistCp.
- Experienced on loading and transforming of large sets of structured and semi structured data from HDFS through Sqoop and placed in HDFS for further processing.
- Designed appropriate partitioning/bucketing schema to allow faster data retrieval during analysis using HIVE.
- Involved in processing the data in the Hive tables using HQL high-performance, low-latency queries.
- Transferred the analyzed data across relational database from HDFS using Sqoop enabling BI team to visualize analytics.
- Developed custom aggregate functions using Spark SQL and performed interactive querying.
- Managing and scheduling Jobs on a Hadoop cluster using Airflow DAG.
- Involved in creating Hive tables, loading data and running hive queries in those data.
- Extensive working knowledge of partitioned table, UDFs, performance tuning, compression-related properties in Hive.
- Work with Data Engineering Platform team to plan and deploy new Hadoop Environments and expand existing Hadoop clusters.
- Monitor Autosys jobs and resolve issues in case of failure.
- Deploy Informatica objects in production repository.
- Monitor and debug Informatica components in case of failure or performance issues.
Confidential
Hadoop Developer
Responsibilities:
- Moving data from Oracle to HDFS and vice-versa using SQOOP.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Worked with different file formats and compression techniques to determine standards
- Developed Hive queries to analyze/transform the data in HDFS.
- Designed and Implemented Partitioning (Multi-level), Buckets in HIVE.
- Analyzing/Transforming data with Impala and Hive
- Gathering Client requirements and writing techno-functional requirement document.
- Review design and development artifacts are to ensure quality in the products being developed
- Perform analysis for various enhancements, perform impact analysis to find out the systems/programs that could be potentially affected by proposed change(s)
- Promoting the code through various stages (SIT, UAT and PROD).
- Estimating effort for Change Request.
- Responsible for providing recommendations and technical solutions on improving the processes.
- Interacting with Customer for regular status.
- Effective coordination with offshore team and managed project deliverable on time.
