Hadoop Engineer Resume

SUMMARY

Over all 8 Years of IT experience which includes Hadoop Eco system components, Java development.
Having experience in Application Development in Java, Hadoop and its ecosystem - HDFS, Map Reduce, Hive, Sqoop, Spark, Impala, Sentry, Flume, Yarn, OOZIE, Zookeeper.
Experience in working with Hadoop distribution Cloudera (CDH5) distributions and Horton works.
Imported/Exported data using Sqoop in between RDBMS and HDFS on regular basis
Having good experience in writing Hive queries, Impala Queries for generating reports
Implemented Utility UDF’s in hive integrated with Sentry enabled for business analysts used in generating reports
Designed flume agents configuration and implementing them in cluster for ingesting data from external sources into HDFS
Developed Kafka producer and consumers in Kerberos enabled Kafka cluster
Defined work flows using Oozie and troubleshooting issues in production.
Implemented business functionality using spark SQL jobs in functional programming language SCALA
Implemented workflow for ETL process and testing/trouble shooting them
Having good understanding of security implementation MIT Kerberos, Apache Sentry and integration with Hadoop components
Having experience in trouble shooting the issues in multitenant cluster by analyzing diagnostic logs
Having hands on experience in planning discovery cluster, analytical cluster, Compute only Cluster
Hands on experience in developing Map Reduce jobs according to the business use cases.
Developed ingestion jobs for flat files and RDBMS data into hadoop data lake using the custom tools Data Movement Frame work, BDRE(Wipro)
Worked on evaluating frameworks like Spark, Spark Streaming, Kafka, Flume, and other tools to empower tenants of HAAS Clusters.
Worked on defining standards and guidelines for Hadoop and Hive
Evaluated multiple projects and provided consulting work for various internal teams on boarding them to HaaS Cluster.
Developed Data mover utility using shell scripts which will help Haas tenants to move data between prod to UAT, Dev.
Involved in performance tuning of the existing hive queries and map reduce jobs, Impala queries by analyzing query profiles.
Involved in in-room/telephonic Scrum meetings to gather requirements and analyzing the requirements and developments.
Experience in different operating Systems UNIX, LINUX, and WINDOWS.
Good exposure to databases like Oracle, MSSQL, MySQL
Worked on HDFS Encryption POC to support PCI guidelines on the cluster.
Developed data ingestion tools DMF using java play framework
Developed custom Input format for reading data from XML file using java in Map reduce
Writing blog on Hadoop eco system components and core java confidential
Having working experience on template based platform (Open Architecture Framework) which is built using J2EE architecture and standards.
Strong troubleshooting and production support skills and interaction abilities with end users
Experience in problem solving, analysis, implementation, installation, and configuration skills.
Good interpersonal skills, commitment, result oriented, hard working with a quest and zeal to learn new technologies and undertake challenging tasks. Excellent team member with strong communication skills and capable of meeting set Deadlines.

PROFESSIONAL EXPERIENCE

Programming Skills: Core Java (OOPs and collections), J2EE Framework, Linux Shell Script, JDBC,SCALA

Big data: HDFS, YARN, Spark, Mapreduce, SQOOP, HIVE, PIG, Hbase, OOZIE, Zookeeper, Kafka, SPARK,Impala

Data Base: Oracle 10g (SQL/PLSQL), MSSQL,MYSQL

Design Patterns: Singleton, Factory, MVC.

Secondary Skills: ANT, Maven

Version Control system: SVN, GIT, ADE

Scripting Languages: Shell scripting

IDE: Eclipse/My Eclipse, JDeveloper

Operating Systems: Windows, Linux, UNIX

Domains: Banking, SCE (Supply Chain Execution), Oracle Fusion GRC, Telecom

PROFESSIONAL EXPERIENCE

Confidential

Hadoop Engineer

Responsibilities:

Developed Data mover utility using shell scripts which will help Haas tenants to move data between prod to UAT, Dev.
Resolved issues with flume and advised flume Agent configuration changes.
Implemented business functionality using Spark core and Spark SQL.
Transforming the Hive jobs into Spark SQL
Imported/Exported data using Sqoop in between RDBMS and HDFS on regular basis
Implemented Spark using Scala as part of tool acceptance test and worked with Cloudera to resolve issues in spark.
Developing the new jobs using Spark and evaluating their performance
Having good experience in writing Hive queries, Impala Queries.
Implemented UDF’s in hive integrated with Sentry enabled
Involved in performance tuning of the existing hive queries, Impala queries by analyzing query profiles.
Developed Kafka producer and consumers in Kerberos enabled Kafka cluster
Implemented Snapshot policy for hdfs files and replicated data between clusters using BDR(Backup and Disaster Recovery)
Troubleshooting the existing jobs while migrating them to higher version
Having good understanding of security implementation MIT Kerberos, Apache Sentry
Actively debugging the application teams P1, P2 issues on the clusters.
Developed the utility scripts used for application development.
Evaluated multiple projects and provided consulting work for various internal teams on implementation of a Big data projects.
Developed Scripts for Kafka Metrics by leveraging the JMX API
Collaborated with application owners and other engineers to define requirements to design, build and tune complex solutions.
Involved in performance tuning of the existing hive queries and map reduce jobs, Impala queries by analyzing query profiles.
Having experience in trouble shooting the issues in multitenant cluster by analyzing diagnostic logs
Working with Cloudera on recommendations for future enhancements based on the issues faced by the tenants in HaaS.
Analyze Performance of queries running on Impala.
Worked on defining standards and guidelines for Hadoop and Hive
Evaluated multiple projects and provided consulting work for various internal teams on boarding them to HaaS Cluster.

Confidential

Hadoop developer

Responsibilities:

Developed ingestion jobs for flat files and RDBMS data into hadoop data lake using the custom ingestion tools Data Movement Frame work, BDRE(wipro)
Hands on experience in developing Map Reduce jobs according to the business use cases.
Experienced in managing and reviewing the Hadoop log files.
Exported data using Sqoop from HDFS to Teradata on regular basis.
Developed Hive queries, Hive UDF’s according to business requirement
Developed PIG Latin scripts for the analysis of semi structured data. Involved in debugging PIG scripts
Involved in designing data modeling for hive/cassandra tables
Converting map reduce/pig/jobs into Spark jobs using functional programming language SCALA
Transforming the Hive jobs into Spark SQL
Developing the new jobs using Spark and evaluating their performance
Troubleshooting the existing jobs while migrating them to higher version
Developed pig scripts to preprocess data before moving the data into final tables
Developed Pig scripts to process the data from different data sets and generating the aggregating results
Designed OOZIE jobs to import the data into hive tables using the Sqoop
Implemented Slowly changing dimensions type 1 using the Pig scripts
Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
Implemented Slowly changing dimensions type 2 using the Map reduce
Involved in performance tuning of the existing hive queries and map reduce jobs, pig scripts
Responsible to manage data coming from different sources
Developing Scripts and Batch Job to schedule various Hadoop Program.
Involved in managing and reviewing Hadoop log files.
Experienced in defining job flows for business use cases
Involved in creating the workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.
Involved in Design, develop Hive Data model, loading with data and writing Java UDF for Hive
Developed Pig Latin scripts and used Pig as ETL tool for transformations, event joins, and filter.
Responsible for performing peer code reviews

Environment: Map Reduce, HDFS, Hive, Pig, Sqoop, SCALA, HTML, XML, SQL, MySQL J2EE, Eclipse, SPARK Core, SPARK SQL,OOZIE, CASSANDRA, Oracle, MSSQL,CORE JAVA, Play Framework, Horton works HDP 2.2,ORC,SNAPPY,PARQUET,AVRO,Cloudera

Confidential

Hadoop developer

Responsibilities:

Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing
Importing and exporting data into HDFS and Hive using Sqoop Jobs
Extracted files from Oracle DB through Sqoop and placed in HDFS and processed Developed Hive queries, Hive UDF’s according to business requirement
Developed PIG Latin scripts for the analysis of semi structured data. Involved in debugging PIG scripts
Involved in designing data modeling for hive/cassandra tables
Troubleshooting the existing jobs while migrating them to higher version
Developed pig scripts to preprocess data before moving the data into final tables
Load and transform large sets of structured, semi structured and unstructured data
Supported Map Reduce Programs those are running on the cluster
Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions
Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs
Utilized Java and MySQL from day to day to debug and fix issues with client processes
Managed and reviewed log files
Implemented partitioning, dynamic partitions and buckets in HIVE
Effective coordination with offshore team and managed project deliverable on time.
Worked on QA support activities, test data creation and Unit testing activities.
Responsible for creatingHivetables, loading the structured data resulted from Map Reduce jobs into the tables and writinghivequeries to further analyze the logs to identify issues and behavioral patterns.
Involved in running real-time processing using STORM.
UsedHiveto analyses data ingested intoHbaseby usingHive-Hbaseintegration and compute various metrics for reporting on the dashboard.
Developed a data pipeline using KAFKA to store data into HDFS.

Environment: JDK1.6, Hive, Pig, Map reduce, Flume, Cassandra, Oracle 9i, YARN, Hadoop, Hbase, Spark core, Spark SQL, Map Reduce, HDFS, Pig, Sqoop, Flume, Play framework, XML, SQL, MySQL J2EE, Eclipse, CDH5(Cloud era distribution)

Confidential

Java developer

Responsibilities:

Enhanced, Implemented and Supported Product using java, ADF,JSF,Shell script.
Developed new UI screens for TCG module using ADF adhering to Fusion standards.
Developed new UI screens using the ADF Framework (Jsff, Jsp and task flows).
Developed shell script to deploy the Application in server.
Developed new UI screens using the ADF Framework (Jsff, Jsp and task flows).
Integrated UI screens to ADF business components.
Followed best practice of tracking the defects via Quality Centre.
Have worked on code development in java, Involved in Analysis, coding,
enhancement of application.
Analysis and Bug Fixing
Tracking the Defects in the QC and by interacting with QA team effectively
Working on AGILE Developing environment.

Confidential

Software Engineer

Responsibilities:

Involved in development and maintenance of Product development and fixed the issues.
Implemented new Functional module using the J2EE and customized framework (OA).
Developed new screens for using JSP and Servlets.
Customized Business module using the EJB and JAVA.
Involved in SCRUM meetings and developed and fixed the issues.
Developed Framework Manager Modelsand analyzed those models in analysis studio.
Fixed the standard issues and client generated issues.
Involved in maintaining and developing the metadata model using Framework Manager.
Installing and configuring Applications.

Environment: Java, Open Architecture, Linux, Core Java (OOPs and collections), J2EE Framework, JSP, Servlets, ANT, MAVEN, GIT, Java Script, Shell scripting

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship