Hadoop Engineer Resume
SUMMARY
- Over all 8 Years of IT experience which includes Hadoop Eco system components, Java development.
- Having experience in Application Development in Java, Hadoop and its ecosystem - HDFS, Map Reduce, Hive, Sqoop, Spark, Impala, Sentry, Flume, Yarn, OOZIE, Zookeeper.
- Experience in working with Hadoop distribution Cloudera (CDH5) distributions and Horton works.
- Imported/Exported data using Sqoop in between RDBMS and HDFS on regular basis
- Having good experience in writing Hive queries, Impala Queries for generating reports
- Implemented Utility UDF’s in hive integrated with Sentry enabled for business analysts used in generating reports
- Designed flume agents configuration and implementing them in cluster for ingesting data from external sources into HDFS
- Developed Kafka producer and consumers in Kerberos enabled Kafka cluster
- Defined work flows using Oozie and troubleshooting issues in production.
- Implemented business functionality using spark SQL jobs in functional programming language SCALA
- Implemented workflow for ETL process and testing/trouble shooting them
- Having good understanding of security implementation MIT Kerberos, Apache Sentry and integration with Hadoop components
- Having experience in trouble shooting the issues in multitenant cluster by analyzing diagnostic logs
- Having hands on experience in planning discovery cluster, analytical cluster, Compute only Cluster
- Hands on experience in developing Map Reduce jobs according to the business use cases.
- Developed ingestion jobs for flat files and RDBMS data into hadoop data lake using the custom tools Data Movement Frame work, BDRE(Wipro)
- Worked on evaluating frameworks like Spark, Spark Streaming, Kafka, Flume, and other tools to empower tenants of HAAS Clusters.
- Worked on defining standards and guidelines for Hadoop and Hive
- Evaluated multiple projects and provided consulting work for various internal teams on boarding them to HaaS Cluster.
- Developed Data mover utility using shell scripts which will help Haas tenants to move data between prod to UAT, Dev.
- Involved in performance tuning of the existing hive queries and map reduce jobs, Impala queries by analyzing query profiles.
- Involved in in-room/telephonic Scrum meetings to gather requirements and analyzing the requirements and developments.
- Experience in different operating Systems UNIX, LINUX, and WINDOWS.
- Good exposure to databases like Oracle, MSSQL, MySQL
- Worked on HDFS Encryption POC to support PCI guidelines on the cluster.
- Developed data ingestion tools DMF using java play framework
- Developed custom Input format for reading data from XML file using java in Map reduce
- Writing blog on Hadoop eco system components and core java confidential
- Having working experience on template based platform (Open Architecture Framework) which is built using J2EE architecture and standards.
- Strong troubleshooting and production support skills and interaction abilities with end users
- Experience in problem solving, analysis, implementation, installation, and configuration skills.
- Good interpersonal skills, commitment, result oriented, hard working with a quest and zeal to learn new technologies and undertake challenging tasks. Excellent team member with strong communication skills and capable of meeting set Deadlines.
PROFESSIONAL EXPERIENCE
Programming Skills: Core Java (OOPs and collections), J2EE Framework, Linux Shell Script, JDBC,SCALA
Big data: HDFS, YARN, Spark, Mapreduce, SQOOP, HIVE, PIG, Hbase, OOZIE, Zookeeper, Kafka, SPARK,Impala
Data Base: Oracle 10g (SQL/PLSQL), MSSQL,MYSQL
Design Patterns: Singleton, Factory, MVC.
Secondary Skills: ANT, Maven
Version Control system: SVN, GIT, ADE
Scripting Languages: Shell scripting
IDE: Eclipse/My Eclipse, JDeveloper
Operating Systems: Windows, Linux, UNIX
Domains: Banking, SCE (Supply Chain Execution), Oracle Fusion GRC, Telecom
PROFESSIONAL EXPERIENCE
Confidential
Hadoop Engineer
Responsibilities:
- Developed Data mover utility using shell scripts which will help Haas tenants to move data between prod to UAT, Dev.
- Resolved issues with flume and advised flume Agent configuration changes.
- Implemented business functionality using Spark core and Spark SQL.
- Transforming the Hive jobs into Spark SQL
- Imported/Exported data using Sqoop in between RDBMS and HDFS on regular basis
- Implemented Spark using Scala as part of tool acceptance test and worked with Cloudera to resolve issues in spark.
- Developing the new jobs using Spark and evaluating their performance
- Having good experience in writing Hive queries, Impala Queries.
- Implemented UDF’s in hive integrated with Sentry enabled
- Involved in performance tuning of the existing hive queries, Impala queries by analyzing query profiles.
- Developed Kafka producer and consumers in Kerberos enabled Kafka cluster
- Implemented Snapshot policy for hdfs files and replicated data between clusters using BDR(Backup and Disaster Recovery)
- Troubleshooting the existing jobs while migrating them to higher version
- Having good understanding of security implementation MIT Kerberos, Apache Sentry
- Actively debugging the application teams P1, P2 issues on the clusters.
- Developed the utility scripts used for application development.
- Evaluated multiple projects and provided consulting work for various internal teams on implementation of a Big data projects.
- Developed Scripts for Kafka Metrics by leveraging the JMX API
- Collaborated with application owners and other engineers to define requirements to design, build and tune complex solutions.
- Involved in performance tuning of the existing hive queries and map reduce jobs, Impala queries by analyzing query profiles.
- Having experience in trouble shooting the issues in multitenant cluster by analyzing diagnostic logs
- Working with Cloudera on recommendations for future enhancements based on the issues faced by the tenants in HaaS.
- Analyze Performance of queries running on Impala.
- Worked on defining standards and guidelines for Hadoop and Hive
- Evaluated multiple projects and provided consulting work for various internal teams on boarding them to HaaS Cluster.
Confidential
Hadoop developer
Responsibilities:
- Developed ingestion jobs for flat files and RDBMS data into hadoop data lake using the custom ingestion tools Data Movement Frame work, BDRE(wipro)
- Hands on experience in developing Map Reduce jobs according to the business use cases.
- Experienced in managing and reviewing the Hadoop log files.
- Exported data using Sqoop from HDFS to Teradata on regular basis.
- Developed Hive queries, Hive UDF’s according to business requirement
- Developed PIG Latin scripts for the analysis of semi structured data. Involved in debugging PIG scripts
- Involved in designing data modeling for hive/cassandra tables
- Converting map reduce/pig/jobs into Spark jobs using functional programming language SCALA
- Transforming the Hive jobs into Spark SQL
- Developing the new jobs using Spark and evaluating their performance
- Troubleshooting the existing jobs while migrating them to higher version
- Developed pig scripts to preprocess data before moving the data into final tables
- Developed Pig scripts to process the data from different data sets and generating the aggregating results
- Designed OOZIE jobs to import the data into hive tables using the Sqoop
- Implemented Slowly changing dimensions type 1 using the Pig scripts
- Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
- Implemented Slowly changing dimensions type 2 using the Map reduce
- Involved in performance tuning of the existing hive queries and map reduce jobs, pig scripts
- Responsible to manage data coming from different sources
- Developing Scripts and Batch Job to schedule various Hadoop Program.
- Involved in managing and reviewing Hadoop log files.
- Experienced in defining job flows for business use cases
- Involved in creating the workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.
- Involved in Design, develop Hive Data model, loading with data and writing Java UDF for Hive
- Developed Pig Latin scripts and used Pig as ETL tool for transformations, event joins, and filter.
- Responsible for performing peer code reviews
Environment: Map Reduce, HDFS, Hive, Pig, Sqoop, SCALA, HTML, XML, SQL, MySQL J2EE, Eclipse, SPARK Core, SPARK SQL,OOZIE, CASSANDRA, Oracle, MSSQL,CORE JAVA, Play Framework, Horton works HDP 2.2,ORC,SNAPPY,PARQUET,AVRO,Cloudera
Confidential
Hadoop developer
Responsibilities:
- Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing
- Importing and exporting data into HDFS and Hive using Sqoop Jobs
- Extracted files from Oracle DB through Sqoop and placed in HDFS and processed Developed Hive queries, Hive UDF’s according to business requirement
- Developed PIG Latin scripts for the analysis of semi structured data. Involved in debugging PIG scripts
- Involved in designing data modeling for hive/cassandra tables
- Troubleshooting the existing jobs while migrating them to higher version
- Developed pig scripts to preprocess data before moving the data into final tables
- Load and transform large sets of structured, semi structured and unstructured data
- Supported Map Reduce Programs those are running on the cluster
- Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions
- Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs
- Utilized Java and MySQL from day to day to debug and fix issues with client processes
- Managed and reviewed log files
- Implemented partitioning, dynamic partitions and buckets in HIVE
- Effective coordination with offshore team and managed project deliverable on time.
- Worked on QA support activities, test data creation and Unit testing activities.
- Responsible for creatingHivetables, loading the structured data resulted from Map Reduce jobs into the tables and writinghivequeries to further analyze the logs to identify issues and behavioral patterns.
- Involved in running real-time processing using STORM.
- UsedHiveto analyses data ingested intoHbaseby usingHive-Hbaseintegration and compute various metrics for reporting on the dashboard.
- Developed a data pipeline using KAFKA to store data into HDFS.
Environment: JDK1.6, Hive, Pig, Map reduce, Flume, Cassandra, Oracle 9i, YARN, Hadoop, Hbase, Spark core, Spark SQL, Map Reduce, HDFS, Pig, Sqoop, Flume, Play framework, XML, SQL, MySQL J2EE, Eclipse, CDH5(Cloud era distribution)
Confidential
Java developer
Responsibilities:
- Enhanced, Implemented and Supported Product using java, ADF,JSF,Shell script.
- Developed new UI screens for TCG module using ADF adhering to Fusion standards.
- Developed new UI screens using the ADF Framework (Jsff, Jsp and task flows).
- Developed shell script to deploy the Application in server.
- Developed new UI screens using the ADF Framework (Jsff, Jsp and task flows).
- Integrated UI screens to ADF business components.
- Followed best practice of tracking the defects via Quality Centre.
- Have worked on code development in java, Involved in Analysis, coding,
- enhancement of application.
- Analysis and Bug Fixing
- Tracking the Defects in the QC and by interacting with QA team effectively
- Working on AGILE Developing environment.
Confidential
Software Engineer
Responsibilities:
- Involved in development and maintenance of Product development and fixed the issues.
- Implemented new Functional module using the J2EE and customized framework (OA).
- Developed new screens for using JSP and Servlets.
- Customized Business module using the EJB and JAVA.
- Involved in SCRUM meetings and developed and fixed the issues.
- Developed Framework Manager Modelsand analyzed those models in analysis studio.
- Fixed the standard issues and client generated issues.
- Involved in maintaining and developing the metadata model using Framework Manager.
- Installing and configuring Applications.
Environment: Java, Open Architecture, Linux, Core Java (OOPs and collections), J2EE Framework, JSP, Servlets, ANT, MAVEN, GIT, Java Script, Shell scripting