We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

0/5 (Submit Your Rating)

San Jose, CA

SUMMARY

  • IT professional with 9+ years of experience in Analysis, Design, Development, Integration, Testing and maintenance of various applications using JAVA /J2EE technologies along with 6+ years of BigData/Hadoop experience
  • Expertise in big data architecture with Hadoop File system and its eco system tools MapReduce, HBase, Hive, Pig, Zookeeper, Oozie, Flume, Avro, Impala and Apache spark.
  • Strong experience in developing, debugging and tuning Map Reduce jobs in Hadoop environment.
  • Experienced in installation, configuration, Deploying and managing Hadoop Clusters using Apache, CLOUDERA, MAPR, HORTONWORKS distributions
  • Excellent understanding of Hadoop architecture and various components such as HDFS, MapReduce, Job Tracker, Task Tracker, Name Node, Secondary Name Node, Backup Node, Standby Node, Data Node, Resource Manager, Node Manager and MapReduce programming.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, Job Tracker, Task Tracker, YARN, Pig, Hive, Sqoop, Flume, Oozie, Zookeeper.
  • Experienced in Hadoop Cluster setup, monitoring and managing Hadoop clusters using Cloudera Manager, AWS - EMR.
  • Experienced in working with Hadoop clusters using Cloudera and Horton works distributions
  • Expertise in developing PIG and HIVE scripts for data analysis
  • Hands on experience in data mining process, implementing complex business logic and optimizing the query using HiveQL and controlling the data distribution by partitioning and bucketing techniques to enhance performance.
  • Experience working with Flume to handle large volume of streaming data.
  • Good working knowledge on Hadoop hue ecosystems.
  • Extensive experience in migrating ETL operations into HDFS systems using Pig Scripts.
  • Good knowledge in evaluating big data analytics libraries (MLlib) and use of Spark-SQL for data exploratory.
  • Hands-on experience in using relational databases like Oracle, MySQL, PostgreSQL and MS-SQL Server.
  • Hands on development experience with RDBMS, including writing SQL queries, PLSQL, views, stored procedure, triggers, etc.
  • Participated in all Business Intelligence activities related to data warehouse, ETL and report development methodology
  • Expertise in Waterfall and Agile software development model & project planning using Microsoft Project Planner.
  • Highly motivated, dynamic, self-starter with keen interest in emerging technologies.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice versa.

TECHNICAL SKILLS

BigData Technologies: HDFS, MapReduce, Hive, Hcat, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, Zookeeper, Kafka, Impala, Apache Spark

Hadoop Distributions: Cloudera (CDH4/CDH5),Horton Works

Languages: Java, C, SQL, PYTHON,PL/SQL,PIG-Latin, HQL

IDE Tools: Eclipse, NetBeans, RAD

Framework: Hibernate, Spring, Struts, Junit

Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP,JSON, XML, XHTML, JSF, Angular JS

Web Services: SOAP,REST, WSDL, JAXB, and JAXP

Operating Systems: Windows (XP,7,8,10), UNIX, LINUX, Ubuntu, CentOS

Application Servers: Jboss, Tomcat, Web Logic, Web Sphere

Reporting Tools /ETL Tools: Tableau,Powerview for Microsoft Excel,Informatica

Databases: Oracle 8i/9i/10g/11g, MySQL, DB2, Derby,PostgreSQL,No-SQL Database (Hbase,Cassandra)

Methodologies: Agile, Scrum, Waterfall, Iterative, Spiral

PROFESSIONAL EXPERIENCE

Confidential - San Jose, CA

Sr. Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Written multiple MapReduce programs in Java for Data Analysis
  • Wrote MapReduce job using Pig Latin and Java API
  • Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files
  • Developed pig scripts for analyzing large data sets in the HDFS.
  • Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume
  • Designed and presented plan for POC on impala.
  • Experienced in migrating HiveQL into Impala to minimize query response time.
  • Knowledge on handling Hive queries using Spark SQL that integrate with Spark environment.
  • Implemented Avro and parquet data formats for apache Hive computationsto handle custom business requirements.
  • Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
  • Performed extensive Data Mining applications using HIVE.
  • Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS using autosys and Oozie coordinator jobs.
  • Experienced in analyzing and Optimizing RDD’s by controlling partitions for the given data
  • Good understanding on DAG cycle for entire spark application flow on Spark application WebUI
  • Experienced in writing live Real-time Processing using Spark Streaming with Kafka
  • Developed custom mappers in python script and Hive UDFs and UDAFs based on the given requirement
  • Used Hive QL to analyze the partitioned and bucketed data and compute various metrics for reporting
  • Experienced in querying data using Spark SQL on top of Spark engine.
  • Experience in managing and monitoring Hadoop cluster using Cloudera Manager
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop

Environment: CDH, Java(JDK1.7), Hadoop, MapReduce, HDFS, Hive, Sqoop, Flume, HBase, Cassandra, Pig, Oozie, Kerberos, Scala, Spark, SparkSQL, Spark Streaming, Kafka, Linux, AWS, Shell Scripting, MySQL Oracle 11g, PL/SQL, SQL*PLUS

Confidential - Harrisburg, PA

Hadoop Developer

Responsibilities:

  • Created Hive Tables, loaded retail transactional data from Teradata using Sqoop.
  • Loaded home mortgage data from the existing DWH tables (SQL Server) to HDFS using Sqoop.
  • Wrote Hive Queries to have a consolidated view of the mortgage and retail data. Also developed Solr bolt to write and index documents for lightning fast search.
  • Data is loaded back to the Teradata for the reporting and for the business users to analyze and visualize the data using Datameer.
  • Orchestrated hundreds of sqoop scripts, pig scripts, hive queries using oozie workflows and sub-workflows.
  • Loaded the load ready files from mainframes to Hadoop and files were converted to ASCII format.
  • Configured Hive Server (HS2) to enable analytical tools like Tableau, Qlikview and SAS to interact with Hive tables.
  • Developed pig scripts for replacing the existing home loans legacy process to the Hadoop and the data is back fed to retail legacy mainframes systems.
  • Developed MapReduce programs to write data with headers and footers and Shell scripts to convert the data to fixed-length format suitable for Mainframes CICS consumption.
  • Used Hive to find correlations between customer’s browser logs in different sites and analyzed them to build risk profile.
  • End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines against very large data sets.
  • Exposure to burn-up, burn-down charts, dashboards, velocity reporting of sprint and release progress.

Environment: Hive, PIG, HDFS, Java Map-Reduce, Solr, Core Java, UNIX, Eclipse, Oozie, Sqoop, Flume

Confidential

Hadoop Developer

Responsibilities:

  • Started with POC on Cloudera Hadoop converting one small, medium, complex legacy system intoHadoop.
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Experienced in loading data from UNIX file system to HDFS.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
  • Integrate data from various sources intoHadoopand Move data fromHadoopto other databases using Sqoop import and Export.
  • Use Cloudera manager to pull metrics on various cluster features like JVM, Running Map and reduce tasks.
  • Backup configuration and Recovery from a Namenode failure.
  • Decommissioning and commissioning the Node on running Hadoop cluster.
  • Wrote SQL queries to load the required data to HDFS.
  • Experienced in managing and reviewingHadooplog files.
  • Created user accounts and given users the access to the Hadoop cluster.
  • Worked with application teams to install operating system andHadoopupdates, patches, version upgrades as required.

Environment: Cloudera Distribution, CDH4, FLUME, HBase, HDFS, Pig, R, MapReduce, Hive, Oozie and Zookeeper.

Confidential

Java Developer

Responsibilities:

  • Worked on Requirement analysis, gathered all possible business requirements from end users and business Analysts
  • Involved in creation of UML diagrams like Class, Activity, and Sequence Diagrams using modelling tools of IBM Rational Rose
  • Worked with coreJavacode extensively using interfaces and multi-threading techniques
  • Involved in production support and documenting the application to provide training and knowledge transfer to the user
  • Used Log4j for logging mechanism and developed wrapper classes to configure the logs
  • Used JUnit and Test cases for testing the application modules
  • Developed and configured theJavabeans using SpringMVC framework
  • Developed the application using Rational Team Concert and worked under Agile Environment
  • Developed SQL stored procedures and prepared statements for updating and accessing data from database
  • Also used C++ to create some libraries used in the application

Environment: C++,Java, JDBC, Servlets, JSP, Struts, Eclipse, Oracle 9i, Apache Tomcat, CVS, JavaScript, Log4J

We'd love your feedback!