We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

3.00/5 (Submit Your Rating)

San Jose, CA

SUMMARY:

  • Over 8 years of experience in all phases of Software Development Life Cycle that includes Requirements Gathering/Analysis, Design, Development, Integration, Documentation, Testing, Build, Deployment, of Web and Enterprise applications and Implementation of Big data solutions using Hadoop.
  • 4 years of experience in building solutions for Big data problems using HDFS, Map Reduce, PIG, Hive, Sqoop, Zoo keeper, Flume, Oozie.
  • Experience in using various Hadoop components such as Map Reduce, Pig, Hive, Zookeeper, HBase, Sqoop, Oozie and Flume, Storm for data storage and streaming analysis.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Map Reduce, HDFS, HBase, Oozie, Tez, Hive, Sqoop, Pig, Zookeeper and Flume.
  • Experience in importing and exporting data using Sqoop from HDFS/Hive/HBase to Relational Database Systems and vice - versa.
  • Proficient in HiveQL, PIG and SQL scripting and Query optimizations.
  • Experience in using Kafka as a distributed publisher-subscriber messaging system.
  • Strong experience working with real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Flume and Hive
  • Development experience with Big Data/NoSQL platforms, such as MongoDB and Cassandra.
  • Worked and migrated RDMBS databases into different NoSQL database.
  • Have a hand on experience on Data Warehousing experience on Extraction, Transformation and Loading (ETL) processes using Talend Open Studio for Data Integration.
  • Hands on experience in performing data cleaning, pre processing using Java and Talend data preparation tool.
  • Good Knowledge over job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Worked on implementing enterprise applications built on top of search engines like Solr and Elasticsearch.
  • Excellent Programming skills at a higher level of abstraction using Scala and Spark.
  • Good working experience in PySpark and Spark Sql.
  • Experience with databases like DB2, Oracle 8g, MySQL and SQL Server.
  • Proficient in using various IDEs like Eclipse and NetBeans.
  • Expertise in design and development of Web Applications involving J2EE technologies with Java, Spring, EJB, AJAX, Hibernate, JSP, Struts, PL/SQL, Web Services, XML, JMS and JDBC.
  • Familiar with data architecture including data ingestion pipeline design, data modeling and data mining and advanced data processing.
  • Extensive experience in solving analytical problems using quantitative approaches using machine learning methods in R.
  • Excellent problem solving skills and the ability to rapidly absorb new skills and adapt to new organizational contexts.

TECHNICAL SKILLS:

Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Tez Hive, Pig,Sqoop, Oozie, Flume, Cassandra, Spark, Kafka, Apache Mahout, Solr

Programming Languages: Core Java, Python, Scala, SQL, PL/SQL, HiveQL, XML, R, C++.

Databases: SQL Server, Oracle, DB2, SQL, MongoDB, Teradata.

Tools: Eclipse, NetBeans, Tableau, Rational Rose, QMF, Talend, Endevor, Toad

Operating Systems: Unix, Linux, Windows, MVS, OS/390, Z/OS

Methodologies: Agile, Waterfall

J2EE Technologies: Spring, Struts, Hibernate, JMS, JNDI, Web Services, Servlet 2.0 and JAXB

Scripting: Spring, Struts, Hibernate, JMS, JNDI, Web Services, Servlet 2.0 and JAXB

PROFESSIONAL EXPERIENCE:

Confidential, San Jose, CA

Senior Hadoop Developer

Responsibilities:

  • Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinating jobs in Oozie.
  • Handled complex Hive queries and UDFs.
  • Involved in reading multiple data formats on HDFS using PySpark.
  • Worked in converting Hive/SQL queries into Spark transformations using Spark RDDs and Python.
  • Developed multiple POCs using PySpark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Experienced in analyzing the SQL scripts and designed the solution to implement using PySpark.
  • Involved in loading data from UNIX file system to HDFS
  • Involved in extracting the data from Teradata into HDFS using Sqoop
  • Worked on migrating MapReduce programs into Spark transformations using Spark and Scala.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
  • Responsible in analysis, design, testing phases and responsible for documenting technical specifications
  • Worked on Talend Administrator Console (TAC) for scheduling jobs and data integration.
  • Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
  • Involved in using HCATALOG to access Hive table metadata from Pig code.
  • Good knowledge in partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Experienced in developing regression models on R for the statistical analysis.
  • Involving in moving of data from other databases to Cassandra with basic knowledge of Cassandra Data Modeling.
  • Worked on the core and Spark SQL modules of Spark extensively.
  • Expertise in running Hadoop streaming jobs to process terabytes data.
  • Experience in importing the real time data to Hadoop using Kafka and implemented the Oozie job.

Confidential, Bellevue, WA

Hadoop Developer

Responsibilities:

  • Gathering the requirements from client, coordinating with Onsite, Offshore and Client teams.
  • Experience in Hortonworks Distribution Platform 2.2, MapReduce, PIG, Hive, Sqoop, Control-M, HBase and Strom.
  • Worked with large data sets in a pretty large cluster.
  • Great knowledge on data mining and data warehousing.
  • Worked with RabbitMQ with regards to messaging system.
  • Worked on data preparation and data processing which needs to be loaded into HBase.
  • Experienced on loading the data into Hive, and retrieving the data from Hive tables using HiveQL.
  • Worked on loading the raw data extracts into Hive tables.
  • Worked on creating external and managed tables in Hive.
  • Designed HBase Schema, created HBase tables and loaded the historical data into HBase tables.
  • Worked on loading data into HBase tables using HBase Put method and HBase Bulkloading methods.
  • Daily updated the HBase tables using Oozie.
  • Worked on HBase and Hive integration and loaded the data into HBase tables.
  • Worked on building dashboards for visualizing it to higher level of hierarchy using Tableau.
  • Worked on project related documentation in Confluence.
  • Experience in offshore and onsite coordination.

Confidential, Boston, MA

Hadoop Developer

Responsibilities:

  • Involved in Requirement gathering, Business Analysis and translated business requirements into Technical design in Hadoop and Big Data
  • Importing and exporting data into HDFS from database and vice versa using Sqoop
  • Written hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
  • Written Map Reduce code to process and parsing the data from various sources and storing parsed data into HBase and Hive using HBase-Hive Integration.
  • Involved in creating hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Involved in creating workflow to run multiple hive and Pig Jobs, which run independently with time and data availability
  • Involved in developing shell scripts and automated data management from end to end integration work
  • Used Pig as a ETL tool to do Transformations, even joins and some pre-aggregations before storing data into HDFS
  • Developed Map Reduce program for parsing and loading into HDFS information.
  • Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying.
  • Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Experienced in using Zookeeper and OOZIE Operational Services for coordinating the cluster and scheduling workflows.
  • Using Hbase to store majority of data which needs to be divided based on region.
  • Developed Map Reduce programs for data analysis and data cleaning

Confidential

Java/JEE Developer

Responsibilities:

  • Involved in various phases of Software Development Life Cycle.
  • Interacting with all the modules of the project, gathered the batch related requirements and designed accordingly.
  • Used Eclipse as IDE for application development.
  • Created and maintained the configuration of the Spring Application Framework.
  • Involved in writing Spring Configuration XML files that contains declarations and other dependent objects declaration.
  • Designed and developed GUI using JSP, HTML, DHTML and CSS.
  • Worked with JMS for messaging interface.
  • Developed UI using JAVA and used Oracle 10g as backend support through TOAD.
  • Extensively used log4j for logging the log files.
  • Used Subversion as the version control system.
  • Responsible for understanding the scope of the project and requirement gathering.
  • Used Tomcat web server for development purpose.
  • Involved in creation of Test Cases for JUnit Testing.
  • Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
  • Used CVS as configuration management tool for code versioning and release.
  • Developed application using Eclipse and used build and deploy tool as Maven.
  • Used Log4J to print the logging, debugging, warning, info on the server console.
  • Performed unit testing using JUnit.
  • Involved in scheduling all the batch tasks to run in different environments.
  • Used JMS to send and receive messages in the form of XML’s.
  • Configured the Data source to access the Oracle database using JDBC Provider for Oracle in the Application server.
  • Involved in the maintenance and production support.

We'd love your feedback!