We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Newyork, NY

PROFESSIONAL SUMMARY:

  • 9 years of Experience in Java/J2EE, Big Data and cloud based applications spanning across technologies and business domains.
  • 4+ years of design and development experience in Big Data on AWS - EMR, Spark, HDFS & YARN, Redshift.
  • Experience working withHBase and knowledge onCassandra, MongoDB NoSQL
  • Experience in Eco systemsSpark Streaming, SparkSQL, Flume, Scala, MapReduce, Hive, Impala, Pig, Sqoop, Apache Drill and Oozie.
  • Experience in Data Analytics by writing real-time processing data using Spark Streaming with Kafka, Spark SQL and Flume written in Scala and Python.
  • Extensive experience working with different file formats, compression techniques, setting up batch jobs, ETL/ELT pipelines, automation using Chef on AWS.
  • Having good experience in creating ETL data pipelines by using MapReduce, Hive/Impala, Pig, Sqoop and UDF's (Hive, Pig) in Java and Python.
  • Oracle expertise includes good understanding of in depth database architecture and ERD's Oracle programming evaluation of application products, database schema design and development.
  • Expertise in writing SQL, PL/SQL to integrate of complex OLTP and OLAP database models and data marts, worked extensively on Oracle, SQL SERVER, and DB2.
  • Experience in ETL tools such as Informatica Power Center, Oracle PL/SQL, and Talend.
  • Experience in Full SDLC Cycle,waterfall and Agile (XP and Scrum)
  • Experience in developing the Shell, Perl and Python Scripts (Linux & Unix) to execute jobs automated to process extraction, data integration and report generation.
  • Experience in working on version control systems like CVS, SVN, Git and using build tools for automated continuous integration such as ANT, Jenkins and Maven.
  • Hands on experience in working with Business Intelligence for data visualization and generated BI Analytic reports with specialization on Tableau.
  • Articulate in written and verbal communication along with strong interpersonal, analytical, and organizational skills.
  • Highly motivated team player with the ability to work independently and adapt quickly to new and emerging technologies.
  • Creatively communicate and present models to business customers and executives, utilizing a variety of formats and visualization methodologies.

TECHNICAL SKILLS:

Hadoop Ecosystem/ Big Data: HDFS, MapReduce, Mahout, HBase, Pig, Hive, Sqoop, Flume, Oozie, Zookeeper, Apache spark, Splunk, YARN, Avro, Impala.

Frameworks in Hadoop: Spark, Kafka, Cloudera CDHs, Hortonworks HDPs, Hadoop1.0, Hadoop2.0

Databases, Application Servers &NoSQL Databases: Oracle 11g, PL/SQL MySQL, Microsoft SQL-Server 2000, PostgreSQL, Cassandra, MongoDB, HBase.

JAVA & J2EE Technologies: Core Java, Hibernate, Spring framework, JSP, Servlets, Java Beans, JDBC, EJB 3.0, Java Sockets & Java Scripts. jQuery.

Amazon Web Services (AWS): Elastic Map Reduce(EMR), Amazon EC2, Amazon S3.

Languages: C, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts, Shell script, Java Script, XML, HTML, XHTML, Python, Scala, HTML5, AJAX, jQuery, CSS, JavaScript, AngularJS.

Source Code Control: Github, CVS, SVN.

IDE & Build Tools: Eclipse, Net Beans, Spring Tool Suite, Hue (Cloudera specific), Toad, Maven.

Web/Application servers: Apache Tomcat, WebLogic, Web Logic, IBM Web Sphere

Analysis/Reporting: Ganglia, Custom Shell scripts, Tableau, ETL(Informatica)

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

Operating Systems: Sun Solaris, HP-UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

SDLC: Agile, Scrum, Water fall Models.

PROFESSIONAL EXPERIENCE:

Confidential, NewYork, NY

Hadoop Developer

Responsibilities:

  • Worked in migrating the map reduce jobs into Spark Jobs and Used Spark SQL and Data frames API to load structured and semi structured data into Spark Clusters.Hands on expertise in running the SPARK & SPARK SQL on AMAZON ELASTIC MAPREDUCE(EMR).
  • Implemented SPARK batch jobs on AWS instances through Amazon Simple Storage Service (Amazon S3).
  • Used Spark Streaming on Scala to construct learner data model from sensor data using MLLib.
  • Developing the Tasks and setting up the requirement environment through AWS for running Hadoop in cloud on various instances.
  • Involved in requirement and design phase to implement Streaming Lambda Architecture to use real time streaming using Spark and Kafka.
  • Implemented Parquet Columnar storage systems which serialize and store data by column, so that searches across large data sets and reads of large sets of data are highly optimized.
  • Extracted the data from Teradata into HDFS/Databases/Dashboards using SPARK STREAMING.
  • Involved in importing the data from RDBMS into HDFS using Sqoop Queries.
  • Worked on SPARK engine creating batch jobs with incremental load through KAFKA, FLUME, HDFS/S3, Sockets etc.
  • Productive implementation of DStreams on reresilient distributed dataset (RDD) through various windows also simultaneously update log files for the streams.
  • Extensive experience in Spark Streaming (version 2.0.0) through core Spark API running Scala, Java & Python Scripts to transform raw data from several data sources into forming baseline data.
  • Implemented various MapReduce Jobs in custom environments and updating them Hive tables by generating hive queries.
  • Good hands on experience in writing HQL statements as per the user requirements.
  • Implemented Cassandra connection with the Resilient Distributed Datasets (local and cloud)
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to ElasticMapReduce jobs.
  • Developed UDFs in Scala, Java & Python as and when necessary to use in PIG and HIVE queries
  • Experience in using Sequence files, RCFile, AVRO file formats for developing UDFs.
  • Developed Oozie workflow for scheduling and orchestrating the ETL process
  • Implemented authentication using Kerberos and authentication using Apache Sentry.

Environment: CDH5.8.2,Hive 1.0,, Map Reduce 2.6.0 (YARN), Flume 1.6.0, Pig 0.16.0, Spark 2.0.0, Impala 2.2.0, AWS, Oozie 4.1.0, Kafka,Cloudera Manager. CDH 5.8.2 is installed on top of AWS.

Confidential, Dublin, OH

Hadoop Developer

Responsibilities:

  • Closely worked with business analysts and senior level Architects to design and configure Master/slave architecture in testing, development and production using configuration tool puppet and Hortonworks distribution.
  • Data ingestion from large data sets from RDBMS into Hadoop edge node using Map Reduce andSqoop (vice-versa) and FTP shell Scripts.
  • Creating Hive tables with periodic backups, writing complex Hive/Impala queries to run on Impala.
  • Implemented partitioning, bucketing and worked on Hive, using file formats and compressions techniques with optimizations.
  • Creating impala views on top of Hive tables for faster access to analyze data
  • Developed Map Reduce Programs in Java for filtering out the unstructured data and also written custom input and output format classes onto HDFS storage layer.
  • Extensively involved in loading, filtering, transforming and combined data using customs UDFs and Generic UDF's, Pig loader and Storage classes from Piggybank by using Pig.
  • Involved in resolving performance issues in Pig and Hive with understanding of Map Reduce physical plan execution and using debugging commands to run code in optimized way.
  • Involved in Unit testing by analyzing Informatica mappings that compares with Pig Scripts that performs same set of transformed data as per business user requirements.
  • Developed scripts using MAVEN build tools in Jenkins to move source code from one environment to another environment and responsible for managing code in SVN distributed version control and the access control strategies.
  • Involved in regular commissioning and decommissioning of nodes in order to balance Hadoop cluster and also archiving of name node.
  • Involved in creating user interactive sheets and reports as per business requirements and written SQL Scripts to load the data in Qlikview application.

Environment: Hortonworks, HDFS, Core Java, Map Reduce 2.0.0 (YARN), Hive 0.13, Informatica Power Centre, HQL, Pig 0.14.0, Flume 1.4.0, Perl scripting, Python 2.7, UNIX, Oozie, Shell Scripting, Maven, Teradata, Qlikview.

Confidential, Nashville, TN

Hadoop Developer

Responsibilities:

  • Setting the cluster, configuration and maintenance, install components of the Hadoop ecosystem.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Stored data from HDFS to respective Hive tables for business analysts to conduct further analysis in identifying data trends.
  • Developed Hivead-hoc queries and filtered data in order to increase the effectiveness of the process execution by using functions like Joins, Group By, and Having.
  • Increased the time efficiency of the Hive QL using partitioning of data and reduced the time difference of executing the sets of data by applying the compression techniques like SNAPPY for Map-Reduce Jobs.
  • Created Hive Partitions for storing data for different trends under different partitions.
  • Connected the Hive tables to data analysis tools like Tableau for graphical representation of the trends.
  • Assisted project manager in problem shooting relevant to Hadoop technologies for data integration between different platforms like Sqoop-Sqoop, Hive-Sqoop, and Sqoop-Hive.

Environment: Hortonworks, Java 7, HBase, HDFS, MapReduce, Hadoop 2.0, Hive, Pig, Eclipse, Linux, Sqoop, MySQL, Agile, Kafka, Cognos.

Confidential, Champaign, IL

Java/Hadoop Developer

Responsibilities:

  • Design and creation of GUI screens using JSP, Servlets and HTML based on Struts MVC Framework.
  • Operated JDBC to access Database.
  • Manipulated JavaScript for client side validation.
  • Validations were performed using Struts Validation Framework.
  • Commit and Rollback methods were provided for transactions processing.
  • Designed and developed the action form beans and action classes and implemented MVC using Struts framework.
  • Written Oracle SQL Stored procedures, functions and triggers.
  • Developed both Session and Entity beans representing different types of business logic abstractions.
  • Maintained the server log document.
  • Performed Unit /Integration testing for the test cases.
  • Implemented and designed user interface for web based customer application.
  • Understanding business needs, analyzing functional specifications and map those to develop and designing MapReduce programs and algorithms.
  • Written Pig and Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data. Also have hand on Experience on Pig and Hive User Define Functions (UDF).
  • Execution of Hadoop ecosystem and Applications through Apache HUE.
  • Optimizing Hadoop MapReduce code, Hive/Pig scripts for better scalability, reliability and performance.

Environment: Java, JSP, HTML, CSS, Java Script, JQuery, Struts 2.0, MySQL, Oracle, Hibernate, JDBC, Eclipse, SQL Stored Procedures, Tomcat, Hive, Pig, Sqoop, Flume and Cloudera.

Confidential

Associate Java Developer

Responsibilities:

  • Responsible for the Requirement Analysis and Design of Smart Systems Pro (SSP)
  • Involved in Object Oriented Design (OOD) and Analysis (OOA).
  • Analysis and Design of the Object models using JAVA/J2EE Design Patterns in various tiers of the application.
  • Worked with Restful Web Services and WSDL.
  • Worked with Maven build tool to build the Project.
  • Involved in Coding JavaScript code for UI validation and worked on Struts validation frameworks.
  • Analyzing the Client Requirements and designing the specification document based on the requirements.
  • Worked on implementing directives and scope values using AngularJs for an existing webpage.
  • Familiar with the state-of-the-art standards, processes, design processes used in creating and designing optimal UI using Web 2.0 technologies like Ajax, JavaScript, CSS, and XSLT.
  • Involved in the Preparation of Program Specification and Unit Test Case Document.
  • Designed the Proto according to the Business requirements.
  • Developed the web tier using JSP, Struts MVC to show account details and summary.
  • Used Struts Tiles Framework in the presentation tier.
  • Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
  • Used AJAX for asynchronous communication with server
  • Utilized Hibernate for Object/Relational Mapping purposes for transparent persistence onto the SQL Server database.

Environment: Java, J2EE Servlet, JSP, JUnit, AJAX, XML, JSON, CSS, JavaScript, Spring,Struts, Hibernate, Eclipse, Apache Tomcat, and Oracle.

Confidential

Junior JAVA Developer

Responsibilities:

  • Worked on Design, Development and Support phases of Software Development Life Cycle (SDLC)
  • Worked on gathering requirements for data and use case development.
  • Reviewed the functional, design, source code and test specifications
  • Involved in developing the complete front end development using Java Script an
  • Used JDBC for accessing database, and used Data Transfer Object (DTO) patterns
  • Worked on documentation that meets with required standards. Also, monitored end-to-end testing activities.
  • Worked on different IDE's like Eclipse and Net Beans.
  • Good knowledge of database connectivity (JDBC) for databases like Oracle, MySQL, Microsoft SQL server.
  • Worked in various phases of SDLC such as gathering requirements, analysis and development.
  • Created user-friendly GUI interface and Web pages using HTML and JSP.
  • Designed and developed a web-based client using JSP, Java Script, HTML and XML.
  • Wrote SQL for JDBC prepared statements to retrieve the data from database.
  • Experience in work and supported the creation of database schema objects using databases like Oracle, SQL.

Environment: Java, JavaScript, J2EE, HTML, JSP, SQL, JSP, JDBC, MYSQL, Oracle, XML.

We'd love your feedback!