We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

0/5 (Submit Your Rating)

San Mateo, CA

SUMMARY

  • Overall 8 years of extensive experience in software product development, 3+ years of experience of developing large scale applications using Hadoop and Bigdata tools.
  • Hands on experience in designing, testing and deploying Map Reduce applications in a Hadoop Ecosystem.
  • Experienced in the Hadoop ecosystem components like Hadoop Map Reduce, Cloudera, Hortonworks, HBase, Spark, Oozie, Hive, Sqoop, Pig, Flume, Tez and Cassandra.
  • Good Programming experience with SQL, PL/SQL Database technologies and its relational databases including Oracle, Teradata, MS - SQL.
  • Experienced with NoSQL databases - HBase, MongoDB and Cassandra.
  • Good proficiency in Bigdata specific languages such as Python and Scala.
  • Good experience and working knowledge on AWS Cloud services which include but are not limited to EC2, S3, Redshift, DynamoDB, EMR, me&AM, SQS, SES, Lambda, VPC, CloudWatch, CloudFront.
  • Experience in using different Hadoop Distributions like Hortonworks and Cloudera and also experienced with working on AWS cloud environment.
  • Familiar with visualization by using R and SAS and knowledge of AWS services.
  • Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS. Knowledge of working on Teradata and its Bigdata tools.
  • Worked on visualization and BI tools like Tablaeu and Qlikview.
  • Experience in importing/exporting terabytes of data using Sqoop from HDFS to RDBMS and vice-versa.
  • Good knowledge on Hadoop MRV1 and MRV2 (YARN) Architecture.
  • Potential experience in (SDLC) Analysis, Design, Development, Integration and Testing in diversified areas of Client-Server/Enterprise applications using Java, J2EE technologies.
  • Proficient in Java, J2EE, JDBC, Collections, Servlets, JSP, Struts, Spring, Hibernate, JAXB, JSON, XML, XSLT, XSD, JMS, WSDL, WADL, REST, SOAP Web services, CXF, Groovy, Grails, Jersey, Gradle and Eclipse Link.
  • Strong experience in ETL tools on Oracle, DB2 and SQL Server Databases.
  • Hands on experience working with Java project build managers Apache MAVEN and ANT.
  • Extensive experience in Development and Production support on Linux environment.
  • Good knowledge in integration of various data sources like RDBMS, Spreadsheets, Text files, JSON and XML files.
  • Well-developed communication skills, ability to work well independently and as part of a team, developing effective client relations, providing superior client service and satisfaction.

TECHNICAL SKILLS

Big Data Ecosystems: Hadoop, HDFS, YARN, Map Reduce, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, Flume, Parquet, Apache Impala, Spark

Frameworks: JPA,J2EE, JSP, Servlets, Struts, Hibernate, .NET Framework 4.5

Methodology: Agile software development

Languages: Java, HiveQL, PigLatin, R, Regex, Advanced PL/SQL, SQL, VBA, C++, C, Shell

Scripting Languages: HTML, CSS, JavaScript, DHTML, XML, JQuery

Web Technologies: Java, J2EE, Servlets, JSP, JDBC, XML, AJAX, SOAP, Restful, Angular JS

Architectures: SOA, Cloud Computing(AWS, EC2)

Application Server: Apache Tomcat, Glassfish 4.0, Web Logic

Database Systems: Netezza, Oracle 11g/10g/9i, DB2, MS-SQL Server, MySQL, MS-Access

Development Tools (IDEs): JIRA, Clear case, Tableau, Splunk, RStudio, Eclipse/Net Beans, Toad, SQL Developer, AWK

Platforms: Windows 7/8/ 10, Ubuntu(Linux), RedHat, SUSE, CentOS

PROFESSIONAL EXPERIENCE

Confidential, San Mateo, CA

Sr. Hadoop Developer

Responsibilities:

  • Used Oozie workflow engine for managing interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java Map-reduce, Hive and Sqoop as well as System specific jobs.
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
  • Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Created Hive tables as the requirements were either internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Imported data from various data sources, performed transformations using Hive, MapReduce.
  • Responsible for loading data into HDFS, extracted the processed data from MySQL into HDFS using Sqoop.
  • UsedSpark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Developed Spark code using Scala and Spark -SQL/Streaming for faster testing and processing of data.
  • Import the data from different sources like HDFS/Hbase intoSpark RDD.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Developed complex MapReduce programs in Java for Data Analysis on different data formats.
  • Experience on implementation of a log producer in Scala dat watches for application logs, transform incremental log and sends them to a Kafka and Zookeeper based log collection platform.
  • Implemented static and dynamic partitioning in Hive.
  • Extensively Used Sqoop to import/export data between RDBMS and Hive tables, incremental imports and created Sqoop jobs for last saved value.
  • Exported the analyzed data to the Relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Created Hive queries to compare the raw data with EDW reference tables and performing aggregates.
  • Managing and scheduling jobs on a Hadoop cluster.
  • Used Pig as ETL tool for various data joins.
  • Developed Simple and complex MapReduce Jobs using Hive.
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Implemented Partitioning and bucketing in Hive.
  • Experienced in managing and reviewing Hadoop Log Files.
  • Expertise in using ORC and Parquet file formats in Hive.
  • Created customized BI tool for manager team dat perform Query analytics using HiveQL.
  • Configured Flume to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF'S to pre-process the data for analysis.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Involved in story-driven Agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Cloudera, MapReduce, HDFS, Hive, Java (jdk1.7), Pig, Linux, XML, HBase, Zookeeper, Sqoop, Amazon Web Services (AWS), Tableau, JIRA, Maven, Eclipse.

Confidential, Parsippany, NJ

Hadoop Developer

Responsibilities:

  • Designed and implemented MapReduce jobs to support distributed data processing to process large data sets utilizing Hadoop cluster environment dat which needed by business use cases.
  • Handled importing of data from various data sources, performed transformations using PIG, Map Reduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using SQOOP.
  • Experienced in the data analysis, design, development and MRUnit testing of Hadoop Cluster Structureusing Java.
  • Managed nodes on Hadoop cluster connectivity and security.
  • Implemented Hive Generic UDF's to implement business logic.
  • Implemented six nodes CDH4 Hadoop Cluster on CentOS.
  • Used Partitioning pattern in Map Reduce to move records into categories
  • Installed and configured Hive and also written Hive QL scripts.
  • Experienced with Accessing Hive tables to perform analytics from java applications using JDBC.
  • Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements
  • Written advanced MapReduce codes for Joins and Grouping.
  • Created Hive external tables on the MapReduce output before partitioning; bucketing is applied on top of it.
  • Performed unit testing of MapReduce jobs on cluster using MRUnit.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
  • Unix Scripting to manage the Hadoop Operation stuffs.
  • Written Puppet program for installation and configuration of Cloudera Hadoop CDH3u1.
  • Responsible for Developing MR jobs using python, performed unit testing using MRUnit.
  • Worked in aggregating data points from nearly 45 external and internal sources in order to view, interrogate and analyze large sets of data to determine best data to use in various analytical solutions.
  • Developed data transformations based on the requirements from the source system owners.
  • Handled importing of data from various data sources, performed transformations using Hive and MapReduce, loaded data into HDFS. And, extracted the data from MySQL into HDFS using Sqoop.
  • Used Pattern matching algorithms to recognize the customer across different sources and built risk profiles for each customer using Hive and stored the results in HBase.

Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig 0.10 and 0.11, JDK1.6, HBase, Hue, Talend, Oozie, Spark, Storm, Kafka, Redis, Flume, Junit and Oracle/Informix, Cassandra, AWS (Amazon Web Services), HDFS, DB2 and HortonWorks, Tableau.

Confidential, Austin, TX

Hadoop Developer

Responsibilities:

  • Work with the Teradata analysis team to gather the business requirements.
  • Worked extensively on importing data using Sqoop and Flume.
  • Responsible for creating complex tables using Hive and developing Hive queries for the analysts.
  • Created partitioned tables in Hive for best performance and faster querying.
  • Transportation of data to HBase using Pig.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Experience with professional software engineering practices and best practices for the full software development life cycle including coding standards, code reviews, source control management and build processes.
  • Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources
  • Involved in source system analysis, data analysis, data modeling to ETL
  • Written multiple MapReduce procedures to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV& other compressed file formats.
  • Handling structured and unstructured data and applying ETL processes.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
  • Developed the Pig UDFs to pre-process the data for analysis.
  • Prepare Developer (Unit) Test cases and execute Developer Testing.
  • Create/Modify shell scripts for scheduling various data cleansing scripts and ETL loading process.
  • Supports and assist QA Engineers in understanding, testing and troubleshooting.
  • Written build scripts using ant and participated in the deployment of one or more production systems
  • Production Rollout Support which includes monitoring the solution post go-live and resolving any issues dat are discovered by the client and client services teams.
  • Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
  • Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files
  • Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications

Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig 0.10 and 0.11, JDK1.6, HBase, Hue, Talend, Oozie, Spark, Storm, Kafka, Redis, Flume, Junit and Oracle/Informix, Cassandra, AWS (Amazon Web Services), HDFS, DB2 and HortonWorks, Tableau.

Confidential, Atlanta, GA

Java/ J2EE Developer

Responsibilities:

  • Involve in the design process, coding and testing phases of software development.
  • As a programmer, involved in designing and implementation of MVC pattern.
  • Extensively used XML where in process details are stored in the database and used the stored XML whenever needed.
  • Part of core team to develop process engine.
  • Developed Action Classes & Validation Struts framework
  • Created project related documentations like user guides based on role.
  • Implemented modules like Client Management, Vendor Management.
  • Attended various Client meetings.
  • Implemented Access Control Mechanism to provide various access levels to the user.
  • Designed and developed the application using J2EE, JSP, XML, Struts, Hibernate, Spring technologies
  • Coded DAO and hibernate implementation Class for data access.
  • Coded Springs Services Class and Transfer Objects to pass the data between layers.
  • Designed the Database for the Jeevica in MS-SQL server 2008
  • Implemented Web Services using Axis
  • Used different features of Struts like MVC, Validation framework and tag library.
  • Created detail design document, Use cases, and Class Diagrams using UML
  • Written ANT scripts to build JAR, WAR and EAR files.
  • Developed Standalone Java Component dat will interact with Crystal Reports on Crystal Enterprise Server in order to view Reports as well Scheduling of Reports as well storing data as XML and sending data to consumers using SOAP.
  • Deployed the application and tested on Websphere Application Servers.
  • Developed Java Scripts for client side validations in JSP.
  • Developed JSPs with Struts taglibs for the presentation layer.
  • Coordinated with the onsite, offshore and QA team to facilitate the quality delivery from offshore on schedule.

Environment: Java 1.5, Spring, Spring WebService, JSP, JavaScript, Hibernate, SOAP, CSS, Struts, Websphere, MQ Series, JUnit, Apache, Windows XP and Linux

Confidential

Java Developer

Responsibilities:

  • Involve in the design process, coding and testing phases of software development.
  • Design usecases, sequence and class diagrams for business requirements.
  • Write feature, design and test specifications.
  • Performed requirements analysis, performance analysis and problem analysis.
  • Give technical guidance and provide mentoring to the team and enhance competitive advantage for Teradata database.
  • Build and deploy Java applications into multiple UNIX based environments and produce both unit and functional test results along with release notes.
  • Work with test team to understand outstanding issues and close them in reasonable time.
  • Apply design patterns and OO design patterns to improve existing Java based tools.
  • Learn Teradata tools and improve APIs to interact with each other.
  • Write SQL queries to fetch data from Teradata and test it with internal tools.
  • Build Automation test scripts dat cannot be developed by test team and maintain them.

Environment: Java EE6, Teradata, J2EE, JSP, JavaScript, Hibernate, Spring, JavaScript, OO design patterns, FastLoad, MultiLoad, HTML5, XML, Clearcase and JSON

Confidential

Java Developer

Responsibilities:

  • Involve in the design process, coding and testing phases of software development.
  • Collaborate with team members and involved in analysis, design and implementation phases of the software development lifecycle (SDLC) for various software modules of the web application.
  • Implemented MVC design pattern using JPA Framework.
  • Used JSP for presentation layer, developed high performance object/relational persistence and query service for entire application utilizing Hibernate.
  • Developed the XML Schema and Web services for the data maintenance and structures.
  • Developed the application using Java Beans, Servlets and EJB's.
  • Actively involved in code review and bug fixing for improving the performance.
  • Created connection through JDBC and used JDBC statements to call stored procedures.
  • Used WebSphere Application Server and RAD, Eclipse to develop and deploy the application.
  • Designed database and created tables, written the complex SQL Queries and stored procedures as per the requirements.
  • Involved in coding for JUnit Test cases, ANT for building the application.
  • Developed application Using J2EE (JDBC, JSF, EJB, Rich faces, JSTL and XML).
  • Worked extensively on Dynamic SQL, Bulk Collections, Materialized Views, Ref Cursor, Query Re-Write, Collections etc.
  • Demonstrated excellence in achieving improved efficiency of 30% by taking down ticket count to 5 from 50.
  • Design usecases, sequence and class diagrams for business requirements.
  • Write feature, design and test specifications.

Environment: Java/J2EE, Oracle 11g/10g, SQL, PL/SQL, JSP,JSF, EJB, JPA, Hibernate, Web Logic 8.0, HTML, AJAX, Java Script, JDBC, XML, JMS, XSLT, UML, JUnit, log4j, My Eclipse 6.0

We'd love your feedback!