We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

BellevuE

PROFESSIONAL SUMMARY:

  • Over 9 years of professional IT experience in all phases of Software Development Life Cycle which includes hands on experience in Java/J2EE technologies and Big Data Analytics.
  • 3+ years of work experience in ingestion, storage, querying, processing and analysis of Big Data with hands on experience inHadoop Ecosystem development including MapReduce, HDFS, Hive, Pig, Cloudera Navigator, Mahout Hbase, Zoo Keeper, Sqoop, Flume, Oozie, AWS and Azkaban.
  • Experience with distributed systems, large - scale non-relational data stores, MapReduce systems, data modeling, and big data systems.
  • Experience in handling various tools for Big Data analysis using Pig, Hive and understanding of Sqoop and Puppet.
  • Experience with Testing Map Reduce programs using MRUnit, Junit, ANT, Maven and Easy Mock.
  • Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSP, JSF, Struts, Spring, Hibernate, JDBC, EJB.
  • Experience with web-based UI development using jQuery UI, jQuery, ExtJS, CSS, HTML, HTML5, XHTML and Java script.
  • Experience in importing streaming logs and aggregating the data to HDFS through Flume.
  • Experience in developing customized UDF’s in java to extend Hive and Pig Latin functionality.
  • Experience in writing MRUnit to test the correctness of MapReduce programs.
  • Expertise in writing Shell-Scripts, Cron Automation and Regular Expressions.
  • Hands on experience in dealing with Compression Codecs like Snappy, BZIP2.
  • Supported Map Reduce Programs running on the cluster and wrote custom Map Reduce Scripts for Data Processing in Java
  • Hands on Experience through hackathons in Scala, Clojure, Python, Perl, R, Ruby, Groovy & Grails.
  • Continuous monitoring and managing theHadoopcluster usingClouderaManager.
  • Excellent OOAD skills with design & development in Java, SOAP and REST Web Services.
  • Hands on experience with various databases such as Oracle, MySQL and IBM DB2.
  • Expertise in developing MapReduce jobs in Java
  • Experience in working with Spark and Storm.
  • Extensive experience with SQL, PL/SQL and database concepts, Developed stored procedures and queries using PL/SQL.
  • Good knowledge on executing Spark SQL queries against data in Hivequeries against data in Hive.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper
  • Knowledge of No SQL databases such as HBase, and MongoDB
  • Performed POC for Splunk, spearheading the effort for adoption resulting in higher productivity and new real time insight for compliance mandates.
  • Analyze data, interpret results and convey findings in a concise and professional manner
  • Partner with Data Infrastructure team and business owners to implement new data sources and ensure consistent definitions are used in reporting and analytics
  • Promote full cycle approach including request analysis, creating/pulling dataset, report creation and implementation and providing final analysis to the requestor
  • Have good Knowledge in ETL and hands on experience in ETL.
  • Worked on Agile methodology, SOA for many of the applications.
  • Experience using XML, XSD and XSLT.
  • Good knowledge of Log4j for error logging.
  • Expertise in RDBMS like Oracle, MS SQL Server, MySQL and DB2.
  • Team player with excellent communication, presentation and interpersonal skills.
  • Highly motivated team player with zeal to learn new technologies.

TECHNICAL SKILLS:

Languages/Tools: Java, C, C++, VB, XML, HTML/XHTML, HDML, DHTML,Scala 2.0.

Big Data: Hadoop, Map Reduce, Hive, Pig, Storm, Sqoop, Oozie and MRUnit

J2EE Standards: JDBC, JNDI, JMS, Java Mail & XML Deployment Descriptors.

Web/Distributed Technologies: J2EE, Servlets 2.1/2.2, JSP 2.0, Struts 1.1, Hibernate 3.0, JSF, JSTL1.1,EJB 1.1/2.0, RMI,JNI, XML,JAXP,XSL,XSLT, UML, MVC,STRUTS,Spring 2.0, Corba, Java Threads.

Operating System: Windows 95/98/NT/2000/XP, MS-DOS, UNIX, Linux6.2

Databases: Oracle 8i/9i, MS SQL Server 2000, DB2, MS Access & MySQL.

Browser Languages: HTML, XHTML, CSS, XML, XSL, XSD, XSLT.

Browser Scripting: Java script, HTML DOM, DHTML, AJAX.

App/Web Servers: IBM Websphere 5.1.2/5.0/4.0/3.5, BEA Web logic 5.1/7.0, Jdeveloper, Apache Tomcat, JBoss.

GUI Environment: Swing, AWT.

Messaging & Web Services Technology: SOAP, WSDL,UDDI, XML, SOA, JAX-RPC, IBM WebSphere MQ v5.3, JMS.

Networking Protocols: HTTP, HTTPS, FTP, UDP, TCP/IP, SNMP, SMTP, POP3.

Testing &Case Tools: Junit, Log4j, Rational Clear case, CVS, ANT, JBuilder.

PROFESSIONAL EXPERIENCE:

Confidential, Bellevue

Hadoop Developer

Responsibilities:

  • Extensive scripting in Perl and Python.
  • Worked on Scheduling jobs using Tidal and Control M scheduler
  • Data ingestion onto Hadoop data lake from multiple data sources
  • Analysis of data sets from various sources for ingestion
  • Design and Develop Parsers for different file formats (CSV, XML, Binary, ASCII, Text, etc.).
  • Extensive usage of Cloudera Hadoop distribution.
  • Executing parameterized Pig, Hive, impala, and UNIX batches in Production.
  • Big Data management in Hive and Impala (Table, Partitioning, ETL, etc.).
  • Design and Develop File Based data collections in Perl.
  • Extensive Usage of Hue and other Cloudera tools.
  • Used Map Reduce JUnit for unit testing.
  • Extensive usage of NOSQL (HBASE) Database.
  • Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, Cassandra and Hive).
  • Design and Develop Dashboards in ZoomData and Write Complex Queries.
  • Worked on Shell Programming and CronTab automation.
  • Monitored System health and logs and respond accordingly to any warning or failure conditions.
  • Extensively worked in Unix environment.

Environment: Apache Hadoop, HDFS, Perl, Python, Pig Hive, Java, Sqoop, Cloudera CDH5, Oracle, MySQL, Tableau, Talend, Elastic search, Storm, Data governance implementation.

Confidential, San Jose, CA

Hadoop/Spark Developer

Responsibilities:

  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala/Python
  • Developed and executed shell scripts to automate the jobs
  • Wrote complex Hive queries and UDFs.
  • Worked on reading multiple data formats on HDFS using PySpark
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Developed multiple POCs using PySpark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Analyzed the SQL scripts and designed the solution to implement using PySpark
  • Involved in loading data from UNIX file system to HDFS
  • Extracted the data from Teradata into HDFS using Sqoop
  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, Spark and loaded data into HDFS.
  • Manage and review Hadoop log files.
  • Involved in analysis, design, testing phases and responsible for documenting technical specifications
  • Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
  • Worked on the core and Spark SQL modules of Spark extensively.
  • Handling structured and unstructured data and applying ETL processes.
  • Experienced in managing and reviewing Hadoop log files.
  • Experienced in running Hadoop streaming jobs to process terabytes data.
  • Involved in importing the real time data to hadoop using Kafka and implemented the Oozie job for daily imports.

Environment: Hadoop, HDFS, Hive, Python, Scala, Spark, SQL, Teradata, UNIX Shell Scripting

Confidential, Los Angeles, CA

Hadoop Developer

Responsibilities:

  • Implemented Struts and Spring frameworks.
  • Responsible for creating HIVE external tables on the finalized data in HDFS and partitioning and bucketing the data
  • Wrote complex Hive queries and UDFs.
  • Involved in converting Map Reduce programs into Spark transformations using Spark RDD's and Scala.
  • Performed Job Scheduling and Testing using Azkaban
  • Developed Shell and Python scripts to automate the jobs on Azkaban
  • Worked on importing data from multiple data sources to Google docs to S3/AWS, then to Data Lake
  • Worked on Amazon AWS EMR cluster
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Developed Spark scripts by using Scala Shell commands as per the requirement.
  • Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hadoop and Hadoop stack on a 16 node cluster.
  • Extracting feeds form social media sites using Python scripts.
  • Developing data pipeline programs with Spark Scala APIs, data aggregations with Hive, and formatting data (Json) for visualization, and generating. E.g. High charts: Outlier, data distribution, Correlation/comparison, and 2 dimension charts using JavaScript.
  • Involved in creating Hive tables, loading the data using Hive and in writing Hive queries to analyze the data.
  • Gained very good business knowledge on different category of products and designs within.

Environment: AWS EMR, Hadoop, HDFS, Hive, Azkaban, Linux, UNIX Shell Scripting, Kafka, Python, Hbase, Zookeeper, MapReduce, Java, Sqoop, Parquet, Spark, Yarn, Crunch and Avro

Confidential, Jacksonville, FL

Java/J2EE/ Hadoop Developer

Responsibilities:

  • Developed the application using Struts Framework dat leverages classical Model View Layer (MVC) architecture UML diagrams like use cases, class diagrams, interaction diagrams, and activity diagrams were used
  • Participated in requirement gathering and converting the requirements into technical specifications
  • Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax
  • Created Business Logic using Servlets, Session beans and deployed them on Web logic server.
  • Developed the XML Schema and Web services for the data maintenance and structures
  • Implemented the Web Service client for the login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
  • Successfully integrated Hive tables and Mongo DB collections and developed web service dat queries Mongo DB collection and gives required data to web UI.
  • Successfully integrated Hive tables and Mongo DB collections and developed web service dat queries Mongo DB collection and gives required data to web UI.
  • Developed workflows using custom MapReduce, Pig, Hive and Sqoop.
  • Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Maintain Hadoop, Hadoop ecosystems, third party software, and database(s) with updates/upgrades, performance tuning and monitoring
  • Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
  • Responsible in modification of API packages
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Created UDFs to calculate the pending payment for the given Residential or Small Business customer, and used in Pig and Hive Scripts.
  • Responsible to manage data coming from different sources.
  • Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
  • Got good experience with NOSQL database.
  • Experience in managing and reviewing Hadoop log files.
  • UsedHibernateORM framework withspringframework for data persistence and transaction management.
  • Participated in development/implementation ofClouderaHadoopenvironment.
  • Wrote test cases in Junit for unit testing of classes
  • Involved in templates and screens in HTML and JavaScript
  • Involved in integrating Web Services using WSDL and UDDI
  • Built and deployed Java applications into multiple Unix based environments and produced both unit and functional test results along with release notes

Environment: Hadoop, HDFS, Pig, Cloudera, JDK 1.5, J2EE 1.4, Struts 1.3, Kafka, Storm JSP, Servlets 2.5, WebSphere 6.1, HTML, XML, ANT 1.6, Perl, Python, JavaScript, Junit 3.8

Confidential, Matawan, NJ

Java/Big Data Analyst

Responsibilities:

  • Involved in analysis and design of the system architecture.
  • Actively participated in the complete Software development life cycle starting from design phase to the implementation phase.
  • Involved in preparing use-case diagrams, sequence diagrams and class diagrams using Rational Rose, UML.
  • Analyzed, designed and developed Login Module based on privileges.
  • Used Java Script for necessary framework for managing the content and workflow solutions.
  • Scrum methodology is used in developing the product. Attended regular scrum meetings for feedbacks and design changes.
  • Used Hibernate tools to interact with the database.
  • Developed Add/Edit Securities component using Struts Action, ActionForm and ActionErrors and the Tiles Framework. dis is a set of wizard pages dat will allow an admin to create new instrument master entries or modify existing entries.
  • Implemented Object Oriented JavaScript in the persistence layer using hibernate frame work in functionality.
  • Installed and configured Hadoop and Hadoop stack on a 16 node cluster.
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables.
  • Involved in data ingestion into HDFS using Sqoop from variety of sources using the connectors like jdbc and import parameters.
  • Analyze large and critical datasets of Global Risk Investment and Treasury Technology (GRITT) Domain using Cloudera, HDFS, Hbase, MapReduce, Hive, Hive UDF, Pig, Sqoop, Zookeeper, & Mahout.
  • Worked with NoSQL database Hbase to create tables and store data.
  • Designed and implemented MapReduce-based large-scale parallel relation-learning system.
  • Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data coming from various sources.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
  • Exported the data from Avro files and indexed the documents in sequence file format.
  • Install, configure, and operate data integration and analytic tools i.e. Informatica, Chorus, SQLFire, & Gem Fire XD for business needs
  • Develop scripts to automate routine DBA tasks (i.e. refresh, backups, vacuuming, etc.)
  • Installed and configured Hive and also wrote Hive UDF’s dat halped spot market trends.
  • Used Hadoop streaming to process terabytes data in XML format.
  • Involved in loading data from UNIX file system to HDFS.
  • Implemented Fair schedulers on the Job tracker with appropriate parameters to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Involved in creating Hive tables, loading the data using it and in writing Hive queries to analyze the data.

Environment: My Eclipse 6.0, Struts 1.2, Spring Framework, Hibernate, WebSphere6.0, Object Oriented JavaScript, JUnit, XPath, Oracle 9i, Rational Rose, MQ Series and Maven, CDH4 with Hadoop 1.x, HDFS, Pig, Cloudera, Hive, Hbase, zookeeper, MapReduce, Java, Sqoop, Oozie, Linux, UNIX Shell Scripting and Big Data.

Confidential

Java Developer

Responsibilities:

  • Extensively involved in the design and development of JSP screens to suit specific modules.
  • Converted the application’s console printing of process information to proper logging technology using log4j.
  • Developed the business components (in core Java) used in the JSP screens.
  • Involved in the implementation of logical and physical database design by creating suitable tables, views and triggers.
  • Developed related procedures and functions used by JDBC calls in the above components.
  • Extensively involved in performance tuning of Oracle queries.
  • Created components to extract application messages stored in xml files.
  • Executed UNIX shell scripts for command line administrative access to oracle database and for scheduling backup jobs.
  • Created war files and deployed in web server.
  • Performed source and version control using VSS.
  • Involved in maintenance support.

Environment: JDK, HTML, JavaScript, XML, JSP, Servlets, JDBC, Oracle 9i, Eclipse, Toad, UNIX Shell Scripting, MS Visual SourceSafe, Windows 2000.

Confidential

Junior JAVA Developer

Responsibilities:

  • Involved in the analysis, design, implementation, and testing of the project
  • Implemented the presentation layer with HTML, XHTML and JavaScript
  • Developed web components using JSP, Servlets and JDBC
  • Wrote complex SQL queries and stored procedures
  • Involved in fixing bugs and unit testing with test cases using JUnit
  • Actively involved in the system testing
  • Involved in implementing service layer using Spring IOC module
  • Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product.

Environment: Java, JSP, Servlets, JDBC, JavaScript, MySQL, JUnit, Eclipse IDE.

We'd love your feedback!