We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Pittsburgh, PA

SUMMARY:

  • Having 9+ years of experience in Consulting, Analysis, Implementation of Java and Big Data solutions for various project assignments
  • Lead contributor for Big Data Centre of excellence in various emerging technologies like Big Data, Texts analytics, No SQL databases and other related areas.
  • Focus on designing and delivering most optimum and critical business solutions for Big Data Technologies.
  • Keen in building knowledge on emerging technologies in the Analytics, Information Management, Big data, Data science and related areas and in providing best business solutions
  • Experienced in individual phases of project lifecycle with emphasis on Planning, Designing and Coding.
  • Experienced in Hadoop environment technologies/Platforms like Pig, Hive, Sqoop.
  • Consulting for Java, Big Data Technologies. Evaluation of new technologies in Big data, Analytics and NO Sql space
  • Exclusive experience in Hadoop and its components like HDFS, Map Reduce, Apache Pig, Hive, Sqoop, HBase, Cassandra and Oozie.
  • Extensive Experience in Setting Hadoop Cluster.
  • Good working knowledge with Map Reduce and Apache Pig.
  • Involved in writing the Pig scripts to reduce the job execution time.
  • Having experience with processing real time streamed data using Strom and Spark streaming
  • Experience with configuration of Hadoop Ecosystem components: Map Reduce, Hive, Hbase, Pig, Sqoop, Oozie, Flume, Storm, Spark, Yarn and Tez.
  • Have executed projects using Java/J2EE technologies such as Core Java, Servlets, Jsp, JDBC, Ext JS, Struts.
  • Experience in application development frameworks like spring, Hibernate and also on validation plug - ins like Validator frameworks.
  • Strong experience with version control tools such as Subversion, Clear Case, and CVS.
  • Experienced in Developing J2EE Application on IDE tools like Eclipse and Net Beans.
  • Expertise in build scripts like ANT and Maven and build automation.
  • Strong Experience in working with Databases like Oracle, SQL Server, EDB and proficiency in writing complex SQL, PL/SQL for creating tables, views, indexes, stored procedures and functions.
  • Strong experience on Hadoop distributions like Cloudera, Hortonworks and MapR. exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark SQL, Data Frame, Pair RDDs, Storm, Spark YARN.
  • Having experience with real time messaging system Kafka.
  • Developed Pig Latin scripts to analyze the data and wrote UDFs in Java to extend the functionality of Pig.
  • Experienced in transporting, and processing real time event streaming using Kafka and Storm.
  • Experienced in Real time data ingestion into HBASE and HIVE using Storm.
  • Experience in web design using HTML, Bootstrap, XML, CSS, AngularJS, NodeJS, AJAX, JavaScript, EXT JS and JQuery
  • Extensively worked on implementing Web services (SOAP, REST, JSON, XML/XSD, WSDL and XML Parsers).
  • Good working knowledge on SOA, RESTful Web services with JSON using Jackson API
  • Experience with all stages of the SDLC and Agile Development model right from the requirement gathering to Deployment and production support.
  • Also have experience in understanding of existing systems, maintenance and production support, on technologies such as Java, J2EE and various databases (Oracle, SQL Server).

TECHNICAL SKILLS:

Big Data Technologies: Apache Hadoop, Map-Reduce, HDFS, Pig, Hive, Hbase, Zookeeper, Sqoop, Flume, OOZIE, YARN, Impala

Languages: Core Java, J2EE, SQL, PL/SQL, Unix Shell Scripting

Web Technologies: JSP, EJB 2.0, JNDI, JMS, JDBC, HTML, JavaScript

Web/Application servers: Tomcat 6.0/5.0/4.0, JBoss 5.1.0, WebLogic, Jetty, Apache Tomcat

Databases: Oracle 11G/10G, SQL Server, DB2, Sybase, Teradata

Operating Systems: MS-DOS, Windows XP, Windows 7, UNIX and Linux

IDE: IntelliJ IDEA 7.2, EditPlus3, Eclipse3.5, NetBeans6.5, TOAD, PL/SQL, Teradata

Frame Works: Hadoop MapReduce, MVC, Struts 2.x/1.x

Version Control: VSS Visual Source Safe, Subversion, CVS

Testing Technologies: JUnit 4/3.8

Office Packages: MS-Office 2010, 2007, 2003 and Visio

Process/Methodologies: SDLC, Waterfall, OOAD, UML, Agile/SCRUM, SOA, EAI

Business Intelligence: Business Object XI 3.1, Cognos 8.4

PROFESSIONAL EXPERIENCE:

Confidential, Pittsburgh, PA

Senior Hadoop Developer

Responsibilities:

  • Developed various workflows using custom MapReduce, Pig, Hive and scheduled them using Oozie.
  • Created Hive Tables, loaded retail transactional data from Teradata using Sqoop.
  • Used Sqoop to move the analyzed data to the Oracle DB for report generation.
  • Auto Populate Hbase tables with data coming from Kafka sink.
  • Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Unit.
  • Developed custom writable MapReduce JAVA programs to load web server logs into HBase using flume.
  • Loaded home mortgage data from the existing DWH tables (Sql Server) to HDFS using Sqoop.
  • Wrote Hive Queries to have a consolidated view of the mortgage and retail data.
  • Orchestrated hundreds of Sqoop scripts, Pig scripts, Hive queries using Oozie workflows and sub workflows.
  • Loaded the load ready files from mainframes to Hadoop and files were converted to ASCII.
  • Developed Pig Scripts for replacing the existing home loans legacy process to Hadoop and data is back fed to retail legacy mainframe systems.
  • Agile Methodology was used for development using XP practices.
  • Participated in daily Scrum meetings and iterative development.
  • Exposure to burn up. burn down charts, dashboards, velocity reporting of sprint and release progress.
  • Handled importing jobs to process various data sources, performed transformations using Hive, Map reduce and loaded data into HDFS.
  • Extracted the data from Teradata into HDFS using Scoop.
  • Used Apache Kafka and Apache Storm to gather log data and fed into HDFS
  • Experience in migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop &Flume.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
  • Setup Linux shell scripts and Oozie workflows to periodically import any incremental data generated, clean it, and add it into HDFS.
  • Exported the patterns analyzed back to Teradata using Sqoop.
  • Performing data migration from Legacy Databases RDBMS to HDFS using SQOOP.
  • Responsible for running Hadoop streaming jobs to process terabytes of CSV data.

Environment: Cloudera (CDH5), MapReduce, Kafka, Storm, Spark, YARN2.0, HBase, Hive, Java, Pig, Oozie, Apache Solar, Sqoop, Tableau, PL/SQl, Data masking, Data Modeling, Scala, python.

Confidential, Coopersburg, PA

Senior Hadoop Developer/Administrator

Responsibilities:

  • Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.
  • Played responsible role for deciding the hardware configurations for the cluster along with other teams in the company.
  • Resolving tickets submitted, P1 issues, troubleshoot the error documenting, resolving the errors.
  • Adding new Data Nodes when needed and running balancer.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Continuous monitoring and managing the Hadoop cluster through Ganglia and Nagios.
  • Done major and minor upgrades to the Hadoop cluster.
  • Done stress and performance testing, benchmark for the cluster.
  • Working closely with both internal and external cyber security customers.
  • Research effort to tightly integrate Hadoop and HPC systems.
  • Compared Hadoop to commercial big-data appliances from Netezza, XtremeData and LexisNexis. Published and presented results.
  • Research effort to tightly integrate Hadoop and HPC systems.
  • Deployed, and administered 70 node Hadoop cluster. Administered two smaller clusters.
  • Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.
  • Deployed, and administered Hadoop clusters.
  • Compared Hadoop to commercial big-data appliances from Netezza, XtremeData, and LexisNexis. Published and presented results.
  • Worked on developing Linux scripts for Job Automation.
  • Resolving tickets submitted by users, P1 issues, troubleshoot the error documenting, resolving the errors.
  • Developing machine-learning capability via Apache Mahout.

Environment: Cloudera, Apache Hadoop 1.0.1, MapReduce, HDFS, CentOS 6.4, Linux Red hat, Hbase, Hive, Pig, Oozie, Flume, Java (jdk 1.6), Eclipse, Tableau, PL/SQl, Spark Scala, Python.

Confidential, Edison, NJ

Hadoop Developer

Responsibilities:

  • Worked on reading multiple data formats on HDFS using Scala
  • Having experience on Hadoop eco system components HDFS, MapReduce, Hive, Pig, Sqoop and HBase.
  • Expertise with web based GUI architecture and development using HTML, CSS, AJAX, JQuery, Angular Js, and JavaScript.
  • Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDDs, and Scala.
  • Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Cassandra and SQL
  • Involved in loading data from UNIX file system to HDFS.
  • Extracted the data from Databases into HDFS using Sqoop
  • Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
  • Manage and review Hadoop log files. Implemented lambda architecture as s solution to a problem.
  • Involved in analysis, design, testing phases and responsible for documenting technical specifications.
  • Very good understanding of Partitions, bucketing concepts Managed and External tables in Hive to optimize performance.
  • Involved in migrating Hive queries into Spark transformations using Data frames, Spark SQL, SQL Context, and Scala.
  • Developed Hadoop Streaming MapReduce jobs using Python.
  • Worked on the core and Spark SQL modules of Spark extensively.
  • Experienced in running Hadoop streaming jobs to process terabytes data in Hive and designed both.

Environment: s: CDH5, Hadoop, HDFS, MapReduce, Yarn, Hive, Oozie, Sqoop, Oracle, Linux, Shell scripting, Java, Spark, Scala, SBT, Storm, Kafka, Eclipse, Amazon S3, JD Edwards Enterprise One, JIRA, Git Stash Apache Hadoop, HDFS, Java MapReduce, Eclipse, Hive, PIG, Sqoop and SQL, Oracle 11g

Confidential, Washington, DC

Hadoop Developer

Responsibilities:

  • Working on a Cloudera Hadoop platform to implement Big data solutions using Hive, MapReduce, shell scripting, and Java technologies.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Configured Hive, Pig, Oozie, and Sqoop on Hadoop cluster.
  • Developing bash shell scripts invoking hive HQL scripts and creating appropriate dependency.
  • Batch job scheduling using Contabo and Oozie workflow.
  • Handled the importing of data from various data sources, performed transformations using hive, Map-Reduce, loaded data into HDFS and extracted data from MS SQL Server into HDFS using Sqoop.
  • Analyzed the data by performing Hive queries.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Configured Tableau and generated the reports and dashboards using the tool Tableau.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
  • Worked with Linux systems and MySQL database on a regular basis as well with NoSQL database such as Cassandra.
  • Worked towards continuous performance enhancements of Hive queries.
  • Experience in managing and reviewing Hadoop log files.
  • Working on various POC’s which involves for better big data solutions.
  • Very good experience with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN) setups.
  • Review and analyze logs for Yarn MapReduce Jobs.
  • Worked on the projects to implement Spark and Scala.
  • Creation of database objects like tables, views, materialized views, procedures and packages using oracle tools like Toad, PL/SQL Developer and SQL plus.
  • Partitioned the fact tables and materialized views to enhance the performance.
  • Extensively used bulk collection in PL/SQL objects for improving the performing.

Environment: Hadoop, MapReduce, MapR-FS, Hive, PIG, Zookeeper, HBase, Sqoop, Ubuntu, JDBC, Java, Web service, SOAP UI, WSDL, MapR Distribution, Eclipse mars IDE.

Confidential, Plano TX

Java Developer/Hadoop Developer

Responsibilities:

  • Developed Graphical User Interfaces using HTML, JSP’s for user interaction.
  • Created dynamic HTML pages, used JavaScript for client-side validations and involved in server side validations using AJAX.
  • Developed server side applications using Servlets and JDBC.
  • Experience using various Java, J2EE and open source frameworks - Servlets, JSP, JDBC, JMS, Java Mail, Apache CXF, REST, and XML.
  • Developed and used SOAP web services for providing services to other platforms.
  • Implemented persistence layer using JPA and writing SQL queries based on JPA criteria API.
  • Developed Hibernate for data persistence.
  • Extensively involved in Design phase and delivered Design documents.
  • Developed custom writable MapReduce JAVA programs to load web server logs into HBase using flume.
  • Log data Stored in HBase DB is processed and analyzed and then imported into Hive warehouse, which enabled end business analysts to write HiveQL queries.
  • Built re-usable Hive UDF libraries which enabled various business analysts to use these UDF’s in Hive querying.
  • Developed various workflows using custom MapReduce, Pig, Hive and scheduled them using Oozie.
  • Extensive knowledge in troubleshooting code related issues.
  • Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MRUnit.
  • Used Apache Kafka and Apache Storm to gather log data and fed into HDFS
  • Developed job workflows in Oozie to automate the tasks of loading the data into HDFS.
  • Moved all RDBMS data into flat files generated from various channels to HDFS for further processing.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.

Environment: Java, J2EE, Spring 1.2, AJAX, Hibernate, EJB 3.0, WebSphere Application Server, Windows XP, Oracle 10g, Hadoop, MapReduce, MapR-FS, Hive, PIG, Zookeeper, HBase, Sqoop

We'd love your feedback!