We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Minneapolis, MN

SUMMARY

  • Over all 9+ years of experience in IT industry which included 4 years of Big data and Hadoop along with Reporting tools in all phases of SDLC: Requirements gathering, System Design, Development, Enhancement, Maintenance, Testing, Deployment, Production support, System Documentation.
  • Hands - on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, Talend, Apache Kite and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
  • Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
  • Experience in managing and reviewing Hadoop Log files.
  • Expert in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
  • Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map Reduce and Pig jobs.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems mainframe and vice-versa.
  • Worked on writing Map Reduce programs to perform Data processing and analysis
  • Work experience with cloud infrastructure like Amazon Web Services (AWS).
  • Experienced the integration of various data sources like Java, RDBMS, Shell Scripting, Spreadsheets, and Text files.
  • In-depth understanding of Data Structures and Algorithms
  • Experience in Web Services using XML, HTML and SOAP.
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Knowledge of NoSQL, Mongo DB such as Hbase and Cassandra.
  • Worked in Windows, UNIX/Linux platform with different technologies such as SQL, PL/SQL, XML, HTML, CSS, Java Script, Core Java etc.
  • Experience in Hadoop administration activities such as installation and configuration of clusters using Apache and Cloudera.
  • Experience in using IDEs like Eclipse and NetBeans.
  • Experience in Apache Spark, Spark Streaming, Spark SQL and No SQL databases like Cassandra and Hbase.
  • Automated the process from pulling the data from data sources to Hadoop and exporting the data in the form of Jason files in to specified location.
  • Participated in design, development and system migration of high performance metadata driven data pipeline with Kafka and Hive/Presto on Qubole, providing data export capability through API and UI
  • Involved in migrating the map reduce jobs into Spark Jobs and Used Spark SQL and Data frames API to load structured and semi structured data into Spark Clusters.
  • Experience on Puppet and Chef.
  • Knowledge of NoSQL, Mango DB such as Hbase and Cassandra
  • Developed UML Diagrams for Object Oriented Design: Use Cases, Sequence Diagrams and Class Diagrams.
  • Working knowledge of database such as Oracle 10g.
  • Experience in writing Pig Latin scripts.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS using FLUME and SQOOP, perform structural modifications using Map-Reduce, HIVE and analyze data using visualization/reporting tools.
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
  • Clear knowledge of rack awareness topology in the Hadoop cluster.
  • Experience in use of Shell scripting to perform tasks.
  • Familiar in Core Java with strong understanding and working knowledge in Object Oriented Concepts like Collections, Multithreading, Data Structures, Algorithms, Exception Handling and Polymorphism.
  • Involved in CICD process using GIT, Nexus, Jenkins job creation, Maven build Create Docker image and deployment in AWS environment.
  • Basic knowledge in application design using Unified Modeling Language (UML), Sequence diagrams, Case diagrams, Entity Relationship Diagrams (ERD) and Data Flow Diagrams (DFD).
  • Extensive programming experience in developing web based applications using Core Java, J2EE, JSP and JDBC.
  • Comprehensive knowledge of Software Development Life Cycle coupled with excellent communication skills.
  • Expertise in relational databases like Oracle, My SQL.
  • Extensive experience in working with the Customers to gather required information to analyze, clear up and provide data fix or code fix for technical problems, build service patch for each version release and unit testing, integration testing, User Acceptance testing and system testing and providing Technical Solution documents for the Users.

TECHNICAL SKILLS

Hadoop/Big Data Technologies: HDFS, MapReduce, Hive, Pig, Impala, Sqoop, Flume, Kafka, Strom, Hbase, Spark, Cassandra, Oozie, Zookeeper, YARN, Talend.

Programming Languages: Java (JDK 7/JDK 8), C/C++, Mat lab, Python, R, HTML, SQL, PL/SQL

Frameworks: Hibernate, Spring, Struts and JPA

Web Services: WSDL, SOAP, Apache CXF/X Fire, Apache, Axis, REST, Jersey

Client Technologies: JQUERY, Java Script, AJAX, CSS, HTML, XML, XHTML

Operating Systems: UNIX, Windows, LINUX

Application Servers: IBM Web sphere, Tomcat, Web Logic, WebSphere

Web technologies: JSP, Servlets, Socket Programming, JNDI, JDBC, Java Beans, JavaScript, Web Services (JAX-WS)

Databases: NoSQL, Oracle10g, Microsoft SQL Server DB2 & MySQL

Java IDE: Eclipse, IBM Web Sphere Application Developer, IBM RAD 7.0

Tools: TOAD, SQL Developer, SOAP UI, ANT, Maven, Visio, Rational Rose

Reporting Tools: Talend, Tableau

PROFESSIONAL EXPERIENCE

Confidential, Minneapolis, MN

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop stack and different big data analytic tools including Pig and Hive, Hbase database and Sqoop.
  • Designed high level ETL architecture for overall data transfer from the OLTP to OLAP.
  • Created various Documents such as Source-To- Confidential Data Mapping Document, Unit Test Cases and Data Migration Document.
  • Developed Pig Latin scripts to extract the data from the web server output files to load in HDFS.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Created mappings using the transformations like Source Qualifier, Aggregator, Expression, Lookup, Router, Normalizer, Filter, Update Strategy and Joiner transformations.
  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • Experience in analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in Java.
  • Migrated high-volume OLTP transactions from Oracle to Cassandra in order to reduce oracle licensing footprint.
  • Implemented best income logic using Pig scripts and UDFs.
  • Designed and implemented Spark test bench application to evaluate quality of recommendations made by the engine.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Worked on Hive Jason Serde to parse spatial /Jason Data format.
  • Streaming and complex analytics of processing are handled with use of Spark.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance of Hive and Pig queries.
  • Worked on to ease the jobs by building the applications on top of NoSQL database Cassandra.
  • Involved in Using Sqoop to unite with the DB2 and move the pivoted information to hive tables.
  • Executed custom workflow scheduler service to manage multiple independent workflows.
  • Implemented a web application which uses Oozie Rest API and schedule jobs.
  • Create and Implement Apache Solr system to detect duplicate vendors.
  • Migrated Mongo DB to Hbase. Worked on developing Hbase schema for existing database
  • Worked closely with admin team in deploying, monitoring and maintaining Amazon AWS cloud infrastructure consisting of multiple EC2 nodes and VMWare, VM's as required in this environment.
  • Gained knowledge in AWS data migration between different database platforms through RDS tool.
  • Loading the data into Hive for end user analytics.
  • Unit tested and tuned SQLs and ETL Code for better performance.
  • Monitored the performance and identified performance bottlenecks in ETL code.
  • Extracted files through Talend and placed in HDFS and processed.
  • Analyzed the requirement and framed the business logic for the ETL process using Talend.
  • Gained knowledge in formulating procedures for integration of R programming plans with data sources and delivery systems.
  • Provided technical assistance in development and execution of test plans using R as per client requirements.
  • Worked on data utilizing a Hadoop, Zookeeper, and Accumulate stack, aiding in the development of specialized indexes for performant queries on big data implementations

Environment: Hadoop, Hbase, Hive, PIG, Sqoop, Spark SQL, Spark Context, Spark Stream, Linux/Unix shell scripting, Java, GIT Hub, Talend.

Confidential - Milwaukee, WI

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
  • Experience in installing, configuring and using Hadoop Ecosystem components.
  • Experience in Importing and exporting data into HDFS and Hive using Sqoop.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Involved in loading data into Hbase using Hbase Shell, Hbase Client API, Pig and Sqoop.
  • Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
  • Responsible for managing data coming from different sources.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Strong expertise on MapReduce programming model with XML, JSON, CSV file formats.
  • Gained good experience with NOSQL database.
  • Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map, reduce way.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Developed custom aggregate functions using Spark SQL and performed interactive querying, on a POC level.
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Experience in managing and reviewing Hadoop log files.
  • Involved in loading data from LINUX file system to HDFS.
  • Implemented test scripts to support test driven development and continuous integration.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Worked on tuning the performance Pig queries.
  • Mentored analyst and test team for writing Hive Queries.
  • Installed Oozie workflow engine to run multiple MapReduce jobs.
  • Implemented working with different sources using Multi Input formats using Generic and Object Writable.
  • Cluster co-ordination services through Zookeeper.
  • Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Worked with the Data Science team to gather requirements for various data mining projects.

Environment: Cloudera CDH 6, HDFS, Hadoop 2.2.0 (Yarn), HDFS, Jason, Spark Flume 1.5.2, Eclipse, Map Reduce, Hive, Pig Latin 0.14.0, Java, SQL, Sqoop 1.4.6, Centos, Zookeeper 3.5.0 and NOSQL database.

Confidential - Houston, TX

Hadoop Developer

Responsibilities:

  • Involved in building a multi-node Hadoop Cluster
  • Imported data into HDFS using Sqoop.
  • Experience in retrieving data from databases like MYSQL and Oracle into HDFS using Sqoop and ingesting them into Hbase.
  • Developed Hive Queries to analyze the data in HDFS to identify issues and behavioral patterns.
  • Worked on shell scripting to automate jobs.
  • Used Pig Latin to analyze datasets and perform transformation according to business requirements.
  • Experience in using Sqoop to import the data on to Cassandra tables from different relational databases and also importing data from various sources to the Cassandra cluster using Java API's.
  • Worked on implementing Flume to import streaming data logs and aggregating the data to HDFS through Flume.
  • Implemented MapReduce programs to perform joins using secondary sorting and distributed cache.
  • Generated daily and weekly Status Reports to the team manager and participated in weekly status meeting with Team members, Business analysts and Development team.

Environment: Hadoop, Map Reduce, HDFS, Hbase, Hive, MYSQL, Oracle, Sqoop, Pig, Flume, UNIX, Java, Java Script, Maven, Eclipse.

Confidential - MA

Java Developer

Responsibilities:

  • Responsible for requirements capturing & preparing software requirements specification.
  • Design, build, test, and deploy software product features, selects best approach, develops functionality using best practices.
  • Develop rich user interface with Ext JS 4.2 and CSS3 and AJAX.
  • Develop new Card Layout, create XTemplate, grid layout and action grid functionality.
  • Develop Apache Solr based faceted search functionality.
  • Develop new REST based web services.
  • Create migration scripts to support nightly migration.
  • Develops and/or enhances database objects, including queries, procedures, views.
  • Test out new code with behavior driven JavaScript jasmine and write automated selenium based functional test for new features.
  • Provide guidance to cross-functional/organizational team members.
  • Work as a team member in a fast-paced, agile environment.
  • Worked with the Quality Assurance team in fixing the defects.
  • Use Eclipse as IDE tool to develop the application and JIRA for bug and issue tracking.

Environment: Java (JDK 1.7), Ext Js 4.2, Cold-fusion 8, Oracle database 11g, ORM hibernate 4.2, Apache Tomcat 7.0, Apache Solr, Git repository, REST web services, Jasmine 1.2, Selenium web driver, TestNG, SQL developer, Maven, CSS3, JavaScript, HTML, JIRA.

Confidential

Java/ J2EE Developer

Responsibilities:

  • Involved in preparing the Test Plans for testing to be carried out effectively.
  • Developed the core modules for the services by using n- tier architecture.
  • Analyzed the GAP documents to created Test Scenarios and Test Cases.
  • Focused more on the Functional behavior of the system.
  • Integration tested the Transfers Module completely.
  • Involved in testing the manual creation of transactions like Funds Transfer and Standing Order.
  • Tested transactions created electronically through message injection using JMS.
  • Used Java script for client side validations.
  • Worked on single transactions as well as bulk transactions such as Payroll Processing using Custom MVC framework.
  • Used JDBC to connect with DB2 data base.
  • Effective execution of the prepared Test Cases.
  • Involved in writing SQL queries & PL SQL- Stored procedures & functions.
  • Used separate rules to do business validation.
  • Took active participation in the discussions with the Client on several issues.
  • Tested Transactions involving Foreign exchange and tested various scenarios involving FOREX.
  • Used IBM Optim tool for data base UI.
  • Involved in peer level reviews.
  • Build & deployment the code using Ant.
  • Involved in fixing QA, UAT and production defects and tracked them using QC.
  • Involved in unit testing JUNIT and in integration testing.

Environment: Java 1.4, JSP, Servlets, Custom MVC framework, Tag libraries, Java Script, CSS, JDBC, JNDI, Oracle, Java beans, Windows/UNIX, Ant, JUNIT, IBM Clear Case, QC, Edit Plus, Web Sphere, IBM Optim tool.

Java/J2EE Developer

Confidential

Responsibilities:

  • Used Web Sphere for developing use cases, sequence diagrams and preliminary class diagrams for the system in UML.
  • Extensively used Web Sphere Studio Application Developer for building, testing, and deploying applications.
  • Used Spring Framework based on (MVC) Model View Controller, designed GUI screens by using HTML, JSP.
  • Developed the presentation layer and GUI framework in HTML, JSP and Client-Side validations were done.
  • Involved in Java code, which generated XML document, which in turn used XSLT to translate the content into HTML to present to GUI.
  • Implemented XQuery and XPath for querying and node selection based on the client input XML files to create Java Objects.
  • Used Web Sphere to develop the Entity Beans where transaction persistence is required and JDBC was used to connect to the MySQL database.
  • Developed the user interface using the JSP pages and DHTML to design the dynamic HTML pages.
  • Developed Session Beans on Web Sphere for the transactions in the application.
  • Utilized WSAD to create JSP, Servlets, and EJB that pulled information from a DB2 database and sent to a front end GUI for end users.
  • In the database end, responsibilities included creation of tables, triggers, stored procedures, sub-queries, joins, integrity constraints and views.
  • Worked on MQ Series with J2EE technologies (EJB, Java Mail, JMS, etc.) on Web Sphere server.

Environment: Java, EJB, IBM Web Sphere Application server, Spring, JSP, Servlets, JUnit, JDBC, XML, XSLT, CSS, DOM, HTML, MySQL, JavaScript, Oracle, UML, Clear Case, ANT.

We'd love your feedback!