Hadoop Developer Resume , Richmond, VA - Hire IT People

SUMMARY:

Around 8 years of overall experience in a variety of industries including 4+ years of experience in Big Data Technologies (Apache Hadoop stack and Apache Spark), 4+ years of experience in Java Technologies
Hands on experience with Cloudera and Hortonworks.
Hands on experience in Hadoop Ecosystem components such as Hive, Pig, Sqoop, Flume, Impala, Oozie, Zookeeper, HBase.
Strong knowledge of Hadoop Daemons such asHDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts.
Hands on experience in writing Map Reduce programs using Java to handle different data sets using Map and Reduce tasks.
Hands on experience with various Apache Hadoop Ecosystems such as Hadoop, Spark, HDFS, MapReduce, YARN, TEZ,HBase, Pig, Hive, Sqoop, Flume, Oozie, and Kafka
Hands on experience in writing MapReduce jobs in Java, Pig, and Python
Experience in dealing with SQL in Hadoop with Apache Hive
Hands on experience in writing Apache Spark SQL and Spark Streaming programming with Scala and Python.
Developed multiple Map Reduce jobs to perform data cleaning and preprocessing.
Involved in designing the data model in Hive for migrating the ETL process into Hadoop and wrote Pig Scripts to load data into Hadoop environment
Designed HIVE queries & Pig scripts to perform data analysis, data transfer and table design.
Expertise in writing Hive UDF, Generic UDF's to in corporate complex business logic into Hive Queries.
Experienced in optimizing Hive queries by tuning configuration parameters.
Implemented SQOOP for large dataset transfer between Hadoop and RDBMS.
Extensively used Apache Flumeto collect the logs and error messages across the cluster.
Experience in implementing Real - Time streaming and analytics using SparkStreaming and Kafka
Experience in data ingestion using Sqoop from RDBMS to HDFS and Hive and vice-versa
Proficient in Java/J2EE technologies - Core Java, JSP, Java Beans, Java Servlets, Ajax, JDBC, ODBC, Web Services, Swing, Hibernate, Spring, Struts, XML and XSLT
Performed data analysis using MySQL, SQL Server Management Studio and Oracle
Experience with ETL Tool using Informatica, Talend and SSIS
Experience in working with Cloudera (CDH3 & CDH4&CDH5) and Horton works Hadoop Distributions.
Hands on experience on AWS infrastructure services Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2).
Worked with Oozie and Zookeeper to manage the flow of jobs and coordination in the cluster
Experience in performance tuning, monitoring the Hadoop cluster by gathering and analyzing the existing infrastructure using Cloudera manager.
Experience with configuration of Hadoop Ecosystem components: Map Reduce, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Flume, Storm, Spark, Yarn, Tez.
Experience with Restful Services and Amazon Web Services
Hands on Experience on Amazon’s EC2, EMR and S3
Conversant with Web/Application Servers - Tomcat, Websphere, Weblogic and IIS
Experience in writing Maven and SBT scripts to build and deploy Java and Scala Applications
Around 2 years’ experience on Spark and Scala
Implemented unit testing with Junit and MRUnit
Expertise in Web Application Development with JSP, HTML, CSS, JavaScript, ASP .Net, C# .Net and JQuery

TECHNICAL SKILLS:

Big data Technologies: Hadoop, Map Reduce, HDFS, Hive, Pig, Zookeeper, Sqoop,Oozie, Flume, IMPALA, HBASE, Kafka, Storm

Big Data Frameworks:: HDFS, YARN, Spark

Hadoop Distributions: Cloudera(CDH3,CDH4,CDH5),Horton works, Amazon EMR

Programming Languages: Java, C, C++,shell scripting, Scala

Databases: RDBMS, MySQL, Oracle, Microsoft SQL Server, Teradata, DB2, PL/SQL, CASSANDRA, MongoDB

IDE and Tools: Eclipse, NetBeans, Tableau

Operating System: Windows XP/vista/7, Linux/Unix

Frameworks: Spring, Hibernate, JSF, EJB, JMS

Scripting Languages: JSP & Servlets, JavaScript, XML, HTML, Python

Application Servers: Apache Tomcat, Web Sphere, Web logic, JBoss

Methodologies: Agile, SDLC,Waterfall

Web Services: Restful, SOAP

ETL Tools: Talend, Informatica

Others: Solr, Elasticsearch

PROFESSIONAL EXPERIENCE:

Confidential, Richmond, VA

Hadoop Developer

Responsibilities:

Imported the retail and commercial data from various vendors into HDFS using EDE process and Sqoop.
Designed the Cascading flow setup from the Edgenode to the HDFS(Data lake)
Created the cascading code to do several type of data transformations as required by the DA
Used the Hue to create external Hive tables on the data in the data imported and on transformed data
Developed the code for removing or replacing the error fields in the data fields using cascading
Created the custom functions for several datatype conversions, handling the errors in the data provided by the vendor
Monitored the cascading flow using the Driven component to ensure the desired result was obtained
Optimized a Confidential tool Docs, for importing the data and converting the data into parquet file format post validation.
Involved in testing the tool Spark for exporting the data from HDFS to external database in POC
Developed the shell scripts for automating the cascading jobs for Control Mschedule.
Involved in testing the AWS Redshift to connecting with SQL database for testing and storing data in POC
Developed Hive queries to analyze the data according to the customer rating Id for several projects
Converted the raw files (CSV, TSV) to different file formats like Parquet and Avro with datatype conversion using cascading
Involved in writing the test cases for the cascading jobs using Plunger framework.
Setting up the cascading environment and troubleshooting the environmental issues related to cascading.
Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts

Environment: MapReduce, HDFS Sqoop, Cascading, LINUX, Shell, Hadoop, Spark, Hive, AWS RedShift, Hadoop Cluster

Confidential, New York, NY

Sr. Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
Used Pig as ETL tool to do transformations, event joins, filtering and some pre-aggregations before storing the data onto HDFS.
Hands on experience in writing, executing pig scripts.
Hands on experience in writing Pig UDFs.
Configured Oozie work flows to automate data flow, preprocess and cleaning tasks using Hadoop Actions.
Daily Monitoring of Cluster status and health included Data Node, Job Tracker, Talk Tracker, and Name Node.
Experience with configuration of Hadoop Ecosystem components: Map Reduce, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Flume, Storm, Spark, Yarn, Tez.
Experience with CDH distribution and ClouderaManager to manage and monitor Hadoop clusters.
Knowledge on rendering and delivering reports in desired formats by using reporting tools such as Tableau.
Worked on debugging, performance tuning of Hive & Pig Jobs
Worked on tuning the performance Pig queries
Gained experience in managing and reviewing Hadoop log files
Created HBase tables to store various data formats coming from different applications
Developed ETL Scripts for Data acquisition and Transformation using Talend
Extensive experience with Talend source & connections configuration, credentials management, context management
Implemented and assisted with Talend installations and Talend Servers setup which including,MDM server
Implemented proof of concept to analyze the streaming data using Apache Spark with Scala and Python; Used Maven and SBT for build and deploy the Spark programs
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
Developed simple to complex MapReduce jobs using Java, Pig and Hive
Developed application using Eclipse and used build and deploy tool as Maven
Exported the analyzed data to the relational databases using Sqoop for visualization

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Sqoop, Flume, Java, Oracle 10g, MySQL, SQL Server, Ubuntu, Agile, SQL Server, YARN, Spark,Hortonworks, Teradata, Talend, UNIX Shell Scripting, Oozie, Maven, Eclipse

Confidential, NY

Hadoop Developer

Responsibilities:

Used Sqoopto extract data from Oracle SQL server and MySQL databases to HDFS
Developed workflows in Oozie for business requirements to extract the data using Sqoop
Developed MapReduce(YARN) jobs for cleaning, accessing and validating the data
Used Hive and Impala to query the data in HBase
Written multiple MapReduce programs in java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats
Hive scripts were written in Hive QL to de-normalize and aggregate the data
Used Solr for querying and searching the Hbase DB
Optimized the existing Hive and Pig Scripts
Created external table using Hive to perform analysisin HDFS
Involved in loading data from UNIX file system to HDFS
Designed workflows by scheduling Hive processes for Log file data, which is streamed into HDFS using Flume
Involved in implementing a query to search the clients using their respective fields using Neo4j
Developed schemas to handle reporting requirements using Tableau
Involved with a team who worked on NoSQL databases like MongoDB for POC (proof of concept) in storing documents using GridFs.
Have deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment
Worked with application teams to install operating systems, Hadoop updates, patches, version upgrades as required.

Environment: Hadoop, Map Reduce, Hive QL, Hive, HBase, Sqoop, Solr, Flume, Tableau, Impala, Oozie, MYSQL, Oracle SQL, Java, Unix Shell, YARN, Pig Latin.

Confidential, Great Neck, NY

Hadoop and Java Developer

Responsibilities:

Worked as a senior developer for the project
Used Enterprise Java Beans as a middleware in developing a three-tier distributed application
Developed Session Beans and Entity beans to business and data process
Implemented Web Services with REST
Developed user interface using HTML, CSS, JSPs and AJAX
Client side validation using JavaScript and JQuery
Performed client side validation with JavaScript and applied server side validation as well to the web pages.
Used JIRA for BUG Tracking of Web application.
Written Spring Core and Spring MVC files to associate DAO with Business Layer.
Worked with HTML, DHTML, CSS, and JAVASCRIPT in UI pages.
Wrote Web Services using SOAP for sending and getting data from the external interface.
Extensively worked with JUnit framework to write JUnit test cases to perform unit testing of the application
Implemented JDBC modules in java beans to access the database.
Designed the tables for the back-end Oracle database.
Application hosted under Web Logic and developed utilizing Eclipse IDE.
Used XSL/XSLT for transforming and displaying reports. Developed Schemas for XML.
Involved in writing the ANT scripts to build and deploy the application.
Developed a web-based reporting for monitoring system with HTML and Tiles using Struts framework.
Implemented field level validations with AngularJS, JavaScript and JQuery
Preparation of unit test scenarios and unit test cases
Branding the site with CSS
Code review and unit testing the code
Involved in unit testing using Junit
Implemented Log4J to trace logs and to track information
Involved in project discussions with clients and analyzed complex project requirements as well as prepared design documents

Environment: Hive, Pig, HBase, Zookeeper, Sqoop,Cloudera,Java, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit, SQL language, spring, Hibernate, Junit, Oracle, XML, Altova XmlSpy, Putty and Eclipse.

Confidential

Java Developer

Responsibilities:

Access to this site is provided for authorized users.
Coding using Java, JSP, and HTML.
Developed front end validations using JavaScript and developed design and layouts of JSPs and custom taglibs for all JSPs.
Participated in planning and development of UML diagrams like Use Case Diagrams, Object Diagrams, Class Diagrams and Sequence Diagrams to represent the detail design phase.
Implemented several Test cases using Junit.
Implemented the Log4J logging component from Apache into the Application.
Made Builds and deployed the same onto Common development test Environment, which is a Web sphere Application server Environment to verify its functional requirements.

Environment: Java, J2EE, Tomcat, JSP and Struts Framework, Eclipse, SQL and Oracle.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Richmond, VA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship