Hadoop Developer Resume Jersey City, NJ - Hire IT People

SUMMARY

Over 8 years of experience with emphasis on Big Data technologies, development and design of Java based enterprise applications
Expertise in teh creation of On - prem and Cloud Data Lake
Experience working with Cloudera, Hortonworks and Pivotal Distributions of Hadoop
Expertise in HDFS, Mapreduce, Spark, Hive, Impala, Pig, Sqoop, Hbase, Oozie, Flume, Kafka and various other ecosystem components
Expertise in Spark framework for batch and real time data processing
Experience in working with BI team and transform big data requirements into Hadoop centric technologies.
Experience in performance tuning teh Hadoop cluster by gathering and analyzing teh existing infrastructure.
Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie, Flume and zookeeper.
Experience in converting MapReduce applications to Spark.
Experience in handline messaging services using Apache Kafka.
Experience in working with flume to load teh log data from multiple sources directly into HDFS
Experience in Data migration from existing data stores and mainframe NDM(Network Data mover) to Hadoop
Good Knowledge with NoSql Databases - Cassandra, Mongo DB and HBase.
Experience in handling multiple relational databases: MySQL, SQL Server, PostgeSQL and Oracle.
Experience in supporting data analysis projects using Elastic Map Reduce on teh Amazon Web Services (AWS) cloud. Exporting and importing data into S3.
Experience in designing both time driven and data driven automated workflows using Oozie.
Experience in supporting analysts by administering and configuring HIVE.
Experience in running Pig and Hive scripts.
Experience in fine-tuning Mapreduce jobs for better scalability and performance.
Developed various Map Reduce applications to perform ETL workloads on terabytes of data.
Performed Importing and exporting data into HDFS and Hive using Sqoop.
Experience in writing shell scripts to dump teh Sharded data from Landing Zones to HDFS.
Worked on predictive modeling techniques like Neural Networks, Decision Trees and Regression Analysis.
Experience in Data mining and Business Intelligence tools such as Tableau, SAS Enterprise Miner, JMP and Enterprise Guide, IBM SPSS modeler and MicroStratergy.

TECHNICAL SKILLS

Hadoop Ecosystem Development: HDFS, MapReduce, Spark, Hive, Pig, Flume, Oozie, Zookeeper, HBASE, Cassandra, Kafka,Solr, HCatalog, Sqoop.

Operating System: Linux, Windows XP, Server 2003, Server 2008.

Databases: MySQL, Oracle, MS SQL Server, PostgreSQL, MS Access

Languages: C, JAVA, PYTHON, SQL, Pig, UNIX shell scripting

PROFESSIONAL EXPERIENCE

Confidential, Jersey City, NJ

Hadoop Developer

Responsibilities:

Worked on teh creation of business rules in Pig
Imported data from legacy systems to Hadoop using Sqoop and Apache Camel
Used Pig for data transformation
Used Apache Spark for real time and batch processing
Used Apache Kafka for handling log messages that are handled by multiple systems
Used shell scripting extensively for data munging
Worked on HCatalog, which allows PIG and Map Reduce to take advantage of teh SerDE data format transformation definitions are already written on HIVE
Worked on DevOps tools like Chef, Artifactory and Jenkins to configure and maintain teh production environment
Used Pig to transform data into various formats
Stored processed tables in Cassandra from HDFS for applications to access teh data in real time
Used Solr on Cassandra for implementation of near real-time search
Worked on writing UDFs in Java for Pig
Created ORCFile tables from teh existing non-ORCFile Hive tables

Environment: Hortonworks Data Platform 2.2, Pig, Hive, Spark, Kafka, Cassandra, Sqoop, Apache Camel, Apache Crunch, HCatalog, Chef, Jenkins, Artifactory, Avro, IBM Data Studio

Confidential, Piscataway, NJ

Hadoop Developer

Responsibilities:

Worked on teh creation of on-premise and cloud data lake from start with Pivotal distribution
Imported data from various relational data stores to HDFS using Sqoop
Collected user activity data, log data using Kafka for real time analytics
Implemented batch processing using Spark
Converted Hive tables to HAWQ for higher query performance
Responsible for loading unstructured and semi-structured data into Hadoop cluster from different data sources using Flume
Used Hive data warehouse tool to analyze teh data in HDFS and developed Hive queries
Used teh RegEx, JSON, Parquet and Avro SerDe’s for serialization and de-serialization packaged with Hive to parse teh contents of streamed log data
Implemented Hive and Pig custom UDF’s to achieve comprehensive data analysis
Used Pig to develop ad-hoc queries
Exported teh business required information to RDBMS using Sqoop to make teh data available for BI team to generate reports based on data
Implemented daily workflow for extraction, processing and analysis of data with Oozie
Responsible for troubleshooting Spark/MapReduce jobs by reviewing teh log files
Used Tableau for visualizing and to generate reports

Environment: Pivotal HD 2.0, Gemfire XD, MapReduce, Spark, Pig, Hive, Kafka, Sqoop, HBase, Cassandra, Flume, Oozie, Tableau, Aspera, AWS, HCatalog

Confidential, Minneapolis, Minnesota

Hadoop Developer

Responsibilities:

Imported data from our relational data stores to Hadoop using Sqoop.
Created various Mapreduce jobs for performing ETL transformations on teh transactional and application specific data sources.
Wrote PIG scripts and executed by using Grunt shell.
Big data analysis using Pig and User defined functions (UDF).
Worked on loading tables to Impala for faster retrieval using different file formats.
Performance tuning of queries in Impala for faster retrieval.
Teh system was initially developed using Java. Teh Java filtering program was restructured to has business rule engine in a jar that can be called from both java and Hadoop.
Created Reports and Dashboards using structured and unstructured data.
Upgrade operating system and/or Hadoop distribution as and when new versions released by using Puppet.
Performed joins, group by and other operations in MapReduce by using Java and PIG.
Worked on Amazon Web Services (AWS) to complete set of infrastructure and application services that runs virtually everything in teh cloud from enterprise applications and big data project.
Processed teh output from PIG, Hive and formatted it before sending to teh Hadoop output file.
Used HIVE definition to map teh output file to tables.
Setup and benchmarked Hadoop/HBase clusters for internal use
Wrote data ingesters and map reduce programs
Reviewed teh HDFS usage and system design for future scalability and fault-tolerance;
Wrote MapReduce/HBase jobs
Worked with HBASE NOSQL database.

Environment: Hadoop, Java 1.5, UNIX, Shell Scripting, XML, HDFS, HBase, NOSQL, MapReduce, Hive, Impala, PIG.

Confidential, Bluebell, PA

Hadoop Consultant

Responsibilities:

Responsible for installing and configuring Hadoop MapReduce, HDFS, also developed various MapReduce jobs for data cleaning
Installed and configured Hive to create tables for teh unstructured data in HDFS
Hold good expertise on major components in Hadoop Ecosystem including Hive, PIG, HBase, HBase-Hive Integration, Sqoop and Flume.
Involved in loading data from UNIX file system to HDFS
Responsible for managing and scheduling jobs on Hadoop Cluster
Responsible for importing and exporting data into HDFS and Hive using Sqoop
Experienced in running Hadoop streaming jobs to process terabytes of xml format data
Experienced in managing Hadoop log files
Worked on managing data coming from different sources
Wrote HQL queries to create tables and loaded data from HDFS to make it structured
Load and transform large sets of structured, semi structured and unstructured data
Extensively worked on Hive for generating transforming files from different analytical formats to .txt i.e. text files enabling to view teh data for further analysis
Created Hive tables, loaded them with data and wrote hive queries that run internally in MapReduce way
Wrote and modified store procedures enabling to load and modify data according to teh project requirements
Responsible for developing PIG Latin scripts enabling teh extraction of data from teh web server output files to load into HDFS
Extensively used Flume to collect teh log files from teh web servers and tan integrated these files into HDFS
Responsible for implementing schedulers on Job Tracker enabling them to TEMPeffectively use teh resources available in teh cluster for any given MapReduce jobs.
Constantly worked on tuning teh performance of teh queries in Hive and Pig, making teh queries work even more powerfully in processing and retrieving teh data
Supported Map Reduce Programs running on teh cluster
Created external tables in Hive and loaded teh data into these tables
Hands on experience in database performance tuning and data modeling
Monitored teh cluster coordination using ZooKeeper

Environment: Hadoop, HDFS, MapReduce, HortonWorks, Hive, Java (jdk1.6), DataStax, Flat files, UNIX Shell Scripting, Oracle 11g 10g, PL SQL, SQL*PLUS, Toad 9.6, Windows NT.

Confidential, Pittsburgh, PA

Sr. Java Developer

Responsibilities:

Developed detail design document based on design discussions.
Involved in designing teh database tables and java classes used in teh application.
Involved in development, Unit testing and system integration testing of teh travel network builder side of application.
Involved in design, development and building teh travel network file system to be stored in NAS drives.
Setup Linux environment for to interact with route smart library (.so) file and NAS drive file operations using JNI.
Implemented and configure Hudson as Continuous Integration server and Sonar for maintaining code and remove redundant code.
Worked with Route-smart C++ code to interact with Java application using SWIG and Java Native interfaces.
Developed teh user interface for requesting a travel network build using JSP and Servlets.
Build business logic to users can specify which version of teh travel network files to be used for teh solve process.
Used Spring Data Access Object to access teh data with data source.
Build an independent property sub-system to ensure that teh request always picks teh latest set of properties.
Implemented thread Monitor system to monitor threads. Used JUnit to do teh Unit testing around teh development modules.
Wrote SQL queries and procedures for teh application, interacted with third party ESRI functions to retrieve map data.
Building and Deployment of JAR, WAR, EAR files on dev, QA servers.
Bug fixing (Log 4j for logging) and testing support after teh development.
Prepared requirements and research to move teh map data using Hadoop framework for future usage.

Environment: Java 1.6.21, J2EE, Oracle 10g, Log4J 1.17, Windows 7 and Red Hat Linux, Sub version, Spring 3.1.0, Icefaces 3, ESRI, Weblogic 10.3.5, Eclipse Juno, Junit 4.8.2, Maven 3.0.3, Hudson 3.0.0 and Sonar 3.0.0

Confidential

Java Developer

Responsibilities:

Involved in Requirements gathering, Requirement analysis, Design, Development, Integration and Deployment.
Involved in Order Placement / Order Processing module.
Responsible for teh design and development of teh customizations framework
Designed and Developed UI’s using JSP by following MVC architecture.
Developed teh application using Struts framework. Teh views are programmed using JSP pages with teh struts tag library, Model is teh combination of EJB’s and Java classes and web implementation controllers are Servlets.
Used EJB as a middleware in designing and developing a three-tier distributed application.
Teh Java Message Service (JMS) API is used to allow application components to create, send, receive, and read messages.
Used JUnit for unit testing of teh system and Log4J for logging.
Created and maintained data using Oracle database and used JDBC for database connectivity.
Created and implemented Oracle stored procedures and triggers.
Installed Web Logic Server for handling HTTP Request/Response. Teh request and response from teh client are controlled using Session Tracking in JSP.
Worked on teh front-end technologies like HTML, JavaScript, CSS and JSP pages using JSTL tags.
Reported daily about teh team progress to teh Project Manager and Team Lead.

Environment: Core Java, J2EE 1.3, JSP 1.2, Servlets 2.3, EJB 2.0, Struts 1.1, JNDI 1.2, JDBC 2.1, Oracle 8i, UML, DAO, JMS, XML, Web Logic 7.0, MVC Design Pattern, Eclipse 2.1, Log4j and JUnit.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Jersey City, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship