Hadoop Developer/Java Developer Resume Princeton, NJ - Hire IT People

PROFESSIONAL SUMMARY:

Around 7 years of professional experience in various Software Development positions in core and enterprise software development using Big Data, Java/J2EE and Open Source technologies.
Having 3+ years of hands - on Experience on Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper, Flume, NiFi and Kafka including their installation and configuration.
Experience in AWS, Hortonworks and Cloudera Hadoop distributions.
Worked with case team (client, implementation Consultants, System Administrators and project manager) in gathering business requirements for data migration needs. Also to identify, define, document and communicate data migration requirements
Assisted with designing, planning and managing the data migration process.
Migrated large data for both Front office and back office systems (SaaS or Enterprise Clients).
In depth knowledge of Hadoop architecture and various components such as HDFS, JobTracker, NameNode, DataNode, MapReduce and Yarn concepts.
Experience in writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
Handling and further processing schema oriented and non-schema oriented data using Pig.
Designed and developed SQOOP scripts for datasets transfer between Hadoop and RDBMS.
Experience in extending Hive and Pig core functionality by writing custom UDFs.
Hands on experience in extending the core functionalities of HIVE using UDF, UDAF and UDTF.
Experience and good at Data modeling with Hive.
Experience in using NIFI processor groups, processors and concepts on process flow management.
Implemented Data Warehousing Methodologies for ETL using Informatica Designer, Repository Manager, Workflow Manager, Workflow Monitor, Repository Server Administration Console.
Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
Experience in using Flume to collect weblogs.
Actively involved in successfully migration project without affecting Ab Initio applications data and processing with through testing.
Developed MapReduce jobs to automate transfer the data from HBase.
Handling different file formats on Parquet, Proto Buffer, Avro, Sequence file, JSON, XML and Flat file.
Experience working on Kafka cluster. Also have experience in working on Spark and Spark streaming.
Good Knowledge in creating event processing data pipelines using Kafka and Spark Streaming.
Configured and maintained different topologies in Storm cluster and deployed them on regular basis.
Experienced in Apache Spark for implementing advanced procedures like text analytics and processing using the in-memory computing capabilities written in Scala.
Experience in using Hcatalog for Hive and Pig.
Involved in the ETL process using Ab Initiotool to setup a data extraction from several databases.
Wrote a Python module to connect and view the status of an Apache Cassandra instance.
Worked on Apache spark writing Python applications to convert txt, xls files and parse data into JSON format.
Loaded data in elastic search from datalake using SPARK/Hive.
Experienced in NoSql databases such as Hbase, MongoDb and Cassandra.
Involved in deploying the applications in AWS. Proficiency in Unix/Linux shell commands.
Hands on experience in AWS Cloud in various AWS services such as Redshift cluster, Route 53 domain configuration.
Involved in loading data from local file system (Linux) to HDFS
Extracted large data from mostly flat files and excel sources, with a minimum of 250,000 rows, to be loaded to the server; handled files with the max 1024 column allowed
Experienced in developing Shell scripts and Python scripts for system management.
Involved in data modeling and sharing and replication strategies in MongoDB.
Experience in creating custom Lucene/Solr Query components.
Utilized Kafka for loading streaming data and performed initial processing, real time analysis using Storm.
Experience in developing distributed Web applications and Enterprise applications using Java/ J2EE technologies (Core Java (JDK 6+).
Got experience in working on Scala with Spark.
Excellent programming skills with experience in Java, C, SQL and Python Programming.

TECHNICAL SKILLS:

Hadoop Core Services: HDFS, MapReduce, Spark, Yarn

Hadoop Distribution: Hortonworks, Cloudera

NoSQL Databases: Hbase, Cassandra, MongoDB

Hadoop Data Services: Hive, Pig, Impala, Sqoop, Flume, NiFi, Kafka, Storm, Solr

Hadoop Operational Services: Zookeeper, Oozie

Programming Languages: Core Java, Servlets, Hibernate, Spring, Struts, Scala, Python

Databases: Oracle, MySQL, SQL Server

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat

Operating Systems: UNIX, Windows, LINUX

Development Tools: Microsoft SQL Studio, Eclipse, NetBeans, IntelliJ

PROFESSIONAL EXPERIENCE:

Confidential, Princeton, NJ

Hadoop Developer/Java Developer

Responsibilities:

Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
Involved in data migration activity using Sqoop JDBC drivers for MySql
Developed MapReduce programs that filter bad and un-necessary records and find out unique records based on different criteria.
Developed Secondary sorting implementation to get sorted values at reduce side to improve MapReduce performance.
Implemented Custom writables, Input Format, Record Reader, Output Format, and Record Writer for MapReduce computationsto handle custom business requirements.
Implemented MapReduce programs to classify data records into different classifications based on different type of records.
Responsible to manage data coming from different sources.
Created NIFI flow to ingest the data realtime from MySql to SalesForce using RestAPIS
Created FanIn and FanOut multiplexing flows with Flume
Experience with creating ETL jobs to load JSON data and server data into MongoDB and transformed MongoDB into the Data Warehouse.
Experience in developing and designing POCs using Scalaand deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Created Ab Initio graphs that transfer data from various sources like Oracle, flat files and CSV files to the Teradata database and flat files.
Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
Implemented Daily Oozie coordination jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig using Oozie co-coordinator jobs.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
Responsible for performing extensive data summarization using Hive.
Importing the data into Spark from Kafka Consumer group using Spark Streaming APIs.
Developed Pig UDF's to pre-process the data for analysis using Java or Python.
Worked with SQOOP import and export functionalities to handle large data set transfer between Oracle database and HDFS.
Derived modeled the Facts, Dimensions, Aggregated facts in Ab Initio from data warehouse star schema for creating billing.
Involved in writing curl scripts, background batch process and on demand process for indexing to solr using SolrJ API.
Involved in migrating Hive queries intoSparktransformations usingData frames, Spark SQL, SQL Context, and Scala.
Developed a data pipeline using Kafka and Storm to store data into HDFS.
Involved in deploying the applications in AWS. Proficiency in Unix/Linux shell commands.
Worked with JSON for data exchange between client and server.
Extensively used Spring & Hibernate Frameworks and implemented MVC architecture.
Worked on Spring RESTful for dependency injection.
Developed and retrieved No-SQL data using Mongo DB using DAO's
Implementation of Business logic layer for MangoDB services.
Implemented test scripts to support test driven development and continuous integration.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Impala, Oozie, Java, Kafka, Storm, Linux, Maven, Oracle 11g/10g, SVN, MongoDB, Informatica.

Confidential, Waltham, MA

Hadoop Developer

Responsibilities:

Involved in the Complete Software development life cycle (SDLC) to develop the application.
Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase and Sqoop.
Involved in loading data from LINUX file system to HDFS.
Developed custom and Indexed search result using Apache SOLR.
Involved in creation of test plan and testing module for all the java code in these projects and load test of solr server and search service.
Developed MapReduce jobs in Python for data cleaning and data processing
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Integrated Spark with various NoSQL Databases like HBase, Cassandra and Message Brokering Kafka in Cloudera.
Implemented test scripts to support test driven development and continuous integration.
Performed performance tuning for Spark Streaming e.g. setting right Batch Interval time, correct level of Parallelism, selection of correct Serialization & memory tuning.
Developed multiple MapReduce jobs in java for data cleansing and preprocessing.
Developed Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
Created Pig Latin scripts to sort, group, join and filter the enterprise wise to get transformed data sets.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
Worked on tuning the performance Pig scripts.
Mentored analyst and test team for writing Hive Queries.
Installed Oozie workflow engine to run multiple Mapreduce jobs.
Migrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets.
Hands on experience in AWS Cloud in various AWS services such as Redshift cluster, Route 53 domain configuration.
Used Cloud watch logs to move app logs to S3. Create alarms based on exceptions raised by applications.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as Required

Environment: Hadoop, HDFS, MapReduce, Hive, Pig, Sqoop, Linux, Java, Oozie, Cassandra.

Confidential, San Jose, CA

Hadoop Developer/Java Developer

Responsibilities:

Extensively involved in Installation and configuration of Cloudera distribution, NameNode, Secondary NameNode, Job Tracker, TaskTrackers and DataNodes.
Installed and configured Hadoop ecosystem like HBase, Flume, Pig and Sqoop.
Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
Managed and reviewed Hadoop Log files.
Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Worked extensively with Sqoop for importing metadata from Oracle.
Designed a data warehouse using Hive.
Created partitioned tables in Hive.
Mentored analyst and test team for writing Hive Queries.
Extensively used Pig for data cleansing.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
Developed the Pig UDF’S to pre-process the data for analysis.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Good exposure and knowledge on co-ordination services through ZooKeeper.
Expertise in using with Spring, JSF, EJB, Hibernate and Struts frameworks.
Expertise in using Development Tools like Eclipse, My eclipse and Net beans.
Excellent back-end SQL programming skills using MSSQL, Oracle and SQL Server with PL/SQL.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, ZooKeeper, Oozie, CoreJava, Spring MVC, Hibernate UNIX Shell Scripting.

Confidential

Java Developer

Responsibilities:

Developed documentation for new and existing programs, designs specific enhancements to application.
Implemented web layer using JSF.
Implemented business layer using Spring MVC.
Implemented Getting Reports based on start date using SQL.
Implemented Session Management using Session Factory in Hibernate.
Developed the DO’s and DAO’s using hibernate.
Hands on Experience in consuming data from RESTful Web Services using JSON.
Restful web service development using Hibernate.
Implement SOAP web service to validate zip code using Apache Axis.
Built SOAP and RESTful services
Wrote complex queries, PL/SQL Stored Procedures, Functions and Packages to implement Business Rules.
Wrote PL/SQL program to send EMAIL to a group from backend.
Developer scripts to be triggered monthly to give current monthly analysis.
Scheduled Jobs to be triggered on a specific day and time.
Modified SQL statements to increase the overall performance as a part of basic performance tuning and exception handling.
Used Cursors, Arrays, Tables, Bulk collect concepts.
Extensively used log4j for logging the log files
Performed UNIT testing in all the environments.
UsedSubversionas the version control system

Environment: Java 1.4.2, Spring MVC, JMS, Java Mail API 1.3, Hibernate, HTML, CSS, JSF, JavaScript, Junit, RAD, Web service, UNIX

Confidential

Java/J2ee Developer

Responsibilities:

Involved in all the phases of the life cycle of the project from requirements gathering to quality assurance testing.
Developed Class diagrams, Sequence diagrams using Rational Rose.
Responsible in developing Rich Web Interface modules with Struts tags,JSP, JSTL, CSS, JavaScript, Ajax, GWT.
Developed presentation layer using Struts framework, and performed validations using Struts Validatorplugin.
Created SQL script for the Oracle database
Implemented the Business logic using Java Spring Transaction Spring AOP.
Implemented persistence layer using Spring JDBC to store and update data in database.
Produced web service using WSDL/SOAP standard.
Implemented J2EE design patterns like Singleton Pattern with Factory Pattern.
Extensively involved in the creation of the Session Beans and MDB, using EJB 3.0.
Used Hibernate framework for Persistence layer.
Extensively involved in writing Stored Procedures for data retrieval and data storage and updates in Oracle database using Hibernate.
Deployed and built the application using Maven.
Performed typo JUnit.
Used JIRA to track bugs.
Extensively used Log4j for logging throughout the application.
Produced a Web service using REST with Jersey implementation for providing customer information.
Used SVN for source code versioning and code repository.

Environment: Java (JDK1.5), J2EE, Eclipse, JSP, JavaScript, JSTL, Ajax, GWT, Log4j, CSS, XML, Spring, EJB, MDB, Hibernate, Web Logic, REST, Rational Rose, Junit, Maven, JIRA, SVN.

We provide IT Staff Augmentation Services!

Hadoop Developer/java Developer Resume

Princeton, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship