We provide IT Staff Augmentation Services!

Hadoop Developer/java Developer Resume

3.00/5 (Submit Your Rating)

Princeton, NJ

PROFESSIONAL SUMMARY:

  • Around 7 years of professional experience in various Software Development positions in core and enterprise software development using Big Data, Java/J2EE and Open Source technologies.
  • Having 3+ years of hands - on Experience on Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper, Flume, NiFi and Kafka including their installation and configuration.
  • Experience in AWS, Hortonworks and Cloudera Hadoop distributions.
  • Worked with case team (client, implementation Consultants, System Administrators and project manager) in gathering business requirements for data migration needs. Also to identify, define, document and communicate data migration requirements
  • Assisted with designing, planning and managing the data migration process.
  • Migrated large data for both Front office and back office systems (SaaS or Enterprise Clients).
  • In depth knowledge of Hadoop architecture and various components such as HDFS, JobTracker, NameNode, DataNode, MapReduce and Yarn concepts.
  • Experience in writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
  • Handling and further processing schema oriented and non-schema oriented data using Pig.
  • Designed and developed SQOOP scripts for datasets transfer between Hadoop and RDBMS.
  • Experience in extending Hive and Pig core functionality by writing custom UDFs.
  • Hands on experience in extending the core functionalities of HIVE using UDF, UDAF and UDTF.
  • Experience and good at Data modeling with Hive.
  • Experience in using NIFI processor groups, processors and concepts on process flow management.
  • Implemented Data Warehousing Methodologies for ETL using Informatica Designer, Repository Manager, Workflow Manager, Workflow Monitor, Repository Server Administration Console.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Experience in using Flume to collect weblogs.
  • Actively involved in successfully migration project without affecting Ab Initio applications data and processing with through testing.
  • Developed MapReduce jobs to automate transfer the data from HBase.
  • Handling different file formats on Parquet, Proto Buffer, Avro, Sequence file, JSON, XML and Flat file.
  • Experience working on Kafka cluster. Also have experience in working on Spark and Spark streaming.
  • Good Knowledge in creating event processing data pipelines using Kafka and Spark Streaming.
  • Configured and maintained different topologies in Storm cluster and deployed them on regular basis.
  • Experienced in Apache Spark for implementing advanced procedures like text analytics and processing using the in-memory computing capabilities written in Scala.
  • Experience in using Hcatalog for Hive and Pig.
  • Involved in the ETL process using Ab Initiotool to setup a data extraction from several databases.
  • Wrote a Python module to connect and view the status of an Apache Cassandra instance.
  • Worked on Apache spark writing Python applications to convert txt, xls files and parse data into JSON format.
  • Loaded data in elastic search from datalake using SPARK/Hive.
  • Experienced in NoSql databases such as Hbase, MongoDb and Cassandra.
  • Involved in deploying the applications in AWS. Proficiency in Unix/Linux shell commands.
  • Hands on experience in AWS Cloud in various AWS services such as Redshift cluster, Route 53 domain configuration.
  • Involved in loading data from local file system (Linux) to HDFS
  • Extracted large data from mostly flat files and excel sources, with a minimum of 250,000 rows, to be loaded to the server; handled files with the max 1024 column allowed
  • Experienced in developing Shell scripts and Python scripts for system management.
  • Involved in data modeling and sharing and replication strategies in MongoDB.
  • Experience in creating custom Lucene/Solr Query components.
  • Utilized Kafka for loading streaming data and performed initial processing, real time analysis using Storm.
  • Experience in developing distributed Web applications and Enterprise applications using Java/ J2EE technologies (Core Java (JDK 6+).
  • Got experience in working on Scala with Spark.
  • Excellent programming skills with experience in Java, C, SQL and Python Programming.

TECHNICAL SKILLS:

Hadoop Core Services: HDFS, MapReduce, Spark, Yarn

Hadoop Distribution: Hortonworks, Cloudera

NoSQL Databases: Hbase, Cassandra, MongoDB

Hadoop Data Services: Hive, Pig, Impala, Sqoop, Flume, NiFi, Kafka, Storm, Solr

Hadoop Operational Services: Zookeeper, Oozie

Programming Languages: Core Java, Servlets, Hibernate, Spring, Struts, Scala, Python

Databases: Oracle, MySQL, SQL Server

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat

Operating Systems: UNIX, Windows, LINUX

Development Tools: Microsoft SQL Studio, Eclipse, NetBeans, IntelliJ

PROFESSIONAL EXPERIENCE:

Confidential, Princeton, NJ

Hadoop Developer/Java Developer

Responsibilities:

  • Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
  • Involved in data migration activity using Sqoop JDBC drivers for MySql
  • Developed MapReduce programs that filter bad and un-necessary records and find out unique records based on different criteria.
  • Developed Secondary sorting implementation to get sorted values at reduce side to improve MapReduce performance.
  • Implemented Custom writables, Input Format, Record Reader, Output Format, and Record Writer for MapReduce computationsto handle custom business requirements.
  • Implemented MapReduce programs to classify data records into different classifications based on different type of records.
  • Responsible to manage data coming from different sources.
  • Created NIFI flow to ingest the data realtime from MySql to SalesForce using RestAPIS
  • Created FanIn and FanOut multiplexing flows with Flume
  • Experience with creating ETL jobs to load JSON data and server data into MongoDB and transformed MongoDB into the Data Warehouse.
  • Experience in developing and designing POCs using Scalaand deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Created Ab Initio graphs that transfer data from various sources like Oracle, flat files and CSV files to the Teradata database and flat files.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
  • Implemented Daily Oozie coordination jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig using Oozie co-coordinator jobs.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
  • Responsible for performing extensive data summarization using Hive.
  • Importing the data into Spark from Kafka Consumer group using Spark Streaming APIs.
  • Developed Pig UDF's to pre-process the data for analysis using Java or Python.
  • Worked with SQOOP import and export functionalities to handle large data set transfer between Oracle database and HDFS.
  • Derived modeled the Facts, Dimensions, Aggregated facts in Ab Initio from data warehouse star schema for creating billing.
  • Involved in writing curl scripts, background batch process and on demand process for indexing to solr using SolrJ API.
  • Involved in migrating Hive queries intoSparktransformations usingData frames, Spark SQL, SQL Context, and Scala.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Involved in deploying the applications in AWS. Proficiency in Unix/Linux shell commands.
  • Worked with JSON for data exchange between client and server.
  • Extensively used Spring & Hibernate Frameworks and implemented MVC architecture.
  • Worked on Spring RESTful for dependency injection.
  • Developed and retrieved No-SQL data using Mongo DB using DAO's
  • Implementation of Business logic layer for MangoDB services.
  • Implemented test scripts to support test driven development and continuous integration.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Impala, Oozie, Java, Kafka, Storm, Linux, Maven, Oracle 11g/10g, SVN, MongoDB, Informatica.

Confidential, Waltham, MA

Hadoop Developer

Responsibilities:

  • Involved in the Complete Software development life cycle (SDLC) to develop the application.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase and Sqoop.
  • Involved in loading data from LINUX file system to HDFS.
  • Developed custom and Indexed search result using Apache SOLR.
  • Involved in creation of test plan and testing module for all the java code in these projects and load test of solr server and search service.
  • Developed MapReduce jobs in Python for data cleaning and data processing
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Integrated Spark with various NoSQL Databases like HBase, Cassandra and Message Brokering Kafka in Cloudera.
  • Implemented test scripts to support test driven development and continuous integration.
  • Performed performance tuning for Spark Streaming e.g. setting right Batch Interval time, correct level of Parallelism, selection of correct Serialization & memory tuning.
  • Developed multiple MapReduce jobs in java for data cleansing and preprocessing.
  • Developed Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise to get transformed data sets.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
  • Worked on tuning the performance Pig scripts.
  • Mentored analyst and test team for writing Hive Queries.
  • Installed Oozie workflow engine to run multiple Mapreduce jobs.
  • Migrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets.
  • Hands on experience in AWS Cloud in various AWS services such as Redshift cluster, Route 53 domain configuration.
  • Used Cloud watch logs to move app logs to S3. Create alarms based on exceptions raised by applications.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as Required

Environment: Hadoop, HDFS, MapReduce, Hive, Pig, Sqoop, Linux, Java, Oozie, Cassandra.

Confidential, San Jose, CA

Hadoop Developer/Java Developer

Responsibilities:

  • Extensively involved in Installation and configuration of Cloudera distribution, NameNode, Secondary NameNode, Job Tracker, TaskTrackers and DataNodes.
  • Installed and configured Hadoop ecosystem like HBase, Flume, Pig and Sqoop.
  • Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
  • Managed and reviewed Hadoop Log files.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Worked extensively with Sqoop for importing metadata from Oracle.
  • Designed a data warehouse using Hive.
  • Created partitioned tables in Hive.
  • Mentored analyst and test team for writing Hive Queries.
  • Extensively used Pig for data cleansing.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
  • Developed the Pig UDF’S to pre-process the data for analysis.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Good exposure and knowledge on co-ordination services through ZooKeeper.
  • Expertise in using with Spring, JSF, EJB, Hibernate and Struts frameworks.
  • Expertise in using Development Tools like Eclipse, My eclipse and Net beans.
  • Excellent back-end SQL programming skills using MSSQL, Oracle and SQL Server with PL/SQL.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, ZooKeeper, Oozie, CoreJava, Spring MVC, Hibernate UNIX Shell Scripting.

Confidential

Java Developer

Responsibilities:

  • Developed documentation for new and existing programs, designs specific enhancements to application.
  • Implemented web layer using JSF.
  • Implemented business layer using Spring MVC.
  • Implemented Getting Reports based on start date using SQL.
  • Implemented Session Management using Session Factory in Hibernate.
  • Developed the DO’s and DAO’s using hibernate.
  • Hands on Experience in consuming data from RESTful Web Services using JSON.
  • Restful web service development using Hibernate.
  • Implement SOAP web service to validate zip code using Apache Axis.
  • Built SOAP and RESTful services
  • Wrote complex queries, PL/SQL Stored Procedures, Functions and Packages to implement Business Rules.
  • Wrote PL/SQL program to send EMAIL to a group from backend.
  • Developer scripts to be triggered monthly to give current monthly analysis.
  • Scheduled Jobs to be triggered on a specific day and time.
  • Modified SQL statements to increase the overall performance as a part of basic performance tuning and exception handling.
  • Used Cursors, Arrays, Tables, Bulk collect concepts.
  • Extensively used log4j for logging the log files
  • Performed UNIT testing in all the environments.
  • UsedSubversionas the version control system

Environment: Java 1.4.2, Spring MVC, JMS, Java Mail API 1.3, Hibernate, HTML, CSS, JSF, JavaScript, Junit, RAD, Web service, UNIX

Confidential

Java/J2ee Developer

Responsibilities:

  • Involved in all the phases of the life cycle of the project from requirements gathering to quality assurance testing.
  • Developed Class diagrams, Sequence diagrams using Rational Rose.
  • Responsible in developing Rich Web Interface modules with Struts tags,JSP, JSTL, CSS, JavaScript, Ajax, GWT.
  • Developed presentation layer using Struts framework, and performed validations using Struts Validatorplugin.
  • Created SQL script for the Oracle database
  • Implemented the Business logic using Java Spring Transaction Spring AOP.
  • Implemented persistence layer using Spring JDBC to store and update data in database.
  • Produced web service using WSDL/SOAP standard.
  • Implemented J2EE design patterns like Singleton Pattern with Factory Pattern.
  • Extensively involved in the creation of the Session Beans and MDB, using EJB 3.0.
  • Used Hibernate framework for Persistence layer.
  • Extensively involved in writing Stored Procedures for data retrieval and data storage and updates in Oracle database using Hibernate.
  • Deployed and built the application using Maven.
  • Performed typo JUnit.
  • Used JIRA to track bugs.
  • Extensively used Log4j for logging throughout the application.
  • Produced a Web service using REST with Jersey implementation for providing customer information.
  • Used SVN for source code versioning and code repository.

Environment: Java (JDK1.5), J2EE, Eclipse, JSP, JavaScript, JSTL, Ajax, GWT, Log4j, CSS, XML, Spring, EJB, MDB, Hibernate, Web Logic, REST, Rational Rose, Junit, Maven, JIRA, SVN.

We'd love your feedback!