Senior Hadoop Developer Resume
Indianapolis, IN
SUMMARY:
- Around 7 years of professional experience which includes Analysis, Design, Development, Integration, De p l o y m e n t a n d Ma i n te n a n c e o f qu a li ty s o f tw a r e a p p li c a tio n s u si n g J a v a / J 2 E E T e c h n o lo g i e s a n d Big d a t a Hadoop technologies.
- Working experience in data analysis and data mining using Big Data Stack.
- Proficiency in Java, Hadoop Map Reduce, Pig, Hive, Oozie, Sqoop, Flume, HBase, Scala, Spark, Kafka, Storm, Impala and NoSQL Databases.
- High Exposure on Big Data technologies and Hadoop ecosystem, In - depth understanding of Map Reduce and the Hadoop Infrastructure.
- Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and Map Reduce programming paradigm.
- Good exposure on usage of NoSQL databases column-oriented HBase and Cassandra.
- Extensive experienced in working with semi/unstructured data by implementing complex map reduce programs using design patterns.
- Extensive experience writing custom Map Reduce programs for data processing and UDFs for both Hive and Pig in Java.
- Strong experience in analyzing large amounts of data sets writing Pig scripts and Hive queries.
- Extensive experienced in working with structured data using Hive QL, join operations, writing custom UDF’s and experienced in optimizing Hive Queries.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database.
- Experienced in job workflow scheduling and monitoring tools like Oozie.
- Experience in Apache Flume for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc.
- Hands on experience in major Big Data components Apache Kafka, Apache spark, Zookeeper, Avro.
- Experienced in implementing unified data platforms using Kafka producers/ consumers, implement pre- processing using storm topologies.
- Experienced in migrating map reduce programs into Spark RDD transformations, actions to improve performance.
- Experience with using Big Data with ETL (Talend).
- Experience with ETL - Extract Transform and Load - Talend Open Studio, Informatica.
- Strong experience in architecting real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Spark SQL, Kafka, Flume, Map reduce, Hive etc.
- Experience using various Hadoop Distributions (Cloudera, Horton works, MapR etc.) to fully implement and leverage new Hadoop features
- Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.,
- Good Knowledge in Amazon AWS concepts like EMR and EC2webservices which provides fast and efficient processing of Big Data.
- Experienced in working with different scripting technologies like Python, Unix shell scripts.
- Experience on Source control repositories like SVN, CVS and GIT.
- Strong experienced in working with UNIX/LINUX environments, writing shell scripts.
- Skilled at build/deploy multi module applications using Maven, Ant and servers like Jenkins.
- Adequate knowledge and working experience in Agile & Waterfall methodologies.
- Excellent problem solving, and analytical skills.
TECHNICAL SKILLS:
Hadoop/Big Data Technologies: HDFS, Map Reduce, Sqoop, Flume, Pig, Hive, Oozie, impala, Apache Nifi, Zookeeper, Cloudera Manager, Ambari.
NoSQL Database: MongoDB, Cassandra
Real Time/Stream processing: Apache Storm, Apache Spark
Distributed message broker: Apache Kafka
Monitoring and Reporting: Tableau, Custom shell scripts.
Hadoop Distribution: Horton Works, Cloudera, MapR.
Build Tools: Maven, SQL Developer
Programming & Scripting: JAVA, C, SQL, Shell Scripting, Python
Databases: Oracle, MY SQL, MS SQL server
Web Dev. Technologies: HTML, XML, JSON, CSS, JQUERY, JavaScript, angular JS
Tools: & Utilities: Eclipse, MQ explorer, RFH util, SSRS, Aqua Data Studio, XML SpyETL(talend)
Operating Systems: Linux, Unix, Mac OS-X, Windows 10, Windows 8, Windows 7, Windows Server 2008/2003
PROFESSIONAL EXPERIENCE:
Senior Hadoop Developer
Confidential , Indianapolis, IN
Responsibilities:
- Provided a solution using Hive, Sqoop (to export/ import data), for faster data load by replacing the traditional ETL process with HDFS for loading data to target tables.
- Maintaining and Monitoring Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries
- Designed Pig Latin scripts to sort, group, join and filter the data as part of data transformation as per the business requirements.
- Data files were merged and loaded into HDFS using java code and tracking history related to merging files were maintained in HBase.
- Collaborate with the Data Warehouse team to design and develop required ETL processes, performance tune ETL programs/scripts.
- Creating Hive tables and working on them using HiveQL.
- Written Apache PIG scripts to process the HDFS data.
- Created Java UDFs in PIG and HIVE.
- Involved in the analysis of the specifications from the client and actively participated in SRS Documentation.
- Knowledge on handling Hive queries using Spark SQL that integrate Spark environment.
- Developing Scripts and Scheduled Autosys Jobs to filter the data.
- Implemented near real-time data pipeline using a framework based on Kafka, Spark, and MemSQL.
- Involved monitoring Autosys file watcher jobs and testing data for each transaction and verified data whether it ran properly or not.
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
- Implemented Object-Relational mapping in the persistence layer using Hibernate Framework in conjunction with Spring Functionality.
- Involved in the planning process of iterations under the Agile Scrum methodology.
- Involved in writing PL/SQL, SQL queries.
- Involved in testing the Business Logic layer and Data Access layer using JUnit.
- Used Scala to test Dataframe transformations and debugging issues with data.
- Used Oracle DB for writing SQL scripts, PL/SQL code for procedures and functions.
- Wrote JUnit test cases to test the functionality of each method in the DAO layer. Configured and deployed the WebSphere Application Server.
- Prepared technical reports and documentation manuals for efficient program development.
Environment: Java, HDP-2.2 YARN cluster, HDFS, Map Reduce, Apache Hive, Apache Pig, HBase, Sqoop, XML. Oracle8i, UNIX, ETL, Spark, Scala.
Senior Hadoop Developer
Fire trucks, Henderson, NV
Responsibilities:
- Installing and configuring fully distributed Hadoop Cluster.
- Installing Hadoop Eco-system Components (Pig, Hive and HBase).
- Involved in Hadoop Cluster environment administration that includes cluster capacity planning, performance tuning, cluster Monitoring and Troubleshooting.
- Creating and configuring Hadoop cluster in Cloudera.
- Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
- Consulting on Hadoop ecosystem: Hadoop, Admin, MapReduce, HBase, Sqoop, Amazon Elastic Map Reduce (EMR)
- Coordinating and managing relations with vendors, IT developers and end users.
- Managing the work streams, process and coordinate the team members and their activities to ensure that the technology solutions are in line with the overall vision and goals.
- Analyzed the web log data using the HiveQL.
- Developed custom aggregate functions using Spark SQL and performed interactive querying.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase NoSQL database and Sqoop.
- Integrated Cassandra Querying Language called CQL for Apache Cassandra.
- Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Developed workflows using custom MapReduce, Pig, Hive, Sqoop.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Written the Apache PIG scripts to process the HDFS data and send the data to HBase.
- Used Kafka to load data in to HDFS and move data into NoSQL databases.
- Responsible for building scalable distributed data solutions using Hadoop.
- Build/deploy/configure/maintain multiple real-time clusters consisting of Apache products: Flume, Storm, Mesos, Spark, and Kafka
- Experience in working with Spark tools like RDD transformations and spark SQL.
- Worked on Apache Spark component in Scala to do transformations like normalization and standardization and convert raw data to Apache Parquet.
- Consolidated the small files for large set of data using spark Scala to create table on the data.
Environment: Cassandra, MapReduce, HDFS, Hive, Flume, Cloudera Manager, Sqoop, MySQL, UNIX Shell Scripting, Zookeeper, Tableau, Git, Spark, Kafka, Elastic Search, Docker, Ruby on Rails.
Hadoop Developer
Confidential, Winston Salem, NC
Responsibilities:
- Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
- Developed Pig latin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
- Installed and configured Hive and also implemented various business requirements by writing HIVE UDFs.
- Used Sqoop tool to extract data from a relational database into Hadoop.
- Experience in pulling data from Amazon S3 cloud to HDFS.
- Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
- Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
- Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
- Installed and configured Hadoop cluster in DEV, QA and Production environments.
- Strongly recommended to bring in Elastic Search and was responsible for installing, configuring and administration.
- Performed upgrade to the existing Hadoop clusters.
- Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
- Implemented Commissioning and Decommissioning of new nodes to existing cluster
- Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
- Responsible for data ingestions using Talend.
- Worked on ETL process using Hive with different engines like MR, tez and spark.
Environment: Hadoop, HDFS, Map Reduce, Hive, Flume, Sqoop, Cloudera CDH4, HBase, Oozie, Pig, AWS EC2 cloud, Eclipse, Talend.
Java Developer
Confidential, Erie, PA
Responsibilities:
- Involved in developing the application using Java/J2EE platform. Implemented the Model View Control (MVC) structure using Structs.
- Responsible to enhance the Portal UI using HTML, JavaScript, XML, JSP, Java, CSS as per the requirements and providing the client-side Java script validations and Server-side bean Validation Framework (JSR 303).
- Used Spring Core Annotations for Dependency Injection.
- Used Hibernate as persistence framework mapping the ORM objects to table using Hibernate annotations.
- Responsible to write the different service classes and utility API which will be used across the frame work.
- Used Axis to implementing Web Services for integration of different systems.
- Developed Web services component using XML, WSDL and SOAP with DOM parser to transfer and transform data between applications.
- Exposed various capabilities as Web Services using SOAP/WSDL.
- Used SOAP UI for testing the Restful Webservices by sending and SOAP request.
- Used AJAX framework for server communication and seamless user experience.
- Created test framework on Selenium and executed Web testing in Chrome, IE and Mozilla through Web driver.
- Used client-side java scripting: JQUERY for designing TABS and DIALOGBOX.
- Created UNIX shell scripts to automate the build process, to perform regular jobs like file transfers between different hosts.
- Used Log4j for the logging the output to the files.
- Used JUnit/ Eclipse for the unit testing of various modules.
- Involved in production support, monitoring server and error logs and foreseeing the Potential issues and escalating to the higher levels.
Environment: Java, J2EE, JSP, Servlets, Spring, Servlets, Custom Tags, Java Beans, JMS, Hibernate, IBM MQ Series, AJAX, Junit, Log4j, JNDI, Oracle, XML, SAX, Rational Rose, UML.
Java Developer
Apollo Munich
Responsibilities:
- Responsible and active in the analysis, design, implementation and deployment of full Software Development Lifecycle (SDLC) of the project.
- Designed and developed user interface using JSP, HTML and JavaScript.
- Developed Struts action classes, action forms and performed action mapping using Struts framework and performed data validation in form beans and action classes.
- Defined the search criteria and pulled out the record of the customer from the database. Make the required changes and save the updated record back to the database.
- Validated the fields of user registration screen and login screen by writing JavaScript validations.
- Used DAO and JDBC for database access.
- Design and develop XML processing components for dynamic menus on the application.
- Involved in post-production support and maintenance of the application.
Environment: Java 1.5, Struts, Servlets, HTML, XML, SQL, J2EE, JUnit, Tomcat 6.
Java Developer
Confidential
Responsibilities:
- Designed and implemented administration screens with MVC architecture using Struts
- Extensively worked on Swing JFC and developed UI using java SWING
- Coding involves writing Action classes /Forms, JSP
- Developed Struts model components to access business logic components.
- Involved in the development of plug-in classes. All the Plug-in Classes are instances of plug-in Servlet which acts as a controller.
- Installed and configured Tomcat 5.0 in development and testing environment
- Involved in configuring web.xml & struts-config.xml according to struts framework.
- Deployed necessary stored procedures.
- Involved in unit testing, system integration testing and User acceptance testing.
Environment: Java, XML, JSP, Servlets, JavaScript, JDBC, Java Mail, HTML, CSS, Tomcat 4, Oracle 8i
