Hadoop Developer Resume
MinneapoliS
SUMMARY
- 7 years of overall experience in IT industry which includes hands on experience in Software development using Java and 3 + years of comprehensive experience on Big Data.
- Good Understanding of the software development lifecycle (SDLC)
- Exposed to Agile method of software development (SCRUM)
- Hands on Experience with core Java and good at communicating with the client for requirement gathering
- Good hands on knowledge in Hadoop ecosystem and its components such as Map Reduce & HDFS.
- Good understanding on various daemon processes like Job Tracker, Task Tracker, Name Node, and Data Node.
- Good experience working with Horton works Distribution and Cloudera Distribution.
- Experienced in working with Hadoop and Big Data over Amazon web services AWS Cloud and in Microsoft Azure using SSH and Putty.
- Worked on installing, configuring, and administrating Hadoop cluster for distributions like Cloudera and Horton works.
- Good understanding of NOSQL database HBase, MongoDB.
- Worked on RESTful web service with spring and JSON files.
- Very well experienced in designing and developing both server side and client side applications.
- In - depth knowledge on Hadoop Ecosystem components like PIG, YARN Hive, Sqoop, HBase, Oozie, Zookeeper, Hue, Cloudera Manager, Flume, Spark and Scala.
- Hands-on experience writing MapReduce Jobs
- Experienced in managing and reviewing Hadoop logs.
- Hands on experience in using Sqoop for importing and exporting data from HDFS to Relational Database Systems.
- Hands-on work experience in RDBMS and creating complex Linux shell scripts
- Experience in extending Hive and Pig core functionality by writing custom UDFs
- Experienced in analyzing data by using Hive query language, Pig and Map Reduce.
- Knowledge and experience in job work-flow scheduling and monitoring tools like Oozie and Zookeeper.
- Knowledge of data warehousing and ETL tools like Informatica.
- Used tools such as Eclipse and net beans for Development.
- Explored Spark,Kafka,and Storm with other open source projects to create a real-time analytics framework.
- Experienced in various RDBMS like MS SQL Server, MySQL, and Oracle.
- Experienced in using MVC architecture, Struts, Hibernate for developing web applications using Java, JSPs, JavaScript, HTML, JQuery, AJAX, XML and JSON.
- Expertise in Java development using J2EE, Spring, J2SE, Servlets, JSP
- Experienced in Core Java and object oriented design with strong understanding of Collections, Multithreading and Exception handling.
- Experienced in Agile software methodology (scrum).
- Knowledgeable in Database concepts and writing finely tuned queries and performance tuning.
- Strong knowledge in writing advanced Shell Scripts in Linux/Unix
- Skilled in establishing strong relations among project’s teams and team members.
- Delivery Assurance - Quality Focused & Process Oriented:
- Ability to work in high-pressure environments delivering to and managing stakeholder expectations
- Application of structured methods to: Project Scoping and Planning, risks, issues, schedules and deliverables.
- Strong analytical and Problem solving skills.
- Good Inter personnel skills and ability to work as part of a team. Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines
TECHNICAL SKILLS
Hadoop Ecosystem: MapReduce, HDFS, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, Flume, Spark, Kafka, Storm
Java Technologies: Java, J2EE, Servlets, JSP, XML, AJAX, SOAP, WSDL
SDLC Methodologies: Agile, UML, Design Patterns (Core Java and J2EE)
Enterprise Frameworks: Ajax, MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2
Version and Source Control: CVS, SVN, GIT, Synergy
Programming Languages: C,C++,Java, XML, Unix Shell scripting, SQL and PLSQL
Web Technologies: HTML, DHTML, XML, XSLT, JavaScript, CSS
Modeling Tools: UML on Rational Rose 4.0
IDE Tools: Eclipse, Net beans
Web Services: WebLogic, WebSphere, JBoss
Databases: Oracle DB2, MS - SQL Server, MySQL, MS - Access, Apache Cassandra
Frameworks: MVC, Struts, Log4J, Junit, Maven, Web Services.
Operating Systems: Windows Client OS and servers (2000,2008,2012), UNIX, Linux
PROFESSIONAL EXPERIENCE
Confidential, Wichita, KS
Hadoop Developer
Responsibilities:
- Worked on a Hadoop environment with MapReduce, KAFKA, Sqoop, Oozie, Flume, Hbase, Pig, Hive and IMPALA on a multi node cloud environment
- To configure Hadoop environment in cloud through Amazon Web Services (AWS) and to provide a scalable distributed data solution
- Developed producers for Kafka which compress, and bind many small files into a larger Avro and Sequence files before writing to HDFS to make best use of a Hadoop block size.
- Worked on implementing MapReduce Jobs to parse raw weblogs into delimited records and also in handling files in various formats such as JSON, XML, Text formats.
- Improved performance on MapReduce Jobs by creating combiners, Partitioning and Distributed Cache.
- Exposure in spark iterative processing.
- Created partitioned tables in Hive for best performance and faster querying.
- Utilized Sqoop to import data from various database sources into Hbase using Sqoop scripts by incremental data loading on transactions of customer’s data by date
- Utilized Flume in moving log files generated from various sources into Amazon S3 for processing of data.
- Performed extensive data analysis using Hive and Pig.
- Created Simple as well as complex results using Hive and have improved performance and reduced query time by creating partitioned tables.
- Created workflow in Oozie for Automating tasks of loading data into Amazon S3 and to preprocess using Pig, utilized Oozie for data scrubbing and processing
- Developed scripts and deployed them to pre-process the data before moving to HDFS.
- Performed extensive analysis on data with Hive and Pig.
- Worked on proof of concept on IMPALA.
- Used Synergy for Version control and Clear Quest for creating and recording logs on defects and tasks assigned to me.
Environment: Hive, MapReduce, Pig, Impala, Tableau, HDFS, Oozie, and AWS.
Confidential, Minneapolis
Hadoop Developer
Responsibilities:
- To configure and manage Hadoop Components such as Pig, Hive, Sqoop.
- Used Flume to load unstructured and semi structured data from various sources such as websites and streaming data to cluster
- Implemented UDFs for providing custom Pig and hive capabilities
- Worked on designing NoSQL Schemas on HBase
- Worked on configuring and managing disaster recovery and backup on Cassandra Data.
- Performed Filesystem management and monitoring on Hadoop log files.
- Utilized Oozie workflow to run Pig and Hive jobs
- Developed customized classes for serialization and Deserialization in Hadoop
- Performed optimization of MapReduce for effective usage of HDFS by compression techniques.
- Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Responsible for managing data coming from different sources.
- Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
- Successfully converted the AVRO data into PARQUET format in IMPALA for faster query processing.
- Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
- Performed analysis of data using Hive queries and Pig scripts.
Environment: Hadoop Framework, MapReduce, Hive, Sqoop, Pig, HBase, Flume, Oozie, Java(JDK1.6), UNIX Shell Scripting, Oracle 11g/12g, Windows NT, IBM Data stage 8.1, TOAD 9.6, Teradata.
Confidential, Dallas, TX
Hadoop Developer
Responsibilities:
- Communicating with business customers effectively to gather the required information for the project.
- Worked Extensively on Cloudera Distribution.
- Worked on YARN Map Reduce V2 and its Resource Manager and Application Master.
- Involved in loading data into HDFS from Teradata using Sqoop
- Experienced in moving huge amounts of log file data from different servers
- Worked on implementing transformer/mapping MapReduce pipelines
- Involved in generating structured data through MapReduce jobs and have stored them in Hive tables and have analyzed the results through Hive queries based on the requirements.
- Worked on performance improvement by implementing Dynamic Partitioning and Buckets in Hive and by designing managed and external tables.
- Worked on development of PIG Latin scripts and have used ETL tools and Informatica for some pre-aggregations
- Worked on MapReduce programs to cleanse and pre-process data from various different sources.
- Worked on Sequence files and ORC formats in map Reduce programs.
- Created Hive Generic UDF’s for implementing business logic. And have worked on incremental imports to Hive Tables.
- Worked on data analysis using External tables with HBase.
- Worked on a web based tool called DataStax ops Centre to monitor and to simplify administration tasks.
- Used SVN for Version Control.
- Used zookeeper for various centralized configurations.
- Involved in increasing the cluster size from 25 nodes to 40 nodes.
- Monitored system status and log file and diligently provided solutions for failure conditions.
Environment: Apache Hadoop, Map Reduce, HDFS, Hive, Java, SQL, PIG, Zookeeper, Cassandra, Java (jdk1.6), Flat files, Oracle 11g/10g, MySQL, Windows NT, UNIX, Sqoop, Hive, Oozie, HBase.
Confidential, Charlotte, NC
Java Developer
Responsibilities:
- Enabled multiple screen capability by implementing EJB and BO class on Bridge Tier framework.
- Implementation of validations for data collection modules with web services with JavaScript and EJB beans
- Using Collections and Exceptions in Java Core APIs for developing business logic layers.
- Fetched required data from Oracle database using SQL queries using DOA classes.
- Implemented a web service client for fetching information from distributed system and SOAP UI
- Designed the front end using JSP, HTML, CSS, JSTL, Java Script, AJAX and jQuery.
- Worked on bug fixes on production releases and also in QAT, UAT and deployment support as required.
- Exposure to Clear Case and Sub Version tool for managing source code versioning and control.
- Participated in code Reviews and provided valuable suggestions
- Used Tier Framework for implementing Exception mechanism and logged the entries using Log4j.
Environment: Java, J2EE, Servlets, EJB, JNDI, JMS, Oracle 11g, SQL, JavaScript, AJAX, jQuery, XML, Soap, Junit, Bridge-Tier Framework, WebSphere 7.1, RAD 8.1, JSP, JSTL, HTML, IBM Clear Quest and Clear Case, SVN, Agile, TDD.
Confidential
Java Developer
Responsibilities:
- Implemented Transactional Model for handling multiple requests using Spring AOP and JSP.
- Used UML for use case and object modeling and used it for generating class diagrams and sequence diagrams.
- Utilized AJAX codes for validating and populating results from server using asynchronous AJAX calls
- Implemented Backing beans for handling UI components and stores its state in a scope.
- Worked on implementing EJB Stateless sessions for communicating with Controller.
- Implemented database integration using Hibernate and utilized spring with Hibernate for mapping with Oracle database.
- Used Hibernate for mapping ORM objects to tables and to use HQL queries.
- Worked on Oracle PL/SQL queries to Select, Update and Delete data.
- Worked on MAVEN for build automation.
- Used GIT for version control.
Environment: Java 1.6, JSF 1.2, Spring 2.5, Hibernate 3.0, UML, XML, HTML, JavaScript, CSS, XSL, Oracle 10g, SQL, PL/SQL, EJB 3.0, JMS, AJAX, Web services, IBM Web Sphere Application Server 8.0, JBoss, Java Beans, Apache Maven, Git, TFS, JIRA, Remedy (Incident Management Tool).