Hadoop Developer Resume
Northbrook, IL
SUMMARY
- Over 7 years of expertise in various domains and maintenance of the software development lifecycle using Agile and Scrum methodology.
- Worked on systems programming, requirements gathering, technical documentation writing, extensive designing and developing Big Data solutions for different enterprise application systems across the domains.
- Experience in full project development including Software Design, Analysis, Coding, Development, Testing, Implementation, Maintenance and Support using Java and J2EE technologies.
- Sound knowledge of Java, J2EE, SQL & related technologies.
- Excellent understanding of Object Oriented Programming and Core Java concepts such as multi - threading, exception handling, generics, annotations, collections, serialization and I/O.
- In-depth understanding of Data Structures and Design Analysis of Algorithms.
- Around 4 plus years of experience working with Big Data using Hadoop Eco-system components (HDFS, MapReduce (MRV1, YARN), Pig, Hive, Hbase, Sqoop, Flume, Impala, Oozie, Zookeeper, Storm, Solr and Tez).
- Worked with data from different sources like Databases, Log files, Flat files and XML files.
- Hands on experience in data cleaning, transformation using Talend.
- Hands on experience in using Hive and Pig scripts to perform data analysis, data cleaning and data transformation.
- Hands on experience in capturing data and importing using Sqoop from existing rational databases (Oracle, MySQL, SQL and Teradata) with the help of connectors and fast loaders.
- Good working knowledge data visualization usingTableau.
- Solid experience in Storage, Querying, Processing and Analysis of Big Data using Hadoop framework.
- Good Experience in managing and reviewing Hadoop log files.
- Experienced in developing MapReduce programs using Hadoop java API.
- Developed simple to complex MapReduce jobs using Hive scripts and Pig Latin scripts to handle files in multiple formats (JSON, Text, XML etc..)
- Improved the performance of MapReduce jobs using Combiners, Partitioning and Distributed cache.
- Experienced in handling Avro data files in MapReduce programs using Avro data serialization system.
- Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop MapReduce and Pig jobs.
- Experienced in loading unstructured data into HDFS using Flume/Kafka.
- Good hands on experience in NoSQL databases such as Hbase, Cassandra, Couchbase and MongoDB.
- Experienced in developing custom UDFs for Pig and Hive using Java by extending core functionality.
- Expertise in Real time data ingestion into HBASE and HIVE using Storm.
- Experienced with performing analytics on Time Series data using HBase and Java API.
- Sound knowledge in programming using Scala and Spark.
- Good knowledge on Spark components likeSpark SQL, MLib, Spark Streaming and GraphX.
- Having experience onSpark performance tuning options.
- Hands on experience in dealing with Compression Codecs like Snappy, Gzip.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, Map Reduce Hive, Pig, Pentaho, Hbase, Zookeeper, Sqoop, Oozie, Cassandra, Couchbase, Flume, Solr and Avro.
Web Technologies: Core Java, J2EE,Servlets,JSP,JDBC,XML,AJAX,SOAP, WSDL
Methodologies: Agile, UML, Design Patterns (Core Java and J2EE)
Frameworks: MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2
Programming Languages: Java, XML, Unix Shell scripting, HTML.
Databases: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, MS-Access, Neo4J
Web Services: Web Logic, Web Sphere, Apache Tomcat
Monitoring & Reporting tools: Ganglia, Nagios, Custom Shell scripts
PROFESSIONAL EXPERIENCE
Hadoop Developer
Confidential, Northbrook, IL
Responsibilities:
- Worked on scalable distributed data system usingHadoopecosystem.
- Developed Simple to complex Map/reduce streaming jobs using java which are implemented using Hive and Pig.
- Used various compression mechanisms to optimize Map/Reduce Jobs to use HDFS efficiently.
- Transformed the imported data using Hive and MapReduce.
- Used Sqoopto extract the data from MySQL and load data into HDFS.
- Used Hive queries and running Pig scripts to study customer behaviour by analysing the data.
- Used Impala to read, write and query thedata in HDFS.
- Used Apache Storm for processing real time streaming data. Used storm spouts for reading data from external sources and storm bolts for processing the data.
- Used Cloudera Managerfor continuous monitoring and managing theHadoopcluster.
- Implemented Spark POC's using Spark with Scala.
- Developed Oozie workflow for Spark jobs.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Experienced in loading and transforming of large sets of structured, semi structured and unstructured data.
- Analysed large amounts of data sets to determine optimal way to aggregate and report on it.
Environment: Hadoop0.20.2 - PIG, Hive, Apache Sqoop, Spark, Python, Oozie, HBase, Zoo keeper, Cloudera manager, 30 Node cluster with Linux-Ubuntu.
Hadoop Developer
Confidential, Detroit, MI
Responsibilities:
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Responsible for building scalable distributed data solutions usingHadoop.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Installed and configuredHadoopusing Cloudera Distribution.
- Configured other ecosystems like Hive, Sqoop, Flume, Pig and Oozie.
- Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Developed Junit tests for testing MapReduce and also performed testing using small sample data.
- Created MapReduce jobs which were used in processing survey data and log data stored in HDFS.
- Used Pig Scripts for data cleaning and data preprocessing.
- Used MapReduce jobs and pig scripts.
- Transformed and aggregated data for analysisby implementing work flow management of Sqoop, Hive and Pig scripts.
- Used Pigto create dataflow models to process data.
- Developed Unit Test Cases for Mapper and Reducer classes.
- Involved in creating Hive tables and loading them with data and writing hive queries.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Environment: HDFS, Eclipse, Pig, Hive, Oozie, Sqoop, Tableau Desktop.
Hadoop Developer
Confidential, Detroit, MI
Responsibilities:
- Gathered the business requirements from the Business Partners and Subject Matter Experts
- Utilized Agile Scrum Methodology to help manage and organize a team of 3 developers with regular code review sessions
- Handled data coming from different sources and of different formats.
- Involved in the loading of structured and unstructured data into HDFS.
- Loaded data from MySQL to HDFS on regular basis using Sqoop.
- Wrote MapReduce job using Java API
- Involved in managing and reviewing Hadoop log files
- Scheduled various Hadoop tasks to run using Scripts and Batch Jobs.
- Performed data analysis to meet the business requirements by using Hive by creating Hive tables using Hive QL.
- Used JUnit for unit testing.
Environment: Hadoop, MapReduce, HDFS, Hive, PIG, MySQL, Java (jdk1.6), and Junit
Sr. Java Developer / Hadoop Developer
Confidential, Chattanooga, TN
Responsibilities:
- Used JSP, Servlets and JDBCto develop web components.
- Created templates and screens in HTML and JavaScript.
- Worked with EJBs to code reusable components and business logic.
- Handled data coming from different sources.
- Installed and configured HDFS and developed multiple Map Reduce jobs in java for data cleaning and pre-processing.
- Used Map Reduce to perform analytics on data present in Cassandra.
- Used thrift API for Real time analytics on Cassandra data.
- Used Hibernate with Spring framework for data persistence.
- Worked in loading data from UNIX file system to HDFS.
- Loaded and transformed large datasetsinto HDFS using Hadoop fs commands.
- Supported in setting up updating configurations for implementing scripts with Pig and Sqoop.
- Migrated existing SQL queries to HiveQL queries to move to big data analytical platform.
- Used test cases in Junit for unit testing.
- Worked on capacity planning, node recovery and slots configuration.
Environment: JDK, J2EE, UML, Servlet, JSP, JDBC, Struts, XHTML, JavaScript, MVC, XML, XML,Schema, Tomcat, Eclipse, CDH, Hadoop, HDFS, Pig, MYSQL and MapReduce
Java Developer
Confidential, Pittsburgh, PA
Responsibilities:
- Responsible for gathering Business Requirements and User Specifications from Business Analyst.
- Involved in developing, testing and deployment of the application using Web Services.
- Worked on Load Builder Module for Region services using SOAP Web services.
- Worked with Servlets, JSP and Ajax to design the user interface.
- Used JSP, Java Script, HTML5, and CSS for validating and customizing error messages to the User Interface.
- Used Eclipse 3.5 IDE for code development and deployed in Web Logic Server.
- Worked on MVC framework, STRUTS 2.0 and used spring dependency injection for application customization and upgrade.
- UML diagram were used to create use cases and Sequence diagram.
- Implemented Hibernate in the data access object layer to access and update information in the Database.
- Wrote PL/SQL queries, stored procedures, and triggers to perform back-end database operations.
- Used multiple Action Controllers to control the page flow.
- Used Interceptors for client validations.
- Used Subversion for version control and log4j for logging errors.
- Developed ANT scripts to build and deploy the application.
- Wrote test cases in JUnit for unit testing.
Environment: Java 6, J2EE 5, Struts 2.0, Hibernate 3.0, MVC, WebLogic Application Server 10.3, UML,JSP, Servlets, Java Script, HTML5, CSS, Ajax, Web Services, Oracle 10g, Eclipse 3.5 IDE, PL/SQL, ANT, Junit, XML/XSL, log 4j 1.2.15.
Java Developer
Confidential
Responsibilities:
- Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product.
- Worked on JSP, Servlets and JDBC in creating web components.
- Responsible for creating work model using HTML and JavaScript to understand the flow of the web application and created class diagrams.
- Participated in the daily stand up SCRUM agile meetings as part of AGILE process for reporting the day to day developments of the work done.
- J2EE is used to develop the application based on MVC architecture.
- Used HTML, XHTML, JavaScript, JQuery, DHTML and Ajax to improve the interactive front end.
- Used EJB entity and session beans to implement business logic and session handling and transactions.
- Designed, Implemented, Tested and Deployed Enterprise Java Beans using WebLogic as Application Server.
- Designed the database tables and indexes used for the project.
- Developed stored procedures, packages and database triggers to enforce data integrity.
- JDBC API was used with Query Statements and Prepared Statements to interact with the database using SQL.
- Used SAX and DOM XML parsers for data retrieval.
- Performed data analysis and created reports for user requirements.
Environment: Windows NT 2000/2003, XP, and Windows 7/ 8 C, Java, JSP, Servlets, JDBC, EJB, DOM, XML, SAX