Hadoop Developer Resume
Johnston, IA
SUMMARY
- Around 7 years of professional IT experience which includes Three years of experience in Big data ecosystem and Data engineering technologies.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Experience in the design, develop and monitoring of large clusters involving with both relational and NoSQL databases.
- Experience in building data pipelines and defining data flow across large systems.
- Deep understanding of data import and export from relational database into Hadoop cluster.
- Experience in handling data load from Flume to HDFS.
- Experience in handling data import from NoSQL solutions like MongoDB to HDFS.
- Experience in data extraction and transformation using MapReduce jobs.
- Experience in BigData Analytics using Cassandra,MapReduce and relational databases.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, and Flume.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom MapReduce programs in Java. Extending Hive and Pig core functionality by writing custom UDFs.
- Experience in data management and implementation of Big Data applications using Hadoop frameworks.
- Experience in Agile development, Test Driven Development and SCRUM.
- Experience in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Extensive experience of designing and developing software applications with the JDK 1.6/1.5/1.4/1.3 , J2EE1.4/1.1, Java, JSP, Servlets, Web services, JDBC,, RMI, XML, JavaScript, JQuery, CSS, Web Services, SOAP.
- Experienced in a fast paced Agile Development Environment including Test-Driven Development (TDD) and Scrum.
- Responsible for designing Presentation Tier (web pages) using the concepts such as Themes, Skins, HTML, XML, CSS, Java Script and JQuery using AJAX.
- Good Working experience in using different modules like SpringCore Container Module, Spring Application Context Module, Spring MVC Framework module, Spring AOP Module, Spring ORM Module etc in Spring Framework.
- Experience in writing numerous test cases using JUnit framework with JPROB integration
- Worked on IDE’s such as Eclipse/MyEclipse, WSAD/RAD and JBuilder for developing, deploying and debugging the applications.
- Good working knowledge of persisting java objects using Hibernate that simplify data storage and retrieval to the underlying database.
- Implemented various levels of application and services using Spring-Framework (2. 3), EJB3, Struts2 and ORM tools like Hibernate 3.0 for fast pace and efficient code generation.
- Experience working with databases such as Oracle 8.x/9i/10g, MS-SQL Server 2008/2005MySQL and using PL/SQL to write stored procedures, functions and triggers for different Data Models.
- Experience in database related work primarily creating complex stored procedures, Views, Triggers, Functions, using PL/SQL.
- Hands-on experience with Reporting Tool Crystal Reports to create reports in various formats and tuning the performance.
- Experience in Configuration Management, setting up company Version policies, build schedule using Clear Case, SVN, CVS and VSS.
- Expertise working on Application servers and Web servers like WebLogic 8.x/9.x/10.x, Apache Tomcat 5.x/6.x/7.x
- Extensive working experience in unit testing framework - JUnit Tests.
- Excellent Written, Analytical Skills, Verbal Communication skills with a customer service oriented attitude and worked with the offshore team as onsite cordinator to provide the update on the daily basis.
- Extensive experience in developing applications using Java and related technologies using WATERFALL and AGILE SCRUM methodologies.
TECHNICAL SKILLS
HADOOP/BIGDATA: HDFS,MapReduce,Hive, Pig,Zookeeper, Flume Sqoop, Flume, Oozie, H - Base, Cassandra.
Web Technologies: JAVA, J2EE,J2EE Design Patterns, EJB, STRUTS 1.0&2.0, Hibernate, Spring 3.0,Spring MVC, Servlets, JSP, SOAP Web Services, AJAX, LDAP, XML, DOM, SAX, SOAP, DTD, HTML5, DHTML, JSTL, JavaScript, CSS, Swing, AWT
Web Server: IPlanet Web Server 4.1, Java Web Server 2.0, Apache Web Server, ANT, Tomcat 6.0,Proxy Server, TCP/IP, BOS (Business object server), Sun one web server 6.1.
CLOUD TECHNOLOGIES: G Google App Engine, Amazon EC2 and Open Stack Ecosystem
Languages: Java, SQL, HTML, DHTML, JavaScript, XML, C/C++.
Databases: Oracle 8.x/9i/10g Enterprise Edition, MS-SQL Server 2008/2005, DB2, Informix.
Tools: Rational Rose 2000, JBuilder 3.5/5.0, Visual Cafe 4.0, Visual Age Java 3.5, eclipse 3.X, MS-Office, Front Page, Ultra Edit-32, Clear Case, IReport -1.2.5, OEP,WID Ant, SVN, WinCVS 1.2, TOAD 5.0/8.0, Erwin, XML SPY, Code Check, JTest, Jprobe suite 5.1 (Memory Debugger, profiler), Squirrel SQL Client, Maven 1.1 /2.0, Myeclipse 5.1, CANOO testing tool
Operating Systems: Unix (Sun Solaris 2.6/2.8), Unix-HP, Linux 3.1, Unix Scripting, Windows NT 4.0, Windows 95/ 98/2000,Fedrora
Reporting Tool: Crystal Reports 9/10/2008, Tableau
Modeling Tools: Visio
PROFESSIONAL EXPERIENCE
Confidential, Johnston IA
Hadoop Developer
Responsibilities:
- Good understanding and related experience with Hadoop stack-internals, Hive, Pig and Map/Reduce.
- Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms.
- Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
- Involved in loading data from UNIX file system to HDFS.
- Wrote MapReduce jobs to discover trends in data usage by users.
- Involved in managing and reviewing Hadoop log files.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Wrote pig UDF’s.
- Developed HIVE queries for the analysts.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Exported the result set from HIVE to MySQL using Shell scripts.
- Used Zookeeper for various types of centralized configurations.
- Involved in maintaining various Unix Shell scripts.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map
- Reduce jobs given by the users.
- Automated all the jobs starting from pulling the Data from different Data Sources like MySQL to pushing the result set Data to Hadoop Distributed File System using Sqoop.
- Used SVN for version control.
- Helped the team to increase Cluster from 25 Nodes to 40 Nodes.
- Maintain System integrity of all sub-components (primarily HDFS, MR, HBase, and Flume).
- Monitor System health and logs and respond accordingly to any warning or failure conditions.
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, Java 1.6, UNIX Shell Scripting
Confidential, Dover NH
Hadoop Developer
Responsibilities:
- Processed data into HDFS by developing solutions, analyzed the data using MapReduce, Pig, Hive and produce summary results from Hadoop to downstream systems
- Used Sqoop widely in order to import data from various systems/sources (like MySQL) into HDFS
- Applied Hive quires to perform data analysis on HBase using Storage Handler in order to meet the business requirements
- Created components like Hive UDFs for missing functionality in HIVE for analytics.
- Hands on experience with NoSQL databases like HBase, Cassandra for POC (proof of concept) in storing URL’s and images.
- Developing Scripts and Batch Job to schedule a bundle (group of coordinators) which consists of various Hadoop Programs using Oozie
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Worked with cloud services like Amazon web services (AWS)
- Involved in ETL, Data Integration and Migration
- Used different file formats like Text files, Sequence Files, Avro
- Cluster co-ordination services through Zookeeper
- Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts
- Assisted in Cluster maintenance, cluster monitoring, adding and removing cluster nodes and Troubleshooting
- Installed and configured Hadoop,Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Environment: HDFS Sqoop, Flume, LINUX, Oozie, Hadoop, Pig, Hive, Hbase, Cassandra, Hadoop Cluster, Amazon Webservices
Confidential - New York, NY
Hadoop Developer
Responsibilities:
- Worked on Hadoop cluster which ranged from 4-8 nodes during pre-production stage and it was sometimes extended up to 24 nodes during production
- Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analysed the imported data using Hadoop Components
- Established custom MapReduces programs in order to analyze data and used Pig Latin to clean unwanted data
- Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
- Involved in creating Hive tables, then applied HiveQL on those tables for data validation.
- Moved the data from Hive tables into Mongo collections.
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
- Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications
- Used Zookeeper to manage coordination among the clusters
- Experienced in analyzing MongoDB database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
- Gave assistance in exporting the analyzed data to RDBMS using Sqoop.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
- Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability
- Assisted application teams in installing Hadoop updates, operating system, patches and version upgarades when required
- Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files
Confidential, Dallas, TX
Java Developer
Responsibilities:
- Developed the system by following the agile methodology.
- Involved in the implementation of design using vital phases of the Software development life cycle (SDLC) that includes Development, Testing, Implementation and Maintenance Support.
- Applied OOAD principles for the analysis and design of the system.
- Used WebSphere Application Server to deploy the build.
- Developed front-end screens using JSP, HTML, JQuery, JavaScript and CSS.
- Used Spring Framework for developing business objects.
- Used Eclipse for the Development, Testing and Debugging of the application.
- Log4j framework has been used for logging debug, info & error data.
- Used Oracle 10g Database for data persistence.
- SQL Developer was used as a database client.
- Used WinSCP to transfer file from local system to other system.
- Performed Test Driven Development (TDD) using JUnit.
- Used Ant script for build automation.
- Used Rational ClearQuest for defect logging and issue tracking.
- Prepared design documents using object oriented technologies.
- Involved in analyzing the requirements, drafted use cases and created UML class and sequence diagrams.
- Used APIs of Java SQL and Swing packages extensively.
- Involved in the front-end dynamic screens using Swing, JSP, JavaScript, HTML, DHTML and CSS.
- Used Java and JDBC APIs for database access. Has written SQL queries.
- Involved in the development of Servlets.
- Developed and deployed Servlets and JSPs on Tomcat web server.
- Configured web.xml to map incoming requests to servlets
- Developed an N-Tier email message center application with the feature to filter and read the emails, archiving emails in the message center and feature to compose new emails.
- Worked on the interface to provide a feature for the accounts department to keep track of their publishers and advertiser's statistics and means to receive and make payments to and from their clients.
Environment: Windows XP, Unix, Java5.0, Design Patterns, Web sphere, Apache Ant, J2EE (Servlets, JSP), HTML, JSON, JavaScript, CSS, Eclipse, SQL Developer, JUnit.
Confidential
Jr. Java Developer
Responsibilities:
- Requirements Study, Software Development Specification, Software Development and Unit testing and finally integration testing.
- Responsible for coding Java, J2EE and Oracle Packages
Environment: Java, JDBC, JSP, Servlets, and JavaScript