Sr. Hadoop Developer Resume Profile
Boston, MA
SUMMARY
- Over 7 years of professional IT experience which includes 3 plus years of experience in Big data ecosystem related technologies like Hadoop, Pig, Hive, Sqoop, HBase, Cassandra
- Experience in developing Map/Reduce jobs to process large data sets utilizing the Map/Reduce programming paradigm
- Working experience building, designing, configuring large scale Hadoop environments
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Big Data eco system Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Falcon, Pig, Storm, Kafka, Zookeeper, Yarn and Lucene
- Experienced in monitoring Hadoop cluster environment using Ganglia
- Very good experience in complete project life cycle design, development, testing and implementation of Client Server and Web applications
- Involved in generating automated scripts YAML files of Falcon and Oozie using ruby
- Experience in Object Oriented Analysis, Design OOAD and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns
- Experienced in SQA Software Quality Assurance including Manual and Automated testing with tools such as Selenium RC/IDE/WebDriver/Grid and Junit, Load Runner, Jprofile, RFT Rational Functional Tester
- Proficient in deploying applications on J2EE Application servers like Web-Sphere, Web-logic, Glassfish, Tuxedo, JBoss and Apache Tomcat web server
- Expertise in developing applications using J2EE Architectures / frameworks like Struts, Spring Framework and SDP Qwest Communications Framework
- Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC
- Experience in NoSQL data stores Hbase, Accumulo, Cassandra and Mongo DB
- Implemented POC's using Amazon Cloud Components S3, EC2, Elastic beanstalk and SimpleDB
- Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle 8i/9i/10g
- Extensive experience using multiple languages/technologies Java/J2EE, C , C, Perl, PHP, Ruby and environments JVM, Linux, Mac, Various Unix, Windows
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment
- Ability to meet deadlines and handle multiple tasks, flexible in work schedules and possess good communication skills
TECHNICAL SKILLS
Big Data Technologies | : | Hadoop HDFS, Map Reduce, Pig, Hive, HBase, Mahout, Falcon, Oozie, Accumulo, Zookeeper, YARN, Lucene |
J2EE Technologies | : | JSP, Java Bean, Servlets, JDBC, JPA1.0, EJB3.0, JDBC, JNDI, JOLT, Amazon Cloud S3, EC2, Elastic Beanstalk and RDS |
Languages | : | C, C , PL/SQL, and Java |
Frame Works | : | Struts 1.x and Spring 3.x |
Web Technologies | : | XHTML, JavaScript, AngularJS, AJAX, HTML, XML, XSLT, XPATH, CSS, DOM, WSDL, GWT, JQuery, Perl, VB Script |
Application Servers | : | WebLogic8.1/9.1/10.x, Web-Sphere5.x/6.x/7.x, Tuxedo server 7.x/9.x, Glass Fish Server 2.x, JBoss4.x/5.x |
Web Servers | : | Apache Tomcat 4.0/ 5.5, Java Web Server 2.0 |
Operating Systems | : | Windows-XP/2000/NT, UNIX, Linux, and DOS |
Database | : | SQL, Oracle 9i/10g, SQLServer, Hbase and Cassandra |
IDE | : | Eclipse3.x, My Eclipse 8.x, RAD 7.x and JDeveloper 10.x |
Tools | : | Adobe, Sql Developer, Flume, Sqoop and Storm |
Web Technologies | : | XHTML, JavaScript, XML, CSS, DOM, WSDL, SOA, Web Services |
Platforms | : | Windows XP/NT/9x/2000, MS-DOS, UNIX /LINUX/Solaris/AIX |
Databases | : | SQL, PL/SQL, Oracle 9i/10g, MYSQL, Microsoft Access, SQLServer |
Version Control | : | Win CVS, VSS, PVCS, Subversion, Git |
EXPERIENCE
Confidential
Role: Sr. Hadoop Developer
Confidential is the second largest pharmacy in and 7th largest in the World. The research data produced by the scientist is in huge volumes and we are trying to build a search engine for them. The search engine connects to different databases and is indexed through the Hadoop tools and the data is displayed on the GUI with the relative data that scientists need.
Responsibilities:
- Installed and configured Apache Hadoop, Hive and Pig environment on AWS
- Used Sqoop to connect to the Sql Server and move the pivoted data to Hive tables and stored in Avro files
- Configured MySQL database to store Hive metadata
- Written custom Hive UDF's and also custom input and output formats
- Scheduling the Hive jobs using Oozie and falcon process files
- Involved in design and architecture of custom Lucene storage handler
- Configured and Maintained different topologies in storm cluster and deployed them on regular basis
- Implemented POC Spark Cluster on AWS
- Understanding of Ruby scripts used to generated yaml files
- Maintained the test mini cluster using vagrant and VMware fusion
- Involved in GUI development using JavaScript and AngularJS and Guice
- Developed Unit test case using Jmockit framework and automated the scripts
- Worked in Agile environment, which uses Jira to maintain the story points and Kanban model
- Involved in brain storming JAD sessions to design the GUI
- Hands on experience on maintaining the builds in Bamboo and resolved the build failures in Bamboo
Environment: Hadoop, Big data, Hive, Hbase, Sqoop, Accumulo, Oozie, Falcon, HDFS, Map Reduce, Jira, Bit bucket, Maven, Bamboo, J2EE, Guice, AngularJS, Jmockit, Lucene, Storm, Ruby, Unix, Sql, AWS Amazon Web Services
Confidential
Role: Hadoop Developer
Description: The goal of the project is to offer reward points to the loyal customers of Sears Holdings by tracking the customer's spending habits and using that data to offer reward points that can be used for redeeming more merchandise at Sears. The reward programs essentially aim at the retention of loyal customers by offering more discounts and thus generate more revenue for the company.
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle SDLC using Scrum methodology
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer profiles and purchase histories into HDFS for analysis
- Developed job flows in Oozie to automate the workflow for extraction of data from warehouses
- Used Pig as ETL tool to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS
Environment: JDK1.6, RHEL, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Oozie,
Mahout, HBase
Confidential
Project: Cloud Platform Group
Role: Hadoop Developer
Description: The project aims at building a scalable pipeline to store and process the user data from mobile phones and tablets to make intelligent user interfaces and applications. The pipeline is built using hadoop stack and is deployed on a 100-node cluster. Services such as user profile, taste profiles etc. are built on top of this platform.
Responsibilities:
- Built a data flow pipeline using flume, java map reduce and pig
- Used Flume to capture the streaming mobile sensor data and loaded it to HDFS
- Used Java map reduce and Pig scripts to process the data and store the data on HDFS
- Used Hive scripts to compute aggregates and store them on HBase for low latency applications
- Analyze Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirement
- Integrated Cassandra as a distributed persistent metadata store to provide metadata resolution for network entities on the network
- Used Oozie to orchestrate the scheduling of map reduce jobs and pig scripts
- Involved in design and development phases of Software Development Life SDLC using scrum methodology
Environment: JDK1.6, Red Hat Linux, Big Data, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, DB2, HBase and Cassandra
Confidential
Role: Java Developer
Responsibilities:
- Responsible for the design and development of the framework. The system is designed using J2EE technologies based on MVC architecture
- Developed Session Beans using J2EE Design Patterns
- Implemented J2EE Design patterns like Data Access Objects, Business Objects, and Java Design Patterns like Singleton
- Extensively used MQ series
- Extensive use of Struts framework
- Used JSP and Servlets, EJBs on server side
- Implemented Home Interface, Remote Interface, and Bean Implementation class
- Implemented business logic at server side using Session Bean
- Wrote PL/SQL queries to access data from Oracle database
Environment: Java 1.4,Struts, JSP, Servlets API, HTML, JDBC, Web-Sphere 5.1,MQ Series, MS SQL server, XSLT, XML, EJB, Edit Plus, EJB, JUnit, CSS, JMS, Hibernate, Eclipse, and WSAD
Confidential
Role: Java/J2EE Developer
Responsibilities:
- Used Hibernate ORM tool as persistence Layer - using the database and configuration data to provide persistence services and persistent objects to the application
- Responsible for developing DAO layer using Spring MVC and configuration XML's for Hibernate and to also manage CRUD operations insert, update, and delete
- Implemented Dependency injection of spring framework
- Developed reusable services using BPEL to transfer data
- Created JUnit test cases, and Development of JUnit classes
- Configured log4j to enable/disable logging in application
- Developed Rich user interface using HTML, JSP, AJAX, JSTL, Java Script, JQuery and CSS
- Implemented PL/SQL queries, Procedures to perform data base operations
- Wrote UNIX Shell scripts and used UNIX environment to deploy the EAR and read the logs
- Implemented Log4j for logging purpose in the application
Environment: Java, Jest, SOA Suite 10g BPEL , Struts, Spring, Hibernate, Web services JAX-WS , JMS, EJB, Web logic 10.1 Server, JDeveloper, Sql Developer, HTML, LDAP, Maven, XML, CSS, JavaScript, JSON, SQL, PL/SQL, Oracle, JUnit, CVS and UNIX/Linux.
Confidential
Role: SQL Server Developer
Responsibilities:
- Created new database objects like Procedures, Functions, Packages, Triggers, Indexes and Views Using T-SQL in Development and Production environment for SQL Server
- Developed Database Triggers to enforce Data integrity and additional Referential Integrity
- Developed SQL Queries to fetch complex data from different tables in remote databases using joins, database links and formatted the results into reports and kept logs
- Involved in performance tuning and monitoring of both T-SQL and PL/SQL blocks
- Wrote T-SQL procedures to generate DML scripts that modified database objects dynamically based on user inputs
Environment: SQL Server 7.0, Oracle 8i, Windows NT, C , HTML, T-SQL, PL/SQL, SQL Loader