Senior Hadoop Developer/architect Resume
Charlotte, NC
SUMMARY
- 8+ years of overall experience with strong emphasis on Design, Development, Implementation, Testing and Deployment of Software Applications inHadoop, HDFS, MapReduce,HadoopEcosystem, ETL and RDBMS, extensive development experience using Java, J2EE, JSP, and Servlets
- HadoopDeveloperwith 6+ years of working experience on designing and implementing complete end - to-endHadoopInfrastructure using MapReduce, PIG, HIVE.
- Java Programmer with 2+ years of Extensive programming experience in developing web based applications and Client-Server technologies using Java, J2EE.
- Experience Installing, Configuring and TestingHadoopEcosystem components.
- Good knowledge ofHadoopArchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
- Experience in working with MapReduce programs usingHadoopfor working with Big Data.
- Experience in analyzing data using Hive QL, Pig Latin and custom MapReduce programs in Java
- Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
- Collecting and Aggregating large amount of Log Data using Apache Flume and storing data in HDFS for further analysis.
- Job/workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Experience in designing both time driven and data driven automated workflows using Oozie
- Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) using Agile Methodologies.
- Experience in automating theHadoopInstallation, configuration and maintaining the cluster by using the tools like puppet.
- Good Knowledge in build tools like ANT and Maven 2.2/3.0
- Experience in setting up monitoring infrastructure forHadoopcluster using Nagios and Ganglia.
- Experience onHadoopclusters using majorHadoopDistributions - Cloudera(CDH3, CDH4), Hortonworks(HDP) and MapR(M3 v3.0).
- Experience in different layers ofHadoopFramework - Storage (HDFS), Analysis (Pig and Hive), Engineering (Jobs and Workflows).
- Experienced in using Integrated Development environments like Eclipse, NetBeans, Kate and gEdit.
- Migration from different databases (i.e. Oracle, DB2, MYSQL, MongoDB) toHadoop.
- Developed various dashboards in Tableau, used context filters, sets while dealing with huge volume of data.
- Prior experience working as SoftwareDeveloperin Java/J2EE and related technologies.
- Experience in designing and coding web applications using Core Java and J2EE Technologies- JSP, Servlets and JDBC.
- Excellent knowledge in Java and SQL in application development and deployment.
- Hands on experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL,DB2.
- Excellent technical, communication, analytical and problem solving skills and ability to get on well with people including cross-cultural backgrounds and trouble-shooting capabilities.
TECHNICAL SKILLS
Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Cassandra, Oozie, Flume, Chukwa, Pentaho Kettle, spark, storm, Knox, Nagios, Ambari, Ranger, Hue, Stack iq, Falcon and Talend
Architecture & Framework Client-Server: MVC, J2EE, Struts, Spring, Hibernate.
Programming Languages: Java, C/C++, eVB, Assembly Language (8085/8086)
Scripting Languages: JSP & Servlets, PHP, JavaScript, XML, HTML, Python, XSL, XSD, XSLT and Bash
Databases: NoSQL, Oracle, PL/SQLDeveloper, PostgreSQL, MS SQL Server 2000, DB2, MS Access, Cassandra.
UNIX Tools: Apache, Yum, RPM
Tools: Eclipse, JDeveloper, JProbe, CVS, Ant, MS Visual Studio BEA WebLogic 8.1, JBOSS, IBM Websphere Application Server 6.1, JUnit 4.0, ANT, Log4j, Mercury Quality Centre, Rational Clear Quest. ANT, Maven, SVN, Toad
Platforms: Windows(2000/XP), Linux, Solaris, AIX, HPUX
Application Servers: Apache Tomcat 5.x 6.0, Jboss 4.0
Testing Tools: NetBeans, Eclipse, WSAD, RAD
Methodologies: Agile, UML, Design Patterns
PROFESSIONAL EXPERIENCE
Senior Hadoop Developer/Architect
Confidential - Charlotte, NC
Responsibilities:
- Loaded customer data such as service installations, technical help line calls and interaction from the DISH Network web site in to HDFS using Flume.
- Implemented 26 node CDH4Hadoopcluster on Red hat Linux using Cloudera Manager.
- Import data from Teradata, Oracle to HDFS using Sqoop.
- Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Handled importing of data from various data sources, performed transformations using Simple to complex MapReduce jobs, and loaded data into HBase.
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Continuous monitoring and managing theHadoopcluster using Cloudera Manager.
- Setup Flume agent on the application servers to capture the log data.
- Setup flume agent on the web servers to capture the server logs.
- Used Avro sink and sources to move the file into HDFS.
- Analysis of the top errors occurring post an Integrated release using Hive and Pig.
- Used Sqoop to move the analyzed data to the MySql DB for report generation.
- Setup Amazon web services (AWS) to check whetherHadoopis a feasible solution or not.
- SetupHadoopcluster using EC2 (Elastic MapReduce) on managedHadoopFrame Work.
- Used Maven extensively for building MapReduce jar files and deployed it to Amazon Web Services (AWS) using EC2 virtual Servers in the cloud.
- Used S3 Bucket to store the jar's, input datasets and used DynamoDB to store the processed output from the input data sets.
- Setup MongoDB to store the ever growing application config entries in JSON format.
- Use MongoDB as a contingency DB for the current Oracle clusters.
- Setup a MongoDB cluster with 12 shards.
- SetupHadoopcluster using CDH5.0
- Used Sqoop to and mongoDump to move the data between MongoDB and HDFS.
Environment: CDH4/5, Cloudera Manager, MapReduce, HDFS, Hive, Pig, HBase, Flume, Java, MySQL, Sqoop, Oozie, AWS, MongoDB.
Sr. Hadoop Developer
Confidential, Reston, VA
Responsibilities:
- Responsible for architecting the data ingestion of the structured, time series and the unstructured data onto the on premise data lake and into the Amazon AWS S3
- Installed and configured Talend, Falcon and Sqoop on the Pivotal HD 2.2Hadoopdistribution
- Developed Oozie workflows for data ingestion on to the data lake
- Structured data was ingested onto the data lake using Sqoop jobs and scheduled using Oozie workflow from the RDBMS data sources
- Streaming Data (Time Series Data) and Unstructured data was ingested into the data lake using Flume from streaming data sources and the unstructured data sources
- Developed Batch modules for creating an end to end data pipeline using SpringXD an extensible Java framework, these batch jobs were used to ingest, egress, monitor and audit the data
- Aspera Client installed on Amazon EC2 Instance was used to connect to the HDFS to store data in the Amazon S3 cloud
- Developed Map Reduce programs in java to analyze the data at rest on the on premise data lake
- Infrastructure of theHadoopcluster and the Amazon AWS was monitored using Pivotal Command Center (PCC) and Amazon CloudWatch
- Structured data was ingested into the data lake in ORC, Avro format, stream data was ingested as message bursts and the unstructured data as flat files
- Solr an enterprise search platform was used to perform advanced search operations like full text search, cluster based loading, near real time indexing etc.,
- Developed programs in Spark based on the business use cases for faster data processing than standard MapReduce programs.
- Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required
- HAWQ was used to perform advanced querying of the data stored on the on premise data lake
- GemFireXD was used to analyze the real time streaming of data from the data sources
- Used Solr to navigate through data sets in the HDFS storage
- Secured thehadoopcluster by integrating the cluster with the Kerberos protocol and Apache Knox
- Created ACLs, User Groups for user access of the data based on the level of security imposed on the data
- Data rest on the Amazon AWS was secured using the HSM in which all the encryption keys were stored
- Falcon was used to automate the end-to-end orchestration of the data ingestion from the data sources to the data lake and into the Amazon S3 cloud storage
- Developed Map/reduce Jobs using Java and Python programming language that are implemented using Hive and Pig scripts for querying the data
Environment: Hadoop, MapReduce, HDFS, Hive, Spark, Pig, HAWQ, GemFireXD, SpringXD, Falcon, Java, SQL, Cloudera Manager, Sqoop, Strom, Solr, Mahout, Flume, Oozie, Java (jdk 1.6), Eclipse
Hadoop Developer
Confidential, Hoffman Estates, IL
Responsibilities:
- Involved in start to end process ofHadoopcluster installation, configuration and monitoring.
- Responsible for building scalable distributed data solutions usingHadoop
- Installed and configured Hive, Pig, Sqoop and Oozie on theHadoopcluster.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Responsible for managing and reviewingHadooplog files
- Designed and developed data management system using MySQL.
- Developed entire frontend and backend modules using Python on Django Web Framework.
- Wrote Python scripts to parse XML documents and load the data in database.
- Developed dynamic webpages using HTML, CSS, and JavaScript.
- Worked on CSV files while trying to get input from the MySQL database.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Wrote scripts to automate application deployments and configurations.Hadoopcluster performance tuning and monitoring.
- Worked on NoSQL databases including HBase and ElasticSearch
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Environment: HDFS, Hive, PIG, UNIX, SQL, Java MapReduce,HadoopCluster, Hbase, Sqoop, Oozie, Linux, ClouderaHadoopDistribution, Python, MySql, Git, RSpec, ElasticSearch
Java /Hadoop Developer
Confidential, Richmond, VA
Responsibilities:
- Analyzed Business Requirements and Identified mapping documents required for system and functional testing efforts for all test scenarios.
- Involved in using HTML, DHTML, Java Script, AJAX, ExtJs, JQUERY, JSP and Tag Libraries to develop view pages
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Involved in preparing design TSD document with Sequence Diagrams, Class Diagrams using Microsoft VISIO.
- Followed Agile Methodology and participated in SCRUM Meetings.
- Responsible for upgrading the Crash applications to the latest Java version.
- Developed SOAP based web services using JAX-WS for UMM application and used SOAP UI for testing.
- Involved in design, development and enhancement of the applications using agile methodologies with a test driven approach.
- Created REST based web services using JAX-RS.
- Implemented the DBCRs by developing PL/SQL scripts and stored procedures.
- Implemented reports for various screens in the application using Jasper iReports
- Developed payment flow using AJAX partial page refresh, validation and dynamic drop down list.
- Implemented WebServices to send order details to downstream systems using RESTFul, SOAP.
- Expertise in Object Oriented Analysis and Design (OOAD) concepts, various Design Patterns (J2EE) with excellent logical and analytical skills
- Extensive design framework experience using MVC, Struts, spring, Ajax and Hibernate.
- Extensively used JPA for Object Relational Mapping for data persistence.
- Used Hibernate for Object-Relational Mapping and for database operations in Oracle database.
- Used JUnit for testing the application, ANT and Maven for building Projects
- Involved in configuring JMS and JNDI in Rational ApplicationDeveloper(RAD).
- Used JProbe, JMeter for performance testing.
Environment: JAVA 1.6, J2EE, Servlets 2.4, EJB 2.0, JDBC 2.0, JAXB, Spring-IOC/DI, AOP, MVC, JSF components, REST API, DAO, HTML, Java Script, XML, CSS, Ajax, ExtJs, Web Sphere Application server 8.0, Oracle 10g, Log4J, Eclipse 3.1, CVS, DOJO, Ant 1.5, SOA, Mule ESB, SOAP, DB2, PL/SQL, SQL, Web Services-WSDL, SOAP UDDI, SOAP UI, Windows XP,Hadoop, HDFS, Hive.
Java /J2EE Developer
Confidential, Raleigh, NC
Responsibilities:
- Work with business users to determine requirements and technical solutions.
- Followed Agile methodology (Scrum Standups, Sprint Planning, Sprint Review, Sprint Showcase and Sprint Retrospective meetings).
- Developed business components using core java concepts and classes like Inheritance, Polymorphism, Collections, Serialization and Multithreading etc.
- Used SPRING framework that handles application logic and makes calls to business make them as Spring Beans.
- Implemented, configured data sources, session factory and used Hibernate Template to integrate Spring with Hibernate.
- Developed web services to allow communication between applications through SOAP over HTTP with JMS and mule ESB.
- Actively involved in coding using Core Java and collection API's such as Lists, Sets and Maps
- Developed a Web Service (SOAP, WSDL) that is shared between front end and cable bill review system.
- Implemented Rest based web service using JAX-RS annotations, Jersey implementation for data retrieval with JSON.
- Developed MAVEN scripts to build and deploy the application onto Web Logic Application Server and ran UNIX shell scripts and implemented auto deployment process.
- Used Maven as the build tool and is scheduled/triggered by Jenkins (build tool).
- Develop JUNIT test cases for application unit testing.
- Implement Hibernate for data persistence and management.
- Used SOAP UI tool for testing web services connectivity.
- Used SVN as version control to check in the code, created branches and tagged the code in SVN.
- Used RESTFUL Services to interact with the Client by providing the RESTFUL URL mapping.
- Used Log4j framework to log/track application and debugging.
Environment: JDK 1.6, Eclipse IDE, Core Java, J2EE, Spring, Hibernate, Unix, Web Services, SOAP UI, Maven, Web logic Application Server, SQLDeveloper, Camel, Junit, SVN, Agile, SONAR, Log4j, REST, Log 4j, JSON, JBPM.
Java Developer
Confidential
Responsibilities:
- Involved in all the phases of SDLC including Requirements Collection, Design & Analysis of the Customer Specifications, Development and Customization of the Application.
- Developed JSP, JSF and Servlets to dynamically generate HTML and display the data to the client side. Extensively used JSP tag libraries.
- Developed the application using Eclipse IDE.
- Used Spring Security for Authentication and authorization extensively.
- Designed and developed Application based on Struts Framework using MVC design pattern.
- Used Struts Validator framework for client side validations.
- Used Spring Core for dependency injection/Inversion of control (IOC).
- Used Hibernate Framework for persistence onto oracle database.
- Written and debugged the ANT Scripts for building the entire web application.
- Used XML to transfer the application data between client and server.
- XSLT style sheets for the XML data transformations.
- Participated in designing Webservice framework in support of the product.
- Developed web services in Java and Experienced with SOAP, WSDL.
- Used Log4j for logging Errors.
- Used MAVEN as build tool.
- Used Spring Batch for scheduling and maintenance of batch jobs.
- Deployed the application in various environments DEV, QA and also Production.
- Used Data Access Objects (DAO) to abstract and encapsulate all access to the data source.
- Used the JDBC for data retrieval from the database for various inquiries.
- Development of MQ application programs for Java JMS environments using queues and messages. Working with JMS Handling and managing exception conditions.
- Performed purification of the application database entries using Oracle 10g.
- Used CVS as source control.
- Created Application Property Files and implemented internationalization.
- Used JUnit to write repeatable tests mainly for unit testing.
- Involved in complete development of 'Agile Development Methodology' and tested the application in each iteration.
- Wrote complex Sql and Hql queries to retrieve data from the Oracle database.
- Involved E2E Development by integrating Front End and Backend by Debugging.
- Performed Defect tracking using HP Quality Centre.
- Involved in fixing System testing issues and UAT issues.
Environment: Java, J2EE, JSP, JSF, Servlets, Struts 2.0, Spring 2.0, JDBC 3.0, Spring Security Web Services, XML, JNDI, Hibernate 3.0, JMS, Websphere Application Server 8.1, Eclipse, Oracle 10g, WinCvs 1.2, HTML, Rational Rose XDE, Spring security, Spring batch, Maven, Junit 4.0, Log4j, JQuery 2.0, XML/XSLT, SAX, DOM.