Senior Hadoop Developer Resume
Richmond, TX
SUMMARY:
- 7+ years of professional IT experience in analysing requirements, designing and testing highly distributed mission critical applications.
- Strong working experience in Big Data and Hadoop ecosystems.
- Expertise in using various tools in Hadoop ecosystem including MapReduce, HBase, Hive, Pig, Oozie, Sqoop, Flume, Spark, Storm, Kafka, and Zookeeper.
- Good knowledge on Hadoop architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, YARN, and MapReduce programs.
- Proficient in using HDFS commands, writing MapReduce jobs, managing and reviewing Hadoop log files using Hadoop Log Analyzer.
- Good understanding in security requirements for Hadoop clusters and integrations with Kerberos Key Distribution Centre.
- Strong skills in querying using Hive, Pig, HBase, Spark SQL, and MongoDB.
- Extensive experience in working with Oracle, MySQL, Microsoft SQL Server Databases.
- Good knowledge in Amazon Web Services like Amazon EC2, IAM, EMR, S3 Storage, RedShift, DynamoDB, Aurora, and AWS Security Compliance Programs.
- Experienced in constructing Apache Storm - Apache Kafka pipelines and implementing controlled data flow procedures using work flow tools like Oozie.
- Hands on experience in designing ETL operations including data extraction, data cleansing, data transformations, data loading.
- Good knowledge in developing data ingestion processes using Flume Agents and Spark Streaming for real time and near-real time data analysis.
- Experience in developing workflows using Flume Agents with multiple sources like Web Server logs, REST API and multiple sinks like HDFS sink and Kafka sink.
- Experience in implementing unified data platform to get data from different data sources using Apache Kafka brokers and various producers and consumers.
- Proficient in Java, Scala, and Python.
- Hands on experience in application development using Java, RDBMS, and Linux Shell Scripting.
- Experience in working on various operating systems like UNIX, Linux, and Windows.
- Excellent understanding of Software Development Life Cycle (SDLC) and strong knowledge on various project implementation methodologies including Waterfall and Agile.
- Worked in Agile environment with active SCRUM participation.
- Possess strong communication and Interpersonal skills. Can quickly master and work on new concepts and ability to adapt to different project environments and applications with minimal supervision.
TECHNICAL SKILLS:
Programming Languages: Core Java, Scala, Python, C, C++, Unix Shell Scripting
Big data Technologies: Hadoop and MapReduce, HDFS, YARN Pig, Hive, Sqoop, Oozie, HBase, Spark, Flume, Kafka, ZooKeeper
Databases: My SQL, Oracle, Microsoft SQL SERVER, Microsoft Azure SQL, PostgreSQL
NoSQL Databases: MongoDB, HBase, Dynamo DB, Oracle NoSQL Database
Operating Systems: Windows, Linux, MacOS
Methodologies: Agile, Water Fall, Scrum
Network Protocols: TCP/IP, HTTP, HTTPS, UDP, DNS
Integration Tools: Jenkins and Hudson
Build Tools: Ant, Maven, Gradle
Version Control: SVN, Git tortoise, Git hub, TFS
IDE s: Eclipse, Net Beans IntelliJ IDEA, Notepad++, Visual Studio
Cloud Services: Amazon Web Services, Microsoft Azure
PROFESSIONAL EXPERIENCE:
Senior Hadoop Developer
Confidential, Richmond, TX
Responsibilities:- SDLC Requirements gathering, Analysis, Design, Development and Testing of application using AGILE and SCRUM methodology.
- Detailed understanding on existing build system, Tools related for information of various products and releases and test results information
- Designed and implemented map reduce jobs to support distributed processing using Java, Hive and Apache Pig .
- Consumed Web Services for transferring data between different applications using RESTFUL APIs.
- Built a mechanism for Talend, automatically moving the existing proprietary binary format data files to HDFS using a service called Ingestion service.
- Implemented a prototype to integrate PDF documents into a web application using Git hub.
- Active participation in process improvement, normalization/de-normalization, data extraction, data cleansing, SCRUM data manipulation
- Performed data transformations in Scala, Hive and used partitions, buckets for performance improvements.
- Written custom Input format and record reader classes for reading and processing the binary format in MapReduce .
- Used Mockito frame work as the unit test runner.
- Involved in Test Driven Development (TDD) and Acceptance Test Driven Development (ATDD).
- Managed and deployed Amazon Web Services Elastic MapReduce (AWS EMR) clusters.
- Build cloud-native applications using Amazon Web Services - specifically Elastic Map Reduce (EMR), Lambda, DynamoDB, and Elastic Beanstalk.
- Managed data schema versions across various microservices.
- Developed and tested the enterprise application with JUNIT.
- Written Custom writable classes for Hadoop serialization and De-serialization of time series tuples.
- Implemented custom file loader for Pig to query directly on large data files such as build logs
- Used Python for pattern matching in build logs to format errors and warnings
- Developed Pig Latin scripts & Shell scripts for validating the different query modes in Historian.
- Created Hive external tables on the MapReduce output before partitioning; bucketing is applied on top of it.
- Improved the Performance by Scala, tuning of HIVE and MapReduce using Talend, ActiveMQ and JBoss .
- Developed daily test engine using Python for continuous tests.
- Developed rich interactive visualizations integrating various reporting components from multiple data sources
- Used Shell scripting for Jenkins job automation with Talend.
- Building a custom calculation engine which can be programmed according to user needs.
- Ingestion of data into Hadoop using Shell scripting for SCRUM, Elastic Sqoop and apply data transformations and using Pig and Hive.
- Handled the performance improvement changes to Pre-Ingestion service which is responsible for generating the Big Data Format binary files from older version of Historian.
- Worked with support teams and resolved operational & performance issues
Environment: Apache Hadoop, Hive, Scala, PIG, HDFS, Cloudera, Java Map-Reduce, Python, Maven, GIT, Jenkins, UNIX, MySQL, Eclipse, Oozie, Sqoop, Flume, Oracle, JDK 1.8/1.7, Agile and Scrum Development Process, NoSQL, JBoss, Flink, Java Script, and Mockito
Senior Hadoop Developer
Confidential, Boca Raton, FL
Responsibilities:- Involved in all phases of development activities from requirements collection to production support.
- Detailed understanding of current system and find out the different sources of data for EMR.
- Involved in cluster setup.
- Performed Batch processing of logs from various data sources using MapReduce.
- Submitted automated jobs on Cloudera via Jenkins scripts.
- Contributed to Predictive Analytics. (To monitor inventory levels and ensure product availability)
- Analysis of customers' purchasing behaviours in JMS.
- Response to value-added services based on clients' profiles and purchasing habits.
- Worked on gathering and refining requirements, interviewing business users to understand and document data requirements including elements, entities and relationships, in addition to visualization and report specifications.
- Defined UDFs using Pig and Hive to capture customer behaviour.
- Designed and implemented Map Reduce jobs to support distributed processing using Java, Hive, Spark SQL and Apache Pig, Oozie.
- Integrated Apache Kafka for data ingestion.
- Creates Scala and Hive external tables on the map reduce output before partitioning and bucketing.
- Maintenance of data and imported scripts using HBase, Hive and Map reduce jobs.
- Developed and maintain several batch jobs to run automatically depending on business requirements.
- Import and export data between the environments like MySQL, and HDFS.
- Unit testing and Deploying for internal usage monitoring performance of solution.
Environment: Apache Hadoop, Cloudera, RHEL, Hive, HBase, PIG, HDFS, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume and Cloudera Distribution, Oracle, Teradata and MySQL.
Hadoop Developer
Confidential, Winston Salem, NC
Responsibilities:- Involved in System Analysis and Design methodology as well as Object Oriented Design and development using OOAD methodology to capture and model business requirements.
- Worked with several clients with day to day requests and responsibilities.
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Involved in analysing system failures, identifying root causes and recommended course of actions.
- Worked on Hive for exposing data for further analysis and for generating and transforming files from different analytical formats to text files.
- Wrote Shell Scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Managing and scheduling Jobs on a Hadoop cluster.
- Implemented and maintained various projects in Java.
- Utilized Java and MySQL from day to day to debug and fix issues with client processes.
- Developed, tested, and implemented financial-services application to bring multiple clients into standard database format.
- Assisted in designing, building, and maintaining database to analyse life cycle of checking and debit transactions.
- Developed JAVA, J2EE applications in Object Oriented Analysis, extensively involved throughout Software Development Life Cycle (SDLC).
- Worked with J2SE, XML, Web Services, WSDL, SOAP, UDDI, and TCP/IP.
- Utilized JSP, Servlet, Java Server Face, EJB, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit as part of developing.
- Used database technologies such as Oracle 8i and Oracle 9i, DB2, PL/SQL.
- Worked with Sun One Application Server, Web logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology.
Environment: Hive, Pig, HBase, Zookeeper, Sqoop, Cloudera, Java, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit, SQL language, spring, Hibernate, Junit, Oracle, XML, Putty and Eclipse.
Java/Big Data Developer
Confidential, Cl eveland, OH
Responsibilities:- Designed, implemented and tested clustered multi-tiered, e-Commerce products. Core technologies used include IIS, SQL Server, ASP, XML/ XSLT, JSP, Tomcat, JavaBeans and Java Servlets.
- Developed the XML Schema and Web services for the data maintenance and structures.
- Implemented the Web Service client for the login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
- Followed Agile Methodology (TDD, SCRUM) to satisfy the customers and wrote JUnit test cases for unit testing the integration layer.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
- Analysed the data by performing Hive queries and running Pig scripts to know user behaviour.
- Developed scripts to extract data from MySQL into HDFS.
- Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
- Used Hibernate ORM framework with Spring framework for data persistence and transaction management.
- Loaded the aggregated data onto DB2 for reporting on the dashboard.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
- Strong expertise on MapReduce programming model with XML, JSON, CSV file formats.
- Experience in managing and reviewing Hadoop log files.
- Involved in loading data from Linux file system to HDFS.
- Implemented test scripts to support test driven development and continuous integration.
- Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
- Worked with the Data Science team to gather requirements for various data mining projects.
Environment: HDFS, Hadoop 2.2.0 (Yarn), Flume 1.5.2, Eclipse, SQL Server, Map Reduce, Hive 1.1.0, Pig Latin 0.14.0, JavaBeans, SQL, Sqoop 1.4.6, Oozie, CentOS, Zookeeper 3.5.0 and NOSQL database.
Java Developer
Confidential
Responsibilities:- Involved in analysis, design and development of Expense Processing systems.
- Created used interfaces using JSP.
- Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.
- Developed the DAO objects using JDBC.
- Business Services using the Servlets and Java.
- Design and development of User Interfaces and menus using HTML 5, JSP, Java Script, Client side and Server-side validations.
- Developed GUI using JSP, Struts frame work.
- Involved in developing the presentation layer using Spring MVC/Angular JS/jQuery.
- Involved in designing the user interfaces using Struts Tiles Framework.
- Used Spring 2.0 Framework for Dependency injection and integrated with the Struts Framework and Hibernate.
- Used Hibernate 3.0 in data access layer to access and update information in the database.
- Experience in SOA (Service Oriented Architecture) by creating the web services with SOAP and WSDL.
- Developed JUnit test cases for all the developed modules.
- Used Log4J to capture the log that includes runtime exceptions, monitored error logs and fixed the problems.
- Used RESTFUL Services to interact with the Client by providing the RESTFUL URL mapping.
- Used CVS for version control across common source code used by developers.
- Used Ant scripts to build the application and deployed on Oracle WebLogic Server 10.0.
Environment: Struts 1.2, Hibernate 3.0, Spring 2.5, JSP, Servlets, XML, SOAP, WSDL, JDBC, JavaScript, HTML, CVS, Log4J, JUNIT, Web logic App server, Eclipse, Oracle, Restful.
Java Developer
Confidential
Responsibilities:- Wrote SQL queries, stored procedures, and triggers to perform back-end database operations.
- Developed nightly batch jobs which involved interfacing with external third-party state agencies.
- Implemented JMS producer and Consumer using Mule ESB.
- Gathered business requirements and wrote functional specifications and detailed design documents.
- Extensively used Core Java, Servlets, JSP and XML.
- Wrote AngularJS controllers, views, and services.
- Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database.
- Implemented Enterprise Logging service using JMS and Apache CXF.
- Developed Unit Test Cases and used JUNIT for unit testing of the application.
- Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user requirements.
Environment: Java, Spring core, JMS Web services, JMS, JDK, SVN, Maven, Mule ESB, Junit, WAS7, jQuery, Ajax, SAX.
