Senior Hadoop Developer Resume
Richmond, TX
SUMMARY
- 7+ years of professional IT experience in analysing requirements, designing and testing highly distributed mission critical applications.
- Strong working experience in Big Data and Hadoop ecosystems.
- Expertise in using various tools in Hadoop ecosystem including MapReduce, HBase, Hive, Pig, Oozie, Sqoop, Flume, Spark, Storm, Kafka, and Zookeeper.
- Good knowledge on Hadoop architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, YARN, and MapReduce programs.
- Proficient in using HDFS commands, writing MapReduce jobs, managing and reviewing Hadoop log files using Hadoop Log Analyzer.
- Good understanding in security requirements for Hadoop clusters and integrations with Kerberos Key Distribution Centre.
- Strong skills in querying using Hive, Pig, HBase, Spark SQL, and MongoDB.
- Extensive experience in working with Oracle, MySQL, Microsoft SQL Server Databases.
- Good knowledge in Amazon Web Services like Amazon EC2, IAM, EMR, S3 Storage, RedShift, DynamoDB, Aurora, and AWS Security Compliance Programs.
- Experienced in constructing Apache Storm - Apache Kafka pipelines and implementing controlled data flow procedures using work flow tools like Oozie.
- Hands on experience in designing ETL operations including data extraction, data cleansing, data transformations, data loading.
- Good knowledge in developing data ingestion processes using Flume Agents and Spark Streaming for real time and near-real time data analysis.
- Experience in developing workflows using Flume Agents with multiple sources like Web Server logs, REST API and multiple sinks like HDFS sink and Kafka sink.
- Experience in implementing unified data platform to get data from different data sources using Apache Kafka brokers and various producers and consumers.
- Proficient in Java, Scala, and Python.
- Hands on experience in application development using Java, RDBMS, and Linux Shell Scripting.
- Experience in working on various operating systems like UNIX, Linux, and Windows.
- Excellent understanding of Software Development Life Cycle (SDLC) and strong knowledge on various project implementation methodologies including Waterfall and Agile.
- Worked in Agile environment with active SCRUM participation.
- Possess strong communication and Interpersonal skills. Can quickly master and work on new concepts and ability to adapt to different project environments and applications with minimal supervision.
TECHNICAL SKILLS
Programming Languages: Core Java, Scala, Python, C, C++, Unix Shell Scripting
Big data Technologies: Hadoop and MapReduce, HDFS, YARN Pig, Hive, Sqoop, Oozie, HBase, Spark, Flume, Kafka, ZooKeeper
Databases: My SQL, Oracle, Microsoft SQL SERVER, Microsoft Azure SQL, PostgreSQL
NoSQL Databases: MongoDB, HBase, Dynamo DB, Oracle NoSQL Database
Operating Systems: Windows, Linux, MacOS
Methodologies: Agile, Water Fall, Scrum
Network Protocols: TCP/IP, HTTP, HTTPS, UDP, DNS
Integration Tools: Jenkins and Hudson
Build Tools: Ant, Maven, Gradle
Version Control: SVN, Git tortoise, Git hub, TFS
IDE s: Eclipse, Net Beans IntelliJ IDEA, Notepad++, Visual Studio
Cloud Services: Amazon Web Services, Microsoft Azure
PROFESSIONAL EXPERIENCE
Senior Hadoop Developer
Confidential, Richmond, TX
Responsibilities:
- SDLC Requirements gathering, Analysis, Design, Development and Testing of application using AGILE and SCRUM methodology.
- Detailed understanding on existing build system, Tools related for information of various products and releases and test results information
- Designed and implemented map reduce jobs to support distributed processing using Java, Hive and Apache Pig.
- Consumed Web Services for transferring data between different applications using RESTFUL APIs.
- Built a mechanism for Talend, automatically moving the existing proprietary binary format data files to HDFS using a service called Ingestion service.
- Implemented a prototype to integrate PDF documents into a web application using Git hub.
- Active participation in process improvement, normalization/de-normalization, data extraction, data cleansing, SCRUM data manipulation
- Performed data transformations in Scala, Hive and used partitions, buckets for performance improvements.
- Written custom Input format and record reader classes for reading and processing the binary format in MapReduce.
- Used Mockito frame work as the unit test runner.
- Involved in Test Driven Development (TDD) and Acceptance Test Driven Development (ATDD).
- Managed and deployed Amazon Web Services Elastic MapReduce (AWS EMR) clusters.
- Build cloud-native applications using Amazon Web Services - specifically Elastic Map Reduce (EMR), Lambda, DynamoDB, and Elastic Beanstalk.
- Managed data schema versions across various microservices.
- Developed and tested the enterprise application with JUNIT.
- Written Custom writable classes for Hadoop serialization and De-serialization of time series tuples.
- Implemented custom file loader for Pig to query directly on large data files such as build logs
- Used Python for pattern matching in build logs to format errors and warnings
- Developed Pig Latin scripts & Shell scripts for validating the different query modes in Historian.
- Created Hive external tables on the MapReduce output before partitioning; bucketing is applied on top of it.
- Improved the Performance by Scala, tuning of HIVE and MapReduce using Talend, ActiveMQ and JBoss.
- Developed daily test engine using Python for continuous tests.
- Developed rich interactive visualizations integrating various reporting components from multiple data sources
- Used Shell scripting for Jenkins job automation with Talend.
- Building a custom calculation engine which can be programmed according to user needs.
- Ingestion of data into Hadoop using Shell scripting for SCRUM, Elastic Sqoop and apply data transformations and using Pig and Hive.
- Handled the performance improvement changes to Pre-Ingestion service which is responsible for generating the Big Data Format binary files from older version of Historian.
- Worked with support teams and resolved operational & performance issues
Environment: Apache Hadoop, Hive, Scala, PIG, HDFS, Cloudera, Java Map-Reduce, Python, Maven, GIT, Jenkins, UNIX, MySQL, Eclipse, Oozie, Sqoop, Flume, Oracle, JDK 1.8/1.7, Agile and Scrum Development Process, NoSQL, JBoss, Flink, Java Script, and Mockito
Senior Hadoop Developer
Confidential, Boca Raton, FL
Responsibilities:
- Involved in all phases of development activities from requirements collection to production support.
- Detailed understanding of current system and find out the different sources of data for EMR.
- Involved in cluster setup.
- Performed Batch processing of logs from various data sources using MapReduce.
- Submitted automated jobs on Cloudera via Jenkins scripts.
- Contributed to Predictive Analytics. (To monitor inventory levels and ensure product availability)
- Analysis of customers' purchasing behaviours in JMS.
- Response to value-added services based on clients' profiles and purchasing habits.
- Worked on gathering and refining requirements, interviewing business users to understand and document data requirements including elements, entities and relationships, in addition to visualization and report specifications.
- Defined UDFs using Pig and Hive to capture customer behaviour.
- Designed and implemented Map Reduce jobs to support distributed processing using Java, Hive, Spark SQL and Apache Pig, Oozie.
- Integrated Apache Kafka for data ingestion.
- Creates Scala and Hive external tables on the map reduce output before partitioning and bucketing.
- Maintenance of data and imported scripts using HBase, Hive and Map reduce jobs.
- Developed and maintain several batch jobs to run automatically depending on business requirements.
- Import and export data between the environments like MySQL, and HDFS.
- Unit testing and Deploying for internal usage monitoring performance of solution.
Environment: Apache Hadoop, Cloudera, RHEL, Hive, HBase, PIG, HDFS, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume and Cloudera Distribution, Oracle, Teradata and MySQL.
Hadoop Developer
Confidential, Winston Salem, NC
Responsibilities:
- Involved in System Analysis and Design methodology as well as Object Oriented Design and development using OOAD methodology to capture and model business requirements.
- Worked with several clients with day to day requests and responsibilities.
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Involved in analysing system failures, identifying root causes and recommended course of actions.
- Worked on Hive for exposing data for further analysis and for generating and transforming files from different analytical formats to text files.
- Wrote Shell Scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Managing and scheduling Jobs on a Hadoop cluster.
- Implemented and maintained various projects in Java.
- Utilized Java and MySQL from day to day to debug and fix issues with client processes.
- Developed, tested, and implemented financial-services application to bring multiple clients into standard database format.
- Assisted in designing, building, and maintaining database to analyse life cycle of checking and debit transactions.
- Developed JAVA, J2EE applications in Object Oriented Analysis, extensively involved throughout Software Development Life Cycle (SDLC).
- Worked with J2SE, XML, Web Services, WSDL, SOAP, UDDI, and TCP/IP.
- Utilized JSP, Servlet, Java Server Face, EJB, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit as part of developing.
- Used database technologies such as Oracle 8i and Oracle 9i, DB2, PL/SQL.
- Worked with Sun One Application Server, Web logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology.
Environment: Hive, Pig, HBase, Zookeeper, Sqoop, Cloudera, Java, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit, SQL language, spring, Hibernate, Junit, Oracle, XML, Putty and Eclipse.
Java/Big Data Developer
Confidential, Cleveland, OH
Responsibilities:
- Designed, implemented and tested clustered multi-tiered, e-Commerce products. Core technologies used include IIS, SQL Server, ASP, XML/ XSLT, JSP, Tomcat, JavaBeans and Java Servlets.
- Developed the XML Schema and Web services for the data maintenance and structures.
- Implemented the Web Service client for the login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
- Followed Agile Methodology (TDD, SCRUM) to satisfy the customers and wrote JUnit test cases for unit testing the integration layer.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
- Analysed the data by performing Hive queries and running Pig scripts to know user behaviour.
- Developed scripts to extract data from MySQL into HDFS.
- Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
- UsedHibernateORM framework withSpringframework for data persistence and transaction management.
- Loaded the aggregated data onto DB2 for reporting on the dashboard.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
- Strong expertise on MapReduce programming model with XML, JSON, CSV file formats.
- Experience in managing and reviewing Hadoop log files.
- Involved in loading data from Linux file system to HDFS.
- Implemented test scripts to support test driven development and continuous integration.
- Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
- Worked with the Data Science team to gather requirements for various data mining projects.
Environment: HDFS, Hadoop 2.2.0 (Yarn), Flume 1.5.2, Eclipse, SQL Server, Map Reduce, Hive 1.1.0, Pig Latin 0.14.0, JavaBeans, SQL, Sqoop 1.4.6, Oozie, CentOS, Zookeeper 3.5.0 and NOSQL database.
Java Developer
Confidential
Responsibilities:
- Involved in analysis, design and development of Expense Processing systems.
- Created used interfaces using JSP.
- Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.
- Developed the DAO objects using JDBC.
- Business Services using the Servlets and Java.
- Design and development of User Interfaces and menus using HTML 5, JSP, Java Script, Client side and Server-side validations.
- Developed GUI using JSP, Struts frame work.
- Involved in developing the presentation layer using Spring MVC/Angular JS/jQuery.
- Involved in designing the user interfaces using Struts Tiles Framework.
- Used Spring 2.0 Framework for Dependency injection and integrated with the Struts Framework and Hibernate.
- Used Hibernate 3.0 in data access layer to access and update information in the database.
- Experience in SOA (Service Oriented Architecture) by creating the web services with SOAP and WSDL.
- Developed JUnit test cases for all the developed modules.
- Used Log4J to capture the log that includes runtime exceptions, monitored error logs and fixed the problems.
- Used RESTFUL Services to interact with the Client by providing the RESTFUL URL mapping.
- Used CVS for version control across common source code used by developers.
- Used Ant scripts to build the application and deployed on Oracle WebLogic Server 10.0.
Environment: Struts 1.2, Hibernate 3.0, Spring 2.5, JSP, Servlets, XML, SOAP, WSDL, JDBC, JavaScript, HTML, CVS, Log4J, JUNIT, Web logic App server, Eclipse, Oracle, Restful.
Java Developer
Confidential
Responsibilities:
- Wrote SQL queries, stored procedures, and triggers to perform back-end database operations.
- Developed nightly batch jobs which involved interfacing with external third-party state agencies.
- Implemented JMS producer and Consumer using Mule ESB.
- Gathered business requirements and wrote functional specifications and detailed design documents.
- Extensively used CoreJava, Servlets, JSP and XML.
- Wrote AngularJS controllers, views, and services.
- Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database.
- Implemented Enterprise Logging service using JMS and Apache CXF.
- Developed Unit Test Cases and used JUNIT for unit testing of the application.
- Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user requirements.
Environment: Java, Spring core, JMS Web services, JMS, JDK, SVN, Maven, Mule ESB, Junit, WAS7, jQuery, Ajax, SAX.
