Sr. Hadoop Developer Resume
Weston, FL
SUMMARY
- Over 8+ years of extensive IT experience with multinational clients which includes around 5 years of Hadooprelated experience developing Big data /Hadoopapplications.
- Hands on experience with the Hadoop Stack (MapReduce, Hive, HDFS, Sqoop, Pig, HBase, Flume, Oozie and Zookeeper, Apache Solr, Apache Storm, Kafka, YARN), cluster architecture and monitoring the cluster.
- Well versed with developing and implementing MapReduce programs for analyzing Big Data with different file formats like structured and unstructured data.
- 8+ years of IT experience which includes 4 years of experience in Big Data technologies and 4+ years of experience in JAVA and MAINFRAMES technologies.
- Worked in finance, Insurance, Health, and E - commerce domain.
- Expertise in various components of Hadoop Ecosystem.
- Hands on experience withSpark-Scala programming and wring spark streaming applications.
- Hands-on Experience in working with Cloudera Hadoop Distribution.
- Written, executed, and deployed complex Map Reduce java code using various Hadoop API’s.
- Experienced in Map Reduce code tuning and performance optimization.
- Knowledge in installing, configuring, and using Hadoop ecosystem components.
- Proficient in Hive Query language and experienced in hive performance optimization using Partitioning, Dynamic-Partitioning and bucketing concepts.
- Expertise in developing PIG Scripts. Written and implemented custom UDF’s in Pig for data filtering
- Used Impala for data analysis.
- Hands-On experience in using the data ingestion tools - Sqoop and Flume.
- Collected the log data from various sources (webservers, Application servers and consumer devices) using Flume and stored in HDFS to perform various analysis.
- Performed Data transfer between HDFS and other Relational Database Systems (MySQL, SQLServer, Oracle and DB2) using Sqoop.
- Worked on Hadoop 2.0 architecture.
- Used Oozie job scheduler to schedule Map Reduce, Hive and pig jobs. Experience in automating the job execution
- Experience with NoSQL databases like HBase and fair knowledge in MongoDB and Cassandra.
- Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions.
- Experience in working with different relational databases like MySQL, SQL Server, Oracle and DB2.
- Strong experience in database design, writing complex SQL Queries.
- Used derived queries andOLAP functions for breaking up complex queries into simpler queries.
- Expertise in development of multi-tiered web based enterprise applications using J2EE technologies like Servlets, JSP, JDBC and Hibernate.
- Extensive coding experience in Java and Mainframes - COBOL, CICS and JCL.
- Experience in development methodologies such as Agile, Scrum, BDD Continuous Integration and Waterfall.
- Strong base in writing the Test plans, perform Unit Testing, User Acceptance testing, Integration Testing, System Testing.
- Proficient in software documentation and technical report writing.
- Worked coherently with multiple teams. Conducted peer reviews, organized and participated in knowledge transfer (technical and domain) sessions.
- Experience in working with Onsite-Offshore model.
- Developed various UDFs in Map-Reduce and Python for Pig and Hive.
- Decent experience and knowledge in other SQL and NoSQL Databases like MySQL, MS SQL, MongoDB, HBase, Accumulo, Neo4j and Cassandra.
- Good Data Warehouse experience in MS SQL.
- Good knowledge and firm understanding of J2EE frontend/backend, SQL and database concepts.
- Good experience in Linux, Unix, Windows and Mac OS environment.
- Used various development tools like Eclipse, GIT, Android Studio and Subversion.
- Knowledge with Cloudera, Hadoop, Horton works and Map-R distribution components and their custom packages.
TECHNICAL SKILLS
Hadoop/Big Data: Map Reduce, Hive, Pig, Impala, Sqoop, Flume, HDFS, Oozie, Hue, HBase, Zookeeper, Kerberos, Sentry, Spark, Solr, Storm, Kafka, YARN.
Operating Systems: Windows, Ubuntu, RedHat Linux, Unix
Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC
Frameworks: Hibernate
Databases/Database Languages: Oracle 11g/10g/9i, MySQL, DB2, SQLServer, SQL, HQL, NoSQL (HBase, Cassandra, Mongo DB)
Web Technologies: JavaScript, HTML, XML, REST, CSS
Programming Languages: Java, Unix shell scripting, COBOL, CICS, JCL
IDE’s: Eclipse, Net beans
Reporting tools: Tableau
Web Servers: Apache Tomcat 6
Methodologies: Waterfall, Agile and Scrum
PROFESSIONAL EXPERIENCE
Confidential, WESTON, FL
Sr. Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Written multiple Map Reduce programs in Java for Data Analysis.
- Wrote Map Reduce job using Pig Latin and Java API.
- Performed performance tuning and troubleshooting of Map Reduce jobs by analyzing and reviewing Hadoop log files.
- End-to-end performance tuning of Hadoop clusters and Hadoop MapReduce routines against very large data sets.
- Involved in loading data from LINUX file system to HDFS.
- Developed hive queries and UDF’s to analyze/transform the data in HDFS.
- Importing and exporting data into HDFS and Hive using Sqoop and Flume.
- Developed Pig scripts for analyzing large data sets in the HDFS.
- Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume.
- Experienced onmachinelearningfor training the data models using supervision algorithms of classifications.
- Worked independently with Cloudera support andHortonworkssupport for any issue/concerns withHadoopcluster.
- Worked with NoSQL databases like HBase, Cassandra, DynamoDB.
- Designed and presented plan for POC on Impala.
- Experience in migrating on premise to Windows Azure in DR on cloud using Azure Recovery Vault and Azure backups.
- Worked on ELK Stack ( Elastic Search, Logstash and Kibana) to develop reports and risk indicators for archived SPAM & Websense
- Worked on Hadoop 2.0 architecture.
- Configuration of Internal load balancer, load balanced sets and Azure Traffic manager.
- Knowledge on handling Hive queries using Spark SQL that integrates with Spark environment.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs,Python.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Implemented Avro and parquet data formats for Apache Hive computations to handle custom business requirements.
- Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing Hive queries to further analyze the logs to identify issues and behavioral patterns.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
- Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS using Oozie coordinator jobs.
- Built a data validation dashboard in Solrto be able to display the message record.
- Consuming the data from HBASE and producing to the Apache SOLR.
- Loading the data from HBase to Solrfor performance cache.
- Responsible for performing extensive data validation using Hive.
- Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
- Involved in loading data from Teradata database into HDFS using Sqoop queries.
- Involved in submitting and tracking MapReduce jobs using Job Tracker.
- Involved in preparing Bench Mark metrics for comparing MRV1to MRV2(YARN).
- Setting up monitoring tools Ganglia, Nagios for Hadoop monitoring and alerting. Monitoring cluster HBase/zookeeper using these tools Ganglia and Nagios.
- Involved in creating Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
- Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations
- Used Visualization tools such as Power view for excel, Tableau for visualizing and generating reports.
- Exported data to Tableau and excel with Power view for presentation and refining.
- Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources.
- Worked on implementing the features based on the finalized design using C/C++.
- Proficiency in C/C++ programming in the field of avionics software especially in Flight Management System.
- Implemented Hive Generic UDF's to implement business logic.
- Implemented test scripts to support test driven development and continuous integration.
- Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
- Used Bash Shell Scripting, Sqoop, AVRO, Hive, Pig, Java, Map/Reduce daily to develop ETL, batch processing, and data storage functionality.
- UsedJira for Bit Bucket to check-in, Bug tracking and checkout code changes.
Environment: Apache Hadoop, Map Reduce, HDFS, Pig, Hive, Solr, Sqoop, Batch processing,Azure,Flume, Oozie, Jira, Spark, Scala, Java, C++, Linux, Hortonworks, Maven, Python, Teradata, Zookeeper, Ganglia, Tableau.
Confidential, Bridgeton, NJ
Hadoop Developer
Responsibilities:
- Worked on writing transformer/mapping Map-Reduce pipelines using Java.
- Hadoop is a big data platform that is used to store and analyze the data.
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
- Designed and implemented Incremental Imports into Hive tables.
- Worked in Loading and transforming large sets of structured, semi structured and unstructured data.
- Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
- Real time streaming the data using Sparkwith Kafka.
- Implemented Sparkusing Scala and utilizing Data frames and SparkSQL API for faster processing of data.
- Experienced in managing and reviewing the Hadoop log files.
- Configured Kerberos and AD/LDAP for Hadoop cluster.
- Worked in AWS environment for development and deployment of Custom HadoopApplications.
- Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
- Implemented the workflows using Apache Oozie framework to automate tasks.
- Injected files into HDFS layer to perform matching and the aggregations using Hive QL.
- Worked with Avro Data Serialization system to work with JSON data formats.
- Worked extensively with the No SQL databases like MongoDB and Cassandra.
- Developed MongoDB embeddeddocumentsfrom java code using spring data MongoDB.
- Used Sqoop to move the data between MongoDB and HDFS.
- Used Talend ETL tool to extract data from oracle to HDFS.
- Leading three internal initiatives related to AWS,lift and shift of existing production system, Architecting the next generation micro services based crediting system using native AWS technologies likeLamda, Step functions, S3, EC2, etc.
- Converted Hive queries into Spark transformations using Spark RDDs.
- Scheduled and executed workflows in Oozie to run Hive and Sparkjobs.
- Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
- Implemented theMachinelearningalgorithms using Spark with Python.
- Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
- Developed scripts and automated data management from end to end and sync up between all the clusters.
- Involved in Setup and benchmark of Hadoop /HBase clusters for internal use.
- Created wrapper script using python that can be run to trigger java scripts and Hive QL.
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Pig Scripts.
Environment: Hadoop, HDFS, Map Reduce, Sqoop, Oozie, Pig, Hive, Spark, S3, EC2,Flume, Lamda, Kafka, Kerberos, AWS, Python, LINUX, Java, C++, Eclipse, Talend, Mongo DB, Cassandra.
Confidential, Newark, NY
Hadoop Developer
Responsibilities:
- Implemented CDH3 Hadoop cluster on CentOS.
- Design and develop a daily process to do incremental import of raw data from Oracle into Hive tables using Sqoop.
- Launching Amazon EC2 Cloud Instances using Amazon Images (Linux/Ubuntu) and Configuring launched instances with respect to specific applications.
- Launching and Setup of HADOOP Cluster which includes configuring different components of HADOOP.
- Hands on experience in loading data from UNIX file system to HDFS.
- Cluster coordination services through Zookeeper.
- Worked on designing and developing a module in OAJ to collect and store the user actions performed by the plant engineers in and I/A series DCS system using C++ and MKS Tools, windows.
- Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster. .
- Involved in creating Hive tables, loading data and running hive queries in those data.
- Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
- Wrote a Python module to connect and view the status of an Apache Cassandra instance.
- Involved in writing optimized Pig Script along with involved in developing and testing Pig Latin Scripts.
- Working knowledge in writing Pig's Load and Store functions.
- Experience in utilizing Spark machine learning techniques implemented in Scala.
Environment: Apache Hadoop 1.0.1, MapReduce, HDFS, CentOS, Zookeeper,C++, Sqoop, Hive, Pig, Oozie, Java, Eclipse, Amazon EC2, JSP, Servlets, Oracle.
Confidential
Java Developer
Responsibilities:
- Responsible and active in the analysis, design, implementation and deployment of full Software Development Lifecycle (SDLC) of the project.
- Designed and developed user interface using JSP, HTML and JavaScript.
- Developed Struts action classes, action forms and performed action mapping using Struts framework and performed data validation in form beans and action classes.
- Extensively used Struts framework as the controller to handle subsequent client requests and invoke the model based upon user requests.
- Defined the search criteria and pulled out the record of the customer from the database. Make the required changes and save the updated record back to the database.
- Validated the fields of user registration screen and login screen by writing JavaScript validations.
- Developed build and deployment scripts using Apache ANT to customize WAR and EAR files.
- Used DAO and JDBC for database access.
- Set up AWS infrastructure with resources like VPC, EC2, S3, IAM, CloudFormation,Lamda, RDS, ELB, CloudWatch and CloudTrail
- Developed application using AngularJS and Node.JS connecting to Oracle on the backend.
- Designed and developed a Restful APIs using Spring REST API.
- Consumed RESTful web services using AngularJS HTTP service & rendered the JSON data on the screen.
- Used AngularJS framework for building web-apps and is highly efficient in integrating with Restful services.
- Design and develop XML processing components for dynamic menus on the application.
- Involved in postproduction support and maintenance of the application.
Environment: s: Oracle 11g, Java 1.5, Struts, Servlets, HTML, XML, SQL, J2EE, Angular JS, JUnit, RESTful, SOA, Tomcat 6.
Confidential
Jr. JAVA Developer
Responsibilities:
- Installation, Configuration & Upgrade of Solaris and Linux operating system.
- Actively participated in requirements gathering, analysis, design, and testing phases
- Designed use case diagrams, class diagrams, and sequence diagrams as a part of Design Phase
- Developed the entire application implementing MVC Architecture integrating JSF with Hibernate and Spring frameworks.
- Developed the Enterprise Java Beans (Stateless Session beans) to handle different transactions such as online funds transfer, bill payments to the service providers.
- Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services
- Developed XML documents and generated XSL files for Payment Transaction and Reserve Transaction systems.
- Developed SQL queries and stored procedures.
- Developed Web Services for data transfer from client to server and vice versa using Apache Axis, SOAP and WSDL.
- Used JUnit Framework for the unit testing of all the java classes.
- Implemented various J2EE Design patterns like Singleton, Service Locator, DAO, and SOA.
- Worked on AJAX to develop an interactive Web Application and JavaScript for Data Validations.
- Developed the application under JEE architecture, developed Designed dynamic and browser compatible user interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
- Deployed & maintained the JSP, Servlets components on Web logic 8.0
- Developed Application Servers persistence layer using, JDBC, SQL, Hibernate.
- Used JDBC to connect the web applications to Data Bases.
- Implemented Test First unit testing framework driven using Junit.
- Developed and utilized J2EE Services and JMS components for messaging communication in Web Logic.
- Configured development environment using Web logic application server for developer’s integration testing.
Environment: Java/J2EE, SQL, Oracle 10g, JSP 2.0, EJB, AJAX, Java Script, Web Logic 8.0, HTML, JDBC 3.0, XML, JMS, log4j, Junit, Servlets, MVC, My Eclipse.