We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

0/5 (Submit Your Rating)

Phoenix, AZ

SUMMARY

  • Hadoop Developer wif 7+ Years of experience in IT and having exclusive experience in Hadoop and Big Data related Technologies.
  • Experienced in developing web applications in various domains like Retail, Banking, Insurance and Healthcare
  • Strong development skills in Hadoop, HDFS, Map Reduce, Hive, Sqoop, HBase wif solid understanding of Hadoop internals.
  • Well versed experience in installing, configuring, and using Apache Hadoop ecosystem components.
  • Very Familiar wif HDFS, Hive, Spark, Kafka, SQOOP, Pig Latin, OOZIE, Flume and teh various components of Hadoop Eco System.
  • Expertise in ingesting real time/near real time data using Flume, Kafka, Storm
  • Knowledge of NO SQL databases like Mongo DB, Cassandra and HBase.
  • Good noledge in writing Spark application using Python, Scala and Java
  • Efficient in using Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing.
  • Good understanding on Spark architecture and its components.
  • Comprehensive noledge and experience in process improvement, normalization/de - normalization, data extraction, data manipulation, data cleansing on HIVE.
  • Good hands-on experience in Apache Spark wif Scala
  • Developed/supported application on LAMP stack (PHP, MYSQL and Apache)
  • Good experience wif Amazon Cloud EC2, Simple Storage Service S3 and Amazon SQS
  • Experience working in Oracle, DB2, SQL Server and My SQL database.
  • Skilled in data management, data extraction, manipulation, validation, and analyzing huge volume of data.
  • Experience in implementing Java/J2EE technologies for application development in various layers of projects.
  • Deep noledge of AngularJS practices and commonly used modules based on extensive work experience
  • Extensively Used JavaScript for client side validations and implemented jQuery for reducing data transfer between user and server
  • Experience wif NLP, Elastic Search, Text mining.
  • Own teh end-end development life cycle wif high quality of solution and evangelize teh test driven development (test code coverage, etc).
  • Developed core modules in large cross-platform applications using JAVA, J2EE, spring, Struts, Hibernate, JAX-WS Web Services, and JMS.
  • Experience in Agile and Waterfall models.
  • Experience in UNIX Shell scripting.
  • Proficient in using OOPs Concepts (Polymorphism, Inheritance, Encapsulation) etc.
  • Analytical, organized, enthusiastic to work in a fast paced and team oriented environment.
  • Expertise in interacting wif business users and understanding teh requirement and providing solutions to match their requirement.
  • Excellent communication and inter-personal skills, flexible and adaptive to new environments, self-motivated, team player, positive thinker and enjoy working in multicultural environment.

TECHNICAL SKILLS

Big Data Eco System: HDFS, HBase, Hadoop MapReduce, Hive, Pig, AngularJs, Flume, Sqoop. SPARK, Kafka, Oozie.

Languages: C, C++, Core JAVA, JDBC, PL/SQL, Scala

Methodologies: Agile, V-model (Verification & Validation Model)

Database: Oracle 11/10g, My SQL, Cassandra, MongoDB, NO SQL.and HBase.

IDE/Testing Tools: Eclipse

Operating System: Linux, Windows and UNIX

Scripts: Java Script, Shell scripting, Python.

Others: MS-Office, QC, JIRA, Share Point, Visio.

Operating System: Windows XP/2000/NT/98/95,UNIX, LINUX

PROFESSIONAL EXPERIENCE

Confidential, Phoenix, AZ

Senior Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Developed workflows using custom MapReduce, Pig, Hive and Sqoop.
  • Expertise in cluster tasks like Adding and Removing Nodes wifout any TEMPeffect to running jobs and data
  • Worked on moving all log files generated from various sources to HDFS for further processing.
  • Responsible for developing multiple Kafka Producers and Consumers from scratch as per teh software requirement specifications.
  • Involved in Integrating Apache Storm wif Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, Hbase and Hive by integrating wif Storm
  • Developed Spark Application by using Scala
  • Highly skilled in integrating Kafka wif Spark streaming for high speed data processing
  • Developed Spark scripts by using Java, and Python shell commands as per teh requirement.
  • Used Spark Dataframes, Spark-SQL, Spark MLLib extensively
  • Written Hive UDF to sort Structure fields and return complex data type.
  • Created, altered and deleted topics (Kafka Queues) when required wif varying
  • Optimizing of existing algorithms inHadoopusing Spark Context, Spark-SQL, Data Frames and Pair RDD's, Spark Yarn.
  • Analyzed teh SQL scripts and designed teh solution to implement using Scala
  • Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library.
  • Developed Scala scripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Responsible for loading data from UNIX file system to HDFS.
  • Worked on MongoDB, Hbase (NoSql) databases which differ from classic relational databases
  • Developed workflow in Control M to automate tasks of loading data into HDFS and preprocessing wif PIG.
  • Tuned teh cluster for optimal performance to process teh large data sets.
  • Designed and developed a distributed processing system running to process binary files in parallel and crunch teh analysis metrics into a Data Warehousing platform for reporting.
  • Implemented dashboards that internally use Hive queries to perform analytics on Structured data, Avro and Json data to meet business requirements.
  • Written Hive and Pig scripts as per requirements.
  • Cluster co-ordination services through ZooKeeper
  • Configured Oozie work flows to automate data flow, preprocess and cleaning tasks using Hadoop Actions.
  • Experience in AWS EC2, configuring teh servers for Auto scaling and Elastic load balancing.
  • Used Maven extensively for building jar files of MapReduce programs and deployed to Cluster.

Environment: Hive QL, MySQL, Apache Spark 1.6.1, Scala 2.11, Hive, HDFS, YARN, Sqoop,Kafka, HBase, Hive, Eclipse (Kepler), Hadoop, AWS EC2, Oracle 11g, PL/SQL, SQL*PLUS, Toad 9.6, Flume, PIG,UNIX, Cosmos.

Confidential, Fort Worth, TX

Sr. Hadoop Developer

Responsibilities:

  • Experience wif professional software engineering practices and best practices for teh full software development life cycle including coding standards, code reviews, source control management and build processes.
  • Work closely wif various levels of individuals to coordinate and prioritize multiple projects. Estimate scope, schedule and track projects throughout SDLC.
  • Worked in teh BI team in teh area of Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
  • Handling structured and unstructured data and applying ETL processes.
  • Assess existing and available data warehousing technologies and methods to ensure our Data warehouse/BI architecture meets teh needs of teh business unit and enterprise and allows for business growth.
  • Developed a data pipeline usingKafka, Spark and Hive to ingest, transform and analyzing data.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Designed a data warehouse using Hive.
  • Creating teh Hive tables and partitioned tables using Hive Index and bucket to make ease data analytics.
  • Involved in creating MapReduce jobs to power data for search and aggregation.
  • Worked extensively wif Sqoop for importing and exporting teh data from Relational Database systems/mainframe to HDFS and vice-versa. Loading data into Relational Database systems/mainframe.
  • Developed interface for validating incoming data into HDFS before kicking offHadoopprocess.
  • Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
  • Developed workflow in Oozie to automate teh tasks of loading teh data into HDFS and pre-processing wif Pig.
  • Develop and maintains complex outbound notification applications that run on custom architectures, using diverse technologies including Core Java, J2EE, SOAP, JBoss, XML, JMS, and Web Services.
  • Designed and implemented Map reduce based large scale parallel relation learning system
  • Prepare Developer (Unit) Test cases and execute Developer Testing.
  • Supports and assist QA Engineers in understanding, testing and troubleshooting.
  • Coding complex Oracle stored procedures, functions, packages, and cursors for teh client specific applications.
  • Involved in teh database migrations to transfer data from one database to other and complete virtualization of many client applications
  • Written build scripts using ant and participated in teh deployment of one or more production systems
  • Production Rollout Support which includes monitoring teh solution post go-live and resolving any issues that are discovered by teh client and client services teams.
  • Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.

Environment: Hadoop, MapReduce, HDFS, Hive, Kafka, Spark, HBase, Sqoop, Java (jdk1.6), Pig, Oozie, Oracle 11/10g, DB2, MySQL, Eclipse, ETL Tool (Informatica), PL/SQL, Java, JSP, JDBC, XML, HTML, JSON, SOAP, Maven, Ant, SVN, JIRA .Linux, Shell Scripting, SQL Developer, Toad, WinScp, Putty.

Confidential, San Francisco, CA

Hadoop developer

Responsibilities:

  • Responsible to manage data coming from different sources, loading of structured and unstructured data and involved in HDFS maintenance.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Created Data Pipeline of Map Reduce programs using Chained Mappers
  • Visualize teh HDFS data to customer Used BI tool wif teh halp of Hive ODBC Driver.
  • Processed Multiple Data sources input to same Reducer using Generic Writable and Multi Input format.
  • Developed and executed hive queries for denormalizing teh data.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
  • Worked Big data processing of clinical and non clinical data using Map Reduce..
  • Performed data validation on teh data ingested using MapReduce by building a custom model to filter all teh invalid data and cleanse teh data.
  • Familiarity wif a NoSQL database such as MongoDB, Cassandra.
  • Used Flume for importing log files from various sources into HDFS.
  • Created customized BI tool for manager team that perform Query analytics using HiveQL.
  • Implemented Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
  • Written Hive UDF to sort Structure fields and return complex data type.
  • Working on PIG Latin Scripts and UDF's while ingestion, querying, processing and analysis of Data.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Modeled Hive partitions extensively for data separation and faster data processing and followed Pig and Hive bestpractices for tuning.
  • Developed Hive queries to process teh data and generate teh data cubes for visualizing.
  • Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
  • Worked on custom Pig Loaders and storage classes to work wif variety of data formats such as JSON and XML file formats.
  • Involved extensively wif different kind of compression techniques like LZO, GZip, Snappy.

Environment: Hadoop, HDFS, HBase, MongoDb, MapReduce, Java, Hive, Pig, Sqoop, Flume, Oozie, Hue, SQL, ETL, Cloudera Manager, Oracle, My SQL, framework.

Confidential, Kalamazoo, MI

Big Data Analyst/Java Developer

Responsibilities:

  • Installed and configured Apache Hadoop to test teh maintenance of log files in Hadoop cluster.
  • Installed and configured Hive, Pig, Sqoop, and Oozie on teh Hadoop cluster.
  • Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.
  • Extensively Involved in loading data from UNIX file system to HDFS.
  • Involved in evaluating teh business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Provided quick response to ad hoc internal and external client requests for data and experienced in creating ad hoc reports.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented Map Reduce jobs in HIVE by querying teh available data.
  • Migration of ETL processes from Oracle to Hive to test teh easy data manipulation.
  • Used Amazon Redshift to Store and retrieve teh data from data-warehouses.
  • Developed PIG scripts to transform teh raw data into intelligent data as specified by business users.
  • Supported in setting up QA environment and updating configurations for implementing scripts wif Pig.
  • Performed some unit testing for teh development team wifin teh sandbox environment.
  • Used Hive and created Hive tables and also involved in writing Hive UDFs and data loading.
  • Imported data into HDFS and Hive from other data systems by using Sqoop.
  • Installed Oozie Workflow engine to run multiple Hive and Pig Jobs.
  • Generated aggregations and groups and visualizations using Tableau.
  • Developed Hive queries to process teh data.
  • Developed and maintain several batch jobs to run automatically depending on business requirements.

Environment: Apache Hadoop, Cloudera Manager, CDH2, CDH3 CentOS, Apache Hama, Eclipse Indigo, Java, MapReduce, Hive, Sqoop, Pig, Oozie and SQL, Struts, JUnit.

Confidential

Java Developer

Responsibilities:

  • Involved in teh implementation of design using vital phases of teh Software development life cycle (SDLC) that includes Development, Testing, Implementation and Maintenance Support.
  • Developed teh system by following teh agile methodology.
  • Applied OOAD principals for teh design and analysis of teh system.
  • Using Node.js created real time web applications.
  • Developed front-end screens using JSP, HTML, CSS, Java Script and Jquery.
  • Used Spring Framework for developing business objects.
  • Performed data validation in Struts Form beans and Action Classes.
  • Used Eclipse for teh Development, Testing and Debugging of teh application.
  • Web sphere Application Server has been used to deploy teh build.
  • Used DOM Parser to parse teh xml files.
  • Used Log4j framework for logging debug, info & error data.
  • Used Oracle 10g Database for data persistence.
  • SQL Developer was used as a database client.
  • Performed Test Driven Development (TDD) using JUnit.
  • Ant script has been used for build automation.
  • Used WinSCP to transfer file from local system to other system.
  • Used Rational Clear Quest for defect logging and issue tracking.

Environment: Java/J2EE, SQL, Oracle 10g, JSP 2.0, Java Script, Web Sphere 6.1, HTML, JDBC 3.0, XML, JMS, Log4j, Junit, Servlets, MVC.

Confidential

Responsibilities:

  • Analyze teh requirements and documented teh technical specifications.
  • Actively involved in development of JSP pages, Servlet classes and unit testing.
  • Utilized Java debugging and error handling classes and techniques to troubleshoot and debug issues.
  • Worked extensively wif teh Eclipse IDE built on Weblogic Server.
  • Involved in teh Design Document, Coding and Debugging.
  • Used Ajax Controls and CSS to give richness for GUI.
  • Involved in Preparation of Unit Test Cases and Module Level Test Cases.
  • Implemented teh connectivity to teh Oracle database using JDBC.
  • Created SQL views, queries, functions and triggers to be used to fetch data for teh system.
  • Involved in writing stored procedures and triggers using PL/SQL.
  • Code walks through and Code reviews.
  • Coordinating wif Project and Software Quality Assurance (SQA) teams.

Environment: JSP, Servlets, JDBC, RMI, Swing, Websphere 6.0, WSAD 5, Oracle 9i.

We'd love your feedback!