Hadoop Developer Resume Sunnyvale, CA - Hire IT People

SUMMARY:

IT professional with 8+ years of experience in Analysis, Design, Development, Integration, Testing and maintenance of various applications using JAVA /J2EE technologies along with 3 + years of Big Data /Hadoop experience.
Experienced in building highly scalable Big - data solutions using Hadoop and multiple distributions i.e. Cloudera, Horton works and NoSQL platforms (Hbase & Cassandra).
Expertise in big data architecture with Hadoop File system and its eco system tools Map Reduce, HBase, Hive, Pig, Zookeeper, Oozie, Flume, Avro, Impala and Apache spark.
Hands on experience on performing Data Quality checks on petabytes of data
Solid understanding of Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
Good knowledge on Amazon AWS concepts like EMR & EC2 web services which provides fast and efficient processing of Big Data.
Developed, deployed and supported several Map Reduce applications in Java to handle semi and unstructured data.
Experience in writing Map Reduce programs and using Apache Hadoop API for analyzing the data.
Strong experience in developing, debugging and tuning Map Reduce jobs in Hadoop environment.
Experienced in working with Ab intitio.
Expertise in developing PIG and HIVE scripts for data analysis
Hands on experience in data mining process, implementing complex business logic and optimizing the query using Hive QL and controlling the data distribution by partitioning and bucketing techniques to enhance performance.
Expertise in using Apache Hcat with different big data processing tools.
Experience working with Hive data, extending the Hive library using custom UDF's to query data in non- standard formats
Experience in performance tuning of Map Reduce, Pig jobs and Hive queries
Involved in the Ingestion of data from various Databases like TERADATA( Sales Data Warehouse), AS400, DB2, SQL-SERVER using Sqoop
Experience working with Flume to handle large volume of streaming data.
Good working knowledge on Hadoop hue ecosystems.
Extensive experience in migrating ETL operations into HDFS systems using Pig Scripts.
Good knowledge in evaluating big data analytics libraries (MLlib) and use of Spark-SQL for data exploratory.
Experienced in using Apache ignite for handling streaming data.
Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in scala.
Experience in implementing a distributed messaging queue to integrate with Cassandra using Apache kafka and zookeeper.
Expert in creating and designing data ingest pipelines using technologies such as spring Integration, Apache Storm-kafka
Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map Reduce and Pig jobs.
Good knowledge in using OCR for kofax capture.
Worked with different File Formats like TEXTFILE, AVROFILE, ORC for HIVE Querying and Processing
Experienced in working with Apache Ambari.
Experienced in working with Apache Accumulo.
Used Compression Techniques (snappy ) with file formats to leverage the storage in HDFS
Working knowledge in Hadoop HDFS Admin Shell commands.
Developed core modules in large cross-platform applications using JAVA, J2EE, Hibernate, Python, Spring, JSP, Servlets, EJB, JDBC, JavaScript, XML, and HTML.
Experienced with build tools Maven, ANT and continuous integrations like Jenkins.
Working Knowledge in configuring and monitoring tools like Ganglia and Nagios.
Hands-on experience in using relational databases like Oracle, MySQL, PostgreSQL and MS-SQL Server.
Extensive experience in developing and deploying applications using Web Logic, Apache Tomcat and JBOSS.
Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
Experienced with version controller systems like SVN, Clear case.
Experience using IDEs tools Eclipse 3.0, My Eclipse, RAD and NetBeans
Hands on development experience with RDBMS, including writing SQL queries, PLSQL, views, stored procedure, triggers, etc.
Participated in all Business Intelligence activities related to data warehouse, ETL and report development methodology
Expertise in Waterfall and Agile software development model & project planning using Microsoft Project Planner and JIRA.
Highly motivated, dynamic, self-starter with keen interest in emerging technologies

TECHNICAL SKILLS:

Big Data Technologies: HDFS, Map Reduce, Hive, Hcat, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, Zookeeper, Kafka, Impala, Apache Spark, hue, Ambari. Apache ignite.

Hadoop Distributions: Cloudera (CDH4/CDH5),Horton Works

Languages: Java, C, SQL, PYTHON,PL/SQL,PIG-Latin, HQL

IDE Tools: Eclipse, NetBeans, RAD

Framework: Hibernate, Spring, Struts, Junit

Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP,JSON, XML, XHTML, JSF, Angular JS

Web Services: SOAP,REST, WSDL, JAXB, and JAXP

Operating Systems: Windows (XP,7,8), UNIX, LINUX, Ubuntu, CentOS

Application Servers: Jboss, Tomcat, Web Logic, Web Sphere

Reporting Tools /ETL Tools: Tableau, Power view for Microsoft Excel, Informatica

Databases: Oracle, MySQL, DB2, Derby, PostgreSQL, No-SQL Database (Hbase, Cassandra)

PROFESSIONAL EXPERIENCE:

Confidential, Sunnyvale, CA

Hadoop Developer

Responsibilities:

Evaluated business requirements and prepared Detailed Design documents that follow Project guidelines and SLAs required procuring data from all the upstream data sources and developing written programs.
Developed and implemented API services using Python in spark.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python.
Developed multiple POCs using PySpark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Maintained and administrated HDFS through Hadoop - Java API, shell scripting, Python,
Used Python for writing script to move the data across clusters.
Created Hive External tables and loaded the data into tables and query data using HQL.
Installing and maintaining the Hadoop - Spark cluster from the scratch in a plain Linux environment and defining the code outputs as PMML.
Experience in integrating Cassandra with Elastic Search and Hadoop.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processed the data with Pig.
Develop Shell scripts to automate routine DBA tasks (i.e. database refresh, backups, monitoring)
Tuned/Modified SQL for batch and online processes

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Java, Linux, Maven, Teradata, Zookeeper, SVN, autosys, Hbase, Cassandra, Python,Spark

Confidential, Frankfort, KY

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.
Written multiple Map Reduce programs in Java for Data Analysis
Wrote Map Reduce job using Pig Latin and Java API
Performed performance tuning and troubleshooting of Map Reduce jobs by analyzing and reviewing Hadoop log files.
Developed pig scripts for analyzing large data sets in the HDFS.
Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume
Designed and presented plan for POC on impala.
Experienced in migrating Hive QL into Impala to minimize query response time.
Knowledge on handling Hive queries using Spark SQL that integrate with Spark environment.
Implemented Avro and parquet data formats for apache Hive computations to handle custom business requirements.
Responsible for creating Hive tables, loading the structured data resulted from Map Reduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
Performed extensive Data Mining applications using HIVE.
Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS using autosys and Oozie coordinator jobs.
Performed streaming of data into Apache ignite by setting up cache for efficient data analysis.
Responsible for performing extensive data validation using Hive
Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
Utilized Storm for processing large volume of datasets.
Used Kafka to load data in to HDFS and move data into NoSQL databases(cassandra)
Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
Involved in submitting and tracking Map Reduce jobs using Job Tracker.
Involved in creating Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations
Responsible for cleansing the data from source systems using Ab Initio components such as Join, Dedup Sorted, De normalize, Normalize, Reformat, Filter-by-Expression, Rollup
Used Visualization tools such as Power view for excel, Tableau for visualizing and generating reports.
Exported data to Tableau and excel with Power view for presentation and refining
Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
Implemented Hive Generic UDF's to implement business logic.
Coordinated with end users for designing and implementation of analytics solutions for User Based Recommendations using R as per project proposals.
Implemented test scripts to support test driven development and continuous integration.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Java, Linux, Maven, Teradata, Zookeeper, SVN, autosys, Tableau, Hbase, Cassandra, Apache ignite

Confidential - Albuquerque, NM

Hadoop Developer

Responsibilities:

Worked on writing transformer/mapping Map-Reduce pipelines using Java.
Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop.
Designed and implemented Incremental Imports into Hive tables.
Worked in Loading and transforming large sets of structured, semi structured and unstructured data
Deployed an Apache Solr search engine server to help speed up the search of the government cultural asset.
Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume
Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
Experienced in managing and reviewing the Hadoop log files.
Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
Implemented the workflows using Apache Oozie framework to automate tasks
Worked with Avro Data Serialization system to work with JSON data formats.
Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
Developed scripts and automated data management from end to end and sync up between all the clusters.
Involved in Setup and benchmark of Hadoop /HBase clusters for internal use.
Setup Hadoop cluster on Amazon EC2 using whirr for POC
Created and maintained Technical documentation for launching HADOOP Clusters and for executing pig Scripts.

Environment: : Hadoop, Big Data, HDFS, Map Reduce, Sqoop, Oozie, Pig, Hive, hbase, Flume, LINUX, Java, Eclipse, Cassandra, Hadoop Distribution of Cloudera., PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting, Putty and Eclipse

Confidential

Java Developer

Responsibilities:

Worked with business analyst in understanding business requirements, design and development of the project.
Implemented the Struts frame work with MVC architecture.
Created new JSP's for the front end using HTML, Java Script, Jquery, and Ajax.
Developing JSP pages and configuring the module in the application.
Developed the presentation layer using JSP, HTML, CSS and client side validations using JavaScript.
Involved in designing, creating, reviewing Technical Design Documents.
Developed DAOs (Data Access Object) using Hibernate as ORM to interact with DBMS - Oracle.
Collaborated with the ETL/ Informatica team to determine the necessary data models and UI designs to support Cognos reports.
Developed Ab Initio graph that uses Java code to decompress the compressed PDF file and stored them in a directory.
Performed several data quality checks and found potential issues, designed Ab Initio graphs to resolve them
Applied J2EE design patterns like Business Delegate, DAO and Singleton.
Deployed and tested the application using Tomcat web server.
Using java scripts did client side validation.
Involved in developing DAO's using JDBC.
Involved in coding, code reviews, JUnit testing, Prepared and executed Unit Test Cases.
JBOSS for application deployment and MySQL for database
Worked with QA team in preparation and review of test cases.
JUnit was used for unit testing for the integration testing tool.
Writing SQL queries to fetch the business data using Oracle as database.
Developed UI for Customer Service Modules and Reports using JSF, JSP's and My Faces Components
Log4j used for logging the application log of the running system to trace the errors and certain automated routine functions.
CVS was used as configuration management tool.

Environment: : Java, JSP, JavaScript, Servlets, Struts, Hibernate, EJB, JSF, JSP, Ant, Tomcat, CVS, Eclipse, SQL Developer, Oracle.

Java Developer

Confidential

Responsibilities:

Developed the application using Struts Framework that leverages classical Model View Layer (MVC) architecture UML diagrams like use cases, class diagrams, interaction diagrams (sequence and collaboration) and activity diagrams were used
Gathered business requirements and wrote functional specifications and detailed design documents
Extensively used Core Java, Servlets, JSP and XML
Wrote AngularJS controllers, views, and services.
Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database
Implemented Enterprise Logging service using JMS and apache CXF.
Developed Unit Test Cases, and used JUNIT for unit testing of the application
Implemented Framework Component to consume ELS service.
Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user requirements
Implemented JMS producer and Consumer using Mule ESB.
Wrote SQL queries, stored procedures, and triggers to perform back-end database operations
Sending Email Alerts to supporting team using BMC m send.
Designed Low Level design documents for ELS Service.

Environment: Java, Spring core, JMS Web services, JMS, JDK, SVN, Maven, Mule ESB Mule, Junit,WAS7,Jquery, Ajax, SAX.

Jr. Java Developer

Confidential

Responsibilities

Used Hibernate ORM tool as persistence Layer - using the database and configuration data to provide persistence services (and persistent objects) to the application.
Implemented Oracle Advanced Queuing using JMS and Message driven beans.
Responsible for developing DAO layer using Spring MVC and configuration XML’s for Hibernate and to also manage CRUD operations (insert, update, and delete).
Implemented Dependency injection of spring frame work.
Developed and implemented the DAO and service classes.
Developed reusable services using BPEL to transfer data.
Participated in Analysis, interface design and development of JSP.
Configured log4j to enable/disable logging in application.
Wrote SPA (Single Page Web Applications) using RESTFUL web services plus Ajax and AngularJS.
Developed Rich user interface using HTML, JSP, AJAX, JSTL, Java Script, JQuery and CSS.
Implemented PL/SQL queries, Procedures to perform data base operations.
Wrote UNIX Shell scripts and used UNIX environment to deploy the EAR and read the logs.
Implemented Log4j for logging purpose in the application.
Involved in code deployment activities for different environments.
Implemented agile development methodology.

Environment: Java, Spring, Hibernate, JMS, EJB, Web logic Server, JDeveloper, SQL Developer, Maven, XML, CSS, JavaScript, JSON.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Sunnyvale, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship