Hadoop Developer Resume San Jose, CA - Hire IT People

SUMMARY

Around 8+ years of professional IT experience in Analysis, Development, Integration and maintenance of Web based and Client/Server applications using Java and Big Data technologies.
5+ years of experience as Hadoop Development and analysis. Worked on various technologies like Hive, Pig, Java MapReduce, UNIX, and HDFS.
Strong experience working with HDFS, MapReduce, Spark, Hive, Pig, Sqoop, Flume, Kafka, Yarn, Oozie and HBase.
Over 2+ years of experience in development, Linux administration, implementation and maintenance of web servers and distributed Enterprise applications.
Experience in all phases of software development life cycle (SDLC), which includes User Interaction, Business Analysis/Modelling, Design/Architecture, Development, Implementation, Integration, Documentation, Testing, and Deployment.
Experience in analyzing teh Business requirement and creating teh hive or pig scripts to process teh aggregate data.
Good understanding in processing of real - time data using Spark.
Involved in preparation of Test Plans, Test Cases & Test Scripts based on business requirements, rules, data mapping requirements and system specifications.
Ingest data using Sqoop from various RDBMS like Oracle, MYSQL, and Microsoft SQL Server into Hadoop HDFS.
Experience in implementation of Open-Source frameworks like spring, Hibernate, Web Services etc.
Troubleshooting issues in development and operational environments on configuration of Hadoop environments.
Experience in Continuous Integration and Continuous Deployment by teh tools like Jenkins.
Experience in manipulating teh streaming data to clusters through Kafka and Spark-Streaming.
Experience with databases such as PostgreSQL, and MySQL Server with cluster setup and writing teh SQL queries Triggers & Stored Procedures.
Experience in data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka.
Very Good understanding and Working Knowledge of Object Oriented Programming (OOPS), Python and Scala.
Experienced with teh Spark improving teh performance and optimization of teh existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Proficient in working with NoSQL database like MongoDB, Cassandra and HBase.
Good Knowledge in NoSQL databases HBASE (Column family DB).
Good knowledge on Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
Communicated to diverse communities of clients at offshore and onshore, dedicated to client satisfaction and quality outcomes. Extensive experience in coordinating teh Offshore Development activities.
Highly organized and dedicated with positive Attitude, possess good time management and organizational skills with teh ability to handle multiple tasks with positive attitude.
A team player with good interpersonal, communication and leadership skills.
Easily adaptable to teh work conditions and can consistently deliver teh quality work and capable of adapting to new technologies and facing new challenges.

TECHNICAL SKILLS

Big Data Ecosystems: Hadoop, Teradata, Map Reduce, Spark, HDFS, HBase, Pig, Hive, Sqoop, Oozie, Storm, Scala, Kafka and Flume.

Programming Languages: Java (J2SE, J2EE), C, C#, PL/SQL, Swift, SQL+, ASP.NET, JDBC, Python.

Web Development: JavaScript, jQuery, HTML 5.0, CSS 3.0, AJAX, JSON

Development Tools: Net Beans 8.0.2, Visual Studio 2013, Eclipse Neon, Android Studio, SQL developer

Testing Tools: J-Unit Testing, HP- Unified functional testing, HP- Performance Center, Selenium, win runner, Load Runner, QTP

UNIX Tools: Apache, Yum, RPM

Operating Systems: Windows, Linux, Ubuntu, Mac OS, Red Hat Linux

Protocols: TCP/IP, HTTP and HTTPS

Web Servers: Apache Tomcat

Cluster Management Tools: Cloudera Manager, Horton Works, Ambari

Methodologies: Agile, V-model, Waterfall model

Databases: HBase, MongoDB, Cassandra, Oracle 10g, MySQL, Couch, MS SQL server

PROFESSIONAL EXPERIENCE

Confidential, San Jose, CA

Hadoop Developer

Responsibilities:

Teh GVS-CS project is having multiple teams. Among them, me involved in teh Data Engineering team.
Teh focus of teh team is getting teh data from different vendors and process that data by using business rules.
After processing teh data, we will send teh data to teh Eloqua tool.
Myself, involved in teh Hadoop security architecture, which added different users to teh same YARN queue in development and productions clusters.
After adding teh users, we validate some jobs and checked that teh new users are allocated teh same YARN queue or not in respective clusters.
Also involved in teh security architecture for Google Platform, we are in teh process of implementing this to Google cloud projects.
Teh security architecture for Google platform is basically a two-step verification whose is going to access teh cloud projects.
In this project, we are spark-Sql and hive to validate large data sets with teh business rules.
Also, involved in teh discussion of teh Hadoop data pipe lines automation to implement on our Hadoop.
In Hadoop data pipe line automation, we want to implement Jenkins to automate teh Git commits when we push.
their are number of offers which are going live every week, and monthly based on teh client requirements.
We are involved in cleaning teh database when it is required like hive tables, python scripts etc.

Environment: Hadoop, HDFS, Hive, Python, Spark, SQL, Jenkins, UNIX Shell Scripting, Big Data, Map Reduce, Git, Eloqua.

Confidential, Plano, TX

Hadoop Developer

Responsibilities:

Expert in implementing advanced procedures like text analytics and processing using teh in-memory computing capabilities like Apache Spark written in Scala.
Used flume, Sqoop, Hadoop, spark and Oozie for building data pipeline.
Cluster coordination services through Zookeeper.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Automated all teh jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
Experienced in managing and reviewing Hadoop log files.
Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
Developed Oozie workflow for scheduling and orchestrating teh ETL process. Designed & Implemented Java MapReduce programs to support distributed data processing.
Worked with highly unstructured and semi-structured data of 30TB in size (90TB with replication factor of 3).
Contributed towards developing a Data Pipeline to load data from different sources like Web, RDBMS, and NoSQL to Apache Kafka or Spark cluster.
Migrating data from Spark-RDD into HDFS and NoSQL like Cassandra/HBase.
Implement Pig in Pig-Latin to handle teh preprocessing of data and make data regular.
Worked on reading multiple data formats on HDFS using PySpark.
Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
Developed MapReduce programs by using Java.
Worked on teh core and Spark SQL modules of Spark extensive.
Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.

Environment: Hadoop, HDFS, Hive, Python, Scala, Spark, SQL, Teradata, UNIX Shell Scripting, Big Data, Map Reduce, Sqoop, Oozie, Pig, Flume, LINUX, Java, Eclipse.

Confidential, South Portland, ME

Hadoop Developer

Responsibilities:

Worked in Multi Clustered Hadoop Eco-System environment.
Created Map Reduce programs using Java API that filter un-necessary records and find out unique records based on different criteria.
Used Unit Test Pythonlibrary for testing many Python programs and block of codes.
Parse JSON and XML data using Python.
Rewrite existing Java application in Pythonmodule to deliver certain format of data.
Load and transform large sets of unstructured data from UNIX system to HDFS.
Use Apache Scoop to dump teh data user data into teh HDFS on a weekly basis.
Created production jobs using Oozie work flows that integrated different actions like Map Reduce, Sqoop, and Hive.
Involved in importing teh real time data to Hadoop usingKafkaand implemented teh Oozie job for Daily.
Involved in developing Hive DDLs to create, alter and drop Hive tables.
Experienced in transferring data from different data sources into HDFS systems usingKafka Producers.
Prepared ETL pipeline with teh halp of Sqoop for consumption.
Written PIG Scripts to analyze Hadoop logs.
Created tables, loading with data and writing HIVE queries which will run internally in map.
Troubleshoot and debug HADOOP ecosystem run-time issues.
Participated in all phases of SDLC includes areas of requirement gathering, analysis, estimation, design, coding, testing and documentation.
Developed SOAP web service as publisher/producer.
Developed different GUI screens JSPs using HTML, JavaScript and CSS.
Designed teh user interface of teh application using Angular JS, Bootstrap, HTML5, CSS3 and JavaScript.
Designed and developed front-end Graphic User Interface with JSP, HTML5, CSS3, JavaScript, and JQuery.
Developed entire frontend and backend modules using Pythonon Django Web Framework.
Developed tools using Python, Shell scripting, XML, BIG DATA to automate some of teh menial tasks.
Performed Single Point of Technical Contact for different application teams and DEV, QA, Line Managers.

Environment: Hadoop MapReduce, HIVE, HDFS, Java, CSV files, Python, Django, Java, AWS, XML, Shell Scripting, MySQL, HTML, XHTML, Jenkins, Linux.

Confidential, Boston, MA

Hadoop Data Analyst

Responsibilities:

Used Hive quires and Pig scripts to analyze data.
Used Hive for partitioning and bucketing of data from different kind of sources to improve teh performance.
Following agile methodology (SCRUM) during development of teh project and oversee teh software development by attending daily stand-ups.
Used Oozie to automate teh flow of jobs and Zookeeper for coordination.
Used Flume to distribute Unstructured and semi structured data.
Used Sqoop to distribute structured data.
Wrote teh Shell scripts to run teh Cron Jobs to automate teh data migration process from external servers and FTP sites.
Prepared ETL pipeline with teh halp of Sqoop, PIG, and HIVE to be able to frequently bring in data from teh source and make it available for consumption.
Used Tableau for visualization and generate reports for financial data consolidation, reconciliation and segmentation.
Involved in loading data from UNIX file system to HDFS.
Created portioned tables in HIVE.
Developed MapReduce programs by using Java.
Developed various UDF's in hive for various hive scripts achieving various functionalities.
Implemented Kafka messaging services to stream large data and insert into database.
Analysed large amounts of data sets by writing Pig scripts.
Developed Map reduce programs for teh files generated by hive query processing to generate key, value pairs and upload teh data to NoSQL database HBase.

Environment: HDFS, Hive, MapReduce, Java, NoSQL, Unix, Linux, Jenkins, shell scripting, MySQL, Spreadsheet.

Confidential, Fort Worth, TX

Data Analyst

Responsibilities:

Communicated effectively in both a verbal and written manner to client and offshore team.
Completed documentation on all assigned systems and databases, including business rules, logic, and processes.
Created Test data and Test Cases documentation for regression and performance.
Designed, built and implemented relational databases.
Determined changes in physical database by studying project requirements.
Developed intermediate business knowledge of teh functional area and processed to understand teh application of data information to support business function.
Facilitated gathering moderately complex business requirements by defining teh business problem.
Facilitated teh monthly Opportunities for Improvement (OFI) meeting.
Identified Opportunities for Improvement (OFI), recommended and implemented, as applicable, processed improvement plans in collaboration with identified departments.
Identified and addressed outliers in an efficient and professional manner following a predetermined protocol.
Identified data requirements and isolated data elements.
Leveraged a basic understanding of multiple data structures and sources.
Maintained and assisted in teh development of moderately complex business solutions, which included data, reporting, business intelligence/analytics.
Maintained data dictionary by revising and entering definitions.
Maintained direct, timely and appropriate communication with clients.
Supported data governance, integrity, quality and audit functions.
Supported teh implementation of technical data solutions and standards.
Utilized and prepared analysis reports summarizing Opportunities for Improvements (OFIs).
Worked closely with other members of teh database group.

Environment: Linux, Unix, Java, spreadsheet, QlikView, SQL, Excel, shell scripting, MySQL.

Confidential

Java Developer

Responsibilities:

Used Eclipse as an IDE for development of teh application.
Developed Application in Jakarta Struts Framework using MVC architecture.
Implemented J2EE design patterns Session Facade pattern, Singleton Pattern.
Created Action Forms and Action classes for teh modules.
Customizing all teh JSP pages with same look and feel using Tiles, CSS.
Developed JSP's to validate teh information automatically using Ajax.
Created struts-config.xml and tiles-def.xml files.
Involved in coding for teh presentation layer using Apache Struts, XML and JavaScript.
Used XSLT for UI to display XML Data.
Utilized JavaScript for client-side validation. Participated in designing teh user interface for teh application using HTML and connected them to database using JDBC.
Created web pages based on teh requirements and styled them using CSS.
Involved in writing client-Side Scripts using Java Scripts and server-Side scripts using Java Beans and used Servlets for handling teh business
Developed teh Form Beans and Data Access Layer classes.
Involved in writing complex sub-queries and used Oracle for generating on-screen reports
Worked on database interaction layer for insertions, updating and retrieval operations on data.
Involved in deploying teh application in test environment using Apache Tomcat.

Environment: JSP, Core Java, Servlets, Struts, UML, AJAX, SQL, JUNIT, JavaScript, Eclipse, JIRA, HTML, CSS.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

San Jose, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship