Sr. Hadoop Developer Resume San Diego, CA - Hire IT People

SUMMARY

Over Nine years of work experience in IT, which includes experience in Development, Testing and Implementation of Business Intelligence and Data warehousing solutions.
Over Eight years of experience wif Apache Hadoop components like HDFS, MapReduce, HiveQL, Pig.
Experience in installing Cloudera Hadoop CDH4 on an Amazon EC2 Cluster.
Experience in Installing, Configuring and administrating the Hadoop Cluster of Major Hadoop Distributions.
Hands on experience in MapReduce jobs using HiveQL and PigLatin.
Hands on Experience in installing, Configuring and using echo system components like Hadoop, MapReduce, HDFS, Oozie, HiveQL, Sqoop, Pig, Flume.
Expertise in implementing Database projects which includes Analysis, Design, Development, Testing and Implementation of end - to-end IT solution offerings.
Extensive noledge in RDBMS, developing database applications which involved creating Stored Procedures, Views, Triggers, user defined data types and functions.
Knowledge in various phases of software development life cycle (SDLC) including System Analysis and Design, Software Development, Testing, Implementation, and Documentation.
Experience working wif MVC Architecture, Spring Core, Spring Boot, Struts, Hibernate.
Implemented REST Microservices using spring boot.
Excellent logical, analytical, communication and inter- personnel skills wif exceptional ability to learn new concepts / fast learner wif complex systems and a good team player, problem solver and ability to perform at high level to meet deadlines, adaptable to ever changing priorities.
Experience in dealing wif log files to extract data and to copy into HDFS using flume.
Developed Hive UDFs and Pig UDFs using Python in Microsoft HDInsight environment.
Efficient in writing MapReduce Programs and using Apache Hadoop Map Reduce API for analyzing the structured and unstructured data. Handling RSS Feeds in MapReduce.
Used Pig for data cleansing and filtering.
Experience in Streaming tools Spark, Spark Structure, Kafka Streaming.
Developed Hive scripts to perform analysis on the data.
Experience in developing Sqoop jobs to import data from RDBMS sources into HDFS as well as export data from HDFS into Relational tables.
Strong skill wif Distributed Stream Processing frameworks like Apache Kafka.
Worked on installing and configuring big data multi node cluster.
Has a good Knowledge on Python and RHadoop.
Good Understanding of NoSQL databases like HBase, DynamoDB.
Used AWS environment to run MR jobs, load data into HDFS and export the output of MR jobs onto Hive tables.
Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark wif Hive and SQL/Oracle/Snowflake.
Responsible for building scalable distributed solutions using Hadoop MapR.
Implemented Nifi workflows for continuous data operations.
Consuming data from Kafka and inserting into RDBMS using Nifi.
Used Airflow DAG’s(Directed Acyclic Graph) for scheduling the tasks automatically and to send email alerts when the task is failed.
Created Looper Jobs using Jenkins and created operational playbooks.
Knowledge on Data Processing solutions using data streaming technologies in Azure Cloud.
Experience on data management tools and data pipelines.
Did Graph Analysis wif thresholds based on the mean and variance of the linkages.

TECHNICAL SKILLS

Big Data Technologies: Hadoop, Scala 2.11.8, HDFS, Hive, MapR 2.7.0, Pig, Sqoop, Flume, Oozie, HBase, Spark 2.2.0, Python 2.7, Kafka

Programming Languages: Java (5, 6, 7), Python, C, C++

Databases/ RDBMS: MySQL, SQL/PL-SQL, MS-SQL Server 2005, Oracle 9i/10g/11g, DB2, Azure SQL Server 2017, Oracle SQL Developer Data Modeler 18.4

Scripting/ Web Languages: JavaScript, HTML5, CSS3, XML, SQL

ETL Tools: Informatica

Operating Systems: Linux CentOS 6.9, Windows XP/7/8/10, UNIX

Software Life Cycles: SDLC, Waterfall and Agile models

Office Tools: MS-Office, MS-Project and Risk Analysis tools

Utilities/Tools: Eclipse, Tomcat, NetBeans, IntelliJ IDEA CE, JUnit, SQL, Automation, MR-Unit, Airflow 1.10.2 Scheduler, Jenkins 2.107.3

Cloud Platforms: Amazon EC2

Java Frameworks: MVC, Apache Struts2.0, Spring, Spring Boot and Hibernate

NoSQL Database: Cassandra, HBase, Dynamo DB

PROFESSIONAL EXPERIENCE

Confidential, San Diego, CA

Sr. Hadoop Developer

Responsibilities:

Responsible for building the master member table which contains the Patient information.
Worked wif Anvita Edge team to extract data in Parquet file format.
Developed SQL server scripts to store the desired patient’s data in Oracle server.
Worked wif Azure DevOps for creating daily work items.
Developed the Generic Classes, which includes the frequently used functionality, so dat it can be reusable.
Developed new modules in Scala which includes patient’s data.
Exception Management mechanism using Exception Handling Application Blocks to handle the exceptions.
Involved in production Hadoop cluster set up, administration, maintenance, monitoring and support.
Working wif 14million records of the patient’s during open enrollment.
Supported Database team for SQL Developer scripts for Anvita alerts.
Involved in creating Hive tables as per requirement defined wif appropriate static and dynamic partitions.
Supported Map-Reduce Programs those are running on the cluster.

Environment: Hadoop YARN, MapR 2.7.0, Hive 0.13.1, Scala 2.11.8, Airflow Scheduler 1.10.2, Linux CentOS 6.9, Oracle SQL Developer 18.4, Service Now, Azure DevOps, IntelliJ IDEA CE.

Confidential, Sunnyvale, CA

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed solutions using Hadoop MapR.
Worked wif CBB (Customer Back Bone) team for retrieving all ids required for implementing customer request for action wif IDLookup.
Developed spark scripts which takes sequence dataframe as input to the CCPA graphx traversal jar and traverse linked ids for IDLookup.
Implemented Scala using spark to load json data from REST API into a dataframe and utilized Data frames for transformations and Spark SQL API for faster processing of data.
Optimized spark jobs configurations based on real-time or batch and tested configurations for most optimal workloads of around 800GB and 1TB joins wif and wifout buckets.
Developed automated SQL server reports for IDLookup through email by using shell script.
Created Airflow DAG’s (Directed Acyclic Graph) using Python script for scheduling the tasks automatically and also to send the email alerts when the task is failed.
Implemented unit test cases using shell script.
Worked on Traversal Graph Analysis on hive tables wif thresholds based on the mean and variance of the linkages.
Implemented Nifi workflows for continuous data operations.
Created Looper Jobs using Jenkins and created operational playbooks.
Created support plan for manual and automation process.
Involved in End-to-End testing in staging and production and sent daily reports for requests received from Service Now and processed by CBB.
Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark wif Hive and SQL/Oracle/Snowflake.
Documented operational playbooks for IDLookup and service now acceptance support for team.
Worked on Agile environment and tracking tasks using Jira board.
Experience on data management tools and data pipelines.
Consuming GCP (Google Cloud Platform) services and perform security implementation on Big Data applications.

Environment: Hadoop YARN, Spark 2.2.0, Spark Core, Spark SQL, Scala 2.11.8, Python 2.7, MapR 2.7.0, Hive 0.13.1, Airflow Scheduler 1.10.2, Linux CentOS 6.9, Azure SQL server 2017, Service Now, Jira, Jenkins 2.107.3, IntelliJ IDEA CE.

Confidential, North Chicago, IL

Hadoop Developer

Responsibilities:

Obtained the requirement specifications from the SME’s, Business Analysts in the BR, and SR meetings for corporate work place project. Interacted wif the Business users to build the sample report layouts.
Involved in writing the HLD’s along wif the RTM’s tracing back to the corresponding BR’s and SR’s and reviewed them wif the Business.
Implementing an Enterprise level Transfer Pricing System to ensure tax efficient supply chains and achieve entity profit targets.
IOP implementation involved understanding the Business requirements and solution design, translating the design into model construction, data loading using ETL logic, data validation and creating several custom reports as per the end user requirements.
Installed and configured Apache Hadoop and Hive/Pig Ecosystems.
Installed and Configured Cloudera Hadoop CDH4 via Cloudera Manager in a pseudo distributed mode and cluster mode.
Developing the Python APIs which represent the memory subsystem.
Developed Hive UDFs and Pig UDFs using Python in Microsoft HDInsight environment.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing wif HiveQL.
Development of Python APIs to dump the array structures in the Processor at the failure point for debugging.
Developed Map reduce program to extract and transform the data sets and resultant dataset were loaded to Cassandra and vice versa using kafka 2.0.x.
Involved in creating Hive Tables, loading wif data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
Developed Spark Streaming custom receiver to process data from RabbitMQ into Cassandra and Aerospike tables.
Worked on Xml Stub’s integrating them wif the Excel VB code and the backend DB.
Created Map Reduce Jobs using Hive/Pig Queries.
Used NOSQL database services like DynamoDB.
Responsible for Data Ingestion like Flume and Kafka.
Involved in installing, configuring and managing Hadoop Ecosystem components like Spark, Hive, Pig, Sqoop, Kafka and Flume.
Usage of Spark streaming and Spark SQL API to process the files.
Used Apache Spark wif Python to develop and execute Big Data Analytics.
Importing and exporting data into HDFS Sqoop and Flume and Kafka.
Designed Outbound Packages to dump IOP Processed data into the Out tables for the Data Warehouse and the Cognos BI team.
Worked on DB2 to store, analyze and retrieve the Data.
Involved in Unit testing, System Integration testing and UAT post development.
Provided End User training and configured reports in IOP.

Environment: Oracle IOP, Apache Hadoop, HDFS, Sqoop, Flume, Kafka, Cassandra, Cloudera Hadoop CDH4, HiveQL, Piglatin, Spark, DynamoDB.

Confidential, Great Neck, NY

Hadoop Developer

Responsibilities:

Worked as a senior developer for the project.
Used Enterprise Java Beans as a middleware in developing a three-tier distributed application.
Developed Session Beans and Entity beans to business and data process.
Implemented Web Services wif REST.
Developed user interface using HTML, CSS, JSPs and AJAX.
Client side validation using JavaScript and JQuery.
Performed client side validation wif JavaScript and applied server side validation as well to the web pages.
Used AWS environment to run MR jobs, load data into HDFS and export the output of MR jobs onto Hive tables.
Used JIRA for BUG Tracking of Web application.
Written Spring Core and Spring MVC files to associate DAO wif Business Layer.
Worked wif HTML, DHTML, CSS, and JAVASCRIPT in UI pages.
Wrote Web Services using SOAP for sending and getting data from the external interface.
Extensively worked wif JUnit framework to write JUnit test cases to perform unit testing of the application.
Developed Spark Streaming custom receiver to process data from RabbitMQ into Cassandra and Aerospike tables.
Developed real time data ingestion from Kafka to Elastic search by using Kafka Elasticsearch input and output plugins.
Implemented JDBC modules in java beans to access the database.
Designed the tables for the back-end Oracle database.
Application hosted under Web Logic and developed utilizing Eclipse IDE.
Used XSL/XSLT for transforming and displaying reports. Developed Schemas for XML.
Involved in writing the ANT scripts to build and deploy the application.
Developed a web-based reporting for monitoring system wif HTML and Tiles using Struts framework.
Implemented field level validations wif AngularJS, JavaScript and JQuery.
Designing and implementation of multi-tier applications using web-based technologies like Spring MVC and Spring Boot.
Preparation of unit test scenarios and unit test cases.
Used Dynamo DB for running applications.
Branding the site wif CSS.
Created alter, insert and delete queries involving lists, sets and maps in DataStax Cassandra.
Worked wif Spark on parallel computing to enhance noledge about RDD in DataStax Cassandra.
Worked wif Scala to determine the flexibility of Scala on Spark and Cassandra to the management.
Code review and unit testing the code.
Used DB2 wif the support of Object-Oriented features and Non-Relational structures wif XML.
Involved in unit testing using Junit.
Implemented Log4J to trace logs and to track information.
Involved in project discussions wif clients and analyzed complex project requirements as well as prepared design documents.

Environment: Hive, Pig, HBase, Zookeeper, Sqoop, Kafka, Cassandra, Cloudera, Java, JDBC, JNDI, Struts, Maven, Subversion, JUnit, SQL language, DB2, Spring, Hibernate, Junit, Oracle, XML, Putty and Eclipse, DynamoDB, AWS.

Confidential

Hadoop Developer

Responsibilities:

Involved in Automation of clickstream data collection and store into HDFS using Flume.
Involved in creating Data Lake by extracting customer's data from various data sources into HDFS.
Used Sqoop to load data from Oracle Database into Hive.
Developed MapReduce programs to cleanse the data in HDFS obtained from multiple data sources.
Implemented various Pig UDF's for converting unstructured data into structured data.
Developed Pig Latin scripts for data processing.
Involved in writing optimized Pig Script, along wif developing and testing Pig Latin Scripts.
Involved in creating Hive tables as per requirement defined wif appropriate static and dynamic partitions.
Used Hive to analyze the data in HDFS to identify issues and behavioral patterns.
Involved in production Hadoop cluster set up, administration, maintenance, monitoring and support.
Logical implementation and interaction wif HBase.
Assisted in creation of large HBase tables using large set of data from various portfolios.
Cluster coordination services through Zookeeper.
Efficiently put and fetched data to/from HBase by writing MapReduce job.
Developed MapReduce jobs to automate transfer of data from/to HBase.
Assisted wif the addition of Hadoop processing to the IT infrastructure.
Used flume to collect the entire web log from the online ad-servers and push into HDFS.
Implemented custom business logic by writing UDF's in Java and used various UDF's from Piggybank and other sources.
Implemented MapReduce job and execute the MapReduce job to process the log data from the ad-servers.
Load and transform large sets of structured, semi structured and unstructured data.
Back-end Java developer for Data Management Platform (DMP) and building RESTful APIs to build and let other groups build dashboards.

Environment: Hadoop, Pig, Sqoop, Oozie, MapReduce, HDFS, Hive, Java, Eclipse, HBase, Flume, Oracle 10g, UNIX Shell Scripting, GitHub, Maven.

Confidential

Hadoop Developer

Responsibilities:

Extracted the data from the flat files and other RDBMS databases into staging area and populated onto Data warehouse.
Worked on Spark and Cassandra for the User behavior analysis and lightning speed execution.
Developed mapping parameters and variables to support SQL override.
Used existing ETL standards to develop these mappings.
Installed and configured Hadoop Map-Reduce, HDFS and developed multiple Map-Reduce jobs in Java for data cleansing and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Used UDF’s to implement business logic in Hadoop
Extracted files from Oracle and DB2through Sqoop and placed in HDFS and processed.
Load and transform large sets of structured, semi structured and unstructured data.
Responsible to manage data coming from different sources.
Supported Map-Reduce Programs those are running on the cluster.
Involved in loading data from UNIX file system to HDFS.
Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way.
Worked on JVM performance tuning to improve Map-Reduce jobs performance.

Environment: Hadoop, MapReduce, HDFS, Hive, Oracle 11g, Java, Struts, Servlets, HTML, XML, SQL, J2EE, JUnit, Tomcat 6.

Confidential

Java Developer

Responsibilities:

Implemented the project according to the Software Development Life Cycle(SDLC).
Implemented JDBC for mapping an object-oriented domain model to a traditional relational database.
Created Stored Procedures to manipulate the database and to apply the business logic according to the user’s specifications.
Developed the Generic Classes, which includes the frequently used functionality, so dat it can be reusable.
Exception Management mechanism using Exception Handling Application Blocks to handle the exceptions.
Designed and Developed user interfaces using JSP, Java Script and HTML.
Developed server side application to interact wif database using Spring Boot.
Involved in Database design and developing SQL Queries, stored procedures on MySQL.
Used CVS for maintaining the Source Code.
Implemented REST Microservices using spring boot.
Logging was done through log4j.

Environment: Java, Java Script, Spring Boot, HTML, JDBC Drivers, Soap Web Services, Unix, Shell scripting, SQL Server, Microservices.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

San Diego, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship