Hadoop Developer Resume

SUMMARY

8+ years of total IT experience in the analysis, design, testing, development and Implementation of Data Warehouse/Data Mart Design, OLAP, Web and Business Intelligence applications on platforms like Windows and Unix.
Includes 4 years of hands on experience in Big Data technologies and Hands on experience in Hadoop Framework and its ecosystem like Map Reduce Programming, Hive, Sqoop, Nifi, HBase, Impala, and Flume
Experience in working on Horton works and ClouderaHadoop distributions.
Experience in analyzing data using HiveQL, and custom Map Reduce programs.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
Experienced in designing, built, and deploying a multitude applications utilizing almost all of the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
Experience with Amazon Web Services, AWS command line interface, and AWS data pipeline.
Experienced in BI tools like Tableau.
Excellent experience using Text mate on Ubuntu for writing Java, Scala and shell scripts.
Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
Extending Hive core functionality by writing custom UDFs, UDTF and UDAFs.
Collected data from different sources like web servers and social media using Flume for storing in HDFS and analyzing the data using other Hadoop technologies.
Good noledge of configuring and maintaining YARN Schedulers.
Experience in using Zookeeper and Oozie operational services to coordinate clusters andscheduling workflows.
Hands on experience working on NoSQL databases like HBase
Experience of semi-structured data processing (XML, JSON, and CSV) in Hive/Impala.
Good experience in Shell script and Python programming.
Good noledge of Java, JDBC, Collections, JSP, JSON, REST, SOAP Web services, and Eclipse.
Having experience on UNIX commands.
Developed and maintained web applications running on Apache Web server.
Experience of working in Agile Software Development environment.
Exceptional ability to learn new technologies and to deliver outputs in short deadlines.
Exceptional ability to quickly master new concepts and capable of working in-group as well as independently with excellent communication skills.

TECHNICAL SKILLS

Languages: C, C++, Java, Visual Basic, COBOL, UNIX Shell Scripting, Hive, Hadoop, Scala

Frameworks: MAP REDUCE, SPARK

Database: Teradata (V13/V12), Oracle (9i/10g/11g), MS SQL Server, DB2, Data Lake.

Web Scripting: Java Script, ETL, CSS

Web services: Soap and Rest

Cloud Services: AWS, EC2, S3

Ingestion Tools: Sqoop, NIFI, Flume, Kafka

Others: AutoSys, Control-M, PL/SQL, XML

BI/GUI Tools: Tableau, MS Project, Visio, MS office (Word, Excel, PowerPoint, Access), XML

Operating Systems: IBM AIX, HP UNIX, Solaris, Windows and Sun OS

OTHERS: JIRA (TRACKING), GIT HUB (repository), Jenkins (BUILD), UDEPLOY (deploy)

PROFESSIONAL EXPERIENCE

Confidential

Hadoop Developer

Responsibilities:

Creating HIVE entities to create a presentation layer of ingested data (loading data) onto Lake.
Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
Primarily involved in Data Migration process using Azure by integrating with Git Hub repository and Jenkins.
Built code for real time data ingestion using Shell script and Scala and java.
Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
Involved in development of Hadoop System and improving multi-node Hadoop Cluster performance.
Worked on analyzing Hadoop stack and different big data tools including Pig and Hive, Hbase database and Sqoop.
Developed data pipeline using Nifi, Sqoop and hive to extract the data from weblogs and store in HDFS.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Worked with different data sources like Avro data files, XML files, JSON files, SQL server and Oracle to load data into Hive tables.
Used Spark to create the structured data from large amount of unstructured data from various sources.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, Impala and loaded final data into HDFS.
Developed shell scripts to find vulnerabilities with SQL Queries by doing SQL injection.
Experienced in designing and developing POC's in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.
Loaded the load ready files from mainframes to Hadoop and files were converted to ASCII format.
Worked on Amazon AWS concepts like EMR and EC2 web services for fast and efficient processing of Big Data.
Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
Imported weblogs & unstructured data using the Apache Flume and stores the data in Flume channel.
Exported event weblogs to HDFS by creating a HDFS sink which directly deposits the weblogs in HDFS.
Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
Built the automated build and deployment framework using Jenkins, Udeploy etc.

Confidential

Hadoop developer

Responsibilities:

Worked on implementation and data integration in developing large-scale system software experiencing with Hadoop ecosystem components like HBase, Sqoop, Zookeeper, Oozie, and Hive.
Developed Hive UDF's for extended use and wrote HiveQL for sorting, joining, filtering and grouping the structure data.
Developed ETL Applications using Hive, Spark, and Impala & Sqoop for Automation using Oozie.
Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in Oozie.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS.
Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using HiveQL.
Used Sqoop for importing the data into HBase and Hive, exporting result set from Hive to MySQL using Sqoop export tool for further processing.
Enumerated Hive queries to do analysis of the data and to generate the end reports to be used by business users.
Worked on scalable distributed computing systems, software architecture, data structures and algorithms using Hadoop, Apache Spark and Apache Storm.
Ingested streaming data into Hadoop using Spark, Storm Framework and Scala.
Worked on analyzing Hadoop cluster and different big data analytical and processing tools including Pig, Hive, Spark, and Spark Streaming.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Migrating various Hive UDF's and queries into Spark SQL for faster requests.
Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
Hands on experience in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
Implemented POCs with Spark SQL to interpret complex JSON records, Delivery experience on major Hadoop ecosystem Components such as Pig, Hive, Spark, Elastic Search &HBase and monitoring with Cloudera Manager.
Collection framework used to transfer objects between the different layers of the application.
Experience in transferring Streaming data, data from different data sources into HDFS and NoSQL databases using Apache Flume.
Developed Spark jobs written in Scala to perform operations like data aggregation, data processing and data analysis.
Involved in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
Developed Spark code by using Scala and Spark-SQL for faster processing and testing and performed complex HIVEQL queries on HIVE tables
Used Kafka, Flume for building robust and fault tolerant data Ingestion pipeline between JMS and Spark Streaming Applications for transporting streaming web log data into HDFS.
Used Spark for series of dependent jobs and for iterative algorithms. Developed a data pipeline using Kafka and Spark Streaming to store data into HDFS.

Confidential

Hadoop Developer

Responsibilities:

Involved in analyzing scope of application, defining relationship within and groups of data using star schema, and snowflake schema.
Responsible for installation and configuration of HIVE, HBase and Sqoop on the Hadoop cluster and created HIVE tables to store the processed results in a tabular format. Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
Developed the Sqoop scripts to make the interaction between HIVE and Impala.
Processed data into HDFS by developing solutions and analyzed the data using Map Reduce, and HIVE to produce summary results from Hadoop to downstream systems.
Written Map Reduce code to process and parsing the data from various sources and storing parsed data into HBase and HIVE using HBase-HIVE Integration.
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running HIVE queries.
Created Managed tables and External tables in HIVE and loaded data from HDFS
Developed Spark code by using Scala and Spark-SQL for faster processing and testing and performed complex HIVEQL queries on HIVE tables.
Scheduled several times based Oozie workflow by developing Python scripts.
Exporting the data using Sqoop to RDBMS servers and processed that data for ETL operations.
Designing ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shellscript, sqoop, package and MySQL.
End-to-end architecture and implementation of client-server systems using Scala, Java, JavaScript and related, Linux
Optimized the HIVE tables using optimization techniques like partitions and bucketing to provide better.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce HIVE, Pig, and Sqoop.
Involved in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
Created partitioned tables and loaded data using both static partition and dynamic partition method.
Developed custom Apache Spark programs in Scala to analyze and transform unstructured data
Handled importing of data from various data sources, performed transformations using HIVE, Map Reduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop
Using Kafka on publish-subscribe messaging as a distributed commit log, has experienced in its fast, scalable and durability.
Implemented POC to migrate Map Reduce jobs into Spark RDD transformations using SCALA
Scheduled map reduces jobs in production environment using Oozie scheduler.
Managing Amazon Web Services (AWS) infrastructure with automation and orchestration tools such as Chef.
Proficient in AWS services like VPC, EC2, S3, ELB, IAM, CloudFormation
Experienced in creating multiple VPC’s and public, private subnets as per requirement and distributed them as groups into various availability zones of the VPC.
Created NAT gateways and instances to allow communication from the private instances to the internet through bastion hosts.
Involved in writing Java API for Amazon Lambda to manage some of the AWS services.
Used security groups, network ACL’s, internet gateways and route tables to ensure a secure zone for organization in AWS public cloud.
Created and configured elastic load balancers and auto scaling groups to distribute the traffic and to has a cost efficient, fault tolerant and highly available environment.
Created S3 buckets in the AWS environment to store files, sometimes which are required to serve static content for a web application.
Used AWS Beanstalk for deploying and scaling web applications and services developed with Java.
Configured S3 buckets with various life cycle policies to archive the infrequently accessed data to storage classes based on requirement.
Possess good noledge in creating and launching EC2 instances using AMI’s of Linux, Ubuntu, RHEL, and Windows and wrote shell scripts to bootstrap instance.
Used IAM for creating roles, users, groups and also implemented MFA to provide additional security to AWS account and its resources.
Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
Designed and implemented map reduce jobs to support distributed processing using java, HIVE and Apache Pig.
Analyzing Hadoop cluster and different Big Data analytic tools including Pig, HIVE, HBase and Sqoop.
Improved the Performance by tuning of HIVE and map reduce.
Research, evaluate and utilize modern technologies/tools/frameworks around Hadoop ecosystem.

Confidential

Java Developer

Responsibilities:

Responsible for interacting and discussing with Business Team to understand the business process and gather requirements. Developed and documented a high-level Conceptual Data Process Design for the projects.
Involved in development of business domain concepts into Use Cases, Sequence Diagrams, Class Diagrams, Component Diagrams and Implementation Diagrams.
Analyze the requirements and communicate the same to both Development and Testing teams.
Responsible for analysis and design of the application based on MVC Architecture, using open source Struts Framework.
Involved in configuring Struts, Tiles and developing the configuration files.
Developed Struts Action classes and Validation classes using Struts controller component and Struts validation framework.
Developed and deployed UI layer logics using JSP, XML, JavaScript, HTML /DHTML.
Used Spring Framework and integrated it with Struts.
Involved in Configuring web.xml and struts-config.xml according to the struts framework.
Creating unit test cases and executing them with the help of JUnit testing framework. Supported, Testing and coding issues in Production/QA environment. Consumed Web Services for transferring data between different applications.
Involved in fixing defects and unit testing with test cases using JUnit. Developed user and technical documentation.
Designed a lightweight model for the product using Inversion of Control principle and implemented it successfully using Spring IOC Container.
Used transaction interceptor provided by spring for declarative Transaction Management.
The dependencies between the classes were managed by spring using the Dependency Injection to promote loose coupling between them.
Provided connections using JDBC to the database and developed SQL queries to manipulate the data.
Developed ANT script for auto generation and deployment of the web service.
Wrote stored procedure and used JAVA APIs to call these procedures.
Developed various test cases such as unit tests, mock tests, and integration tests using the JUNIT.
Experience writing Stored Procedures, Functions and Packages
Used log4j to perform logging in the applications.

Confidential

Web Developer as an Internship

Responsibilities:

Involved in complete User Interface designing and coded the web site in XHTML, CSS and Java Script.
Responsible for translating designs and concepts into highly usable and engaging web applications using a variety of technologies.
Used AJAX with jQuery controls for listing all scripts in a grid and can edit it in the grid which will reflect in the database table as well (like margins).
Converted business requirements into technical requirements in preparation of high level design document and functional specifications.
Implemented a common styling with the help of CSS across entire application that controls color, layout, width, height, font size, images size and accomplished other graphic related features.
Design and implementation of new feature or software components for the front end of a large Web application.
Used MS Visio, Dreamweaver and Photoshop tools for web application development.
Developed front end UI pages to support data access and user authorization.
Extensively worked on designing web pages using HTML, DHTML, CSS, JavaScript and AJAX.
Implemented User Friendly UI design with HTML, CSS and JavaScript for client side validation and form submission functions and PHP for server side scripting for web development.
Created databases in PHP My Admin for internal projects.
Managed datasets using Panda data frames and MySQL, queried MYSQL database queries from Python using Python-MySQL connector MySQL dB package to retrieve information.
Created cross-browser compatible and standards-compliant based page layouts.
Designed/modified Images/Banners as per the client requirement using Adobe Create Suite.
Interacted with User Experience teams to understand customer needs to design online user experiences, ensuring ease of navigation and simplicity of design.
Responsible for Unit testing and supporting the UAT&PROD environments.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship