We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

SUMMARY

  • 8+ years of total IT experience in the analysis, design, testing, development and Implementation of Data Warehouse/Data Mart Design, OLAP, Web and Business Intelligence applications on platforms like Windows and Unix.
  • Includes 4 years of hands on experience in Big Data technologies and Hands on experience in Hadoop Framework and its ecosystem like Map Reduce Programming, Hive, Sqoop, Nifi, HBase, Impala, and Flume
  • Experience in working on Horton works and ClouderaHadoop distributions.
  • Experience in analyzing data using HiveQL, and custom Map Reduce programs.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Experienced in designing, built, and deploying a multitude applications utilizing almost all of the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
  • Experience with Amazon Web Services, AWS command line interface, and AWS data pipeline.
  • Experienced in BI tools like Tableau.
  • Excellent experience using Text mate on Ubuntu for writing Java, Scala and shell scripts.
  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
  • Extending Hive core functionality by writing custom UDFs, UDTF and UDAFs.
  • Collected data from different sources like web servers and social media using Flume for storing in HDFS and analyzing the data using other Hadoop technologies.
  • Good noledge of configuring and maintaining YARN Schedulers.
  • Experience in using Zookeeper and Oozie operational services to coordinate clusters andscheduling workflows.
  • Hands on experience working on NoSQL databases like HBase
  • Experience of semi-structured data processing (XML, JSON, and CSV) in Hive/Impala.
  • Good experience in Shell script and Python programming.
  • Good noledge of Java, JDBC, Collections, JSP, JSON, REST, SOAP Web services, and Eclipse.
  • Having experience on UNIX commands.
  • Developed and maintained web applications running on Apache Web server.
  • Experience of working in Agile Software Development environment.
  • Exceptional ability to learn new technologies and to deliver outputs in short deadlines.
  • Exceptional ability to quickly master new concepts and capable of working in-group as well as independently with excellent communication skills.

TECHNICAL SKILLS

Languages: C, C++, Java, Visual Basic, COBOL, UNIX Shell Scripting, Hive, Hadoop, Scala

Frameworks: MAP REDUCE, SPARK

Database: Teradata (V13/V12), Oracle (9i/10g/11g), MS SQL Server, DB2, Data Lake.

Web Scripting: Java Script, ETL, CSS

Web services: Soap and Rest

Cloud Services: AWS, EC2, S3

Ingestion Tools: Sqoop, NIFI, Flume, Kafka

Others: AutoSys, Control-M, PL/SQL, XML

BI/GUI Tools: Tableau, MS Project, Visio, MS office (Word, Excel, PowerPoint, Access), XML

Operating Systems: IBM AIX, HP UNIX, Solaris, Windows and Sun OS

OTHERS: JIRA (TRACKING), GIT HUB (repository), Jenkins (BUILD), UDEPLOY (deploy)

PROFESSIONAL EXPERIENCE

Confidential

Hadoop Developer

Responsibilities:

  • Creating HIVE entities to create a presentation layer of ingested data (loading data) onto Lake.
  • Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
  • Primarily involved in Data Migration process using Azure by integrating with Git Hub repository and Jenkins.
  • Built code for real time data ingestion using Shell script and Scala and java.
  • Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
  • Involved in development of Hadoop System and improving multi-node Hadoop Cluster performance.
  • Worked on analyzing Hadoop stack and different big data tools including Pig and Hive, Hbase database and Sqoop.
  • Developed data pipeline using Nifi, Sqoop and hive to extract the data from weblogs and store in HDFS.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked with different data sources like Avro data files, XML files, JSON files, SQL server and Oracle to load data into Hive tables.
  • Used Spark to create the structured data from large amount of unstructured data from various sources.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, Impala and loaded final data into HDFS.
  • Developed shell scripts to find vulnerabilities with SQL Queries by doing SQL injection.
  • Experienced in designing and developing POC's in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
  • Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.
  • Loaded the load ready files from mainframes to Hadoop and files were converted to ASCII format.
  • Worked on Amazon AWS concepts like EMR and EC2 web services for fast and efficient processing of Big Data.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
  • Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Imported weblogs & unstructured data using the Apache Flume and stores the data in Flume channel.
  • Exported event weblogs to HDFS by creating a HDFS sink which directly deposits the weblogs in HDFS.
  • Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
  • Built the automated build and deployment framework using Jenkins, Udeploy etc.

Confidential

Hadoop developer

Responsibilities:

  • Worked on implementation and data integration in developing large-scale system software experiencing with Hadoop ecosystem components like HBase, Sqoop, Zookeeper, Oozie, and Hive.
  • Developed Hive UDF's for extended use and wrote HiveQL for sorting, joining, filtering and grouping the structure data.
  • Developed ETL Applications using Hive, Spark, and Impala & Sqoop for Automation using Oozie.
  • Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in Oozie.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS.
  • Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using HiveQL.
  • Used Sqoop for importing the data into HBase and Hive, exporting result set from Hive to MySQL using Sqoop export tool for further processing.
  • Enumerated Hive queries to do analysis of the data and to generate the end reports to be used by business users.
  • Worked on scalable distributed computing systems, software architecture, data structures and algorithms using Hadoop, Apache Spark and Apache Storm.
  • Ingested streaming data into Hadoop using Spark, Storm Framework and Scala.
  • Worked on analyzing Hadoop cluster and different big data analytical and processing tools including Pig, Hive, Spark, and Spark Streaming.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Migrating various Hive UDF's and queries into Spark SQL for faster requests.
  • Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
  • Hands on experience in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
  • Implemented POCs with Spark SQL to interpret complex JSON records, Delivery experience on major Hadoop ecosystem Components such as Pig, Hive, Spark, Elastic Search &HBase and monitoring with Cloudera Manager.
  • Collection framework used to transfer objects between the different layers of the application.
  • Experience in transferring Streaming data, data from different data sources into HDFS and NoSQL databases using Apache Flume.
  • Developed Spark jobs written in Scala to perform operations like data aggregation, data processing and data analysis.
  • Involved in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
  • Developed Spark code by using Scala and Spark-SQL for faster processing and testing and performed complex HIVEQL queries on HIVE tables
  • Used Kafka, Flume for building robust and fault tolerant data Ingestion pipeline between JMS and Spark Streaming Applications for transporting streaming web log data into HDFS.
  • Used Spark for series of dependent jobs and for iterative algorithms. Developed a data pipeline using Kafka and Spark Streaming to store data into HDFS.

Confidential

Hadoop Developer

Responsibilities:

  • Involved in analyzing scope of application, defining relationship within and groups of data using star schema, and snowflake schema.
  • Responsible for installation and configuration of HIVE, HBase and Sqoop on the Hadoop cluster and created HIVE tables to store the processed results in a tabular format. Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
  • Developed the Sqoop scripts to make the interaction between HIVE and Impala.
  • Processed data into HDFS by developing solutions and analyzed the data using Map Reduce, and HIVE to produce summary results from Hadoop to downstream systems.
  • Written Map Reduce code to process and parsing the data from various sources and storing parsed data into HBase and HIVE using HBase-HIVE Integration.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running HIVE queries.
  • Created Managed tables and External tables in HIVE and loaded data from HDFS
  • Developed Spark code by using Scala and Spark-SQL for faster processing and testing and performed complex HIVEQL queries on HIVE tables.
  • Scheduled several times based Oozie workflow by developing Python scripts.
  • Exporting the data using Sqoop to RDBMS servers and processed that data for ETL operations.
  • Designing ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shellscript, sqoop, package and MySQL.
  • End-to-end architecture and implementation of client-server systems using Scala, Java, JavaScript and related, Linux
  • Optimized the HIVE tables using optimization techniques like partitions and bucketing to provide better.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce HIVE, Pig, and Sqoop.
  • Involved in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
  • Created partitioned tables and loaded data using both static partition and dynamic partition method.
  • Developed custom Apache Spark programs in Scala to analyze and transform unstructured data
  • Handled importing of data from various data sources, performed transformations using HIVE, Map Reduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop
  • Using Kafka on publish-subscribe messaging as a distributed commit log, has experienced in its fast, scalable and durability.
  • Implemented POC to migrate Map Reduce jobs into Spark RDD transformations using SCALA
  • Scheduled map reduces jobs in production environment using Oozie scheduler.
  • Managing Amazon Web Services (AWS) infrastructure with automation and orchestration tools such as Chef.
  • Proficient in AWS services like VPC, EC2, S3, ELB, IAM, CloudFormation
  • Experienced in creating multiple VPC’s and public, private subnets as per requirement and distributed them as groups into various availability zones of the VPC.
  • Created NAT gateways and instances to allow communication from the private instances to the internet through bastion hosts.
  • Involved in writing Java API for Amazon Lambda to manage some of the AWS services.
  • Used security groups, network ACL’s, internet gateways and route tables to ensure a secure zone for organization in AWS public cloud.
  • Created and configured elastic load balancers and auto scaling groups to distribute the traffic and to has a cost efficient, fault tolerant and highly available environment.
  • Created S3 buckets in the AWS environment to store files, sometimes which are required to serve static content for a web application.
  • Used AWS Beanstalk for deploying and scaling web applications and services developed with Java.
  • Configured S3 buckets with various life cycle policies to archive the infrequently accessed data to storage classes based on requirement.
  • Possess good noledge in creating and launching EC2 instances using AMI’s of Linux, Ubuntu, RHEL, and Windows and wrote shell scripts to bootstrap instance.
  • Used IAM for creating roles, users, groups and also implemented MFA to provide additional security to AWS account and its resources.
  • Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
  • Designed and implemented map reduce jobs to support distributed processing using java, HIVE and Apache Pig.
  • Analyzing Hadoop cluster and different Big Data analytic tools including Pig, HIVE, HBase and Sqoop.
  • Improved the Performance by tuning of HIVE and map reduce.
  • Research, evaluate and utilize modern technologies/tools/frameworks around Hadoop ecosystem.

Confidential

Java Developer

Responsibilities:

  • Responsible for interacting and discussing with Business Team to understand the business process and gather requirements. Developed and documented a high-level Conceptual Data Process Design for the projects.
  • Involved in development of business domain concepts into Use Cases, Sequence Diagrams, Class Diagrams, Component Diagrams and Implementation Diagrams.
  • Analyze the requirements and communicate the same to both Development and Testing teams.
  • Responsible for analysis and design of the application based on MVC Architecture, using open source Struts Framework.
  • Involved in configuring Struts, Tiles and developing the configuration files.
  • Developed Struts Action classes and Validation classes using Struts controller component and Struts validation framework.
  • Developed and deployed UI layer logics using JSP, XML, JavaScript, HTML /DHTML.
  • Used Spring Framework and integrated it with Struts.
  • Involved in Configuring web.xml and struts-config.xml according to the struts framework.
  • Creating unit test cases and executing them with the help of JUnit testing framework. Supported, Testing and coding issues in Production/QA environment. Consumed Web Services for transferring data between different applications.
  • Involved in fixing defects and unit testing with test cases using JUnit. Developed user and technical documentation.
  • Designed a lightweight model for the product using Inversion of Control principle and implemented it successfully using Spring IOC Container.
  • Used transaction interceptor provided by spring for declarative Transaction Management.
  • The dependencies between the classes were managed by spring using the Dependency Injection to promote loose coupling between them.
  • Provided connections using JDBC to the database and developed SQL queries to manipulate the data.
  • Developed ANT script for auto generation and deployment of the web service.
  • Wrote stored procedure and used JAVA APIs to call these procedures.
  • Developed various test cases such as unit tests, mock tests, and integration tests using the JUNIT.
  • Experience writing Stored Procedures, Functions and Packages
  • Used log4j to perform logging in the applications.

Confidential 

Web Developer as an Internship

Responsibilities:

  • Involved in complete User Interface designing and coded the web site in XHTML, CSS and Java Script.
  • Responsible for translating designs and concepts into highly usable and engaging web applications using a variety of technologies.
  • Used AJAX with jQuery controls for listing all scripts in a grid and can edit it in the grid which will reflect in the database table as well (like margins).
  • Converted business requirements into technical requirements in preparation of high level design document and functional specifications.
  • Implemented a common styling with the help of CSS across entire application that controls color, layout, width, height, font size, images size and accomplished other graphic related features.
  • Design and implementation of new feature or software components for the front end of a large Web application.
  • Used MS Visio, Dreamweaver and Photoshop tools for web application development.
  • Developed front end UI pages to support data access and user authorization.
  • Extensively worked on designing web pages using HTML, DHTML, CSS, JavaScript and AJAX.
  • Implemented User Friendly UI design with HTML, CSS and JavaScript for client side validation and form submission functions and PHP for server side scripting for web development.
  • Created databases in PHP My Admin for internal projects.
  • Managed datasets using Panda data frames and MySQL, queried MYSQL database queries from Python using Python-MySQL connector MySQL dB package to retrieve information.
  • Created cross-browser compatible and standards-compliant based page layouts.
  • Designed/modified Images/Banners as per the client requirement using Adobe Create Suite.
  • Interacted with User Experience teams to understand customer needs to design online user experiences, ensuring ease of navigation and simplicity of design.
  • Responsible for Unit testing and supporting the UAT&PROD environments.

We'd love your feedback!