Hadoop Developer Resume Charlotte, NC - Hire IT People

SUMMARY

6+ years of total IT experience in the analysis, design, testing, development and Implementation of Data Warehouse/Data Mart Design, OLAP, Web and Business Intelligence applications on platforms like Windows and Unix.
Includes 4 years of hands on experience in Big Data technologies and Hands on experience in Hadoop Framework and its ecosystem like Map Reduce Programming, Hive, Sqoop, Nifi, HBase, Impala, and Flume
Experience in working on Horton works and ClouderaHadoop distributions.
Experience in analyzing data using HiveQL, and custom Map Reduce programs.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
Extending Hive core functionality by writing custom UDFs, UDTF and UDAFs.
Collected data from different sources like web servers and social media using Flume for storing in HDFS and analyzing the data using other Hadoop technologies.
Good knowledge of configuring and maintaining YARN Schedulers.
Experience in using Zookeeper and Oozie operational services to coordinate clusters and scheduling workflows.
Hands on experience working on NoSQL databases like HBase
Experience of semi-structured data processing (XML, JSON, and CSV) in Hive/Impala.
Good experience in Shell script and Python programming.
Good knowledge of Java, JDBC, Collections, JSP, JSON, REST, SOAP Web services, and Eclipse.
Having experience on UNIX commands.
Developed and maintained web applications running on Apache Web server.
Experience of working in Agile Software Development environment.
Exceptional ability to learn new technologies and to deliver outputs in short deadlines.
Exceptional ability to quickly master new concepts and capable of working in-group as well as independently with excellent communication skills.

TECHNICAL SKILLS

Database: Teradata (V13/V12), Oracle (9i/10g/11g), MS SQL Server, DB2, Data Lake.

Web Scripting: Java Script, ETL, CSS

Web services: Soap and Rest

Frameworks: MAP REDUCE, SPARK

Cloud Services: AWS, EC2, S3

Ingestion Tools: Sqoop, NIFI, Flume, Kafka

Languages: C, C++, Java, Visual Basic, COBOL,UNIX Shell Scripting, Hive, Hadoop, Scala

Others: AutoSys, Control-M, PL/SQL, XML

BI/GUI Tools: Tableau, MS Project, Visio, MS office (Word, Excel, PowerPoint, Access), XML

Operating Systems: Confidential AIX, HP UNIX, Solaris, Windows and Sun OS

OTHERS: JIRA(TRACKING), GIT HUB (repository), Jenkins (BUILD), UDEPLOY(deploy)

PROFESSIONAL EXPERIENCE

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities:

Creating HIVE entities to create a presentation layer of ingested data (loading data) onto Lake.
Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
Primarily involved in Data Migration process using Azure by integrating with Git Hub repository and Jenkins.
Built code for real time data ingestion using Shell script and Scala and java.
Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
Involved in development of Hadoop System and improving multi-node Hadoop Cluster performance.
Worked on analyzing Hadoop stack and different big data tools including Pig and Hive, Hbase database and Sqoop.
Developed data pipeline using Nifi, Sqoop and hive to extract the data from weblogs and store in HDFS.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Worked with different data sources like Avro data files, XML files, JSON files, SQL server and Oracle to load data into Hive tables.
Used Spark to create the structured data from large amount of unstructured data from various sources.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, Impala and loaded final data into HDFS.
Developed shell scripts to find vulnerabilities with SQL Queries by doing SQL injection.
Experienced in designing and developing POC's in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.
Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
Imported weblogs & unstructured data using the Apache Flume and stores the data in Flume channel.
Exported event weblogs to HDFS by creating a HDFS sink which directly deposits the weblogs in HDFS.
Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
Built the automated build and deployment framework using Jenkins, Udeploy etc.

Technology: Hadoop HDFS, Ambari view, Hive, Oozie, Zookeeper, HBase, Spark, Storm, Spark SQL, Sqoop, Nifi, NoSQL, Scala, Kafka, Cassandra, Autosys

Confidential

Hadoop developer

Responsibilities:

Worked on implementation and data integration in developing large-scale system software experiencing with Hadoop ecosystem components like HBase, Sqoop, Zookeeper, Oozie, and Hive.
Developed Hive UDF's for extended use and wrote HiveQL for sorting, joining, filtering and grouping the structure data.
Developed ETL Applications using Hive, Spark, and Impala & Sqoop for Automation using Oozie.
Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in Oozie.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS.
Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using HiveQL.
Used Sqoop for importing the data into HBase and Hive, exporting result set from Hive to MySQL using Sqoop export tool for further processing.
Enumerated Hive queries to do analysis of the data and to generate the end reports to be used by business users.
Worked on scalable distributed computing systems, software architecture, data structures and algorithms using Hadoop, Apache Spark and Apache Storm.
Ingested streaming data into Hadoop using Spark, Storm Framework and Scala.
Implemented POCs with Spark SQL to interpret complex JSON records, Delivery experience on major Hadoop ecosystem Components such as Pig, Hive, Spark, Elastic Search &HBase and monitoring with Cloudera Manager.
Collection framework used to transfer objects between the different layers of the application.
Experience in transferring Streaming data, data from different data sources into HDFS and NoSQL databases using Apache Flume.
Developed Spark jobs written in Scala to perform operations like data aggregation, data processing and data analysis.
Involved in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
Developed Spark code by using Scala and Spark-SQL for faster processing and testing and performed complex HIVEQL queries on HIVE tables
Used Kafka, Flume for building robust and fault tolerant data Ingestion pipeline between JMS and Spark Streaming Applications for transporting streaming web log data into HDFS.
Used Spark for series of dependent jobs and for iterative algorithms. Developed a data pipeline using Kafka and Spark Streaming to store data into HDFS.

Technology: Hadoop HDFS, Flume, CDH, Hive, Oozie, Zookeeper, HBase, Spark, Storm, Spark SQL, NoSQL, Scala, Kafka, Cassandra

Confidential

Hadoop Developer

Responsibilities:

Involved in analyzing scope of application, defining relationship within and groups of data using star schema, and snowflake schema.
Responsible for installation and configuration of HIVE, HBase and Sqoop on the Hadoop cluster and created HIVE tables to store the processed results in a tabular format. Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
Developed the Sqoop scripts to make the interaction between HIVE and Impala.
Processed data into HDFS by developing solutions and analyzed the data using Map Reduce, and HIVE to produce summary results from Hadoop to downstream systems.
Written Map Reduce code to process and parsing the data from various sources and storing parsed data into HBase and HIVE using HBase-HIVE Integration.
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running HIVE queries.
Created Managed tables and External tables in HIVE and loaded data from HDFS
Developed Spark code by using Scala and Spark-SQL for faster processing and testing and performed complex HIVEQL queries on HIVE tables.
Scheduled several times based Oozie workflow by developing Python scripts.
Exporting the data using Sqoop to RDBMS servers and processed that data for ETL operations.
Designing ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shell script, sqoop, package and MySQL.
End-to-end architecture and implementation of client-server systems using Scala, Java, JavaScript and related, Linux
Optimized the HIVE tables using optimization techniques like partitions and bucketing to provide better.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce HIVE, Pig, and Sqoop.
Involved in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
Created partitioned tables and loaded data using both static partition and dynamic partition method.
Developed custom Apache Spark programs in Scala to analyze and transform unstructured data
Handled importing of data from various data sources, performed transformations using HIVE, Map Reduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop
Using Kafka on publish-subscribe messaging as a distributed commit log, have experienced in its fast, scalable and durability.
Implemented POC to migrate Map Reduce jobs into Spark RDD transformations using SCALA
Scheduled map reduces jobs in production environment using Oozie scheduler.
Managing Amazon Web Services (AWS) infrastructure with automation and orchestration tools such as Chef.
Proficient in AWS services like VPC, EC2, S3, ELB, IAM, CloudFormation
Experienced in creating multiple VPC’s and public, private subnets as per requirement and distributed them as groups into various availability zones of the VPC.
Created NAT gateways and instances to allow communication from the private instances to the internet through bastion hosts.
Involved in writing Java API for Amazon Lambda to manage some of the AWS services.
Used security groups, network ACL’s, internet gateways and route tables to ensure a secure zone for organization in AWS public cloud.
Created and configured elastic load balancers and auto scaling groups to distribute the traffic and to have a cost efficient, fault tolerant and highly available environment.
Created S3 buckets in the AWS environment to store files, sometimes which are required to serve static content for a web application.
Used AWS Beanstalk for deploying and scaling web applications and services developed with Java.
Configured S3 buckets with various life cycle policies to archive the infrequently accessed data to storage classes based on requirement.
Possess good knowledge in creating and launching EC2 instances using AMI’s of Linux, Ubuntu, RHEL, and Windows and wrote shell scripts to bootstrap instance.
Used IAM for creating roles, users, groups and also implemented MFA to provide additional security to AWS account and its resources.
Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
Designed and implemented map reduce jobs to support distributed processing using java, HIVE and Apache Pig.
Analyzing Hadoop cluster and different Big Data analytic tools including Pig, HIVE, HBase and Sqoop.
Improved the Performance by tuning of HIVE and map reduce.
Research, evaluate and utilize modern technologies/tools/frameworks around Hadoop ecosystem.

Technology: HDFS, Map Reduce HIVE, Sqoop, Flume, Oozie Scheduler, AWS, EC2, S3, Java, Shell Scripts, Teradata, Oracle, HBase, Cassandra, Cloudera, JavaScript, JSP, Kafka, Spark, Scala and ETL, Python.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Charlotte, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship