Spark & Hadoop Developer Resume Atlanta, GA - Hire IT People

SUMMARY

8 years of experience in IT, which includes experience in Bigdata Technologies, Hadoop ecosystem, Java/J2EE, SQL related technologies in Retail, Manufacturing, Financial and Communication sectors
5 Years of experience in Big Data Analytics using Various Hadoop eco - systems tools and Spark Framework and Currently working on Spark and Spark Streaming frameworks extensively using Scala as teh main programming dialect
Experience installing/configuring/maintaining Apache Hadoop clusters for application development and Hadoop tools like Sqoop, Hive, PIG, Flume, Hbase, Kafka, Hue, Storm, Zoo Keeper, Oozie, Cassandra, Sqoop
Worked wif major distributions like Cloudera (CDH 3&4) & Horton works Distributions and AWS. Also worked on Unix and DWH in support for various Distributions
Hands on experience in developing and deploying enterprise based applications using major components in Hadoop ecosystem like Hadoop 2.X, YARN, Hive, Pig, MapReduce, Spark, Kafka, Storm, Oozie, HBase, Flume, Sqoop and ZooKeeper
Experience in handling large datasets using Partitions, Spark in memory capabilities, Broadcasts in Spark wif Scala, TEMPEffective and efficient Joins, Transformations and other during ingestion process itself
Experience in developing data pipeline using Pig, Sqoop, and Flume to extract teh data from weblogs and store in HDFS and accomplished developing Pig Latin Scripts and using HiveQL for data analytics
Extensively dealt wif Spark Streaming and Apache Kafka to fetch live stream data.
Experience in converting Hive/SQL queries into Spark transformations using Java and experience in ETL development using Kafka, Flume and Sqoop
Good experience in writing Spark applications using Scala and Javaand usedScalasbt to developScalaprojects and executed usingSpark-Submit
Experience working on NoSQL databases including Hbase, Cassandra and MongoDB and experience using Sqoop to import data into HDFS from RDBMS and vice-versa
Developed Spark scripts by usingScalashell commands as per teh requirement
Good experience in writing Sqoop queries for transferring bulk data between Apache Hadoop and structured data stores
Substantial experience in writing Map Reduce jobs in Java, PIG, Flume, Zookeeper, Hive and Storm
Created multiple MapReduce Jobs using JavaAPI, Pig and Hive for data extraction
Strong expertise in troubleshooting and performance fine-tuning Spark, MapReduce and Hive applications
Good experience on working wif Amazon EMR framework for processing data on EMR and EC2 instances
Created AWS VPC network for teh installed Instances and configured security groups and Elastic IP’s Accordingly
Developed AWSCloud formation templates to create custom sized VPC, subnets, EC2 instances, ELB and security groups
Extensive experience in developing applications that perform Data Processing tasks using Teradata, Oracle, SQL Server and MySQL database
Worked on data warehousing and ETL tools like Informatica, Tableau, and Pentaho
Experience in understanding teh security requirements for Hadoop and integrate wif Kerberos autantication and authorization infrastructure
Acquaintance wif Agile and Waterfall methodologies. Responsible for handling several clients facing meetings wif great communication skills

TECHNICAL SKILLS

Big Data Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Zookeeper, Kafka, Cassandra, Apache Spark, Spark Streaming, HBase, Flume, Impala

Hadoop Distribution: Cloudera, Horton Works, Apache, AWS

Languages: Java, SQL, PL/SQL, Python, Pig Latin, HiveQL, Scala, Regular Expressions

Web Technologies: HTML, CSS, JavaScript, XML, JSP, Restful, SOAP

Operating Systems: Windows(xp/7/8/10), UNIX, LINUX, UBUNTU, CENTOS.

Portals/Application servers: Weblogic, WebSphere Application server, WebSphere Portal server, JBOSS

Build Automation tools: SBT, Ant, Maven

Version Control: GIT

IDE & Builld Tools, Design: Eclipse, Visual Studio, Net Beans, Rational Application Developer, Junit

Databases: Oracle, SQL Server, MySQL, MS Access, NoSQL Database (Hbase, Cassandra, MongoDB), Teradata.

PROFESSIONAL EXPERIENCE

Confidential - Atlanta, GA

Spark & Hadoop Developer

Responsibilities:

Worked on analysing Hadoop cluster and different big data analytical and processing tools including Pig, Hive, Sqoop and Spark wif Scala & java, Spark Streaming
Wrote Spark-Streaming applications to consume teh data from Kafka topics and wrote processed streams to HBase and steamed data using Spark wif Kafka
Worked on teh large-scale HadoopYARN cluster for distributed data processing and analysis using Spark, Hive, and MongoDB
Involved in creating data-lake by extracting customer's data from various data sources to HDFS which include data from Excel, databases, and log data from servers
Developed Apache Sparkapplications by using Scala for data processing from various streaming sources
Implemented Spark solutions to generate reports, fetch and load data in Cassandra
Experienced in writing real-time processing and core jobs using SparkStreaming wif Kafka as a data pipeline system
Written HiveQL to analyse teh number of unique visitors and their visit information such as views, most visited pages, etc
Used Scala to convert Hive/SQL queries into RDD transformations in Apache Spark
Configured SparkStreaming to receive real time data from teh Apache Kafka and store teh stream data to HDFS using Scala
Monitored workload, job performance and capacity planning using Cloudera Manager
Created teh AWSVPC network for teh Installed Instances and configured teh Security Groups and Elastic IP's accordingly
Experienced on working wif Amazon EMR framework for processing data on EMR and EC2 instances
Designing and implementing complete end-to-endHadoop Infrastructure including Pig, Hive, Sqoop, Oozie, Flume, and Zookeeper
Further used pig to do transformations, event joins, elephant bird API and pre -aggregations performed before loading JSON files format onto HDFS
Involved in resolving performance issues in Pig and Hive wif understanding of Map Reduce physical plan execution and using debugging commands to run code in optimized way
Implemented Kafka event log producer to produce teh logs into Kafka topic which are utilized by ELK (Elastic Search, Log Stash, Kibana) stack to analyse teh logs produced by teh Hadoop cluster
Used Sparkto perform analytics on data in Hive and experienced wif ETLworking wif Hive and Map-Reduce

Environment: Hadoop 2.6.0, HDFS, MapReduce, Spark Streaming, Spark-Core, Spark SQL, Scala, Pig 0.14, Hive 1.2.1, Sqoop 1.4.4, Flume 1.6.0, Kafka, JSON, HBase.

Confidential - Milwaukee City, WI

Hadoop/Java Developer

Responsibilities:

Responsible for architecting Hadoop clusters wif CDH3 and involved in installation of CDH3 and upgradation to CDH4 from CDH3
Designed AWSformation templates to create VPC architecture, EC2s, Subnets and NATS to meet high availability application and security parameters across multiple AZs
Manage migration of on-prem servers to AWSby creating golden images for upload and deployment
Manage multiple AWSaccounts wif multiple VPC’s for both production and non-productionwhere primary objectives are automation, build out, integration and cost control
Involved in developing teh Pig scripts and Hive Reports
Developed teh sqoop scripts to make teh interaction between Pig and MySQL Database
Worked on Performance Enhancement and setting up pig, Hive and HBase on multiple nodes and developed using Pig, Hive and HBase, MapReduce
Worked wif Distributed n-tier architecture and Client/Server architecture
Supported Map Reduce Programs those are running on teh cluster and developed multiple Map Reduce jobs in Java for data cleaning and pre-processing
Developed MapReduce application usingHadoop, MapReduce programming and Hbase
Evaluated usage of Oozie for Work Flow Orchestration and experienced in cluster coordination using Zookeeper
Developing ETL jobs wif organization and project defined standards and processes
Experienced in enabling Kerberos autantication in ETLprocess
Implemented data access using Hibernate persistence framework
Developed teh configuration files and teh class’s specific to teh spring and hibernate
Utilized Spring framework for bean wiring & Dependency injection principals
Expertise in server-side and J2EE technologies including Java, J2SE, JSP, Servlets, XML, Hibernate, Struts, Struts2, JDBC, and JavaScript development
Design of GUI using Model View Controller Architecture (STRUTS Frame Work)
Integrated Spring DAO for data access using Hibernate and involved in teh Development of Spring Framework Controllers

Environment: Hadoop 2.X, HDFS, MapReduce, Hive, Pig, Sqoop, Oozie, Hbase, Java, J2EE, Eclipse, HQL.

Confidential - Woodcliff Lake, NJ

HadoopDeveloper

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing
Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs
Moving log data periodically into HDFS using Flume. Building multi-hop flows, fan-out flows, and failover mechanism
Developed MapReduce jobs to automate transfer of data from Hbase and to read data files and scrub teh data
Transferring data between MySQL and HDFS using Sqoop wif connectors
Creating and populating Hive tables and writing Hive queries for data analysis to meet teh business requirements
Installed and configured Pig and also written Pig Latin scripts
Migrating data from MySQL database to HBase. Running MapReduce jobs to access HBase data from application using Java Client API’s
Automating teh jobs using Oozie
Installed and configured Hive, Pig, Sqoop, Flume and Oozie on teh Hadoop cluster
Actively participated in software development lifecycle, including design and code reviews, test development, test automation
Involved in solution-driven agile development methodology and actively participated in daily scrum meetings
MonitoringHadoopcluster using tools like Cloudera Manager
Automation script to monitor HDFS and HBase through Cron jobs
Create a complete processing engine, based on Cloudera's distribution, enhanced to performance

Environment: Hadoop, MapReduce, HDFS, Sqoop, Hbase, Oozie, SQL, Pig, Flume, Hive,Java.

Confidential - Denver, CO

Hadoop Developer

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReducejobs in java for data cleaning and preprocessing
Installed and configured Apache Hadoop to test teh maintenance of log files inHadoop cluster
Importing and exporting data into HDFS and Hive using Sqoop
Experienced in defining job flows and managing and reviewing Hadooplog files
Load and transform large sets of structured, semi structured and unstructured data
Responsible to manage data coming from different sources and for implementing MongoDB to store and analyze unstructured data
Supported Map Reduce Programs those are running on teh cluster and involved in loading data from UNIX file system to HDFS
Installed and configured Hive and also written Hive UDFs
Involved in creating Hive tables, loading wif data and writing hive queries that will run internally in map reduce way
Involved in Hadoop cluster task like Adding and Removing Nodes wifout any TEMPeffect to running jobs and data
Created HBase tables to store variable data formats of PII data coming from different portfolios
Implemented best income logic using Pig scripts
Load and transform large sets of structured, semi structured and unstructured data
Cluster coordination services through Zookeeper
Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team
Supported in setting up QA environment and updating configurations for implementing scripts wif Pig and Sqoop
Continuous monitoring and managing theHadoop cluster using Cloudera Manager
UsedHibernateORM framework wifSpringframework for data persistence and transaction management and involved in templates and screens in HTML and JavaScript

Environment: Hadoop, HDFS, MapReduce, Pig, Sqoop, Unix, Hbase, Java, JavaScript, HTML

Confidential

SQL/Java Developer

Responsibilities:

Worked wif several clients wif day to day requests and responsibilities
DesignedanddevelopedStruts like MVC 2 Webframework using teh front-controller design pattern, which is used successfully in a number of production systems
Wrote SQL queries to perform back-end database operations
Wrote various SQL, PLSQL queries and stored procedures for data retrieval
Prepared utilities for teh Unit -Testing of Application Using JSP and Servlets
Developed Database applications using SQL and PL/SQL
Applied design patterns and Object-Oriented design conceptto improve teh existing Java/J2EE based code base
Identified and fixed transactional issues due to incorrect exception handling and concurrency issues due to unsynchronized blocks of code
Resolvedproduct complications Confidential customer sites and narrowed teh understanding to teh development and deployment teams to adopt long term product development strategy wif minimal roadblocks
Convinced business users and analysts wif alternative solutions that are more robust and simpler to implement from technical perspective and satisfying teh functional requirements from teh business perspective
Played a crucial role in developing persistence layer
Analyzed, developed, tuned, tested, debugged and documented processs using technologies SQL, PL/SQL, Informatica, UNIX and Control-M
Documented technical specs, class diagrams and sequence diagrams, developed technical design documents based on changes. Analyzed Portal capabilities and scalability and identified area where Portal could be used to enhance usability and improve productivity

Environment: Java, J2EE, JSP, Eclipse, SQL, Windows, PL/SQL, Oracle, Informatica, Unix, Control-M.

We provide IT Staff Augmentation Services!

Spark & Hadoop Developer Resume

Atlanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship