Hadoop/Spark Developer Resume Atlanta, GA - Hire IT People

PROFESSIONAL SUMMARY:

7 years of professional experience in information technology, which includes 4 years of experience in the development of Bigdata and Hadoop ecosystem components.
Over 3 years of extensive experience in JAVA, J2EE Technologies, Database development and Data analytics.
Hands on experience in development of Big Data projects using Hadoop, Hive, Sqoop, Oozie, PIG, Flume, and MapReduce open source tools/technologies.
Experience in writing Pig Latin, HiveQL scripts and extended their functionality using User Defined Functions (UDF’s).
Hands on experience with performance optimization techniques for data processing in Hive, Impala, Spark, Pig, Map - Reduce.
Written complex Map-Reduce code by implementing custom writable and writable comparable for analysis of large datasets.
Had a very good exposure working with various File-Formats (Parquet, Avro, JSON) and Compressions (Snappy, Bzip & Gzip).
Hands on experience with Spark Core, Spark SQL, and Data Frames/Data Sets/RDD API.
Developed applications using Spark for data processing.
Replaced existing map-reduce jobs and Hive scripts with Spark Data-Frame transformation and actions.
Capable of using AWS utilities such as EMR, S3 and cloud watch to run and monitor Hadoop and spark jobs on AWS.
Good knowledge on Spark architecture and real-time streaming using Spark.
Fluent with the core Java concepts like I/O, Multi-threading, Exceptions, RegEx, Collections, Data-structures and serialization.
Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
Experience in Java, JSP, Servlets, Web Logic, Web Sphere, Java Script, JQuery, XML, and HTML.
Experience in writing stored procedures and complex SQL queries using relational databases like Oracle, SQL Server, and MySQL.
Knowledge on ETL methods for data extraction, transformation and loading in corporate-wide ETL solutions and Data warehouse tools for reporting and data analysis.
Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
Well-Versed with Agile and Waterfall methodologies.
Strong team player with good communication, analytical, presentation and interpersonal skills.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop, MapReduce, HDFS, HBase, Cassandra, Mongo DB Zookeeper, Hive, Pig, Sqoop, Flume and Oozie.

Operating Systems: Windows, UNIX, LINUX.

Programming Languages: C, Java, PL/SQL, Scala

Scripting Languages: JavaScript, Shell Scripting

Web Technologies: HTML, XHTML, XML, CSS, JavaScript, JSON, SOAP, WSDL.

Hadoop Distribution: Cloudera, Hartonworks.

Java/J2EE Technologies: Java, J2EE, JDBC.

Database: Oracle, MS Access, MySQL, SQL, No SQL.

IDE: Eclipse, IntellIj, SBT.

Methodologies: J2EE Design patterns, Scrum, Agile, Water Flow

Version Control: SVN, Git, GitHub, BITBUCKET

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Hadoop/Spark Developer

Responsibilities:

Developed Sqoop jobs to import data in Avro file format from Oracle database and created hive tables on top of it.
Created Partitioning and Bucketing on Hive tables in Parquet File Formats with Snappy compression
Involved in running all the hive scripts through hive, Impala, Hive on Spark, and some through Spark SQL using Scala .
Involved in performance tuning of Hive for design, storage, and query perspectives.
Collected the JSON data from HTTP Source and developed Spark APIs that helps to do inserts and updates in Hive tables.
Developed Spark core and Spark SQL scripts using Scala for faster data processing.
Worked with Spark - SQL context to create data frames to filter input data for model execution.
Experienced in performance tuning of Spark Applications for setting right Batch Interval time, the correct level of Parallelism and memory tuning.
Developed Kafka consumer to consume data from Kafka topics .
Involved in designing and developing tables in HBase and storing aggregated data from Hive Table.
Integrated Hive with Tableau to generate reports for the end user.
Developed shell scripts for running Hive scripts in Hive and Impala .
Orchestrated number of Sqoop, Hive scripts using Oozie workflow, and scheduled using Oozie coordinator.
Used Jira for bug tracking, BitBucket to check-in, and checkout code changes.

Environment: HDFS, Yarn, Hive, Sqoop, Flume, Oozie, HBase, Kafka, Impala, Spark SQL, Spark Streaming, Eclipse, Oracle, Teradata, PL/SQL Linux Shell Scripting, Hortonworks.

Confidential, Minneapolis, MA

Hadoop Developer/Spark Developer

Responsibilities:

Involved in Importing and exporting the data into HDFS and Hive using Sqoop and Kafka.
Converted complex Teradata and Netezza SQL into HiveQL.
Developed ETL using Hive, Oozie, shell scripts and Sqoop. Used Scala for coding the components, & Utilized Scala pattern matching in coding.
Used Flume to collect, aggregate and store the weblog data into HDFS .
Designed NoSQL schemas in HBase .
Developed MapReduce ETL in Java and Pig .
Loaded log data into HDFS using Flume .
Developed simple to complex MapReduce Jobs using Hive and Pig.
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs .
Implemented Partitioning, Dynamic Partition, Bucket in HIVE .

Environment: Map Reduce, HDFS, Hive, Pig, Sqoop, Scala, Oozie, SQL, Flume, Python, Shell Script, DataStage, Horton works.

Confidential, Tampa, FL

Hadoop Developer

Responsibilities:

Loaded the data using Sqoop from different RDBMS Servers like Teradata and Netezza to Hadoop HDFS Cluster.
Performed Sqoop Incremental imports by using Oozie based on every day.
Involved in creating Hive tables, loading with data, and writing hive queries which will run internally in the map-reduce pattern.
Performed Optimizations of Hive Queries using Map-side joins, dynamic partitions, and Bucketing.
Responsible for executing Hive queries using Hive Command Line under Cloudera Manager.
Implemented Hive Generic UDF’s to implement business logic around custom data types.
Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
Coordinated the Pig and Hive scripts using Oozie workflow.
Loaded the data into HBase from HDFS.
Loaded and transformed large sets of structured, semi-structured, and unstructured data that includes Avro, sequence files, and XML files.

Environment: Hadoop, Cloudera, Big Data, HDFS, MapReduce, Sqoop, Oozie, Pig, HiveLinux, Java, Eclipse.

Confidential

Hadoop Developer

Responsibilities:

Involved in various phases of Software Development Life Cycle (SDLC) such as requirements gathering, analysis, design, and development.
Analyze large datasets to provide strategic direction to the company.
Collected the logs from the physical machines and integrated into HDFS using Flume.
Involved in analyzing the system and business.
Developed SQL statements to improve back-end communications.
Loaded unstructured data into Hadoop File System (HDFS).
Created reports and dashboards using structured and unstructured data.
Involved in importing data from MySQL to HDFS using SQOOP.
Involved in writing Hive queries to load and process data in Hadoop File System.
Involved in creating Hive tables, loading with data, and writing hive queries which will run internally in map reduce.
Involved in working with Impala for data retrieval process.
Sentiment Analysis on reviews of the products on the client's website.
Developed custom Map-Reduce programs to extract the required data from the logs.
Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files.

Confidential

Java/J2EE Developer

Responsibilities:

Involved in Full Life Cycle Development in Distributed Environment using Java and J2EE Framework.
Designed the application by implementing Struts Framework based on MVC Architecture.
Designed and developed the front end using JSP, HTML and JavaScript and JQuery.
Implemented the Web Service client for the login authentication, credit reports and applicant information Apache Axis 2 Web Service.
Extensively worked on User Interface for few modules using JSP, JavaScript.
Developed framework for data processing using Design patterns, Java, XML.
Used the lightweight container of the Spring Framework to provide architectural flexibility for Inversion of Controller (IOC).
Used Hibernate ORM framework with Spring framework for data persistence and transaction management.
Designed and developed Session beans to implement the Business logic.
Developed EJB components that are deployed on Web Logic Application Server.
Written unit tests using JUnit Framework and Logging is done using Log4J Framework.
Designed and developed various configuration files for Hibernate mappings.
Designed and documented REST/HTTP APIs, including JSON data formats and API versioning strategy.
Developed Web Services for sending and getting data from different applications using SOAP messages.
Actively involved in code reviews and bug fixing.
Applied CSS (Cascading Style Sheets) for entire site for standardization of the site.

Environment: Java 5.0, Struts, Spring 2.0, Hibernate 3.2, Web Logic 7.0, Eclipse 3.3, Oracle 10g, JUnit 4.2, Maven, Windows XP, HTML, CSS, JavaScript, and XML.

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

Atlanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship