We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

3.00/5 (Submit Your Rating)

Atlanta, GA

PROFESSIONAL SUMMARY:

  • 7 years of professional experience in information technology, which includes 4 years of experience in the development of Bigdata and Hadoop ecosystem components.
  • Over 3 years of extensive experience in JAVA, J2EE Technologies, Database development and Data analytics.
  • Hands on experience in development of Big Data projects using Hadoop, Hive, Sqoop, Oozie, PIG, Flume, and MapReduce open source tools/technologies.
  • Experience in writing Pig Latin, HiveQL scripts and extended their functionality using User
  • Hands on experience with performance optimization techniques for data processing in Hive,
  • Written complex Map - Reduce code by implementing custom writable and writable comparable
  • Had a very good exposure working with various File-Formats (Parquet, Avro & JSON) and
  • Hands on experience with Spark Core, Spark SQL, and Data Frames/Data Sets/RDD API.
  • Developed applications using Spark for data processing.
  • Replaced existing map-reduce jobs and Hive scripts with Spark Data-Frame transformation and
  • Capable at using AWS utilities such as EMR, S3 and cloud watch to run and monitor Hadoop and spark jobs on AWS.
  • Good knowledge on Spark architecture and real-time streaming using Spark.
  • Fluent with the core Java concepts like I/O, Multi-threading, Exceptions, RegEx, Collections,
  • Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
  • Experience in Java, JSP, Servlets, Web Logic, Web Sphere, Java Script, Ajax, JQuery, XML, and
  • Experience in writing stored procedures and complex SQL queries using relational databases like Oracle, SQL Server, and MySQL.
  • Knowledge on ETL methods for data extraction, transformation and loading in corporate-wide ETL solutions and Data warehouse tools for reporting and data analysis.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Well-Versed with Agile and Waterfall methodologies.
  • Strong team player with good communication, analytical, presentation and interpersonal skills.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop, MapReduce, HDFS, HBase, Cassandra, Mongo DB Zookeeper, Hive, Pig, Sqoop, Flume and Oozie.

Operating Systems: Windows, UNIX, LINUX.

Programming Languages: C, Java, PL/SQL, Scala

Scripting Languages: JavaScript, Shell Scripting

Web Technologies: HTML, XHTML, XML, CSS, JavaScript, JSON, SOAP, WSDL.

Hadoop Distribution: Cloudera, Hartonworks.

Java/J2EE Technologies: Java, J2EE, JDBC.

Database: Oracle, MS Access, MySQL, SQL, No SQL.

IDE: Eclipse, IntellIj, SBT.

Methodologies: J2EE Design patterns, Scrum, Agile, Water Flow

Version Control: SVN, Git, GitHub, BITBUCKET

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Hadoop/Spark Developer

Responsibilities:

  • Using Spark API imported data into HDFS from Teradata and created Hive tables.
  • Developed Sqoop jobs to import data in Avro file format from Oracle database and created hive
  • Created Partitioned and Bucketed Hive tables in Parquet File Formats with Snappy compression
  • Involved in running all the hive scripts through hive, Impala, Hive on Spark, and some through
  • Involved in performance tuning of Hive from design, storage, and query perspectives.
  • Collected the Json data from HTTP Source and developed Spark APIs that helps to do inserts
  • Developed Spark core and Spark SQL scripts using Scala for faster data processing.
  • Worked with Spark-SQL context to create data frames to filter input data for model execution.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time,
  • Developed Kafka consumer’s API in Scala for consuming data from Kafka topics.
  • Involved in designing and developing tables in HBase and storing aggregated data from Hive Table.
  • Integrated Hive and Tableau Desktop reports and published to Tableau Server.
  • Developed shell scripts for running Hive scripts in Hive and Impala.
  • Orchestrated number of Sqoop, Hive scripts using Oozie workflow, and scheduled using Oozie
  • Used Jira for bug tracking, Bit Bucket to check-in, and checkout code changes.

Environment: HDFS, Yarn, Hive, Sqoop, Flume, Oozie, HBase, Kafka, Impala, Spark SQL, Spark Streaming, Eclipse, Oracle, Teradata, PL/SQL Linux Shell Scripting, Cloudera.

Confidential, Minneapolis, MA

Hadoop Developer/Spark Developer

Responsibilities:

  • Importing and exporting data into HDFS and Hive using Sqoop and Kafka.
  • Converted complex Teradata and Netezza SQLs into Hive HQLs.
  • Developed ETL using Hive, Oozie, shell scripts and Sqoop. Used Scala for coding the components, & Utilized Scala pattern matching in coding.
  • Used Flume to collect, aggregate and store the web log data onto HDFS.
  • Developed and implemented core API services using Python with Hive.
  • Designed NoSQL schemas in HBase.
  • Developed MapReduce ETL in Java and Pig.
  • Load log data into HDFS using Flume.
  • Developed simple to complex MapReduce Jobs using Hive and Pig.
  • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.

Environment: Map Reduce, HDFS, Hive, Pig, Sqoop, Scala, Oozie, SQL, Flume, Python, Shell Script, DataStage, Horton works, Cloudera.

Confidential, Tampa, FL

Hadoop Developer

Responsibilities:

  • Loaded the data using Sqoop from different RDBMS Servers like Teradata and Netezza to Hadoop HDFS Cluster.
  • Performed Sqoop Incremental imports by using Oozie based on every day.
  • Involved in creating Hive tables, loading with data, and writing hive queries which will run internally in map reduce pattern.
  • Performed Optimizations of Hive Queries using Map side joins, dynamic partitions, and Bucketing.
  • Responsible for executing Hive queries using Hive Command Line under Tez.
  • Implemented Hive Generic UDF’s to implement business logic around custom data types.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Coordinated the Pig and Hive scripts using Oozie workflow.
  • Loaded the data into HBase from HDFS.
  • Load and transform large sets of structured, semi structured, and unstructured data that includes Avro, sequence files, and XML files.

Environment: Hadoop, Hortonworks, Big Data, HDFS, MapReduce, Tez, Sqoop, Oozie, Pig, HiveLinux, Java, Eclipse.

Confidential

Hadoop Developer

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) such as requirements gathering, analysis, design, and development.
  • Analyze large datasets to provide strategic direction to the company.
  • Collected the logs from the physical machines and integrated into HDFS using Flume.
  • Involved in analyzing the system and business.
  • Developed SQL statements to improve back-end communications.
  • Loaded unstructured data intoHadoopFile System (HDFS).
  • Created reports and dashboards using structured and unstructured data.
  • Involved in importing data from MySQL to HDFS using SQOOP.
  • Involved in writing Hive queries to load and process data inHadoopFile System.
  • Involved in creating Hive tables, loading with data, and writing hive queries which will run internally in map reduce way.
  • Involved in working with Impala for data retrieval process.
  • Sentiment Analysis on reviews of the products on the client's website.
  • Developed custom Map Reduce programs to extract the required data from the logs.
  • Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewingHadooplog files.

Confidential

Java/J2EE Developer

Responsibilities:

  • Involved in Full Life Cycle Development in Distributed Environment using Java and J2EE Framework.
  • Designed the application by implementing Struts Framework based on MVC Architecture.
  • Designed and developed the front end using JSP, HTML and JavaScript and JQuery.
  • Implemented the Web Service client for the login authentication, credit reports and applicant information Apache Axis 2 Web Service.
  • Extensively worked on User Interface for few modules using JSPs, JavaScript, and Ajax.
  • Developed framework for data processing using Design patterns, Java, XML.
  • Used the lightweight container of the Spring Framework to provide architectural flexibility for Inversion of Controller (IOC).
  • Used Hibernate ORM framework with Spring framework for data persistence and transaction management.
  • Designed and developed Session beans to implement the Business logic.
  • Developed EJB components that are deployed on Web Logic Application Server.
  • Written unit tests using JUnit Framework and Logging is done using Log4J Framework.
  • Designed and developed various configuration files for Hibernate mappings.
  • Designed and documented REST/HTTP APIs, including JSON data formats and API versioning strategy.
  • Developed Web Services for sending and getting data from different applications using SOAP messages.
  • Actively involved in code reviews and bug fixing.
  • Applied CSS (Cascading style Sheets) for entire site for standardization of the site.
  • Assisted QA Team in defining and implementing a defect resolution process including defect priority, and severity.

Environment: Java 5.0, Struts, Spring 2.0, Hibernate 3.2, Web Logic 7.0, Eclipse 3.3, Oracle 10g, JUnit 4.2, Maven, Windows XP, HTML, CSS, JavaScript, and XML.

Confidential

Programmer Analyst

Responsibilities:

  • Involved in understanding the functional specifications of the project.
  • Assisted the development team in designing the complete application architecture
  • Involved in developing JSP pages for the web tier and validating the client data using JavaScript.
  • Developed connection components using JDBC.
  • Designed Screens using HTML and images.
  • Cascading Style Sheet (CSS) was used to maintain uniform look across different pages.
  • Involved in creating Unit Test plans and executing the same.
  • Did the documents/code reviews and knowledge transfer for the status updates of the ongoing project developments
  • Deployed web modules in Tomcat web server.

Environment: Java, JSP, J2EE, Servlets, Java Beans, HTML, JavaScript, JDeveloper, Tomcat Webserver, Oracle, JDBC, XML.

We'd love your feedback!