We provide IT Staff Augmentation Services!

Spark Developer Resume

0/5 (Submit Your Rating)

Charlotte, NC

SUMMARY

  • Around 6 years of experience in application development and design using emerging technologies like Hadoop, NoSQL and Java/J2EE.
  • Strong experience in requirements gathering, design and development, application migration and maintenance phases of teh Software Development Lifecycle (SDLC).
  • Experience in installing, configuring and using Apache Hadoop ecosystem components like Hadoop Distributed File System (HDFS), MapReduce, PIG, HIVE, HBASE, ZOOKEEPER, SQOOP, Hue, Spark, Kafka Solr, Git, Maven, AVRO, JSON and CHEF.
  • Technically skilled at developing new applications on Hadoop according to business needs and converting existing applications to Hadoop environment
  • Exposure in analyzing data using HiveQL, HBase and Map Reduce programs in Java.
  • Experience in Machine Learning and Data Science and in using new tools and technologies to drive improvements throughout entire software development lifecycle.
  • Well versed on using Sqoop to import data into HDFS from RDBMS and vice - versa.
  • Understanding of managed distributions of Hadoop like Cloudera and Hortonworks.
  • Proficient knowledge on Apache Spark and programming Scala to analyze large datasets using Spark Streaming and Kafka to process real time data.
  • Experience in managing and schedulingSpark Jobs on a Hadoop Cluster using Oozie.
  • Expertise in Cluster co-ordination services through Zookeeper.
  • Developed Spark applications using Scala and Python.
  • Involved in HBase CRUD operations in both Java API and shell commands.Proficiency on indexes, scalability and query language supporting using Cassandra.
  • Involved in creating HIVE tables, Partitioning, Bucketing, loading data and writing HIVE queries.
  • Designed and implemented Hive UDF's using Java for evaluation, filtering, loading and storing of data.
  • Knowledge in installation, configuration, supporting and managing Hadoop clusters using Apache Cloudera (CDH3, CDH4) distributions and Amazon web services.
  • Solid understanding on teh working of EC2 and S3 in Amazon Web Services (AWS).
  • Proficiency in multiple databases likeMongoDB, Cassandra, MYSQL, Oracle 9i, 10g, 11g and MS SQL Server.
  • Experienced in Core JAVA wif strong understanding and working knowledge of object-oriented programming concepts (OOP), Multi-threading, Collections Framework, Exception handling, me/O system & JDBC.
  • Expertise in web Technologies like HTML 5, CSS 3, XML, JavaScript, JQuery.
  • Well versed working experience in Scrum/Agile framework and Waterfall methodologies.
  • Good Inter personnel skills and ability to work as part of a team. Exceptional ability to learn, master new technologies and to deliver outputs in short deadlines.

TECHNICAL SKILLS

Big Data Technologies\Tools and IDE\: HDFS, Map Reduce, Hive, Pig, Sqoop, \Eclipse, NetBeans, Hue, IntelliJ, Tableau, \Flume, Oozie, Scala, Kafka, Spark, Spark \Service Now.\SQL, Spark Streaming, Spark MLib, Apache \Nifi\

Programming Languages \Java/J2EE Technologies \: C, C++, Java, Python, HiveQL, R \Java 8, JDBC, JSP, Servlets\

Development Approach\Frameworks\: Agile, Waterfall Model, TDD\MVC Struts, Hibernate, Spring \

ETL\Database\: IBM WebSphere/Ascential DataStage 8.7/\HBase, Cassandra, MongoDB, Oracle, \8.1.2/8.5\MySQL, SQL server.\

Web Technologies\Operating Systems\: HTML5, CSS 3, JavaScript, XML, JQuery.\Windows, Linux, Mac\

Version Control\: GitHub, SVN\

PROFESSIONAL EXPERIENCE

Confidential - Charlotte, NC

Spark Developer

Responsibilities:

  • Exposure on usage of Apache Kafka to develop data pipeline of logs as a stream of messages using producers and consumers in HDFS.
  • Responsible for developing prototypes and proof of concepts for teh selected solutions and implementing complex big data projects wif a focus on collecting, parsing, managing, analyzing and visualizing large sets of data using multiple platforms.
  • Developed Hive UDF's to bring all teh customers email id into a structured format.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark data frames, Scala and Python.
  • Performed unit testing for Sparkand Spark Streamingwif Scala Test and Junit.
  • Used Scala to develop Scala coded Spark projects and executed using Spark-submit.
  • Leverage Tableau to perform visualizations on teh collected data.
  • Performed importing data from various sources to teh Cassandra cluster using Sqoop. Worked on creating data models for Cassandra from Existing Oracle data model.
  • Used Spark - Cassandra connector to load data to and from Cassandra.
  • Used Spark SQL to fetch and generate reports on Cassandra table data.
  • Set up Solr for distributing indexing and search
  • Developed Spark scripts by using Scala Shell commands as per teh requirement.
  • Developed UDF's using both Data Frames/ SQL and RDD in Spark for data Aggregation queries and reverting into OLTP through Sqoop.
  • Using Hive join queries to join multiple tables of a source system and load them toElasticsearch tables.
  • Optimized Hive queries to extract teh customer information from HDFS or Cassandra.
  • Automated teh data flow using Nifi / ControlM.
  • Loaded two different datasets sources like Oracle, MySQL to HDFS and Hive respectively on daily basis.
  • Worked in Agile environment wif active scrum participation.

Environment: Map Reduce, HDFS, Spark, Scala, Apache Kafka, Hive, Sqoop, Nifi, Solr, Cassandra, UNIX Shell Scripting, MySQL, Eclipse

Confidential - Houston, TX

Big Data Developer

Responsibilities:

  • Processed teh Web server logs by developing Multi-hop flume agents by using Avro Sink and loaded into MongoDB for further analysis, also extracted files from MongoDB through Flume and processed.
  • Wrote teh MapReduce jobs to parse teh web logs which are stored in HDFS.
  • Involved in optimizing Hive Queries, joins to get better results for Hive ad-hoc queries.
  • Implemented Partitioning, Bucketing in Hive for better organization of teh data.
  • Created hive queries for extracting data and sending them to clients.
  • Developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Involved in creating Oozie workflow and Coordinator jobs for Hive jobs to kick off teh jobs on time for data availability.
  • Hands on experience in developing Map Reduce programs using Apache Hadoop for analyzing teh Big Data.
  • Wrote multiple Hive queries to convert teh processed data to multiple file formats including XML, JSON and CSV file formats.
  • Created Mappings using Talend Open Studio for Evaluation and POC.
  • Spearheaded teh POCs for teh AWS ecosystem via teh AWS Management console.
  • Used Agile methodology for project management and Git for source code control.

Environment: HDFS, MapReduce, Hive, Sqoop, Flume, Oozie, Talend, MongoDB, Java, SQL scripting, Linux shell scripting, Eclipse

Confidential - St.Louis, MO

Big Data Developer

Responsibilities:

  • Experience in Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in teh design team for designing teh flow architecture.
  • Experienced in handling data from different data sets, join them and preprocess using Pig join operations.
  • Moving Bulk amount data into HBase using Map Reduce Integration.
  • Developed HBase data model on top of HDFS data to perform real time analytics using Java API.
  • Developed different kind of custom filters and handled pre-defined filters on HBase data using API.
  • Implement counters on HBase data to count total records on different tables.
  • Experienced in handling Avro data files by passing schema into HDFS using Avro tools and Map Reduce.
  • Created Hive Dynamic partitions to load time series data.
  • Created tables, partitions, buckets and perform analytics using Hive ad-hoc queries.
  • Integrated spring schedulers wif Oozie client as beans to handle cron jobs.
  • Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, HDFS, Map Reduce, Hive, HBase, Sqoop, RDBMS/DB, Flat files, MySQL, CSV, Avro data files

Confidential

Java Developer

Responsibilities:

  • Extensively used Core Java, Servlets, JSP and XML.
  • Used DB2 Database to store teh system data
  • Used Apache log 4j Logging framework for logging of trace and Auditing.
  • Used Asynchronous JavaScript and XML for better and faster interactive Front-End.
  • Developed RESTful Web services client to consume JSON messages using Spring JMS configuration. Developed teh message listener code.
  • Developed teh business components using EJB Session Beans.
  • Created JSP pages for teh Customer module of teh application.
  • Involved in developing in all teh tiers of J2EE application.
  • Performed code reviews. Used unit testing for all teh components using Junit.
  • Developed and deployed project in a team dat followed a Software management procedure as Rational Unified Process(RUP) combined wif Pair Programming and Test-Driven Development (TDD).

Environment: Java, JSP, Servlets, XML, Web Sphere, SQL Server 2003, PLSQL, Windows XP, SVN, ANT

Confidential

Software Developer

Responsibilities:

  • Implemented various Core Java concepts such as Multi-Threading, Exception Handling, Collection APIs to implement various features and enhancements.
  • Involved in various phases of Software Development Life Cycle (SDLC) such as requirements gathering, analysis, design and development.
  • Refined application framework for data flow and data handling.
  • Extensively involved in developing Web interface using JSP, JSP Standard Tag Libraries (JSTL), HTML, CSS and JavaScript to render teh dynamic web pages (presentation layer) for teh application
  • Generated teh application using Eclipse IDE.
  • Meeting stringent Deadlines and analyzing teh specific process requirements.
  • Involved in AJAX implementation.
  • Developed JavaScript validations to validate form fields.
  • Performed unit testing for teh developed code using JUnit.
  • Worked on defining and improving a home-grown testing framework to support integration and performance testing.
  • Implemented Waterfall model practices according to teh application requirements.

Environment: Java 1.6, JSP, MySQL, HTML, CSS, JavaScript, jQuery, Junit, Eclipse

We'd love your feedback!