We provide IT Staff Augmentation Services!

Scala/spark Developer Resume

3.00/5 (Submit Your Rating)

Pottsville, PA

SUMMARY

  • Highly Confident and Skilled Professional with having 8+ years of professional experience in IT industry, with around 4 years of hands - on expertise in Big Data processing using Hadoop, Hadoop Ecosystem implementation, maintenance, ETL and Big Data analysis operations.
  • Over 4+ years of comprehensive experience in Big Data processing using Apache Hadoopand its ecosystem (Map Reduce, Pig, Spark, Scala, Hive, Sqoop, and Hbase, Cassandra.
  • Experience in installing, configuring and maintaining the HadoopCluster
  • Wrote Hive queries for data analysis to meet the requirements
  • Created Hive tables to store data into HDFS and processed data using Hive QL
  • Extending Hive functionality by writing custom UDFs
  • Provided support in design and build end-to-end framework for Data Acquisition Layer, ETL Transformer Layer for Data Mart / Operational Data Store (OLTP & OLAP) and Data Provisioning Layer to Consumers / Services.
  • Experience in using ZooKeeper distributed coordination service for High-Availability.
  • Experience in migrating Data from RDMS to HDFS and Hive using Sqoop and converting SQL to HQL (Hive Query Language.
  • Experience in working with Map Reduce programs using Apache Hadoop for working with Big Data
  • Hands on experience in dealing with Compression Codec's like Snappy, Gzip.
  • Good understanding of Data Mining and Machine Learning techniques
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
  • Experience in developing solutions to analyze large data sets efficiently
  • Ability to work in high-pressure environments delivering to and managing stakeholder expectations.
  • Application of structured methods to: Project Scoping and Planning, risks, issues, schedules and deliverables.

TECHNICAL SKILLS

Hadoop Technologies: Apache Hadoop, Cloud era Hadoop Distribution (HDFS and Map Reduce)

Hadoop Ecosystem: Hive, Pig, Sqoop, Flume, Zookeeper, cassandra, mongodb

NOSQL Databases: Hbase

Programming Languages: Java, C, C++, Linux shell scripting

Web Technologies: HTML, J2EE, CSS, JavaScript, AJAX, Servlet, JSP, DOM, XML

Databases: MySQL, SQL, Oracle, SQL Server

Software Engineering: UML, Object Oriented Methodologies, Scrum, Agile methodologies

Operating System: Linux, Windows 7, Windows 8, XP

IDE Tools: Eclipse, Rational rose

PROFESSIONAL EXPERIENCE

Confidential, Pottsville, PA

Scala/Spark Developer

RESPONSIBILITIES:

  • Responsible for design & development of Spark SQL Scripts based on Functional Specifications.
  • Implemented Spark RDD Transformations and Actions in Scala.
  • Developed DF's, Case Classes for the required input data and performed the data transformations using Spark - Core.
  • Used Nosql Queries in Hbase for analysis and processing the data.
  • Used Machine learning to perform transformations and applying business logic Using Scala.
  • Implemented Partitioning, Dynamic Partition, Indexing and buckets in Hive.
  • Stored processed data in parquet file format.
  • Streamed data from data source using Kafka.
  • Converting Hive/SQL queries into Spark transformations using Spark RDD, Python, akka framework.
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities in machine learning using Scala.
  • Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the Google cloud service.
  • Importing and exporting data into HDFS Using Kafka and analysis using hive.

Confidential, GA

Hadoop developer/Scala developer

RESPONSIBILITIES:

  • Developed data pipeline using Kafka, Spark, Sqoop, and map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Performed performance tuning and troubleshooting of Spark jobs by analyzing and reviewing Hadoop log files
  • Experienced in migrating Scala minimize query response time.
  • Worked on Sequence files, ORC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
  • Exported the result set from Hive to MySQL using Sqoop.
  • Configured Hive using shared meta-store in MySQL and used Sqoop to migrate data into External Hive Tables from different RDBMS sources (Oracle, Teradata and DB2) for Data warehousing.
  • Provided the necessary support to the BI team when required in apache spark
  • Performed extensive Data Mining applications using Spark .
  • Used Nosql database to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
  • Computed various metrics using Java Map Reduce to calculate metrics that define user experience, revenue etc..
  • Involved in processing ingested raw data using Map Reduce, Apache Pig and Hive.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.

Confidential, Boca Raton, FL

Hadoop Developer

RESPONSIBILITIES:

  • Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
  • Wrote Custom Map Reduce Scripts for Data Processing in Java.
  • Importing and exporting data into HDFS and Hive using Sqoop and also used flume from to extract from multiple resources.
  • Responsible to manage data coming from different sources.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Created Hive tables to store data into HDFS, loading data and writing hive queries that will run internally in map reduce way.
  • Created HBase tables to store variable data formats coming from different portfolios
  • Implemented best income logic using Pig scripts. Wrote custom Pig UDF to analyze data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Cluster coordination services through Zookeeper
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Confidential, Rochester, MN

Java Developer

RESPONSIBILITIES:

  • Used JSP, Servlet coding under J2EE Environment.
  • Designed XML files to implement most of the wiring need for Hibernate annotations and Struts configurations.
  • Responsible for developing the forms, which contains the details of the employees, and generating the reports and bills.
  • Developed Web Services for data transfer from client to server and vice versa using Apache Axis, SOAP and WSDL.
  • Involved in designing of class and dataflow diagrams using UML Rational Rose.

ENVIRONMENT: Java(JDK 1.6), J2EE, JSP, Servlet, Hibernate, JavaScript, JDBC, Oracle 10g, UML, Rational Rose, SOAP, Web Logic Server, JUnit, PL/SQL, CSS, HTML, XML, Eclipse

Confidential, New York, NY

Java Developer

RESPONSIBILITIES:

  • Developed the Enterprise Java Beans (Stateless Session beans) to handle different transactions such as online funds transfer, bill payments to the service providers.
  • Worked with various types of controllers like simple form controller, Abstract Controller and Controller Interface etc.
  • Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
  • Developed XML documents and generated XSL files for Payment Transaction and Reserve Transaction systems.
  • Developed coded, tested, debugged and deployed JSPs and Servlet for the input and output forms on the web browsers

ENVIRONMENT: J2EE, JDBC, Servlet, JSP, Struts, Hibernate, Web services, MVC, HTML, JavaScript, Web Logic, XML, JUnit, Oracle, Web Sphere, Eclipse

Confidential

Java Developer

RESPONSIBILITIES:

  • Designed use cases for different scenarios.
  • Involved in acquiring requirements from the clients.
  • Developed functional code and met expected requirements.
  • Wrote product technical documentation as necessary.
  • Designed presentation part in JSP(Dynamic content) and HTML(for static pages)
  • Designed Business logic in EJB and Business facades.
  • Used Resource Manager to schedule the job in UNIX server.

ENVIRONMENT: J2EE, JSP, HTML, Struts Frame Work, EJB, JMS, Web Logic Server, JBoss Server, PL/SQL, CVS, MS PowerPoint, MS Outlook

We'd love your feedback!