Scala/spark Developer Resume
Pottsville, PA
SUMMARY
- Highly Confident and Skilled Professional with having 8+ years of professional experience in IT industry, with around 4 years of hands - on expertise in Big Data processing using Hadoop, Hadoop Ecosystem implementation, maintenance, ETL and Big Data analysis operations.
- Over 4+ years of comprehensive experience in Big Data processing using Apache Hadoopand its ecosystem (Map Reduce, Pig, Spark, Scala, Hive, Sqoop, and Hbase, Cassandra.
- Experience in installing, configuring and maintaining the HadoopCluster
- Wrote Hive queries for data analysis to meet the requirements
- Created Hive tables to store data into HDFS and processed data using Hive QL
- Extending Hive functionality by writing custom UDFs
- Provided support in design and build end-to-end framework for Data Acquisition Layer, ETL Transformer Layer for Data Mart / Operational Data Store (OLTP & OLAP) and Data Provisioning Layer to Consumers / Services.
- Experience in using ZooKeeper distributed coordination service for High-Availability.
- Experience in migrating Data from RDMS to HDFS and Hive using Sqoop and converting SQL to HQL (Hive Query Language.
- Experience in working with Map Reduce programs using Apache Hadoop for working with Big Data
- Hands on experience in dealing with Compression Codec's like Snappy, Gzip.
- Good understanding of Data Mining and Machine Learning techniques
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
- Experience in developing solutions to analyze large data sets efficiently
- Ability to work in high-pressure environments delivering to and managing stakeholder expectations.
- Application of structured methods to: Project Scoping and Planning, risks, issues, schedules and deliverables.
TECHNICAL SKILLS
Hadoop Technologies: Apache Hadoop, Cloud era Hadoop Distribution (HDFS and Map Reduce)
Hadoop Ecosystem: Hive, Pig, Sqoop, Flume, Zookeeper, cassandra, mongodb
NOSQL Databases: Hbase
Programming Languages: Java, C, C++, Linux shell scripting
Web Technologies: HTML, J2EE, CSS, JavaScript, AJAX, Servlet, JSP, DOM, XML
Databases: MySQL, SQL, Oracle, SQL Server
Software Engineering: UML, Object Oriented Methodologies, Scrum, Agile methodologies
Operating System: Linux, Windows 7, Windows 8, XP
IDE Tools: Eclipse, Rational rose
PROFESSIONAL EXPERIENCE
Confidential, Pottsville, PA
Scala/Spark Developer
RESPONSIBILITIES:
- Responsible for design & development of Spark SQL Scripts based on Functional Specifications.
- Implemented Spark RDD Transformations and Actions in Scala.
- Developed DF's, Case Classes for the required input data and performed the data transformations using Spark - Core.
- Used Nosql Queries in Hbase for analysis and processing the data.
- Used Machine learning to perform transformations and applying business logic Using Scala.
- Implemented Partitioning, Dynamic Partition, Indexing and buckets in Hive.
- Stored processed data in parquet file format.
- Streamed data from data source using Kafka.
- Converting Hive/SQL queries into Spark transformations using Spark RDD, Python, akka framework.
- Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities in machine learning using Scala.
- Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the Google cloud service.
- Importing and exporting data into HDFS Using Kafka and analysis using hive.
Confidential, GA
Hadoop developer/Scala developer
RESPONSIBILITIES:
- Developed data pipeline using Kafka, Spark, Sqoop, and map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Performed performance tuning and troubleshooting of Spark jobs by analyzing and reviewing Hadoop log files
- Experienced in migrating Scala minimize query response time.
- Worked on Sequence files, ORC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
- Exported the result set from Hive to MySQL using Sqoop.
- Configured Hive using shared meta-store in MySQL and used Sqoop to migrate data into External Hive Tables from different RDBMS sources (Oracle, Teradata and DB2) for Data warehousing.
- Provided the necessary support to the BI team when required in apache spark
- Performed extensive Data Mining applications using Spark .
- Used Nosql database to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
- Computed various metrics using Java Map Reduce to calculate metrics that define user experience, revenue etc..
- Involved in processing ingested raw data using Map Reduce, Apache Pig and Hive.
- Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
- Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.
Confidential, Boca Raton, FL
Hadoop Developer
RESPONSIBILITIES:
- Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
- Wrote Custom Map Reduce Scripts for Data Processing in Java.
- Importing and exporting data into HDFS and Hive using Sqoop and also used flume from to extract from multiple resources.
- Responsible to manage data coming from different sources.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Created Hive tables to store data into HDFS, loading data and writing hive queries that will run internally in map reduce way.
- Created HBase tables to store variable data formats coming from different portfolios
- Implemented best income logic using Pig scripts. Wrote custom Pig UDF to analyze data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Cluster coordination services through Zookeeper
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Confidential, Rochester, MN
Java Developer
RESPONSIBILITIES:
- Used JSP, Servlet coding under J2EE Environment.
- Designed XML files to implement most of the wiring need for Hibernate annotations and Struts configurations.
- Responsible for developing the forms, which contains the details of the employees, and generating the reports and bills.
- Developed Web Services for data transfer from client to server and vice versa using Apache Axis, SOAP and WSDL.
- Involved in designing of class and dataflow diagrams using UML Rational Rose.
ENVIRONMENT: Java(JDK 1.6), J2EE, JSP, Servlet, Hibernate, JavaScript, JDBC, Oracle 10g, UML, Rational Rose, SOAP, Web Logic Server, JUnit, PL/SQL, CSS, HTML, XML, Eclipse
Confidential, New York, NY
Java Developer
RESPONSIBILITIES:
- Developed the Enterprise Java Beans (Stateless Session beans) to handle different transactions such as online funds transfer, bill payments to the service providers.
- Worked with various types of controllers like simple form controller, Abstract Controller and Controller Interface etc.
- Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
- Developed XML documents and generated XSL files for Payment Transaction and Reserve Transaction systems.
- Developed coded, tested, debugged and deployed JSPs and Servlet for the input and output forms on the web browsers
ENVIRONMENT: J2EE, JDBC, Servlet, JSP, Struts, Hibernate, Web services, MVC, HTML, JavaScript, Web Logic, XML, JUnit, Oracle, Web Sphere, Eclipse
Confidential
Java Developer
RESPONSIBILITIES:
- Designed use cases for different scenarios.
- Involved in acquiring requirements from the clients.
- Developed functional code and met expected requirements.
- Wrote product technical documentation as necessary.
- Designed presentation part in JSP(Dynamic content) and HTML(for static pages)
- Designed Business logic in EJB and Business facades.
- Used Resource Manager to schedule the job in UNIX server.
ENVIRONMENT: J2EE, JSP, HTML, Struts Frame Work, EJB, JMS, Web Logic Server, JBoss Server, PL/SQL, CVS, MS PowerPoint, MS Outlook
