Spark Scala Developer Resume Bloomfield, CT - Hire IT People

SUMMARY:

Around 5 years of extensive IT experience in all phases of Software Development Life Cycle, including experience working with Hadoop, Spark and Cloud projects.
Worked extensively with Hadoop Distributions like Cloudera, and Hortonworks.
Hands on experience with Hadoop Architecture and its components like YARN HDFS, Resource Manager, Node Manager, Name Node, Data Node and MR v1 & v2.
Experienced in working with different file formats (Avro, Parquet, RC & ORC) & compressions (Gzip, LZO, Snappy and Bzip2).
Broad working experience and certified in Spark Core, SQL (Dataframes), Streaming.
Experience in importing and exporting data from different RDBMS Servers like Oracle and Teradata into HDFS and Hive using Sqoop.
Experience in ingesting data from FTP/SFTP servers using Flume.
Experience in developing Kafka Consumer API using Spark applications using Scala.
Experience in Hive, Impala and Spark Performance Tuning and Optimization.
Experience in developing Hive UDFs and running hive scripts using different execution engines like Tez and Spark (Hive on Spark).
Experienced in tuning long running Spark applications and implementing features like graceful shutdown, fault tolerance and fail over.
Experience in creating DStreams from sources like Kafka and performed different Spark transformations and actions on it.
Experience in working with Akka Actor Model using Scala.
Hands on experience working with Kerberos keytabs for application authentication and Sentry for defining role based ACLs on objects like URI, databases and tables.
Well versed working with Hadoop encryption like data at rest and transportation.
Experience in Integrating Hive, Impala, Spark with Tableau reports.
Experience in publishing and scheduling refreshes on the Tableau Server.
Experience with AWS components like Ec2 instances, S3 buckets & Cloud Formation templates.
Experience with Azure Components like Azure SQL Database and Azure Data Factory.

TECHNICAL SKILLS:

Programming Languages: Scala, Java, shell scripting, SQL and PL/SQL

Big Data Technologies: Hadoop, HDFS, MapReduce, Hive, Pig, HBase, Impala, Hue, Sqoop,Oozie, Flume, Zookeeper, Kafka, Sentry, Cloudera and Hortonworks.

Spark Components: Core, SQL(Dataframes, Datasets), Streaming

Databases & NoSQL: Oracle, Teradata, MySQL, SQL Server, HBase

Hadoop Paradigms: Map Reduce, YARN, In - memory computing, High Availability, Batch processing, Real-time Streaming.

Other Tools: Eclipse, IntelliJ, Maven, SBT, SVN, GitHub, Jira, Jenkins

Cloud Components: AWS (S3 Buckets, EMR, Ec2, Cloud Formation), Azure (Sql Database

Visualization: Tableau Desktop and Tableau Server

PROFESSIONAL EXPERIENCE:

Confidential - Bloomfield, CT

Spark Scala Developer

Responsibilities:

Involved in complete BigData flow of the application starting from data ingestion from upstream to HDFS, processing and analyzing the data in HDFS.
Orchestrated number of Sqoop Queries and Hive Scripts through custom developed Infrastructure.
Handled encryption algorithms using Apache Shiro for password protection.
Transformed existing hive scripts to Spark applications using RDDs for transforming data and persisting into HDFS.
Extensively worked with Spark-SQL context to create data frames and datasets to preprocess the model data.
Developed Spark HBase and Spark AtScale modules for retrieving data into Spark for processing.
Designed end to end integration testing and unit testing for Spark Applications.
Experienced in performance tuning of Spark applications on code, memory and parallelism levels.
Developed to Spark application to stream Hive table to Kafka topic with Avro format.
Migrated all the data from Teradata to Big data environment using Sqoop and Hive.
Worked on Spark Hive models to convert to turn off the Classic Environment.
Responsible for developing Spark scripts to check data quality issues in Dataframes.
Developed preprocessing logics to filter data for downstream teams based on the requirements.
Deployed all changes through Continuous Integration Continuous Development pipelines using Jenkins and IBM UDeploy.

Environment: - Cloudera, AWS, Sqoop, Hive, Spark, HBase, AtScale, SBT, Jenkins, IBM UDeploy, Shiro, Oozie, Intellij, Teradata

Confidential - St. Louis, MO

Big Data - Senior Technical Consultant

Responsibilities:

Integrated Tableau with Azure SQL Database and published workbooks to Tableau Server.
Involved in developing Sqoop queries for moving data from RDBMS servers to Hadoop.
Implemented optimization techniques like partitioning, bucketing and query optimization in Hive.
Responsible for creating spark Dataframes using Scala for the ingested data.
Developed data pipe lines using Kafka and Spark Streaming to ingest, transform and for aggregations

Environment: Hadoop, HDFS, Spark, Hive, Oozie, Impala, Cloudera, Azure SQL Server, Tableau

Confidential, Ridgefeild Park, NJ

Hadoop Engineer

Responsibilities:

Involved in complete BigData flow of the application starting from data ingestion from upstream to HDFS, processing and analyzing the data in HDFS.
Transformed existing hive scripts to Spark applications using RDDs for transforming data and persisting into HDFS.
Extensively worked with Spark-SQL context to create data frames to filter input data for model execution.
Developed data pipe lines using Kafka and Spark Streaming to ingest, transform and for aggregations.
Developed Flume agents for handling data from FTP/SFTP Source and Sink as HDFS.
Developed Sqoop jobs to import data in Avro file format from Oracle database and created hive tables on top of it.
Developed automated scripts to import data from Amazon s3 buckets to HDFS using Boto library.
Involved in performance tuning of Hive from design, storage and query perspectives.
Extensively worked on performance optimization of hive queries by using map-side join, parallel execution and cost based optimization.
Automated the ETL pipelines using Oozie and scheduled jobs using coordinator & Cron tabs.
Integrated Hive and Impala with Tableau reports and published to Tableau Server.
Involved in designing and developing tables in HBase and storing aggregated data from Hive Table.
Designed role based acls for the tables in Hive and Impala using Sentry.

Environment: HDFS, Yarn, Hive, Sqoop, Flume, Oozie, HBase, Kafka, Impala, Spark SQL, Spark Streaming, IntelliJ, Oracle, Teradata, Shell Scripting, Tableau, Scala, Cloudera, AWS.

Confidential

Hadoop Developer

Responsibilities:

Participated in Gathering requirements, analyze requirements and design technical documents for business requirements.
Responsible for importing data to HDFS using Sqoop from different RDBMS servers and exporting data using Sqoop to the RDBMS servers after aggregations for other ETL operations.
Developed and used existing UDF’s for custom implementation on table data.
Created Partitioning, Bucketing, Map side Join, Parallel execution for optimizing the hive queries.
Responsible for monitoring Cluster using Cloudera Manager.
Developed Pig scripts for track data capture between arrived data and current data.
Orchestrated hundreds of Sqoop queries, Pig scripts, Hive queries using Oozie workflows and sub-workflows.
Responsible for handling different data formats like Avro, Parquet and ORC formats.

Environment: Pig, Hive, Oozie, Linux, YARN, Cloudera Manager

Confidential

Java Developer

Responsibilities:

Involved in developing various data flow diagrams, use case diagrams and sequence diagrams.
Worked on various phases and as well as improving the reporting module.
Worked extensively in JSP, HTML, JavaScript, and CSS to create the UI pages for the project.
Created JUnit test cases for unit testing and developed generic JS functions for validations.
Gathered requirements for migrating from ICD-9 to ICD-10 codes.

Environment: Java 5.0, Struts, Spring 2.0, Hibernate 3.2, Web Logic 7.0, Eclipse, Oracle, JUnit 4.2, Maven, Windows XP, HTML, CSS, JavaScript, and XML.

We provide IT Staff Augmentation Services!

Spark Scala Developer Resume

Bloomfield, CT

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship