We provide IT Staff Augmentation Services!

Spark/azure Developer Resume

Redmond, WA

SUMMARY

  • Overall 6+ years of experience in Software applications development including Analysis, Design, Development, Integration, Testing and Maintenance.
  • Work experience in Big data/ Hadoop Development and Ecosystem Analytics using programming languages like Java and Scala.
  • Experienced developing big data applications in cloud, specifically amazon web services.
  • Have good experience ofSparkSQL, Pyspark and Spark using Scala.
  • Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
  • Experienced in building highly scalable Big - data solutions using Hadoop and multiple distributions i.e., Cloudera, Hortonworks and NoSQL platforms (HBase & Cassandra).
  • Experience in analyzing data using HiveQL, Pig Latin and writing custom MapReduce programs in Java and Python.
  • Expertise in Big data architecture with Hadoop File system and its eco system tools MapReduce, HBase, Hive, Agile, Pig, Zookeeper, Oozie, Flume, Avro, Impala, Apache spark and Spark Streaming and Spark SQL.
  • Hands on experience in Apache Sqoop, Apache Storm and Apache Hive integration.
  • Hands on experience working with different File Formats like TEXTFILE, JSON, AVROFILE, ORC for HIVE querying and processing.
  • Experience on Apache Kafka, used for Messaging broker, Log Aggregation and Stream processing.
  • Expertise in migration data from different databases (i.e. Oracle, DB2, Teradata) to HDFS.
  • Experience in designing and coding web applications using Core Java & Web Technologies- JSP, Servlets and JDBC.
  • Experience in designing the User Interfaces using HTML, CSS, JavaScript and JSP.
  • Experience in version control tools like Git
  • Developed User Defined Functions (UDFs) for Apache Pig and Hive using Python and Java languages.
  • Developed Scala 2.10+ applications on Hadoop and Spark SQL for high-volume and real-time data processing.
  • Used Spark API over Cloudera to perform analytics ondatain Impala
  • Rich experience in Agile Methodologies such as extreme programming (XP), Scrum, waterfall model and Test Driven Development TDD.
  • Expert level skills in designing and implementing web server solutions and deploying java application servers like Tomcat, JBoss, WebSphere, WebLogic on Windows& UNIX platform.
  • Knowledge in Spark APIs to cleanse, explores, aggregate, transform, and store data.
  • Experience with RDBMS and writing SQL and PL/SQL scripts used in stored procedures.
  • Familiar with Spark, Kafka, Storm, Talend, and Elastic search.
  • Knowledge of NoSQL databases such as HBase, MongoDB &Cassandra.
  • Used Spark API over Cloudera to perform analytics ondatain Impala
  • Well experienced in building servers like DHCP, PXE with kick-start, DNS and NFS and used them in building infrastructure in a Linux Environment. Automated build, testing and integration with Ant, Maven and JUnit.
  • Knowledge in Spark APIs to cleanse, explores, aggregate, transform, and store data.
  • Experience with RDBMS and writing SQL and PL/SQL scripts used in stored procedures.
  • Strengths include good team player, excellent communication interpersonal and analytical skills, flexible to work with new technologies and ability to work effectively in a fast-paced, high volume, deadline-driven environment.

TECHNICAL SKILLS

Big Data Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Storm, Zookeeper, Kafka, Impala, HCatalog, Apache Spark, Spark Streaming, PySpark, Spark SQL, HBase and Cassandra

Hadoop Distributions: AWS, Cloudera, Hortonworks.

Databases: DB Oracle, MySQL, NoSql, Teradata, HBase and Cassandra

Programming Languages: SQL, Scala, Java, Python, Unix Shell Scripting

Java Technologies: JDBC, Servlets, JSP, Spring and Hibernate

Operating System: Windows XP/7/8, Linux Distro (Ubuntu), Cent OS.

Tools: & Utilities: HP Quality Center, Git, Maven.

PROFESSIONAL EXPERIENCE

Confidential, Redmond, WA

Spark/Azure Developer

Responsibilities:

  • Evaluated Business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Worked on analyzing Hadoop cluster and different big data analytical and processing tools including Pig, Hive, Spark and Spark Streaming.
  • Design solution for various system components using Confidential Azure
  • Experience developing multi-user cloud hosted software with one or more public cloud platforms such as Azure
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Performed Code Optimizations to improve the performance of the applications.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and exported the data from HDFS to MYSQL using Sqoop.
  • Experience with Azure relational and no-SQL experience, HDInsights, Apache Storm, Spark
  • Configured Spark Streaming to receive real time data from the Event Hubs and store the stream data to HDFS using Scala.
  • Developed numerous Spark jobs in Scala 2.10.x forDataCleansing and AnalyzingDatain Impala 2.1.0.
  • Hands on experience in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
  • Used HIVE to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Develop cutting edge cloud platform solutions using the latest tools and platforms like Azure, etc
  • Experience with Azure PaaS services such as web sites, SQL, Stream Analytics, IoT Hubs, Event Hubs, Data Lake, Azure Data Factory
  • Hands on experience in working on Spark SQL queries, Data frames, and import data from Data sources, perform transformations; perform read/write operations, save the results to output directory into HDFS.
  • Responsible for Ingestion of Data from Blob to Kusto and maintaining the PPE and PROD pipelines.
  • Collaborated with the infrastructure, network, database, application and teams to ensure data quality and availability.
  • Experienced in running query usingImpalaand used BI tools to run ad-hoc queries directly on Hadoop.
  • Responsible for creating Hive tables, partitions, loading data and writing hive queries.
  • Imported and exported the data using Sqoop between Hadoop Distributed File System (HDFS) and Relational Database systems.
  • Hands on experience in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.

Environment: MapReduce, Azure, Python, HDFS, Hive, Pig, Spark, Spark-Streaming, Spark SQL, Sqoop, Java, Scala, Eclipse, Git, Shell Scripting and Cassandra.

Confidential, Fort Wayne, IN

Hadoop/Spark Developer

Responsibilities:

  • Involved in Cluster Setup, monitoring and administration tasks likecommission and decommissionnodes.
  • Good at working onHadoop, MapReduce, and Yarn/MRv2 developed multiple MapReduce jobs for structured, semi-structured and unstructured data in java.
  • Developed MapReduce programs in Java for parsing the raw data and populating staging Tables.
  • Created Hive queries to compare the raw data with EDW reference tables and performing aggregates
  • Experienced in developing custom input formats and data types to parse and process unstructured and semi structured input data and mapped them into key value pairs to implement business logic in Map-Reduce.
  • Experience in implementing custom sterilizer, interceptor, source and sink as per the requirement in Flume to ingest data from multiple sources.
  • Perform big data processing using Hadoop, MapReduce, Sqoop, Oozie, and Impala
  • Involved in developing Hive DDLs to create, alter and drop Hive tables and storm, &Kafka.
  • Experience in setting up Fan-out workflow in flume to design v shaped architecture to take data from many sources and ingest into single sink.
  • Implemented extensive Impala 2.7.0 queries and creating views for adhoc and business processing.
  • Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in near real time and persist it to Cassandra.
  • Consumed JSON messages using Kafka and processed the JSON file using Spark Streaming to capture UI updates
  • Developed spark programming code in SCALA on INTELLIJ IDE using SBT tools.
  • Performance tuning of SQOOP, Hive and Spark jobs.
  • Worked with .Net and C# to create dash board according to the client requirements.
  • Experienced in writing live Real-time Processing and core jobs using Spark Streaming with Kafka as a data pipe-line system.
  • Implemented OLAP multi-dimensional cube functionality using Azure SQL Data Warehouse.
  • Wrote AZUREPOWERSHELLscripts to copy or move data from local file system to HDFS Blob storage
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in near real time and persist it to Cassandra.
  • Experienced in analyzing data with Hive and Pig.
  • Experienced in writing live Real-time Processing and core jobs using Spark Streaming with Kafka as a data pipe-line system.
  • Used Kafka Streams to Configure Spark streaming to get information and then store it in HDFS.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Integrating bulk data into Cassandra file system using MapReduce programs.
  • Expertise in designing, data modeling for Cassandra NoSQL database.
  • Experienced in managing and reviewingHadooplog files.
  • Involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Experienced in implementing High Availability using QJM and NFS to avoid single point of failure
  • Experienced in writing live Real-time Processing using Spark Streaming with Kafka.
  • Developed custom mappers in python script and Hive UDFs and UDAFs based on the given requirement.
  • Connects to a NFSv3 storage server supporting AUTH NONE or AUTH SYS authentication method.
  • Used HiveQL to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Experienced in querying data using SparkSQL on top of Spark engine.
  • Experience in managing and monitoringHadoopcluster using Cloudera Manager.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
  • Implemented analytical platform that used Hive Functions and different kind of join operations like Map joins, Bucketed Map joins.
  • Unit tested a sample of raw data and improved performance and turned over to production.

Environment: CDH, Java(JDK1.7), Impala,Hadoop, Azure, MapReduce, HDFS, Hive, Sqoop, Flume, NFS, Cassandra, Pig, Oozie, Kerberos, Scala, SparkSQL, Spark Streaming, Kafka, Linux, Shell Scripting, MySQL Oracle 11g, SQL*PLUS, C++, C#

Confidential, Madison, WI

Hadoop Developer

Responsibilities:

  • Understand the exact requirement of report from the Business groups and users.
  • Imported trading and derivatives data in Hadoop Distributed File System using Eco System components MapReduce, Pig, Hive, Sqoop.
  • Was part of activity to setup Hadoop ecosystem at development & QA Environment.
  • Managed and reviewed Hadoop Log files.
  • Responsible writing PIG Script and Hive queries for data processing.
  • Running Sqoop for importing data from Oracle & Other Database.
  • Creation of shell script to collect raw logs from different machines.
  • Created Partitions in Hive as static and dynamic.
  • Implemented Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, COGROUP, ORDER, LIMIT and UNION.
  • Defined some PIG UDFs for some functions such as swap, hedging, Speculation and arbitrage.
  • Coded MapReduce program to process unstructured logs file.
  • Worked on Import and export data into HDFS and Hive using Sqoop.
  • Used parameterize Pig Script and optimized script using illustrate and explain.
  • Involved in the process of configuring HA, Kerberos security issues and name node failure restoration activity time to time as a part of zero downtime.
  • Implemented FAIR Scheduler as well.
  • Used Spring framework that handles application logic and makes calls to business, make them as Spring Beans.
  • Implemented, configured data sources, session factory and used Hibernate Template to integrate Spring framework with Hibernate.
  • Developed JUNIT test cases for application unit testing.
  • Used SVN as version control to check in the code, created branches and tagged the code in SVN.
  • Used RESTFUL Services to interact with the Client by providing the RESTFUL URL mapping.
  • Used Log4j framework to log/track application and debugging.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Shell Scripting, Sqoop, Java, Eclipse, Spring, Hibernate, SOAP, REST, SVN, Log4j

Confidential

Java/J2EE Developer

Responsibilities:

  • Involved in the complete SDLC software development life cycle of the application from requirement analysis to testing.
  • Developed the modules based on Struts MVC Architecture.
  • Followed AGILE methodology (Scrum Stand-ups, Sprint Planning, Sprint Review, Sprint Showcase and Sprint Retrospective meetings).
  • Developed business components using Core Java concepts and classes like Inheritance, Polymorphism, Collections, Serialization and Multithreading etc.
  • Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.
  • Developed the DAO objects using JDBC.
  • Business Services using the Servlets and Java.
  • Used Spring Framework for Dependency injection and integrated with the Struts Framework and Hibernate.
  • Developed JUnit test cases for all the developed modules.
  • Used Log4j to capture the log that includes runtime exceptions, monitored error logs and fixed the problems.
  • Performed Unit Testing, System Testing and Integration Testing.
  • Provided technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.

Environment: Java, JDBC, spring, Hibernate, HTML, CSS, Java Script, Log4j, Oracle, Struts and Eclipse.

Confidential

Jr. Java/J2EE Developer

Responsibilities:

  • Developed User Interfaces module usingJSP,JavaScript, DHTML and form beansfor presentation layer.
  • Developed Servlets and Java Server Pages (JSP).
  • Developed PL/SQL queries, and wrote stored procedures andJDBC routines to generate reports based on client requirements.
  • Enhancement of the System according to the customer requirements.
  • Involved in the customization of the available functionalities of the software for an NBFC (Non-BankingFinancialCompany).
  • Involved in putting proper review processes and documentation for functionality development.
  • Providing support and guidance for Production and Implementation Issues.
  • Used Java Script validation in JSP.
  • UsedHibernateframework to access the data from back-end SQL Server database.
  • Used AJAX (Asynchronous JavaScript and XML) to implement user friendly andefficient client interface.
  • UsedMDBfor consuming messages from JMS queue/topic.
  • Designed and developed Web Application usingStrutsFramework.
  • ANT to compile and generate EAR, WAR, and JAR files.
  • Created tes1t case scenarios for Functional Testing and wrote Unit test cases with JUnit.
  • Responsible for Integration, unit testing, system testing and stress testing for all the phases of project.

Environment: Java, J2EE, JSP 1.2, Performance Tuning,Spring1.2,Hibernate 2.0, JSF1.2,EJB 1.2, IBM WebSphere6.0, Servlets, JDBC, XML, XSLT, DOM, CSS, HTML, DHTML, SQL, Java Script, Log4J, ANT1.6, WSAD6.0, Oracle 9i, Windows 2000.

Hire Now