We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

4.00/5 (Submit Your Rating)

TX

SUMMARY:

  • 8+ years of professional experience in software development with around 5 years of experience in developing Big data applications using Hadoop and Spark.
  • Strong expertise in big data ecosystem like Spark, HIVE, SQOOP, HDFS, Map Reduce, Kafka, Oozie, Yarn, Pig, Hbase, Flume.
  • Strong expertise in building scalable applications using various programming languages (Java, Scala and python).
  • In depth Knowledge on Architecture of Distributed systems and parallel computing.
  • Experience implementing end to end data pipelines for serving reporting and data science capabilities.
  • Experienced in working with Cloudera, Hortonworks and Amazon EMR clusters.
  • Experience in fine tuning applications written in Spark and Hive and to improve the overall performance of the pipelines.
  • Developed production ready spark applications using Spark RDD apis, Data frames, Datasets, Spark SQL and Spark Streaming.
  • In depth knowledge on import/export of data from Databases using Sqoop.
  • Well versed in writing complex hive queries using analytical functions.
  • Knowledge in writing custom UDF’s in Hive to support custom business requirements.
  • Solid experience in using the various file formats like CSV, TSV, Parquet, ORC, JSON and AVRO.
  • Experience in using the compression techniques like Gzip,Snappy with in Hadoop.
  • Strong knowledge of NoSQL databases and worked with HBase, Cassandra and Mongo DB.
  • Experience in using the cloud services like Amazon EMR, S3, EC2, Red shift and Athena.
  • Extensively used various IDE’s like IntelliJ, NetBeans and Eclipse
  • Proficient in using RDBMS concepts with Oracle, MySQL, DB2, Teradata and experienced in writing SQL queries.
  • Knowledge in writing shell scripts and scheduling using cron jobs.
  • Experience working with GIT(Repository), Jenkins and Maven build tools.
  • Developed cross - platform applications using JAVA, JSP, Servlets, Hibernate, RESTful, JDBC, JavaScript, XML, and HTML.
  • Used Log4J for enabling runtime logging and performed system integration test to ensure quality of the system.
  • Experience in using SOAP UI tool to validate the web service.
  • Expertise in writing unit test cases using JUnit API.
  • Experienced in using Selenium for testing.
  • Highly self-motivated, good technical, communications and interpersonal skills. Able to work reliably under pressure. Committed team player with strong analytical and problem solving skills, ability to quickly adapt to new environments & technologies.

TECHNICAL SKILLS:

Big Data Ecosystem: Spark, HIVE, SQOOP, HDFS, Map Reduce, Kafka, Oozie, Yarn, Pig, Hbase, Flume.

Database: SQL Server, MySQL, Oracle, DB2, TeraData

NO SQL Databases: HBase, Cassandra, MongoDB

AWS technologies: EMR, S3, Red shift, Athena

Programming Languages: Java, Scala, Python, SQL, Pig Latin, HiveQL, Shell Scripting.

Web and Application Servers: Web logic, Web Sphere, JBoss, Tomcat

IDE: Eclipse,IntelliJ, NetBeans

Operating Systems: UNIX, Windows, Mac, LINUX

Web Technologies: HTML, XHTML, CSS, JavaScript, Ajax

Software development Methodologies: Agile Model, Waterfall Model

WORK EXPERIENCE:

Confidential, TX

Hadoop/Spark Developer

Responsibilities:

  • Created Sqoop Scripts to import and export customer profile data from RDBMS to S3 buckets.
  • Built custom Input adapters to migrate click stream data from FTP servers to S3.
  • Developed various enrichment applications in spark using scala for performing cleansing and enrichment of click stream data with customer profile lookups.
  • Troubleshooting Spark applications for improved error tolerance and reliability.
  • Used Spark Dataframe and Spark SqlAPI to implement batch processing of Jobs.
  • Used Apache Kafka and Spark Streaming to get the data from adobe live stream rest api connections.
  • Automated creation and termination of AWS EMR clusters.
  • Worked on fine tuning and performance enhancements of various spark applications and hive scripts.
  • Used various concepts in spark like broadcast variables, caching, dynamic allocation etc to design more scalable spark applications.

Environment: AWS EMR, S3, Spark, Hive, Sqoop, Scala,Java, MySQL, Oracle DB, Athena, Redshift.

Confidential, PA

Hadoop Developer

Responsibilities:

  • Extensively worked in Sqoop to migrate data from RDBMS to HDFS .
  • Ingested data from various source systems like Teradata, MySQL, Oracle databases.
  • Developed Spark application to perform Extract Transform and load using Spark RDD and Data frames.
  • Created Hive external tables on top of data from HDFS and wrote ad-hoc hive queries to analyze the data based on business requirements.
  • Utilized Partitioning and Bucketing in Hive to improve hive query processing times.
  • Performed incremental data ingestion using sqoop as existing application is generating data on daily basis. .
  • Migrated/re-implemented Map Reduce jobs to Spark applications for better performance.
  • Handled data in different file formats like Avro and Parquet.
  • Extensively used Cloudera Hadoop distributions within the project.
  • Used GIT for maintaining/versioning the code.
  • Created Oozie workflows to automate the data pipelines

Environment: Cloudera(CDH 5.x), Spark, Scala, Sqoop, Oozie, Hive, HDFS, MySQL, Oracle DB, Tera Data

Confidential, OH

Hadoop Developer

Responsibilities:

  • Wrote complex Map Reduce jobs to perform various data cleansing and ETL like processing on the data.
  • Worked on different file formats like Text, Avro, Parquet using Map Reduce Programs.
  • Developed Hive Scripts to create partitioned tables and create various analytical datasets.
  • Worked with cross functional consulting teams within the data science and analytics team to design, develop and execute solutions to derive business insights and solve client operational and strategic problems.
  • Extensively used Hive queries to query data in Hive Tables and loaded data into HBase tables.
  • Exported the processed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Used Hive Partitioning and Bucketing concepts to increase the performance of Hive Query processing.
  • Designing Oozie workflows for job scheduling and batch processing.
  • Helped analytics team by writing Pig and Hive scripts to perform further detailed analysis of the data processed.

Environment: Java, HDFS, MapReduce, Hive, Pig, MySQL, CDH, IntelliJ, YARN, Sqoop, Hbase, Unix Shell Scripting.

Confidential, PA

Java Developer

Responsibilities:

  • Involved in the analysis, design, and development and testing phases of application using AGILESCRUM methodology.
  • Implemented MVC architecture application using Spring, JSP & Java beans.
  • Extensively Used JavaScript, AngularJS and Ajax to provide the users with interactive, Speedy, functional and more usable user interfaces.
  • Designed the Front-end screens using JSP, HTML, CSS and JSON.
  • Created and maintained the configuration of the Spring IOC Container.
  • Developed Business Layer and DAO Classes and wired them using Spring Framework.
  • Integrated spring (Dependency Injection) among different layers of an application
  • Integrated Hibernate with Spring for persistence layer
  • Spring AOP for cross cutting concerns like logging and exception handling
  • Developed SOAP based Web Services.
  • Developed and deployed EJB like Entity Beans and Session Beans.
  • Supported the applications through debugging, fixing and maintenance releases.
  • Involved in the Maintenance and Code changes from existing code, Support of the System.
  • Involved in the configuration management using SVN.
  • Jenkins and Maven scripts for automating the process of building, testing and deploying the system.
  • Developed the test cases using JUnit to test.
  • Deployed application in WebLogic Application Server.
  • Created several Exception classes to catch the error for a bug free environment and logged the whole process using log4j, which gives the ability to pinpoint the errors.
  • Involved in communicating with offshore team to resolve the applications production issues and to deliver the best quality application enhancements to the client.

Environment: Java, J2EE, HTML, CSS, JavaScript, AngularJS, JSP, JSON, AJAX, Servlets, Spring, Spring MVC, Hibernate, SOAP, Jenkins, Maven, JUnit, SVN, WebLogic, Log4j

Confidential

Junior Java Developer

Responsibilities:

  • Involved in requirements gathering and analysis from the existing system. Captured requirements using Use Cases and Sequence Diagrams.
  • Designed web portals using HTML & used Java script, Angularjs, AJAX.
  • Used Spring IOC for dependency injection and Spring AOP for cross cutting concerns like logging, security, transaction management.
  • Integrated Spring JDBC for the persistence layer
  • Developed code in annotation driven Spring IoC and Core Java (extensive use of Collection framework and Multithreading using Executor Framework, Callable, and Future).
  • Developed DAO Classes and written SQL for accessing Data from the database
  • Used XML for the data exchange and developed Web Services.
  • Deployment of the application into Web Sphere Application Server.
  • Implemented ANT build tools to build jar and war files and deployed war files to target servers.
  • Implemented test cases with JUnit.
  • Used RAD for developing and debugging the application
  • Utilized Rational Clear Case as a version control system and for code management
  • Coordinated with the QA team and participated in testing.
  • Involved in Bug Fixing of the application.

Environment: HTML, JavaScript, AngularJS, AJAX, Spring, WebSphere, ANT, JUnit, RAD, Clearcase.

We'd love your feedback!