Hadoop/spark Developer Resume
TX
SUMMARY:
- 8+ years of professional experience in software development with around 5 years of experience in developing Big data applications using Hadoop and Spark.
- Strong expertise in big data ecosystem like Spark, HIVE, SQOOP, HDFS, Map Reduce, Kafka, Oozie, Yarn, Pig, Hbase, Flume.
- Strong expertise in building scalable applications using various programming languages (Java, Scala and python).
- In depth Knowledge on Architecture of Distributed systems and parallel computing.
- Experience implementing end to end data pipelines for serving reporting and data science capabilities.
- Experienced in working with Cloudera, Hortonworks and Amazon EMR clusters.
- Experience in fine tuning applications written in Spark and Hive and to improve the overall performance of the pipelines.
- Developed production ready spark applications using Spark RDD apis, Data frames, Datasets, Spark SQL and Spark Streaming.
- In depth knowledge on import/export of data from Databases using Sqoop.
- Well versed in writing complex hive queries using analytical functions.
- Knowledge in writing custom UDF’s in Hive to support custom business requirements.
- Solid experience in using the various file formats like CSV, TSV, Parquet, ORC, JSON and AVRO.
- Experience in using the compression techniques like Gzip,Snappy with in Hadoop.
- Strong knowledge of NoSQL databases and worked with HBase, Cassandra and Mongo DB.
- Experience in using the cloud services like Amazon EMR, S3, EC2, Red shift and Athena.
- Extensively used various IDE’s like IntelliJ, NetBeans and Eclipse
- Proficient in using RDBMS concepts with Oracle, MySQL, DB2, Teradata and experienced in writing SQL queries.
- Knowledge in writing shell scripts and scheduling using cron jobs.
- Experience working with GIT(Repository), Jenkins and Maven build tools.
- Developed cross - platform applications using JAVA, JSP, Servlets, Hibernate, RESTful, JDBC, JavaScript, XML, and HTML.
- Used Log4J for enabling runtime logging and performed system integration test to ensure quality of the system.
- Experience in using SOAP UI tool to validate the web service.
- Expertise in writing unit test cases using JUnit API.
- Experienced in using Selenium for testing.
- Highly self-motivated, good technical, communications and interpersonal skills. Able to work reliably under pressure. Committed team player with strong analytical and problem solving skills, ability to quickly adapt to new environments & technologies.
TECHNICAL SKILLS:
Big Data Ecosystem: Spark, HIVE, SQOOP, HDFS, Map Reduce, Kafka, Oozie, Yarn, Pig, Hbase, Flume.
Database: SQL Server, MySQL, Oracle, DB2, TeraData
NO SQL Databases: HBase, Cassandra, MongoDB
AWS technologies: EMR, S3, Red shift, Athena
Programming Languages: Java, Scala, Python, SQL, Pig Latin, HiveQL, Shell Scripting.
Web and Application Servers: Web logic, Web Sphere, JBoss, Tomcat
IDE: Eclipse,IntelliJ, NetBeans
Operating Systems: UNIX, Windows, Mac, LINUX
Web Technologies: HTML, XHTML, CSS, JavaScript, Ajax
Software development Methodologies: Agile Model, Waterfall Model
WORK EXPERIENCE:
Confidential, TX
Hadoop/Spark Developer
Responsibilities:
- Created Sqoop Scripts to import and export customer profile data from RDBMS to S3 buckets.
- Built custom Input adapters to migrate click stream data from FTP servers to S3.
- Developed various enrichment applications in spark using scala for performing cleansing and enrichment of click stream data with customer profile lookups.
- Troubleshooting Spark applications for improved error tolerance and reliability.
- Used Spark Dataframe and Spark SqlAPI to implement batch processing of Jobs.
- Used Apache Kafka and Spark Streaming to get the data from adobe live stream rest api connections.
- Automated creation and termination of AWS EMR clusters.
- Worked on fine tuning and performance enhancements of various spark applications and hive scripts.
- Used various concepts in spark like broadcast variables, caching, dynamic allocation etc to design more scalable spark applications.
Environment: AWS EMR, S3, Spark, Hive, Sqoop, Scala,Java, MySQL, Oracle DB, Athena, Redshift.
Confidential, PA
Hadoop Developer
Responsibilities:
- Extensively worked in Sqoop to migrate data from RDBMS to HDFS .
- Ingested data from various source systems like Teradata, MySQL, Oracle databases.
- Developed Spark application to perform Extract Transform and load using Spark RDD and Data frames.
- Created Hive external tables on top of data from HDFS and wrote ad-hoc hive queries to analyze the data based on business requirements.
- Utilized Partitioning and Bucketing in Hive to improve hive query processing times.
- Performed incremental data ingestion using sqoop as existing application is generating data on daily basis. .
- Migrated/re-implemented Map Reduce jobs to Spark applications for better performance.
- Handled data in different file formats like Avro and Parquet.
- Extensively used Cloudera Hadoop distributions within the project.
- Used GIT for maintaining/versioning the code.
- Created Oozie workflows to automate the data pipelines
Environment: Cloudera(CDH 5.x), Spark, Scala, Sqoop, Oozie, Hive, HDFS, MySQL, Oracle DB, Tera Data
Confidential, OH
Hadoop Developer
Responsibilities:
- Wrote complex Map Reduce jobs to perform various data cleansing and ETL like processing on the data.
- Worked on different file formats like Text, Avro, Parquet using Map Reduce Programs.
- Developed Hive Scripts to create partitioned tables and create various analytical datasets.
- Worked with cross functional consulting teams within the data science and analytics team to design, develop and execute solutions to derive business insights and solve client operational and strategic problems.
- Extensively used Hive queries to query data in Hive Tables and loaded data into HBase tables.
- Exported the processed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Used Hive Partitioning and Bucketing concepts to increase the performance of Hive Query processing.
- Designing Oozie workflows for job scheduling and batch processing.
- Helped analytics team by writing Pig and Hive scripts to perform further detailed analysis of the data processed.
Environment: Java, HDFS, MapReduce, Hive, Pig, MySQL, CDH, IntelliJ, YARN, Sqoop, Hbase, Unix Shell Scripting.
Confidential, PA
Java Developer
Responsibilities:
- Involved in the analysis, design, and development and testing phases of application using AGILESCRUM methodology.
- Implemented MVC architecture application using Spring, JSP & Java beans.
- Extensively Used JavaScript, AngularJS and Ajax to provide the users with interactive, Speedy, functional and more usable user interfaces.
- Designed the Front-end screens using JSP, HTML, CSS and JSON.
- Created and maintained the configuration of the Spring IOC Container.
- Developed Business Layer and DAO Classes and wired them using Spring Framework.
- Integrated spring (Dependency Injection) among different layers of an application
- Integrated Hibernate with Spring for persistence layer
- Spring AOP for cross cutting concerns like logging and exception handling
- Developed SOAP based Web Services.
- Developed and deployed EJB like Entity Beans and Session Beans.
- Supported the applications through debugging, fixing and maintenance releases.
- Involved in the Maintenance and Code changes from existing code, Support of the System.
- Involved in the configuration management using SVN.
- Jenkins and Maven scripts for automating the process of building, testing and deploying the system.
- Developed the test cases using JUnit to test.
- Deployed application in WebLogic Application Server.
- Created several Exception classes to catch the error for a bug free environment and logged the whole process using log4j, which gives the ability to pinpoint the errors.
- Involved in communicating with offshore team to resolve the applications production issues and to deliver the best quality application enhancements to the client.
Environment: Java, J2EE, HTML, CSS, JavaScript, AngularJS, JSP, JSON, AJAX, Servlets, Spring, Spring MVC, Hibernate, SOAP, Jenkins, Maven, JUnit, SVN, WebLogic, Log4j
Confidential
Junior Java Developer
Responsibilities:
- Involved in requirements gathering and analysis from the existing system. Captured requirements using Use Cases and Sequence Diagrams.
- Designed web portals using HTML & used Java script, Angularjs, AJAX.
- Used Spring IOC for dependency injection and Spring AOP for cross cutting concerns like logging, security, transaction management.
- Integrated Spring JDBC for the persistence layer
- Developed code in annotation driven Spring IoC and Core Java (extensive use of Collection framework and Multithreading using Executor Framework, Callable, and Future).
- Developed DAO Classes and written SQL for accessing Data from the database
- Used XML for the data exchange and developed Web Services.
- Deployment of the application into Web Sphere Application Server.
- Implemented ANT build tools to build jar and war files and deployed war files to target servers.
- Implemented test cases with JUnit.
- Used RAD for developing and debugging the application
- Utilized Rational Clear Case as a version control system and for code management
- Coordinated with the QA team and participated in testing.
- Involved in Bug Fixing of the application.
Environment: HTML, JavaScript, AngularJS, AJAX, Spring, WebSphere, ANT, JUnit, RAD, Clearcase.