We provide IT Staff Augmentation Services!

Senior Hadoop/spark Developer Resume

3.00/5 (Submit Your Rating)

Denver, CO

SUMMARY

  • IT Consultant with 10 years of extensive experience in Operations, developing, maintaining, monitoring and upgrading Hadoop Clusters (Hortonworks distributions).
  • Extensive Retail Domain and Telecom Domain knowledge with primary skillset on Merchandising, Finance, Product Design and Development and Supply Chain Management areas.
  • Good Experience in translating client’s Big Data business requirements and transforming them into Hadoop centric technologies.
  • Hands on experience in installing/configuring/maintaining Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Spark, Kafka, Zookeeper, Hue and Sqoop using Hortonworks.
  • Hands on experience in developing and deploying enterprise - based applications using major components in Hadoop ecosystem like Hadoop 2.x, YARN, Hive, Pig, Map Reduce, Spark, Kafka, Storm, Oozie, HBase, Flume, Sqoop and Zookeeper.
  • Experience in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
  • Experience in converting Hive/SQL queries into Spark transformations using Java. Experience on ETL development using Kafka, Flume, and Sqoop.
  • Built large-scale data processing pipelines and data storage platforms using open-source big data technologies.
  • Experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster.
  • Experience in installing, configuring Hive, its services and Metastore. Exposure to Hive Querying Language, knowledge about tables like importing data, altering and dropping tables.
  • Experience in installing and running Pig, its execution types, Grunt, Pig Latin Editors. Good knowledge about how to load, store, filter data and also combining and splitting data.
  • Experience in tuning and debugging Spark application running.
  • Experience integration of Kafka with Spark for real time data processing.
  • Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the Operations, implementation, administration and support of ETL processes for large-scale Data Warehouses.
  • In depth knowledge about database imports, worked with imported data to populate tables in Hive. Exposure about how to export data from relational databases to Hadoop Distributed File System.
  • Experience in setting up the High-Availability Hadoop Clusters.
  • Good knowledge about planning a Hadoop cluster like choosing the distribution, hardware selection for both master as well as slave nodes and cluster sizing.
  • Experience in developing Shell Scripts for system management.
  • Experience in Hadoop administration with good knowledge about Hadoop features like safe mode, auditing.
  • Responsible for writing J2EE compliant code using Java for an application development effort.
  • Implemented Java and J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object and Service Locator.
  • Experience with Software Development Processes & Models: Agile, Waterfall & Scrum Model.
  • Have good knowledge on sprint planning tools like Rally, Jira and GitHub version control tools as well.
  • Team Player and a fast learner with good analytical and problem solving skills.
  • Self-Starter and Ability to work independently as well as a Team.
  • Experience in UNIX shell scripting and has good understanding of OOPS and Data structures.

TECHNICAL SKILLS

Operating Systems: MSDOS, Win 95/98/NT/2000/XP, Windows 7, Zos, UNIX

Project Management Tools: MS-Project, Unified Modeling Language (UML), Rational Unified Process (RUP)Software Design Life Cycle (SDLC), Agile (SCRUM),KANBAN

Process/Model Tools: Rational Rose, MS Visio, Rally,Jira

Hadoop Technologies: Hadoop/Big Data Technologies HDFS, SPARK, Scala, Hive, Pig, Sqoop, Flume,JavaKafka, Gobblin

Language: JCL, REXX, EXTRIEVE, SQL, COBOL, JDk 1.8, Java/J2EE, JDBC

Database: DB2, MS Access, Oracle 9i, HBase

Database Tools: IBM DB2 Connect,TOAD,SQLDeveloper

Testing Strategies: System Integration Testing, Regression and System Testing

Testing Tools: HP Quality Center, Quality Center

Office Tools: MS Word, MS Excel, MS PowerPoint, MS Access, MS Project

Web related: HTML, XML, VBScript, and Java Script

Others: Tandem (OutSide Overview),TotalSystem(TSYS)

PROFESSIONAL EXPERIENCE

Confidential - Denver, CO

Senior Hadoop/Spark developer

Responsibilities:

  • Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, Rackspace and Open Stack.
  • Experience in Spark (using RDDs, Data frames& SQLs) and Hadoop(using Map-reduce) eco-system with underlying programming language as Scala.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
  • Worked on Spark SQL and Data frames for faster execution of Hive queries using Spark Sql Context.
  • Analyzing the clients existing Hadoop infrastructure and understand the performance bottlenecks and provide the performance tuning accordingly.
  • Defining job flows in Hadoop environment-using tools like Oozie,UC4 for data scrubbing and processing.
  • Loading logs from multiple sources directly into HDFS using tools like Flume.
  • Strong knowledge in administration and development of Hive, Pig with HiveQL and PigLatin scripts respectively.
  • Used Hive and Pig to analyze data in HDFS to identify issues and behavioral patterns
  • Worked with Sqoop in importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
  • Effectively used Oozie to develop automatic workflows of Sqoop, Mapreduce and Hive jobs.
  • Scheduling the Jobs in the UC4 as per the deployments. Setting the workflows in the UC4.
  • Troubleshooting and monitoring the cluster.
  • Worked on Hive quires from Hue environment.
  • Created Hive tables and involved in data loading and writing Hive.
  • Monitored the user jobs from Resource manager and optimizing the long running jobs.
  • Worked on Toad oracle 11.6 for data ingestion.
  • Created Kafka topics, provide ACLs to users and setting up rest mirror and mirror maker to transfer the data between two Kafka clusters.
  • Helped the users to connect to Kerberized Hive from SQL Workbench and BI tools.
  • Written scripts for disk monitoring and logs compression.
  • Handling data import from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and Extracted the data from Mysql into HDFS.
  • Written scripts for automating the processes such as taking periodic backups, setting up user batch jobs.
  • Deployed multi module applications with built tool like Maven and integrated with Continuous integration servers like Jenkins.
  • Developed test cases using JUNIT and configured GIT for maintaining repository for the project

Environment: Hadoop 2.6.0, HDFS, MapReduce, Spark Core, Spark SQL, Scala, Pig 0.14, Hive 1.2.1, Sqoop 1.4.4, Flume 1.6.0,Kafka,Gobblin,Knox0.6.0,Ambari 2.4.1,Storm 0.9.3, JDk 1.8, Java/J2EE, JDBC, JUNIT4, MAVEN 2.0,Databricks.

Confidential - Denver, CO

Senior Hadoop Developer

Responsibilities:

  • Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements and delivered the BRD and TDD documents.
  • Extensively involved in Design phase and delivered Design documents.
  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive HBase database and SQOOP.
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Migration of huge amounts of data from different databases (i.e. Oracle, SQL Server) to Hadoop.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in mapreduce way.
  • Experienced in defining job flows.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Experienced in managing and reviewing the Hadoop log files.
  • Load and Transform large sets of structured and semi structured data.
  • Responsible to manage data coming from different sources.
  • Involved in creating Hive Tables, loading data and writing Hive queries.
  • Utilized Apache Hadoop environment by Hortonworks.
  • Created Data model for Hive tables.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Written helper classes using the Java Collection Framework.
  • Written JUnit Test Cases for the classes developed.
  • Worked on Oozie workflow engine for job scheduling.
  • Did unit testing for newly developed components using JUnit
  • Involvement in Automation Environment setup using Eclipse, java, selenium web driver java language bindings and TestNG jars.
  • Involved in Unit testing and delivered Unit test plans and results documents.

Environment: Hadoop 2.x, HDFS, MapReduce, Pig 0.12.1, Hive 0.13.1, Sqoop 1.4.4, Flume 1.6.0,Unix, JDk 1.8, Java/J2EE, JDBC, Junit, JSON, MAVEN 2.0

Confidential - Minneapolis, MN

Hadoop Developer

Responsibilities:

  • Involved in creating Hive tables, and loading and analyzing data using hive queries.
  • Developed and executed custom MapReduce programs, PigLatin scripts and HQL queries.
  • Implemented Java and J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object and Service Locator.
  • Worked on importing the data from different databases into Hive Partitions directly using Sqoop.
  • Performed data analytics in Hive and then exported the metrics to RDBMS using Sqoop.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Extensively used Pig for data cleaning and optimization.
  • Implemented complex map reduce programs to perform joins on the Map side using distributed cache.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Thoroughly tested Mapreduce programs using MRUnit and Junit testing frameworks.
  • Responsible for troubleshooting issues in the execution of Mapreduce jobs by inspecting and reviewing log files.
  • Extracted Tables from MS SQL Server through Sqoop and placed in HDFS and processed the records.
  • Used Flume to collect and aggregate weblog data from different sources and pushed to HDFS.
  • Deployed multi module applications with built tool like Maven and integrated with Continuous integration servers like Jenkins.

Environment: Hadoop 1.x, HDFS, MapReduce, Pig 0.11, Hive 0.10, Sqoop, Unix, JDk 1.8, Java/J2EE, JDBC, Junit, JSON, MAVEN 2.0

Confidential

Java Developer

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
  • Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.cc
  • Designed different design specifications for application development that includes front-end, back-end using design patterns.
  • Developed proto-type test screens in HTML and JavaScript.
  • Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
  • Developed the application by using the Spring MVC framework.
  • Collection framework used to transfer objects between the different layers of the application.
  • Spring IOC being used to inject the parameter values for the Dynamic parameters.
  • Actively involved in code review and bug fixing for improving the performance.
  • Documented application for its functionality and its enhanced features.
  • Created connection through JDBC and used JDBC statements to call stored procedures.
  • Created UML diagrams like use cases, class diagrams, interaction diagrams, and activity diagrams.
  • Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax.
  • Wrote complex SQL queries and stored procedures.
  • Developed the XML Schema and Web services for the data maintenance and structures. .
  • Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for PostgreSQL database.
  • Used Hibernate ORM framework with Spring framework for data persistence and transaction management.
  • Used struts validation framework for form level validation.
  • Wrote test cases in JUnit for unit testing of classes.
  • Involved in creating templates and screens in HTML and JavaScript.

Environment: Core Java, Eclipse, Java SDK 1.6, XML, JavaScript, HTML/DHTML

Confidential

Developer

Responsibilities:

  • Analyzing abended jobs in CA7, appropriate recovery is followed for all incidents.
  • Analyzing abended jobs in CA7, appropriate recovery is followed for all incidents.
  • Involved in synchronizing primary Finance application with production.
  • Processing testing team requests and adhoc requests on host..
  • Involved in Root cause analysis and permanent fixation of the critical abend.
  • Involved casting of the elements and implementation.
  • Involved loading, unloading of the table based on the request by testing team.
  • Involved in casting the package to move the elements changes into Integration region.
  • Involved in setup the stream for testing.
  • Mapped the NUID to the testing team members.
  • Involved in review of Test Criteria Form which is receiving from testing team for before element implementation.
  • Wrote complex SQL queries and stored procedures.
  • Developed the XML Schema and Web services for the data maintenance and structures.

Environment: Cobol, JCL, DB2, VSAM, Core Java, Eclipse, Java SDK 1.6, XML

We'd love your feedback!