We provide IT Staff Augmentation Services!

Sr Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Bentwood, TN

SUMMARY

  • 8 years of overall experience in IT Industry which includes experience in Java Development, Big data technologies and web applications in multi - tiered environment using Hadoop, Spark, Hive, HBase, Pig, Sqoop, J2EE (Spring, JSP, Servlets), JDBC, HTML, CSS and Java Script (Angular JS).
  • 3 years of comprehensive experience in Big Data Analytics using Hortonworks and its ecosystem components.
  • Working knowledge in AWS environment and AWS spark with Strong experience in Cloud computing platforms such as AWS services.
  • Hands on experience in programming languages like C, C++ to create new applications.
  • C++ developer with experience in object-oriented analysis and design (OOAD)
  • Experience in LINUX IDE for C/C++, UNIX Shell Scripting.
  • Extensive experience in Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and Map Reduce concepts.
  • Experience in NoSQL databases including HBase.
  • Experience in building large scale highly available Web Applications. Working knowledge of web services and other integration patterns.
  • Developed Simple to complex Map/reduce streaming jobs using Java language.
  • Used Spark streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in near real time and persist it to HBase.
  • Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis.
  • Experience in managing and reviewing Hadoop log files.
  • Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Configured Kafka to handle real time data, to read and write messages from external programs.
  • Developed end to end data processing pipelines that begin with receiving data using distributed messaging systems Kafka through persistence of data into HBase.
  • Hands on experience in RDBMS, and Linux shell scripting.
  • Developed UDF, UDAF, UDTF functions and implemented it in HIVE Queries.
  • Experience in analyzing data using HiveQL, Pig Latin and Map Reduce.
  • Developed Map Reduce jobs to automate transfer of data from HBase.
  • Knowledge in job work-flow scheduling and monitoring tools like Oozie and Zookeeper.
  • Experienced in Oracle Database Design and ETL with Informatica.
  • Good knowledge in Constructing Restful API’s.
  • Procedures, Functions, Packages, Views, materialized views, function-based indexes and Triggers, Dynamic SQL, ad-hoc reporting using SQL.
  • Business Intelligence (DW) applications.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper, of NoSQL databases such as HBase, Cassandra.
  • Experience in setting up HIVE, PIG, HBASE, and SQOOP on Ubuntu Operating system.
  • Excellent Java development skills using J2EE, spring, J2SE, Servlets, JUnit, JSP, JDBC.
  • Excellent global exposure to various work cultures and client interaction with diverse teams
  • Practical understanding of the Data modeling concepts like star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.
  • Collaborate with data architects for data model management and version control and conduct data model reviews with project team members and create data objects (DDL).

TECHNICAL SKILLS

Programming Languages: Java, Scala, Python Unix Shell Scripting, PL/SQL

J2EE Technologies: Spring, Servlets, JSP, JDBC, Hibernate.

Big Data Ecosystem: HDFS, HBase, Map Reduce, Hive, Pig, Spark, Kafka, StormSqoop, Impala, Cassandra, Oozie, Zookeeper, Flume.

DBMS: Oracle 11g, SQL Server, MySQL, IBM DB2.

Modeling Tools: UML on Rational Rose 4.0

Web Technologies: HTML5, JavaScript, XML, jQuery, Ajax, CSS3.

Web Services: Restful, SOAP.

IDEs: Eclipse, Net beans, WinSCP, Visual Studio and Intellij.

Operating systems: Windows, UNIX, Linux (Ubuntu), Solaris, Centos.

Version and Source Control: CVS, SVN and IBM Rational Clear Case.

Servers: Apache Tomcat, Web logic and Web Sphere. Solr

Frameworks: MVC, Spring, Struts, Log4J, Junit, Maven, ANT.

PROFESSIONAL EXPERIENCE

Confidential, Bentwood, TN

Sr Hadoop Developer

Responsibilities:

  • Building a Data Quality framework, which consists of a common set of model components and patterns that can be extended to implement complex process controls and data quality measurements using Hadoop.
  • Experience working on Solr to develop search engine on unstructured data in HDFS.
  • Used Solr to enabling indexing for enabling searching on Non-primary key columns from Cassandra keyspaces.
  • Created and populated bucketed tables in Hive to allow for faster map side joins and for more efficient jobs and more efficient sampling. Also performed partitioning of data to optimize Hive queries.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data
  • Implemented DDL Curated Data Store logic using Spark Scala and Data frames concepts.
  • Used Spark, hive for implementing the transformations need to join the daily ingested data to historic data.
  • Enhanced the performance of queries and daily running spark jobs using the efficient design of partitioned hive tables and Spark logic.
  • Implemented the Spark Scala code for Data Validation in Hive
  • Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive and Spark.
  • Implemented the automated workflows for all the jobs using the Oozie and shell script.
  • Used Spark SQL functions to move data from stage hive tables to fact and dimension tables in
  • Implemented dynamic partitioning in hive tables and used appropriate file format, compression technique to improve the performance of map reduce jobs.
  • Good understanding of ETL tools and how they can be applied in a Big Data environment.
  • Work with Data Engineering Platform team to plan and deploy new Hadoop Environments and expand existing Hadoop clusters.
  • Collaborate with BI teams to create reporting data structures.
  • Capable of using AWS utilities such as EMR, S3 and cloud watch to run and monitor Hadoop and spark jobs on AWS.
  • Involved in Agile methodologies, daily Scrum meetings, Sprint planning.
  • Experienced in Cloud Services such as AWS EC2, EMR, RDS, S3 to assist with big data tools, solve the data storage issue and work on deployment solution.

Environment: Spark, Scala, Hadoop, Hive, Sqoop, Oozie, Design Patterns, SOLID & DRY principles, SFTP, Code Cloud, Jira, Bash.

Confidential, Dallas, TX

Sr Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • This project will download the data that was generated by sensors from the cars activities, the data will be collected in to the HDFS system online aggregators by Kafka.
  • Experience in creating Kafka producer and Kafka consumer for Spark streaming which gets the data from different learning systems of the patients.
  • Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model.
  • Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing.
  • Experience in AWS to spin up the EMR cluster to process the huge data which is stored in S3 and push it to HDFS. Implemented automation and related integration technologies.
  • Implemented Spark SQL to access hive tables into spark for faster processing of data.
  • Involved in Converting Hive/SQL queries into Spark transformations using Spark RDD, Scala.
  • Used Apache Oozie for scheduling and managing the Hadoop Jobs. Extensive experience with Amazon Web Services (AWS).
  • Developed and updated social media analytics dashboards on regular basis.
  • Worked with Apache Nifi for Data Ingestion. Triggered the shell Script and Schedule them using NiFi
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Worked on migrating PIG scripts and MapReduce programs to Spark Data frames API and Spark SQL to improve performance Involved in moving all log files generated from various sources to HDFS for further processing through Flume and process the files by using some piggy bank.
  • Used Flume to collect, aggregate and store the web log data from different sources like web servers, mobile and network devices and pushed into HDFS. Used Flume to stream through the log data from various sources.
  • Using Avro file format compressed with Snappy in intermediate tables for faster processing of data. Used parquet file format for published tables and created views on the tables.
  • Created sentry policy files to provide access to the required databases and tables to view from impala to the business users in the dev, test and prod environment.

Environment: Hadoop, MapReduce, Cloudera, Spark, Kafka, HDFS, Hive, Pig, Oozie, Scala, Eclipse, Flume, Kinesis, Oracle, UNIX Shell Scripting.

Confidential - Auburn Hills, MI

Hadoop Developer

Responsibilities:

  • Hands on experience in loading data from UNIX file system to HDFS. Also performed parallel transfer of data from landing zone to the HDFS file system using DistCp.
  • Experienced on loading and transforming of large sets of structured and semi structured datafrom HDFS through Sqoop and placed in HDFS for further processing.
  • Designed appropriate partitioning/bucketing schema to allow faster data retrieval during analysis using HIVE.
  • Involved in processing the data in the Hive tables using HQL high-performance, low-latency queries.
  • Transferred the analyzed data across relational database from HDFS using Sqoop enabling BI team to visualize analytics.
  • Developed custom aggregate functions using Spark SQL and performed interactive querying.
  • Managing and scheduling Jobs on a Hadoop cluster using Airflow DAG.
  • Involved in creating Hive tables, loading data and running hive queries in those data.
  • Extensive working knowledge of partitioned table, UDFs, performance tuning, compression related properties in Hive.
  • Work with Data Engineering Platform team to plan and deploy new Hadoop Environments and expand existing Hadoop clusters.
  • Deploy Informatica objects in production repository.
  • Monitor and debug Informatica components in case of failure or performance issues.

Environment: Hadoop technologies (Spark, Hive, Impala, Sqoop), Informatica 9.1, Oracle, Autosys, Unix

Confidential, San Francisco, California

Java Developer

Responsibilities:

  • Monitor and debug Informatica components in case of failure or performance issues.
  • Responsible to analyze functional specifications and to prepare technical design specifications.
  • Involved in all Software Development Life Cycle (SDLC) phases of the project from domain knowledge sharing, requirement analysis, system design, implementation and deployment.
  • Developed REST web services for implementing the business logic for different functionalities in the features that are developed.
  • Utilized CSS, HTML and JavaScript for the development of the front-end screens.
  • Wrote Junit test cases for testing the functionality of the developed web services.
  • Involved in writing the SQL queries to fetch data from database.
  • Utilized Postman for verifying the smooth workflow of the application, how the application is changing with the newly developed functionalities and also verified the output for the web services.
  • User login, search & portfolio created using HTML5, CSS3, JavaScript and jQuery.
  • Extensively worked on both Enterprise and Community edition of MULE ESB. Experience working with Mule API and Runtime manager and RAML.
  • Designed and implemented UI layer using JSP, JavaScript, HTML, DHTML, JSON, XML, XHTML and business logic using Servlets, JSP, SWING, EJBs and J2EE framework.
  • Worked on logging Mechanism Web NMS SNMP API supports logging of the SNMP requests.
  • Responsible for the debugging, fixing and testing the existing bugs related to application.
  • Developed builds using continuous integration server Jenkins.
  • Extensively used GIT for push and pull requests of the code.
  • Actively participated in the daily scrum meetings and bi-weekly retro meetings for knowledge sharing.
  • Wrote DAO classes using spring and Hibernate to interact with database for persistence
  • Used Eclipse for application development.
  • Used JIRA as the task and defect tracking system.
  • Followed Agile Methodologiesto manage the life-cycle of the project. Provided daily updates, sprint review reports, and regular snapshots of project progress.

Environment: Java1.8, JavaScript, HTML, CSS, Spring, Hibernate, REST web services, Junit, Oracle, Eclipse, Tomcat, JIRA, Postman, GIT.

Confidential

Java Developer

Responsibilities:

  • Involved in building and implementing the application using MVC architecture with Java Spring framework.
  • Used Hibernate as the Object-Relational mapping framework in order to simplify the transformation of business data between an application and relational database.
  • UsedJunit as the testing framework. Involved in developing test plans and test cases. Performed unit testing for each module and prepared code documentation.
  • Responsible for testing, analyzing and debugging the software.
  • Applied design patterns and OO design concepts to improve the existing code base.
  • Involved in documentation of the module and project. Involved in providing post-production support.
  • Followed Agile Methodologiesto manage the life-cycle of the project. Provided daily updates, sprint review reports, and regular snapshots of project progress.

Environment: Java, MySQL, Google Web Kit, Spring framework, Hibernate, Eclipse, SVN, Maven, Bugzilla.

Confidential

Java/ J2EE Developer

Responsibilities:

  • Involved in the code review meetings with the developers.
  • Worked with Agile methodology.
  • Developed and analyzed the front-end and back-end using JSP, Servlets, and Spring.
  • Integrated Spring (Dependency Injection) among different layers of an application.
  • Used Spring framework for dependency injection, transaction management.
  • Used Spring MVC framework controllers for Controllers part of the MVC. The flow of application controlled by controllers.
  • Extensively used JBoss for deployment purposes and used MongoDB (NoSQL) for JBoss Caching.
  • Coordinated with multiple teams to resolve escalations.
  • Built the backend services, which will be consumed by action classes of studs.
  • Created SOAP web services to allow communication between the applications.
  • Used Java Message Service (JMS) for the reliable and asynchronous exchange of important information, such as loan status report.
  • Worked on Credit Card transactions.
  • Implemented various complex PL/SQL queries.
  • Generated OTP using Twilio Service.
  • Interacted with Business Analysts to come up with better implementation designs for the application.
  • Interacted with the users in case of technical problems & mentoring the business users.
  • Worked with the ISP Site Development to get any infrastructure related issues fixed.
  • Implement the best practices and performance improvement/productivity plans.
  • Co-ordination of activities between off-shore and onsite teams.
  • Developed the presentation layer and content management framework using HTML and JavaScript.

Environment: Java 6, J2EE, Servlets, JMS, Spring, SOAP Web Services, HTML, JavaScript, JDBC, Agile Methodology, PL/SQL, XML, UML, UNIX, No SQL, JBoss, Apache Tomcat, Eclipse, PostgreSQL.

We'd love your feedback!