We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Eden Prairie, MN

SUMMARY:

  • Hadoop developer having about 7+ years of experience in IT with 4+ years on Hadoop and 3 years as a Java developer
  • Strong experience with Big Data and Hadoop technologies with excellent knowledge of Hadoop ecosystem: Hive, Spark, Sqoop, Impala, Pig, HBase, Kafka, Flume, Storm, Zookeeper, Oozie, and Sentry.
  • Deep understanding of Hadoop architecture (HDFS, YARN, MapReduce) along with their insight internal operations.
  • Experience in planning, designing, installing, configuring, supporting and managing large Hadoop clusters.
  • Developed several Java/ J2EE applications and involved throughout the software development lifecycle (SDLC) utilizing different software approaches like Agile(Scrum), Waterfall and Test - Driven Development.
  • Proficient in writing optimized OLAP HiveQL queries using various concepts like Partitioning, Bucketing, and Windowing and have vast exposure of doing operations on Hive tables.
  • Experienced on Spark and performed various transformations and actions on large datasets using RDDs.
  • Hands on experience with Spark SQL and batch Spark Streaming using Data frames/ Datasets and D-Stream RDDs.
  • Commendable knowledge/experience in importing and exporting data using Sqoop between HDFS/Hive and Relational Database Systems (RDBMS)
  • Experienced in Tableau and QlikView and generated reports.
  • Worked along with BI team to export analyzed data to relational databases for deeper insights into the data.
  • Worked on HBase NoSQL database to load large sets of structured, semi-structured and unstructured data coming from variety of data sources.
  • Hands on experience with SQL, NoSQL and Data Warehouses such as MySQL, MS SQL Server, Oracle, Cassandra , MongoDB, Teradata and Netezza.
  • Excellent knowledge of Linux Shell scripting.
  • Ingested data (esp. log data) from different sources to HDFS using Flume and Kafka.
  • Experienced in job workflow scheduling and monitoring tools like Oozie and ZooKeeper.
  • Developed Map/Reduce jobs using Java to process large data sets by fitting the problem into the Map/Reduce programming paradigm.
  • Worked with Big Data Hadoop distributions like Cloudera (CDH4 and CDH5) with Cloudera Manager and HUE for managing and performing operations, Hortonworks with Ambari.
  • Wrote UDFs in Java, Python and implemented many spark jobs using Scala.
  • Developed core modules in large cross-platform applications using Core Java, J2EE, Spring(Framework, Web Services, MVC), Struts, Hibernate, JMS, SOAP and REST.
  • Installed Hadoop cluster on cloud (AWS) using Amazon EMR, EC2 instances, S3 for storage, monitored and controlled using IAM (Identity and Access Management) and deployed Cloudera Manager and CDH on to the cluster.
  • Worked on Talend Open Studio Data and Big Data integration and Preparation tools. Designed and performed ETL Jobs using Talend Open Studio.
  • Performed analytical operations on IBM Netezza data warehouse by using optimized Netezza SQL queries and added SPUs to increase the parallel processing power and memory volume.
  • Developed queries for performing massively parallel processing jobs on Teradata AMP nodes by creating, loading, indexing and fetching table data for fast lookup responses.
  • Used Agile (Scrum), Waterfall model along with automation and enterprise tools like Jenkins, Chef JIRA, Confluence to develop projects and version control, Git.

TECHNICAL SKILLS:

HADOOP: HDFS, MapReduce, YARN, Hive, Spark (Core, SQL, Streaming), Impala, Sqoop, Pig, Flume, HBase, Storm, Kafka, ZooKeeper, Oozie, Tez, Sentry

Programming Languages: Java, J2EE, Python, Scala, Shell Scripting

J2EE Tech./ Frameworks: Servlets, JSP, Spring (Framework, MVC, web services ), Struts 2, Hibernate, JPA, JSF, JAX-WS, JAX-RS

Hadoop Distribution & tools: Cloudera (Cloudera Manager, HUE), Hortonworks (Ambari)

Databases/Data Warehouses: MySQL, MS SQL Server, Oracle, MS Access, HBase, Cassandra, MongoDB, Teradata, Netezza

Visualization: Tableau, QlikView

Operating System: Linux (CentOS, Ubuntu, Red Hat), Windows, Windows Server 2008

Tools: Talend Open Studio 6, JUnit, Jenkins, Maven, Ant, Chef, Cygwin, Docker, Putty, Git, MS Visual Studio, IDEs (Eclipse, IntelliJ IDEA, PyCharm, NetBeans)

Web Technologies: JavaScript, AngularJS, Node.js, AJAX, jQuery, XML, PHP, HTML5, CSS3, Bootstrap, JSON, REST, SOAP

SDLC & Design Patterns: Agile(Scrum), Waterfall, MVC, Singleton, Factory

PROFESSIONAL EXPERIENCE:

Senior Hadoop Developer

Confidential, Eden Prairie, MN

Responsibilities:

  • Installed, Configured and Maintained Apache Hadoop cluster for application development and Hadoop tools like Hive, Sqoop, Spark, Flume, Kafka, Oozie, HBase, Zookeeper, and Sqoop.
  • Used Sqoop to import data into HDFS and Hive from multiple relational database systems for storage and to perform operations and exported the results back.
  • Involved in the process of Cassandra data modeling and building efficient data structures.
  • Assisted the project manager in problem-solving with Big Data technologies for integration of Hive with HBase and Sqoop with HBase.
  • Involved in developing Shell scripts to orchestrate execution of all other scripts and move the data files within and outside of HDFS.
  • Involved in troubleshooting and performance tuning of reports and resolving issues within Tableau server and generated reports.
  • Created action filter parameters and calculated sets for preparing dashboards and worksheets in Tableau.
  • Configured, designed, implemented and monitored Kafka cluster and connectors.
  • Implemented proof of concepts ( Confidential ) using Kafka, Strom, HBase for processing streaming data.
  • Involved in migrating MapReduce jobs in to Spark jobs and used Spark SQL to load Structured and semi-structured data into Spark Clusters.
  • Loaded data from HDFS, calculated results using Spark and loaded them back to HDFS.
  • Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDDs.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the data got from Kafka and persist into Cassandra database.
  • Developed multiple POC’s using Spark and deployed on the Yarn cluster, compared the performance of Spark with Cassandra and SQL.
  • Created POC to store Server Log data into Cassandra to identify System Alert Metrics.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Migrated Complex map-reduce programs into in-memory Spark processing using transformations and actions.
  • Created Partitioned and Bucketed Hive tables in Parquet file format with Snappy compression. Also, loaded data into Parquet Hive tables from Avro Hive tables.
  • Designed and developed Pig and Hive scripts with Java and Python UDF’S to implement business logic to transform and get required results from the ingested data.
  • Automated Sqoop, Hive, Spark jobs using Oozie.
  • Implemented custom interceptors for Flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Recovered from node failures and troubleshoot common Hadoop cluster issues.
  • Scripting Hadoop package installation and configuration to support fully automated deployments.
  • Developed a data pipeline using Kafka, Spark, and Hive to ingest, transform and analyze data.
  • Worked on Talend Open Studio and Talend Integration Suite. Developed and designed ETL Jobs using Talend Integration Suite in Talend 5.2.2.
  • Exported the aggregated data onto Oracle using Sqoop for reporting on the Tableau dashboard.
  • Involved in QA support activities, Test data creation, and Unit testing activities.
  • Performance tuning using Partitioning, bucketing of Impala tables.
  • Involvement in design, development and testing phases of Software Development Life Cycle.
  • Performed Hadoop installation, updates, patches and version upgrades when required.
  • Utilized Agile Scrum Methodology to help manage and organize a team with regular code review sessions.
  • Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: Hadoop 2.x, Hive, Spark, Pig, Sqoop, Oozie, Java 8, CDH 4,5, Cassandra, Oracle 10g, 11g, Flume, Kafka, Flume, Impala, Oozie, Scala, Talend Open Studio 5.2.2, Teradata 15.x, Tableau, Maven.

Senior Hadoop Developer

Confidential, Orlando, FL

Responsibilities:

  • Maintained all the failure logs in Hadoop along with the process and history of applied solutions. The proactive model uses the log history to assess the nature of the fault and then applies the corresponding solution from the history.
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
  • Loaded data in HDFS from web server log data using Apache Flume. Designed and implemented custom writable, custom input formats, custom partitions and custom comparators in MapReduce.
  • Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files. Implemented UDFs, UDAFs, UDTFs in Java for Hive to process the data that can't be performed using Hive inbuilt functions.
  • Effectively used Oozie to develop automatic workflows of Sqoop, MapReduce, and Hive jobs.
  • Created Hive tables to store the processed results in tabular format. Written Map Reduce code that will take input as log files and parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Created External Hive Table on top of parsed data. Involved in gathering the requirements, designing, development, and testing.
  • Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
  • Weekly meetings with technical collaborators and active participation in ELT code review sessions with senior and junior developers. Loaded and analyzed Omniture logs generated by different web applications.
  • Loaded and transformed large sets of structured, semi-structured and unstructured data in various formats like text, sequence, XML, and JSON . Written multiple MapReduce programs to power data for extraction, transformation, and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
  • Defined job flows and developed simply too complex Map Reduce jobs as per the requirement.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
  • PIG UDF was required to extract the information of the area from the huge data which we get from the sensors.
  • Responsible for creating Hive tables based on business requirements. Implemented Static Partitions, Dynamic Partitions, and Buckets in HIVE for efficient data access.
  • Design and Developed Automation Test Scripts using Python.
  • Experience with Cloud Computing Service environment like Amazon Web Services (AWS) for creating a cluster on cloud using Amazon EC2 instances and Amazon EMR and deployed Cloudera Manager and CDH on to the cluster.
  • Involved in NoSQL database design, integration, and implementation. Loaded data into NoSQL database HBase.

Environment: Hadoop 2.x, Map Reduce, Sqoop, Hive, Pig, HBase, Oozie, Flume, Java 8, JMS, MAHOUT NAIVE BAYES ALGORITHM, AWS (EC2, EMR, S3, IAM), Python 3.

Hadoop Developer

Confidential, Las Vegas, NV

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Responsible for Cluster maintenance, adding and removing cluster nodes. Cluster monitoring and troubleshooting, managing and reviewing data backups and log files.
  • Analyzed data using Hadoop components Hive and Pig.
  • Worked with Talend Open Studio to perform ETL jobs.
  • Generated reports using QlikView.
  • Wrote Map Reduce code in Java for performing large data operations to process terabytes of data.
  • Wrote several Hive queries to get valuable information from the hidden large datasets.
  • Loaded and transformed large sets of structured, semi-structured and unstructured data using Hadoop/Big Data concepts
  • Responsible for creating Hive tables, loading data and writing Hive queries.
  • Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
  • Extracted data from Teradata database into HDFS using Sqoop.
  • Exported the patterns analyzed back to Teradata using Sqoop.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.

Environment: Hadoop 2.x, HDFS, Hive, Pig, Sqoop, Map Reduce, HBase, Shell Scripting, QlikView, Teradata 14.0, Oozie, Java 7, Maven 3.x.

Hadoop/Java Developer

Confidential, Winston Salem, NC

Responsibilities:

  • Worked with several clients with day to day requests and responsibilities.
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Involved in analyzing system failures, identifying root causes and recommended course of actions.
  • Worked on Hive for exposing data for further analysis and for generating transformation files from different analytical formats to text files.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Managed and scheduled Jobs on a Hadoop cluster.
  • Implemented and maintained various projects in Java.
  • Utilized Java and MySQL from day to day to debug and fix issues with client processes.
  • Developed, tested, and implemented financial-services application to bring multiple clients into standard database format.
  • Assisted in designing, building, and maintaining database to analyze life cycle of checking and debit transactions.
  • Excellent JAVA, J2EE application development skills with strong experience in Object Oriented Analysis.
  • Extensively involved throughout Software Development Life Cycle (SDLC).
  • Strong experience of J2SE, XML, Web Services, WSDL, SOAP and, TCP/IP.
  • Strong experience of software and system development using JSP, Servlet, JSF, EJB, JDBC, Struts, Maven, Subversion, Trac, JUnit and, SQL language.
  • Rich experience of database design and hands-on experience of large database systems: Oracle 8i and Oracle 9i, DB2, PL, SQL.
  • Hands-on experience of Web Logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology.

Environment: Java 7, Maven 3.x, Hive, Pig, HBase, Zookeeper, Sqoop, Cloudera, Java, JDBC, Struts 2, Trac, Subversion, JUnit, SQL, Spring, Hibernate, Junit, Oracle, XML, Putty and Eclipse

Java/J2EE Developer

Confidential

Responsibilities:

  • Responsible for the analyzing, documenting the requirements, designing and developing the application based on J2EE standards. Strictly Followed Test Driven Development.
  • Used Microsoft Visio for designing use cases like Class Diagrams, Sequence Diagrams, and Data Models.
  • Processed the Design of ERD (Entity Relationship Diagrams) for the Relational database.
  • Extensively developed user interface using HTML, JavaScript, jQuery, AJAX and CSS on the front end.
  • Designed Rich Internet Application by implementing jQuery based accordion styles.
  • Used JavaScript for the client-side web page validation.
  • Used Spring MVC and Dependency Injection for handling presentation and business logic. Integrated Spring DAO for data access using Hibernate.
  • Developed Struts web forms and actions for validation of user request data and application functionality.
  • Developed programs for accessing the database using JDBC thin driver to execute queries, prepared statements, Stored Procedures and to manipulate the data in the database.
  • Created tile definitions, Struts configuration files, validation files and resource bundles for all modules using Struts framework.
  • Involved in the coding and integration of several business-critical modules using Java, JSF, and Hibernate.
  • Developed SOAP-based web services for communication between its upstream applications.
  • Extensively used SQL, PL/SQL, Triggers, and Views using IBM DB2.
  • Implemented different Design patterns like DAO, Singleton Pattern and MVC architectural design pattern of spring
  • Implemented Service Oriented Architecture (SOA) on Enterprise Service Bus (ESB).
  • Developed Message-Driven Beans for asynchronous processing of alerts using JMS.
  • Implemented Rational Rose tool for application development.
  • Used Clear case for source code control and JUnit for unit testing.
  • Performed integration testing of the modules.
  • Used putty for UNIX login to run the batch jobs and check server logs.
  • Deployed application on to Glassfish Server.
  • Involved in peer code reviews.

Environment: Java 6,7, J2EE, Struts 2, Glassfish, JSP, JDBC, EJB, ANT, XML, IBM Web Sphere, JUnit, IBM DB2, Rational Rose 7, CVS, UNIX, SOAP, SQL, PL/SQL.

Java/ J2EE Developer

Confidential

Responsibilities:

  • Involved in the code review meetings with the developers.
  • Worked with Agile methodology.
  • Developed and analyzed the front-end and back-end using JSP, Servlets, and Spring.
  • Integrated Spring (Dependency Injection) among different layers of an application.
  • Used Spring framework for dependency injection, transaction management.
  • Used Spring MVC framework controllers for Controllers part of the MVC. The flow of application controlled by controllers.
  • Extensively used JBoss for deployment purposes and used MongoDB (NoSQL) for JBoss Caching.
  • Coordinated with multiple teams to resolve escalations.
  • Built the backend services, which will be consumed by action classes of studs.
  • Created SOAP web services to allow communication between the applications.
  • Used Java Message Service (JMS) for reliable and asynchronous exchange of important information, such as loan status report.
  • Worked on Credit Card transactions.
  • Implemented various complex PL/SQL queries.
  • Generated OTP using Twilio Service.
  • Worked with testers in resolving defects in the application and was an integral part of the team.
  • Interacted with Business Analysts to come up with better implementation designs for the application.
  • Interacted with the users in case of technical problems & mentoring the business users.
  • Worked with the ISP Site Development to get any infrastructure related issues fixed.
  • Implement the best practices and performance improvement/productivity plans.
  • Co-ordination of activities between off-shore and onsite teams.
  • Developed the presentation layer and content management framework using HTML and JavaScript.

Environment: Java 6, J2EE, Servlets, JMS, Spring, SOAP Web Services, HTML, JavaScript, JDBC, Agile Methodology, PL/SQL, XML, UML, UNIX, No SQL, JBoss, Apache Tomcat, Eclipse, PostgreSQL.

We'd love your feedback!