We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

5.00/5 (Submit Your Rating)

Lebanon, NJ

SUMMARY:

  • Above 9+ years of software development experience on Big Data Technologies, Hadoop Ecosystem and Java/J2EE Technologies.
  • Good understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
  • Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark - SQL, Data Frame, Pair RDD's, Spark Yarn.
  • Developing applications using Scala, Spark SQL and MLlib libraries along with Kafka and other tools as per requirement then deployed on the Yarn cluster.
  • Adequate knowledge and working experience in Agile & Waterfall methodologies.
  • Strong working experience with ingestion, storage, processing and analysis of big data.
  • Experience in working with databases, such as Oracle, SQL Server, My SQL.
  • Extensive experience with ETL and Query tools for Big Data like Pig Latin and HiveQL.
  • Experience in developing front-end systems with HTML5, JavaScript, CSS3, Bootstrap, JSON, JQuery and Ajax.
  • Good knowledge in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Responsible for deploying the scripts into Github version control repository hosting service and deployed the code using Jenkins.
  • Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Spark SQL using Scala.
  • Proficient in Java, Collections, J2EE, Servlets, JSP, spring, Hibernate, JDBC/ODBC.
  • Experience in developing data pipeline using Kafka, Spark and Hive to ingest, transform and analyzing data.
  • Experience in Data modeling and connecting Cassandra from Spark and saving summarized data frame to Cassandra.
  • Developing and Maintenance the Web Applications using the Web server Tomcat, IBM WebSphere.
  • Experience in job workflow scheduling and monitoring tools like Oozie, Nifi.
  • Experience in Front-end Technologies like Html, CSS, Html5, CSS3, and Ajax.
  • Experience in building high performance and scalable solutions using various Hadoop ecosystem tools like Pig, Hive, Sqoop, Spark, Solr and Kafka.
  • Extensive experience in working with Oracle, MS SQL Server, DB2, MySQL RDBMS databases.
  • Well verse and hands on experience in Version control tools like GIT, CVS and SVN.
  • Expert in implementing advanced procedures like text analytics and processing using Apache Spark written in Scala.
  • Successfully loaded files to HDFS from Oracle, Sql Server and Teradata using Sqoop.
  • Working with Sqoop in importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
  • Experience with working of cloud configuration in (Amazon web services) AWS.
  • Experience on working structured, unstructured data with various file formats such as Avro data files, xml files, JSON files, sequence files, ORC and Parquet.
  • Experience with Oozie Workflow Engine to automate and parallelize Hadoop, MapReduce and Pig jobs.

TECHNICAL SKILLS:

Hadoop/Big Data: MapReduce, HDFS, Hive 2.3, Pig 0.17, HBASE 1.2, Zookeeper 3.4, Sqoop 1.4, Oozie, Flume 1.8, Scala 2.12, Kafka 1.0, Storm, MongoDB 3.6, Hadoop 3.0, Spark, Cassandra 3.11, Impala 2.1, Control-M

Languages: Java, Python, Scala, Hive QL, SQL, Pig Latin

No SQL databases: NoSQL Databases, SQL Server2016, MYSQL5.7, Oracle12c, PostgreSQL, MongoDB

Web Technologies: Web Technologies HTML5, DHTML, XML, AJAX, WSDL, SOAP

IDE and Build Tools: Eclipse, NetBeans, MS Visual Studio, Ant, Maven, Jira, Confluence

Version Control: Git, SVN, CVS

Tools: Eclipse, IntelliJ, GIT, NetBeans, Jenkins, JIRA, Micro strategy (BI).

Reporting tools: Tableau, Talend, Pentaho, Power View, Kibana

Methodologies: RAD, JAD, RUP, UML, System Development Life Cycle (SDLC), Waterfall Model.

Java & J2EE Technologies: Core Java/J2EE, Servlets, JSP, JDBC, JNDI, Java Beans

WORK EXPERIENCE:

Confidential - Lebanon, NJ

Sr. Big Data/Hadoop Developer

Responsibilities:

  • As a Big Data/Hadoop Developer worked on Hadoop eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
  • Responsible for building and configuring distributed data solution using MapR distribution of Hadoop.
  • Involved in complete Big Data flow of the application data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data.
  • Involved in Agile methodologies, daily scrum meetings, sprint planning.
  • Installed and configured Hive, HDFS and the Nifi, implemented CDH cluster. Assisted with performance tuning and monitoring.
  • Wrote Scala based Spark applications for performing various data transformations, Denormalization, and other custom processing.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Created a multi-threaded Java application running on edge node for pulling the raw click stream data from FTP servers and AWS S3 buckets.
  • Worked on conversion of Hive/ SQL queries into Spark transformations using Spark RDDs and data frames.
  • Worked on Installing and configuring the HDP Hortonworks 2.x and Cloudera (CDH 5.5.1) Clusters in Dev and Production Environments.
  • Implemented Kafka High level consumers to get data from Kafka partitions and move into HDFS.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including MapReduce, Hive and spark.
  • Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
  • Export result set from HIVE to MySQL using Sqoop export tool for further processing.
  • Evaluated the performance of Apache Spark in analyzing genomic data.
  • Implemented Hive complex UDF's to execute business logic with Hive Queries.
  • Implemented Impala for data analysis.
  • Prepared Linux shell scripts for automating the process.
  • Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
  • Implemented S3 by creating a bucket and passing the event information to AWS Lambda.
  • Automation of all the jobs starting from pulling the Data from different Data Sources like MySQL and pushing the result dataset to Hadoop Distributed File System and running MR, PIG, and Hive jobs using Kettle and Oozie (Work Flow management).
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Load and transform large sets of structured, semi structured, and unstructured data with Map Reduce, Hive, and Pig.
  • Experienced in managing and reviewing huge Hadoop log files. Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting data cleansing.
  • Created, technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
  • Worked on importing data from MySQL DB to HDFS and vice-versa using Sqoop to configure Hive Metastore with MySQL, which stores the metadata for Hive tables.
  • Responsible for loading the customer's data and event logs from Kafka into HBase using REST API.
  • Used hive data warehouse modeling to interface with BI tools such as Tableau from Hadoop also, enhance the existing applications.
  • Worked with NoSQL databases HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
  • Multiple Spark Jobs were written to perform Data Quality checks on data before files were moved to Data Processing Layer.
  • Migrated complex MapReduce programs into Spark RDD transformations, actions.
  • Implemented the Cassandra and manage of the other tools to process observed running on over Yarn.
  • Built the automated build and deployment framework using Jenkins, Maven etc.
  • Involved in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Deployed data from various sources into HDFS and building reports using Tableau.
  • Extended Hive and Pig core functionality by writing custom UDF’s using Java.
  • Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.

Environment: Hadoop 3.0, HDFS, Nifi, Hive 2.3, Pig 0.17, XML, JSON, Spark, python, MySQL, Sqoop 1.4, NoSQL, HBase 1.2, Kafka 1.0, Tableau, Cassandra 3.11, Yarn, Maven, Jenkins, Scala 2.12, Java, AWS, Lambda

Confidential. - Merrimack, NH

Big Data/Hadoop Developer

Responsibilities:

  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
  • Ingested terabytes of click stream data from external systems like FTP Servers and S3 buckets into HDFS using custom Input Adaptors.
  • Implemented end-to-end pipelines for performing user behavioral analytics to identify user-browsing patterns and provide rich experience and personalization to the visitors.
  • Developed Kafka producers for streaming real-time click stream events from external Rest services into topics.
  • Worked with NoSQL database Hive, HBase to create tables and store data.
  • Worked with Apache Nifi to Develop Custom Processors for the purpose of processing and disturbing data among cloud systems
  • Wrote and executed various MySql database queries from Python using Python-MySql connector and MySQL DB package.
  • Worked with NoSQL Cassandra to store, retrieve, and update and manage all the details for Ethernet provisioning and customer order tracking.
  • Developed tools using Python, Shell scripting, XML to automate some of the menial tasks.
  • Worked on setting up Pig, Hive and HBase on multiple nodes and developed using Pig, Hive, HBase and MapReduce.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
  • Created S3 buckets also managing policies for S3 buckets and Utilized S3 bucket and AWS Glacier for storage and backup.
  • Used HDFS File System API to connect to FTP Server and HDFS, S3 AWS SDK for connecting to S3 buckets.
  • Written Scala based Spark applications for performing various data transformations, Denormalization, and other custom processing.
  • Implemented data pipeline using Spark, Hive, Sqoop and Kafka to ingest customer behavioral data into Hadoop platform to perform user behavioral analytics.
  • Created a multi-threaded Java application running on edge node for pulling the raw click stream data from FTP servers and AWS S3 buckets.
  • Developed Spark streaming jobs using Scala for real time processing.
  • Implemented installation and configuration of multi-node cluster on the cloud using Amazon web Services (AWS) on EC2.
  • Worked on data visualization and analytics with research scientist and business stake holders.
  • Imported unstructured data like logs from different web servers to HDFS using Flume and developed MapReduce jobs for log analysis, recommendations and analytics.
  • Converted and loaded local data files into HDFS through the Unix shell.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Involved in creating external Hive tables from the files stored in the HDFS.
  • Optimized the Hive tables utilizing improvement techniques like partitions and bucketing to give better execution Hive QL queries.
  • Used Spark-SQL to read data from hive tables, and perform various transformations like changing date format and breaking complex columns.
  • Wrote spark application to load the transformed data back into the Hive tables using parquet format.
  • Used Oozie Scheduler system to automate the pipeline workflow to exact data on a timely manner.
  • Analyzed large amount of data every day including XML, JSON and Relational files from different data Sources
  • Applied MapReduce framework jobs in Java for data processing by installing and configuring Hadoop, HDFS.
  • Used Oozie Operational Services for batch processing and scheduling workflows dynamically.

Environment: Hive 2.3, Pig 0.17, HDFS, Flume 1.8, MapReduce, Unix, NoSQL, HBase, Nifi, Python, MySQL, Cassandra 3.11, Hive, Scala, Kafka, Impala, Oozie, Oracle

Confidential - Chicago, IL

Sr. J2EE/Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Hive, HBase NoSQL database, and Sqoop.
  • Created the automated build and deployment process for application, re-engineering setup for better user experience, and leading up to building a continuous integration system.
  • Involved in the Complete Software development life cycle (SDLC) to develop the application.
  • Involved in Daily Scrum (Agile) meetings, Sprint planning and estimation of the tasks for the user stories, participated in retrospective and presenting Demo at end of the sprint.
  • Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
  • Used JDBC and Hibernate for persisting data to different relational databases.
  • Implemented application level persistence using Hibernate and spring.
  • Used XML and JSON for transferring/retrieving data between different Applications.
  • Also wrote some complex PL/SQL queries using Joins, Stored Procedures, Functions, Triggers, Cursors, and Indexes in Data Access Layer.
  • Interacted with Business Analysts to understand the requirements and the impact of the the business.
  • Developed screens using JSP, DHTML, CSS, Ajax, JavaScript, Struts, spring, Java and XML.
  • Developed and implemented Swing, spring and J2EE based MVC (Model-View-Controller) framework for the application
  • Worked on deploying Hadoop 2.7.2 cluster with multiple nodes and different big data analytic tools including Pig 0.16, HBase 0.98.23 database and Sqoop HDP2.3.x.
  • Involved in loading data from LINUX file system to HDFS.
  • Experience in reviewing Hadoop log files to detect failures.
  • Exported the analyzed data to the relational databases using Sqoop 2.3.x for visualization and to generate reports for the BI team.
  • Developed continuous flow of data into HDFS from social feeds using Apache Storm Spouts and Bolts.
  • Real streaming the data using Spark 1.6.0 with Kafka 0.10.0.0.
  • Importing and exporting data into HDFS and Hive 2.0 using Sqoop 2.3.x.
  • Implemented new Apache Camel routes and extended existing Camel routes that provide end-to-end communications between the web services and other enterprise back end services
  • Developed code using Core Java to implement technical enhancement following Java Standards.
  • Worked with Swing and RCP using Oracle ADF to develop a search application which is a migration project.
  • Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
  • Implemented Ajax calls using JSF-Ajax integration and implemented cross-domain calls using JQuery Ajax methods.
  • Implemented Object-relational mapping in the persistence layer using Hibernate frame work in conjunction with spring functionality.
  • Worked with NoSQL databases like Base to create tables and store the data Collected and aggregated large amounts of log data using Apache Flume and staged data in HDFS for further analysis.
  • Collaborated with Business users for requirement gathering for building Tableau reports per business needs.
  • Implementing Restful web services architecture for Client-server interaction and implemented respective POJOs for its implementations
  • Used Web Logic for application deployment and Log 4J used for Logging/debugging.
  • Used CVS version controlling tool and project build tool using ANT.

Environment: Hadoop, Hive, Pig, HBase, Sqoop, Spark, MapReduce, Spark, Scala, Oozie, Teradata, SQL, NoSQL, HDFS, Kafka, Flume, Cassandra, Java, XML

Confidential - Malvern, PA

Sr. Java/J2EE Developer

Responsibilities:

  • Worked as a Sr. Java/J2EE professional with extensive back ground in Software Development and Testing Life Cycle.
  • Worked in SDLC methodology followed Waterfall environment including Acceptance Test Driven Design and Continuous Integration/Delivery.
  • Responsible for analyzing, designing, developing, coordinating and deploying web based
  • Designed the data layer using a combination of Soap, Restful web services and occasionally Hibernate ORM.
  • Implemented the project using JAX-WS based Web Services using WSDL, UDDI, and SOAP to communicate with other systems.
  • Monitored the error logs using Log4J Maven is used as a build tool and continuous integration is done using Jenkins.
  • Used complex queries like SQL statements and procedures to fetch the data from the database.
  • Used version control repository GIT and Service now for issue tracking.
  • Developed test cases and performed unit testing using Junit Test cases.
  • Used ANT as build tool and developed build file for compiling the code of creating WAR files.
  • Involved in the development of the User Interfaces using JSP, JQuery and client-side validations using JavaScript and CSS.
  • Implemented Restful Web Services to retrieve data from client side using Micro Services architecture.
  • Worked on Spring IOC, Spring MVC, Web Application Framework, Spring Messaging Framework and Spring AOP to develop application service components.
  • Primarily focused on the spring components such as Spring MVC, Controllers, JQuery, Model and View Objects, View Resolver.
  • Worked on Oracle as the backend database and integrated with Hibernate to retrieve Data Access Objects.
  • Wrote SQL queries, stored procedures, and triggers to perform back-end database operations by using DB2 database.
  • Developed Web Services to communicate to other modules using XML based Soap and WSDL protocols.
  • Designed and developed the application on Eclipse IDE utilizing the Spring framework and MVC Architecture.
  • Used JavaScript’s for client side validations and validation framework for server side validations.
  • Designed and developed front-end using JSP, HTML, CSS, JavaScript and JQuery.
  • Developed multiple formatting, validation utilities in Java, JQuery and JavaScript functions and CSS so that they can be reused across the application.
  • Designed and developed XSLT transformation of components to convert data from XML to HTML. application.
  • Developed the application using Spring MVC Framework that uses Model View Controller (MVC) architecture with JSP as the view.
  • Used Spring MVC for the management of application flow by developing configurable handler mappings, view resolution.
  • Used Spring Framework to inject the DAO and Bean objects by auto wiring the components.
  • Developed front end applications using the HTML, CSS, JavaScript, and JQuery.
  • Created Hibernate mapping files, sessions, transactions, Query and Criteria's to fetch the data from DB.
  • Used Tortoise SVN for Source Control and Version Management.

Environment: Java, J2EE, MVC, JUnit, JavaBeans, HTML, CSS, JavaScript, JQuery, Oracle, Hibernate, SQL, Soap, Eclipse, ANT, Maven

Confidential

Java Developer

Responsibilities:

  • As a Developer in Java, involved in System Requirements analysis and conceptual design.
  • Involved in prototyping, proof of concept, design, Interface Implementation, testing and maintenance.
  • Involved in all the phases of SDLC like Requirement Gathering, Design & Coding, Application implementation, Testing and Deployment.
  • Developed the application on Eclipse and deployed the application on JBoss to integrate run time components and the tools to develop applications.
  • Implement various frameworks like spring framework, Spring JDBC, Hibernate to improve robustness, performance and usability of the application.
  • Designed and developed user interfaces using Spring Framework and Struts MVC framework, JSP, HTML, CSS.
  • Used GWT for loading of Health History pages for good look and feel.
  • Implemented JMS for subscriber Id automatic upload.
  • Used JavaScript for client side validations.
  • Involved in integrating and business layer with DAO layer using ORM tool Hibernate.
  • Responsible for development of configuration, mapping and java beans for Persistent.
  • Developed DAO's to invoke DAP's (Data Access Programs) to access data from .CSV files and to query MySQL database.
  • Extensively used Java Multi-Threading concept for downloading files from a URL.
  • Developed PL/SQL stored procedures and functions and actively involved in the design of views and triggers in the Oracle database.
  • Developed the Restful web services using Spring IOC to provide user a way to run the job and generate daily status report.
  • Involved in writing Spring Configuration XML file that contains declarations and other dependent objects declaration.
  • Built web-based maintenance application to maintain complex specification documents.
  • Developed front end GUI using HTML, CSS, JavaScript and enforced the look and feel
  • Designed Use Cases, Sequence, ER-Diagrams and Class diagrams using Eclipse Plugins.
  • Used JSP, JavaScript, JQuery, Ajax, CSS, and HTML as data and presentation layer technology.
  • Used Hibernate for mapping an object-oriented domain model to a traditional relational database.
  • Implementation Restful web services using JAX-RS.
  • Used Maven script for generating jar files and used them to build work space and for generating classes from the WSDL while creating the web-service.
  • Used JUnit for unit testing of the system and Log4J for logging.
  • Involved in configuring and deploying the application on Weblogic Application Server.

Environment: Java1.5, spring, Eclipse, JBoss, JavaScript, HTML, Bootstrap, CSS, MySQL, PL/SQL, XML, Ajax, Java, Hibernate, Maven, Log4J, Weblogic, JSP, JQuery, Ajax

We'd love your feedback!