We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

4.00/5 (Submit Your Rating)

St Louis, MO

SUMMARY:

  • Over 8+ years of experience as a Sr. Big Data Developer with skills in analysis, design, development, testing and deploying various software applications.
  • Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 Web Services which provides fast and efficient processing of Teradata Big Data Analytics.
  • Experience in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.
  • Extensive experience on developing Spark Streaming jobs by developing RDD's (Resilient Distributed Datasets) and used Spark SQL as required.
  • Experience on developing Java MapReduce jobs for data cleaning and data manipulation as required for the business.
  • Strong knowledge on Hadoop eco - systems including HDFS, Hive, Oozie, HBase, Pig, Sqoop, Zookeeper etc.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
  • Good experience in developing MapReduce jobs in J2EE /Java for data cleansing, transformations, pre-processing and analysis.
  • Excellent experience in installing and running various Oozie workflows and automating parallel job executions.
  • Experience on Spark and SparkSQL, Spark Streaming, Spark GraphX, Spark MLlib.
  • Extensively development experience in different IDE like Eclipse, NetBeans, and IntelliJ.
  • Extensive experience with advanced J2EE Frameworks such as spring, Struts, JSF and Hibernate.
  • Expertise in JavaScript, JavaScript MVC patterns, Object Oriented JavaScript Design Patterns and AJAX calls.
  • Installation, configuration and administration experience in Big Data platforms Cloudera Manager of Cloudera, MCS of MapR.
  • Strong knowledge in NOSQL column oriented databases like HBase and its integration with Hadoop Cluster.
  • Good knowledge of coding using SQL, SQL Plus, T-SQL, PL/SQL, Stored Procedures/Functions.
  • Experience with all stages of the SDLC and Agile Development model right from the requirement gathering to Deployment and production support.
  • Also have experience in understanding of existing systems, maintenance and production support, on technologies such as Java, J2EE and various databases (Oracle, SQL Server).
  • Experience working with Hortonworks and Cloudera environments.
  • Good knowledge in implementing various data processing techniques using Apache HBase for handling the data and formatting it as required.
  • Experience in developing custom UDFs for Pig and Hive to in corporate methods and functionality of Python/Java into Pig Latin and HQL (HiveQL).
  • Experience with Oozie Scheduler in setting up workflow jobs with MapReduce and Pig jobs.
  • Knowledge of architecture and functionality of NOSQL DB like HBase, Cassandra and MongoDB.
  • Good Understanding on querying datasets, Filtering the data, Aggregations, Joining the disparate datasets and produce ranked or sorted data using Spark RDD, Spark DF, Spark SQL, Hive, Impala.
  • Good at writing custom RDD's in Scala and also implemented design patterns to improve the performance.
  • Experience in analyzing large volume of data using Hive Query Language and also assisted with performance tuning.
  • Experience using middleware architecture using Sun Java technologies like J2EE, JSP, and Servlets.

TECHNICAL SKILLS:

Big Data Ecosystem: MapReduce, HDFS, HIVE 2.3, HBase 1.2 Pig, Sqoop, Flume 1.8, HDP, Oozie, Zookeeper, Spark, Kafka, storm, Hue Hadoop Distributions Cloudera (CDH3, CDH4, CDH5), Hortonworks

Cloud Services: Amazon AWS, EC2, Redshift, MS Azure

Relational Databases: Oracle 12c, MySQL, MS-SQL Server2016

NoSQL Databases: HBase, Hive 2.3, and MongoDB

Version Control: GIT, GitLab, SVN

Java/J2EE Technologies: Servlets, JSP, JDBC, JSTL, EJB, JAXB, JAXP, JMS, JAX-RPC, JAX-WS

Programming Languages: Java, Python, SQL, PL/SQL, AWS, HiveQL, UNIX Shell Scripting, Scala.

Software Development & Testing Life cycle: UML, Design Patterns (Core Java and J2EE), Software Development Lifecycle (SDLC), Waterfall Model and Agile, STLC

Web Technologies: JavaScript, CSS, HTML and JSP.

Operating Systems: Windows, UNIX/Linux and Mac OS.

Build Management Tools: Maven, Ant.

IDE &: Command line tools: Eclipse, IntelliJ, Toad and NetBeans.

PROFESSIONAL EXPERIENCE:

Confidential - St. Louis, MO

Sr. Big Data Developer

Responsibilities:

  • Worked as a Sr. Big Data Developer with Hadoop Ecosystems components.
  • Developed Big Data solutions focused on pattern matching and predictive modeling.
  • Involved in Agile methodologies, daily scrum meetings, spring planning.
  • Primarily involved in Data Migration process using Azure by integrating with GitHub repository and Jenkins.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
  • Involved in designing the row key in HBase to store Text and JSON as key values in HBase table and designed row key in such a way to get/scan it in a sorted order.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Java Persistence API (JPA) framework for object relational mapping which is based on POJO Classes.
  • Responsible for fetching real time data using Kafka and processing using Spark and Scala.
  • Worked on Kafka to import real time weblogs and ingested the data to Spark Streaming.
  • Developed business logic using Kafka Direct Stream in Spark Streaming and implemented business transformations.
  • Maintained Hadoop, Hadoop ecosystems, and database with updates/upgrades, performance tuning and monitoring.
  • Developed customized Hive UDFs and UDAFs in Java, JDBC connectivity with hive development and execution of Pig scripts and Pig UDF’s.
  • Used Hadoop YARN to perform analytics on data in Hive.
  • Developed and maintained batch data flow using HiveQL and Unix scripting
  • Used Oozie and Zookeeper operational services for coordinating cluster and scheduling workflows.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDD, Scala and Python.
  • Configured Spark streaming to receive real time data from Kafka and store the stream data to HDFS using Scala.
  • Developed SQL scripts using Spark for handling different data sets and verifying the performance over MapReduce jobs.
  • Used J2EE design patterns like Factory pattern & Singleton Pattern.
  • Involved in converting MapReduce programs into Spark transformations using Spark RDD's using Scala and Python.
  • Developed and execute data pipeline testing processes and validate business rules and policies
  • Built code for real time data ingestion using Java, MapR-Streams (Kafka) and STORM.
  • Extensively used JQuery to provide dynamic User Interface and for the client side validations.
  • Responsible for defining the data flow within Hadoop eco-system and direct the team in implement them.
  • Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive and Spark.
  • Build large-scale data processing systems in data warehousing solutions, and work with unstructured data mining on NoSQL.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Created Hive tables, and loading and analyzing data using hive queries.
  • Wrote Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
  • Developed Hive queries to process the data and generate the data cubes for visualizing.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries.
  • Used Struts which is an open source MVC framework for creating elegant, modern java web applications.
  • Continuous coordination with QA team, production support team and deployment team.

Environment: Agile, Hadoop 3.0, MS Azure, MapReduce, Java, MongoDB 4.0.2, HBase 1.2, JSON, Scala 2.12, Oozie 4.3, Zookeeper 3.4, J2EE, Python 3.7, JQuery, NoSQL, MVC, Struts 2.5.17, Hive 2.3

Confidential - Chicago, IL

Sr. Spark/Hadoop Developer

Responsibilities:

  • As a Spark/Hadoop Developer worked on Hadoop eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
  • Involved in designing the row key in HBase to store Text and JSON as key values in HBase table and designed row key in such a way to get/scan it in a sorted order.
  • Used Cloud watch logs to move application logs to S3 and create alarms based on a few exceptions raised by applications.
  • Used Kibana, which is an open source based browser analytics and search dashboard for Elastic Search.
  • Maintain Hadoop, Hadoop ecosystems, and database with updates/upgrades, performance tuning and monitoring.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Prepared data analytics processing, and data egress for availability of analytics results to visualization systems, applications, or external data stores.
  • Builds large-scale data processing systems in data warehousing solutions, and work with unstructured data mining on NoSQL.
  • Responsible for design and development of Spark SQL Scripts based on Functional Specifications.
  • Used AWS services like EC2 and S3 for small data sets processing and storage.
  • Provisioning of Cloudera Director AWS instance and adding Cloudera manager repository to scale up Hadoop Cluster in AWS.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, and Scala.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Developed Spark Applications by using Scala, Java, and Implemented Apache Spark data processing project to handle data from various RDBMS and Streaming sources.
  • Wrote Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Used Spark SQL on data frames to access hive tables into spark for faster processing of data.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
  • Responsible for developing data pipeline using flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
  • Developed data pipeline using MapReduce, Flume, Sqoop and Pig to ingest customer behavioral data into HDFS for analysis.
  • Used Different Spark Modules like Spark core, Spark SQL, Spark Streaming, Spark Data sets and Data frames.
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Designed and developed automation test scripts using Python
  • Integrated Apache Storm with Kafka to perform web analytics and to perform click stream data from Kafka to HDFS.
  • Used the Spark -Cassandra Connector to load data to and from Cassandra.
  • Handled importing data from different data sources into HDFS using Sqoop and also performing transformations using Hive, MapReduce and then loading data into HDFS.
  • Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
  • In preprocessing phase of data extraction, we used Spark to remove all the missing data for transforming of data to create new features.
  • Developed the batch scripts to fetch the data from AWS S3 storage and do required transformations in Scala using Spark framework.
  • Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis.
  • Writing Pig-scripts to transform raw data from several data sources into forming baseline data.
  • Analyzed the SQL scripts and designed the solution to implement using Pyspark
  • Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) to study customer behavior.
  • Extracted large volumes of data feed on different data sources, performed transformations and loaded the data into various Targets.
  • Developed data formatted web applications and deploy the script using HTML, XHTML, CSS, and Client- side scripting using JavaScript.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
  • Assisted in Cluster maintenance, Cluster Monitoring, and Troubleshooting, Manage and review data backups and log files.

Environment: Hadoop 3.0, Spark 2.3, MapReduce, Java, MongoDB, HBase 1.2, JSON, Hive 2.3, Zookeeper 3.4, AWS, MySQL, Scala 2.12, Python, Cassandra 3.11, HTML5, JavaScript

Confidential - Eden Prairie, MN

Hadoop Developer

Responsibilities:

  • Extensively worked on Hadoop eco-systems including Hive, Spark Streaming with MapR distribution.
  • Implemented J2EE Design Patterns like DAO, Singleton, and Factory.
  • Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating Hive with existing applications.
  • Worked on NoSQL support enterprise production and loading data into HBase using Impala and Sqoop.
  • Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
  • Build Hadoop solutions for big data problems using MR1 and MR2 in YARN.
  • Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
  • Worked on data using Sqoop from HDFS to Relational Database Systems and vice-versa. Maintaining and troubleshooting.
  • Developed the Java/J2EE based multi-threaded application, which is built on top of the struts framework.
  • Used Spring/MVC framework to enable the interactions between JSP/View layer and implemented different design patterns with J2EE and XML technology.
  • Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.
  • Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
  • Involved in PL/SQL query optimization to reduce the overall run time of stored procedures.
  • Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
  • Implemented the J2EE design patterns Data Access Object (DAO), Session Façade and Business Delegate.
  • Developed Nifi flows dealing with various kinds of data formats such as XML, JSON and Avro.
  • Implemented MapReduce jobs in HIVE by querying the available data.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Cloudera Manager for installation and management of Hadoop Cluster.
  • Collaborated with business users/product owners/developers to contribute to the analysis of functional requirements.
  • Implemented application using MVC architecture integrating Hibernate and spring frameworks.
  • Utilized various JavaScript and JQuery libraries Bootstrap, Ajax for form validation and other interactive features.
  • Involved in converting HiveQL into Spark transformations using Spark RDD and through Scala programming.
  • Integrated Kafka-Spark streaming for high efficiency throughput and reliability
  • Worked in tuning Hive & Pig to improve performance and solved performance issues in both scripts.

Environment: Hadoop 3.0, Hive 2.1, J2EE, JDBC, Pig 0.16, HBase 1.1, Sqoop, NoSQL, Impala, Java, Spring, MVC, XML, Spark 1.9, PL/SQL, HDFS, JSON, Hibernate, Bootstrap, JQuery, JavaScript, Ajax

Confidential - Holmdel, NJ

Java/J2EE Developer

Responsibilities:

  • As a Java/J2EE Developer worked on middleware architecture using Java technologies like J2EE, Servlets, and application servers like Web Sphere and Web logic.
  • Worked as a Java/J2EE Developer to manage data and to develop web applications.
  • Implemented MVC architecture by separating the business logic from the presentation layer using Spring.
  • Involved in Documentation and Use case design using UML modeling include development of Class diagrams, Sequence diagrams, and Use case diagrams.
  • Extensively worked on n-tier architecture system with application system development using Java, JDBC, Servlets, JSP, Web Services, WSDL, Soap, Spring, Hibernate, XML, SAX, and DOM.
  • Extensively used Eclipse IDE for developing, debugging, integrating and deploying the application.
  • Developed UI using HTML, CSS, Bootstrap, JQuery, and JSP for interactive cross browser functionality and complex user interface.
  • Developed Service layer interfaces by applying business rules to interact with DAO layer for transactions.
  • Developed various UML diagrams like use cases, class diagrams, interaction diagrams (sequence and collaboration) and activity diagrams
  • Involved in requirements gathering and performed object oriented analysis, design and implementation.
  • Used Spring Framework for MVC for writing Controller, Validations and View.
  • Provided utility classes for the application using Core Java and extensively used Collection package.
  • Used Core Spring for Dependency Injection of various component layers.
  • Used SOA REST (JAX-RS) web services to provide/consume the Web services from/to down-stream systems.
  • Developed a web-based reporting for credit monitoring system with HTML, CSS, XHTML, JSTL, Custom tags using spring.
  • Developed user interface using JSP, JSP Tag libraries and Struts Tag Libraries to simplify the complexities of the application.
  • Implemented Business Logic using POJO's and used WebSphere to deploy the applications.
  • Used the built tools Maven to build JAR & WAR files and ANT for clubbing all source files and web content in to war files.
  • Worked on various SOAP and RESTful services used in various internal applications.
  • Developed JSP and Java classes for various transactional/ non-transactional reports of the system using extensive SQL queries.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including MapReduce, Hive and Spark.
  • Implemented Storm topologies to pre-process data before moving into HDFS system.
  • Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
  • Involved in configuring builds using Jenkins with Git and used Jenkins to deploy the applications onto Dev, QA environments
  • Involved in unit testing, system integration testing and enterprise user testing using JUnit.
  • Used Maven to build, run and create Aerial-related JARs and WAR files among other uses.
  • Used JUnit for unit testing of the system and Log4J for logging.
  • Worked with production support team in debugging and fixing various production issues.

Environment: Java, spring 3.0, XML, Hibernate 3.0, JavaScript, JUnit, HTML 4.0.1, CSS, Ajax, Bootstrap, Angular.JS, WebSphere, Maven 3.0, Eclipse

Confidential

Java Developer

Responsibilities:

  • Responsible for design and implementation of various modules of the application using Struts-Spring-Hibernate architecture.
  • Created user-friendly GUI interface and Web pages using HTML, CSS and JSP.
  • Developed web components using MVC pattern under Struts framework.
  • Wrote JSPs, Servlets and deployed them on Weblogic Application server.
  • Used JSP's, HTML on front end, Servlets as Front Controllers and JavaScript for client side validations.
  • Wrote the Hibernate-mapping XML files to define java classes-database tables mapping.
  • Developed the UI using JSP, HTML, CSS and AJAX and learned how to implement JQuery, JSP and client & server validations using JavaScript.
  • Implemented MVC architecture by using spring to send and receive the data from front-end to business layer.
  • Designed, developed and maintained the data layer using JDBC and performed configuration of Java Application Framework.
  • Extensively used Hibernate in data access layer to access and update information in the database.
  • Migrated the Servlets to the Spring Controllers and developed Spring Interceptors, worked on JSPs, JSTL, and JSP Custom Tags.
  • Used Jenkins for continuous integration purpose in using SVN, JUnit and Mockito as version control and Unit testing by Creating design documents and test cases for development work.
  • Worked on Eclipse IDE for front end development environment for insertions, updating and retrieval operations of data from oracle database by writing stored procedures.
  • Responsible for writing Struts action classes, Hibernate POJO classes and integrating Struts and Hibernate with spring for processing business needs.
  • Developed the application using Servlets and JSP for the presentation layer along with JavaScript for the client side validations.
  • Wrote Hibernate classes, DAO's to retrieve & store data, configured Hibernate files.
  • Used Web Logic for application deployment and Log4J used for Logging/debugging.
  • Used CVS version controlling tool and project build tool using ANT.
  • Used various Core Java concepts such as multi-threading, Exception Handling, Collection APIs to implement various features and enhancements.
  • Wrote and debugged the Maven Scripts for building the entire web application.
  • Designed and developed Ajax calls to populate screens parts on demand.

Environment: Struts, HTML, CSS, JSP, MVC, Hibernate, JSP, AJAX, JQuery, Java, Jenkins, ANT, Maven

We'd love your feedback!