Sr. Big Data/hadoop Developer Resume
Phoenix, AZ
SUMMARY
- Over 7+ years of IT experience in software analysis, design, development, testing and implementation of Big Data, Hadoop, NoSQL and Java/J2EE technologies.
- Experience using various Hadoop Distributions (Cloudera, Hortonworks, MapR, etc) to fully implement and leverage new Hadoop features.
- Install Kafka on Hadoop cluster and configure producer and consumer coding part in java to establish connection from twitter source to HDFS with popular hash tags.
- Experience working with Data Frames, RDD, Spark SQL, Spark Streaming, APIs, System Architecture, and Infrastructure Planning.
- Experience with CoreJava component Collection, Generics, Inheritance, Exception Handling and Multi - threading.
- Very good understanding on NoSql databases like MongoDB and HBase.
- Experience on major components in Hadoop Ecosystem including Hive, Sqoop, Flume&knowledge of MapReduce/HDFS Framework.
- Hands-on programming experience in various technologies like Java, J2EE, Html, XML
- A very good experience in developing and deploying the applications using Web logic, Apache Tomcat, and JBoss.
- Experience in working with Developer Toolkits like Force.com IDE, Force.com Ant Migration Tool, Eclipse IDE, Mavens.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Experience in Apache Flume for collecting, aggregating and moving huge chunks of data from various sources such as webserver, telnet sources etc.
- Experience in installation, configuration and deployment of Big Data solutions.
- Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2.
- Implementing in setting up standards and processes for Hadoop based application design and implementation.
- Solid understanding of Hadoop MRV1 and Hadoop MRV2 (or) Yarn Architecture.
- Hands on experience in configuring and administering the Hadoop Cluster using major Hadoop Distributions like Apache Hadoop and Cloudera.
- Expertise in using XML related technologies such as XML, DTD, XSD, XPATH, XSLT, DOM, SAX, JAXP, JSON and JAXB.
- Excellent knowledge on Hadoop architecture; as in HDFS, Job Tracker, Task Tracker,Name Node, Data Node and Map Reduce programming paradigm.
- Experience in implementing spark solution to enable real time reports from Cassandra data.
- Hands on expertise in working and designing of Row keys & Schema Design with NOSQLdatabases like MongoDB.
- Experience in extracting files from MongoDB through Sqoop and placed in HDFS and processed.
- Hands on experience with Spark Core, Spark SQL and Data Frames/Data Sets/RDD API.
- Experience in using Kafka and Kafka brokers to initiate spark context and processing live streaming information with the help of RDD.
- Developed Java applications using various IDE's like Spring Tool Suite and Eclipse.
- Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
- Operated on Java/J2EE systems with different databases, which include Oracle, MySQL and DB2.
- Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2 instances.
- Build AWS secured solutions by creating VPC with private and public subnets.
- Extensive experience in Application servers likes Web logic, Web Sphere, JBoss, Glassfish and Web Servers like Apache Tomcat.
TECHNICAL SKILLS
Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig 0.17, Hive 2.3, Sqoop 1.4, Apache Impala 3.0, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper 3.4
Hadoop Distributions: Cloudera, Hortonworks, MapR
Cloud: AWS, Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake and Data Factory.
Programming Language: Java, Scala 2.12, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets
Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS
Web Technologies: HTML5, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX
Databases: Oracle 12c/11g, SQL
Database Tools: TOAD, SQL PLUS, SQL
Operating Systems: Linux, Unix, Windows 10/8/7
IDE and Tools: Eclipse 4.7, NetBeans 8.2
NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB
Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere
SDLC Methodologies: Agile, Waterfall
Version Control: GIT, SVN, CVS, Maven
PROFESSIONAL EXPERIENCE
Confidential - Phoenix, AZ
Sr. Big Data/Hadoop Developer
Responsibilities:
- Worked as a Sr. Big Data/Hadoop Developer with Hadoop Ecosystems components.
- Developed Big Data solutions focused on pattern matching and predictive modeling.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
- Involved in Agile methodologies, daily scrum meetings, spring planning.
- Primarily involved in Data Migration process using Azure by integrating with GitHub repository and Jenkins.
- Used Kibana, which is an open source based browser analytics and search dashboard for Elastic Search.
- Used Java Persistence API (JPA) framework for object relational mapping which is based on POJO Classes.
- Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating Hive with existing applications.
- Designed & Developed a Flattened View (Merge and Flattened dataset) de-normalizing several Datasets in Hive/HDFS.
- Worked on NoSQL support enterprise production and loading data into HBase using Impala and Sqoop.
- Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
- Build Hadoop solutions for big data problems using MR1 and MR2 in YARN.
- Handled importing of data from various data sources, performed transformations using Hive, Pig, and loaded data into HDFS.
- Involved in identifying job dependencies to design workflow for Oozie&Yarn resource management.
- Designed solution for various system components using MicrosoftAzure.
- Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.
- Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
- Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
- Developed Nifi flows dealing with various kinds of data formats such as XML, JSON and Avro.
- Developed and designed data integration and migration solutions in Azure.
- Worked on Proof of concept with Spark with Scala and Kafka.
- Worked on visualizing the aggregated datasets in Tableau.
- Worked on importing data from HDFS to MYSQL database and vice-versa using Sqoop.
- Implemented MapReduce jobs in Hive by querying the available data.
- Configured Hive Meta store with MySQL, which stores the metadata for Hive tables.
- Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
- Performance tuning of Hive queries, MapReduce programs for different applications.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Used Cloudera Manager for installation and management of Hadoop Cluster.
- Developed data pipeline using Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in converting HiveQL into Sparktransformations using Spark RDD and through Scala programming.
- Integrated Kafka-Spark streaming for high efficiency throughput and reliability
- Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
- Worked in tuning Hive&Pig to improve performance and solved performance issues in both scripts
Environment: Hadoop 3.0, Agile, Pig 0.17, HBase 1.4.3, Jenkins 2.12, NoSQL, Sqoop 1.4, Impala 3.0.0, Hive 2.3, MapReduce, YARN, Oozie, Microsoft Azure, Nifi, Avro, MYSQL, Kafka, Scala 2.12, Spark, Apache Flume 1.8
Confidential - Lowell, AR
Hadoop Developer
Responsibilities:
- Extensively worked on Hadoop eco-systems including Hive, Spark Streaming with MapRdistribution.
- Implemented J2EEDesignPatterns like DAO, Singleton, and Factory.
- Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
- Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating Hive with existing applications.
- Worked on NoSQL support enterprise production and loading data into HBase using Impala and Sqoop.
- Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
- Build Hadoop solutions for big data problems using MR1 and MR2 in YARN.
- Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
- Worked on data using Sqoop from HDFS to Relational Database Systems and vice-versa. Maintaining and troubleshooting.
- Developed the Java/J2EE based multi-threaded application, which is built on top of the struts framework.
- Used Spring/MVC framework to enable the interactions between JSP/View layer and implemented different design patterns with J2EE and XML technology.
- Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.
- Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
- Involved in PL/SQL query optimization to reduce the overall run time of stored procedures.
- Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
- Implemented the J2EE design patterns Data Access Object (DAO), Session Façade and Business Delegate.
- Developed Nififlows dealing with various kinds of data formats such as XML, JSON and Avro.
- Implemented MapReduce jobs in HIVE by querying the available data.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Used Cloudera Manager for installation and management of Hadoop Cluster.
- Collaborated with business users/product owners/developers to contribute to the analysis of functional requirements.
- Implemented application using MVC architecture integrating Hibernate and spring frameworks.
- Utilized various JavaScript and JQuery libraries Bootstrap, Ajax for form validation and other interactive features.
- Involved in converting HiveQL into Spark transformations using Spark RDD and through Scala programming.
- Integrated Kafka-Sparkstreaming for high efficiency throughput and reliability
- Worked in tuning Hive&Pig to improve performance and solved performance issues in both scripts.
Environment: Hadoop 3.0, Hive 2.1, J2EE, JDBC, Pig 0.16, HBase 1.1, Sqoop, NoSQL, Impala, Java, Spring, MVC, XML, Spark 1.9, PL/SQL, HDFS, JSON, Hibernate, Bootstrap, JQuery, JavaScript, Ajax
Confidential - San Francisco, CA
Sr. Java/Hadoop Developer
Responsibilities:
- Designed and Developed application modules using spring and Hibernateframeworks.
- Responsible for building scalable distributed data solutions using Hadoop.
- Experienced in loading and transforming of large sets of structured, semi structured and unstructured data.
- Used MAVEN for developing build scripts and deploying the application onto WebLogic.
- Implemented SparkRDDtransformations to map businessanalysis and apply actions on top of transformations.
- Developed Spark jobs and Hive Jobs to summarize and transform data.
- Involved in converting Hive/SQL queries into Spark transformations using Spark data frames, Scala and Python.
- Implemented MVC architecture using Spring Framework, Coding involves writing Action Classes/Custom Tag Libraries, JSP.
- Expertise in implementing Spark Scala application using higher order functions for both batch and interactive analysis requirement.
- Creating Hive tables with periodic backups, writing complex Hive/Impala queries to run on Impala.
- Implemented partitioning, bucketing and worked on Hive, using file formats and compressions techniques with optimizations.
- Involved in designing and developing modules at both Client and Server Side.
- Worked on JDBC framework encapsulated using DAO pattern to connect to the database.
- Developed the UI Screens using JSP and HTML and did the client side validation with the JavaScript.
- Worked on various SOAP and RESTful services used in various internal applications.
- Developed JSP and Java classes for various transactional/ non-transactional reports of the system using extensive SQL queries.
- Worked on analyzing Hadoop cluster and different big data analytic tools including MapReduce, Hive and Spark.
- Implemented Storm topologies to pre-process data before moving into HDFSsystem.
- Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
- Involved in configuring builds using Jenkins with Git and used Jenkins to deploy the applications onto Dev, QA environments
- Involved in unit testing, system integration testing and enterprise user testing using JUnit.
- Involved in creating Hive tables, loading with data and writing Hive queries which runs internally in MapReduce way.
- Developed Shell, Perl and Pythonscripts to automate and provide Control flow to Pigscripts.
Environment: spring 4.0, Hibernate 5.0.7, Hadoop 2.6.5, Spark 1.1, Hive, Python 3.3, Scala, Sqoop, Flume 1.3.1, Impala, MapReduce, LINUX
Confidential
Java/J2EE Developer
Responsibilities:
- As a Java/J2ee developer involved in back-end and front-end developing team.
- Designed and developed various modules of the application with J2EE design architecture, frameworks like Spring MVC architecture and Spring Bean Factory using IOC, AOP concepts.
- Implemented Java/J2EE design patterns such as Factory, DAO, Session Façade, and Singleton.
- Used Hibernate in persistence layer and developed POJO's, Data Access Object (DAO) to handle all database operations.
- Used Maven as the build tool, GIT for version control, Jenkins for Continuous Integration and JIRA as a defect tracking tool.
- Developed the user interface components using HTML, CSS, JavaScript, AJAX, JQuery and also created custom tags.
- Implemented the Project structure based on SpringMVC pattern using spring boot.
- Worked on JavaScript to validate input, manipulated HTML elements using JavaScript.
- Developed external JavaScript codes that can be used in several different web pages.
- Developed Web pages using JSP, HTML, CSS, Struts Tag libs and AJAX for the Credit Risk module.
- Used Spring Beans to encapsulate business logic and Implemented Application MVC Architecture using Spring MVC framework.
- Developed XMLs, JavaScript and Java classes for dynamic HTML generation to perform the server side processing on the client requests.
- Used JUnit framework for unit testing of application and Maven to build the application and deployed on Jetty server.
- Developed unit test cases in JUnit and documented all the test scenarios as per the user specifications.
- Implemented Springframework based on the Model View Controller design paradigm.
- Involved in development activities using Core Java /J2EE, Servlets, JSP, JSF used for creating webapplication, XML and springs.
- Used Spring Framework for Dependency injection and integrated with the Hibernate.
- Developed RESTfulWeb Services client to consume JSON messages using Spring JMS configuration.
- Developed services using Spring IOC and Hibernate persistence layer with OracleDatabase.
- Implemented build script using ANT for compiling, building and deploying the application on WebSphere application server.
Environment: J2EE, Java, Spring MVC 3.0, POJO, Jenkins, HTML, JavaScript, AJAX, JQuery, CSS, XML, Maven, JUnit, Hibernate 4.2.8, POJO, ANT, Oracle 10g
