Sr. Big Data Developer Resume
Charlotte, NC
SUMMARY
- Over 7+ years of professional experience in Developing, Integrating, Analyzing, Deploying and Maintaining of the Big Data and Hadoop Eco Systems.
- Experience in installation, configuration and deployment of Big Data solutions.
- Experience in installing, customizing and testing the Big Data and Hadoop Eco Systems such as Hive, Pig, Sqoop, Spark, Oozie etc.
- Experience in Hadoop Map Reduce, Pig, Hive, Oozie, Sqoop, Flume, Zookeeper.
- Excellent experience in AWS, Cloudera and Hortonworks Hadoop distribution and maintaining and optimized AWS infrastructure (EC2 and EBS) also good knowledge in MS Azure.
- Hands on experience with NoSQL Databases like HBase, Cassandra and relational databases like Oracle and MySQL.
- Involving into converting the Hive/SQL queries into Spark transformations using Spark RDDs, Spark SQL using Scala.
- Proficient in Java, Collections, J2EE, Servlets, JSP, Spring, Hibernate, JDBC/ODBC.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Experience in setting up Test, QA, and Prod environment.
- Experience in extending Hive and Pig core functionality by writing custom UDFs using Java.
- Experience in developing MapReduce (Yarn) jobs for cleaning, accessing and validating the data.
- Experience in all stages of SDLC (Agile, Waterfall), writing Technical Design document, Development, Testing and Implementation of Enterprise level Data mart and Data warehouses.
- Experience in Different Distributions like Cloudera, Hortonworks and MapR.
- Experience in creating different visualizations using Bars, Lines, pies, Maps, Scatter Plots, Bubbles, Histograms, Bullets, Heat maps and Highlight tables.
- Expertise in Web pages development using JSP, HTML, Java Script, JQuery and Ajax.
- Experience in writing database objects like Stored Procedures, Functions, Triggers, PL/SQL packages and Cursors for Oracle, SQL Server, and MySQL.
- Involving in efficiently converting JSON files to XML, and CSV files in Talend.
- Experience on working structured, unstructured data with various file formats such as Avro data files, xml files, JSON files, sequence files, ORC and Parquet.
- Expertise with Application servers and web servers like Oracle WebLogic, IBM WebSphere and Apache Tomcat.
- Experience working in environments using Agile (Scrum) and Waterfall methodologies.
- Expertise in database modeling and development using SQL and PL/SQL, MySQL, Teradata.
- Strong programming experience using Java, Scala, Python and SQL.
- Expertise in developing production ready Spark applications utilizing Spark-Core, Data frames, Spark-SQL, Spark-ML and Spark-Streaming API's.
- Experience in working with NoSQL database like HBase, Cassandra and Mongo DB.
- Good experience in developing applications using Java, J2EE, JSP, MVC, EJB, JMS, JSF, Hibernate, AJAX and web based development tools.
- Working extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
- Good experience using a modern version control system such as GitHub, Bitbucket
- Experience with web app development with NodeJS, ExpressJS, HTML, CSS and JavaScript.
- Extensive experience with advanced J2EE Frameworks such as spring, Struts, JSF and Hibernate.
- Working on Bootstrap, AngularJS and NodeJS, knockout, ember, Java Persistence Architecture (JPA).
TECHNICAL SKILLS
Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper
Hadoop Distributions: Cloudera, Hortonworks, MapR
Cloud: AWS, Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake and Data Factory.
Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets
Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS
Web Technologies: HTML, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX
Databases: Oracle 12c/11g, SQL
Database Tools: TOAD, SQL PLUS, SQL
Operating Systems: Linux, Unix, Windows 10/8/7
IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven
NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB
Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere
SDLC Methodologies: Agile, Waterfall
Version Control: GIT, SVN, CVS
PROFESSIONAL EXPERIENCE
Confidential - Charlotte, NC
Sr. Big Data Developer
Responsibilities:
- As a Big Data Developer worked on Hadoop eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
- Used Agile methodology process in the development project and used JIRA to manage the issues/project work flow.
- Worked in Azure environment for development and deployment of Custom Hadoop Applications.
- Designed and implemented scalable Cloud Data and Analytical architecture solutions for various public and private cloud platforms using Azure.
- Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop, PIG, Hive, MapReduce, Spark and Shells scripts (for scheduling of few jobs).
- Implemented various Azure platforms such as Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake and Data Factory.
- Extracted and loaded data into Data Lake environment (MS Azure) by using Sqoop which was accessed by business users and data scientists.
- Manage and support of enterprise Data Warehouse operation, big data advanced predictive application development using Cloudera & Hortonworks HDP.
- Designed, developed and maintained Big Data streaming and batch applications using Storm.
- Worked with Spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
- Implemented and configured workflows using Oozie to automate jobs.
- Participated in the managing and reviewing of the Hadoop log files.
- Used Elastic Search & MongoDB for storing and querying the offers and non-offers data.
- Proficiency in developing Web applications using Servlets, JSP, JDBC, EJB, web services using JAX-WS and JAX-RS APIS.
- Improved the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Developed a Spark job in Java which indexes data into Elastic Search from external Hive tables which are in HDFS.
- Used Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala and NoSQL databases such as HBase and Cassandra.
- Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
- Build Hadoop solutions for big data problems using MR1 and MR2 in Yarn.
- Handled importing of data from various data sources, performed transformations using Hive, Pig, and loaded data into HDFS.
- Involved in identifying job dependencies to design workflow for Oozie & Yarn resource management.
- Used windows Azure SQL reporting services to create reports with tables, charts and maps.
- Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
- Imported and exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Support Cloud Strategy team to integrate analytical capabilities into an overall cloud architecture and business case development.
Environment: Azure, Hadoop 3.0, Sqoop 1.4.6, Pig 0.17, Hive 2.3, MapReduce, Spark 2.2.1, Shells scripts, SQL, Hortonworks, Python 3.6, MLlib, HDFS, YARN 2.9, Java, Kafka 1.0, Cassandra 3.11, Oozie 4.3
Confidential - Peoria IL
Sr. Big Data/Hadoop Developer
Responsibilities:
- Involved in story-driven Agile development methodology and actively participated in daily scrum meetings.
- Worked on all activities related to the development, implementation and support for Hadoop.
- Designed custom re-usable templates in Nifi for code reusability and interoperability.
- Involved in Installing, Configuring Hadoop Eco System, and Cloudera Manager using CDH4 Distribution.
- Worked with teams in setting up AWS EC2 instances by using different AWS services like S3, EBS, and Elastic Load Balancer, Auto scaling groups, VPC subnets and CloudWatch.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Worked with Kafka streaming tool to load the data into HDFS and exported it into MongoDB database.
- Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
- Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Implemented multiple MapReduce Jobs in java for data cleansing and pre-processing.
- Wrote complex Hive queries and UDFs in Java and Python.
- Worked on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters and Experience in converting MapReduce applications to Spark.
- Worked with Hadoop eco system covering HDFS, HBase, YARN and MapReduce.
- Used Scala and Spark-SQL to develop spark code for faster processing, testing and performed complex Hive queries on Hive tables.
- Wrote and execute SQL queries to work with structured data available in relational databases and to validate the transformation/ business logic.
- Use Flume to move data from individual data sources to Hadoop system.
- Use MRUnit framework to test the MapReduce code.
- Responsible for building scalable distributed data solutions using Hadoop Eco system and Spark.
- Involved in the process of data acquisition, data pre-processing various types of source data using Stream sets.
- Responsible for design & development of Spark SQL Scripts using Scala/Java based on Functional Specifications.
- Worked with NoSQL Cassandra to store, retrieve, and update and manage all the details for Ethernet provisioning and customer order tracking.
- Analyzed the data by performing Hive queries (HiveQL), ran Pig scripts, Spark SQL and Spark streaming.
- Developed tools using Python, Shell scripting, XML to automate some of the menial tasks
- Wrote scripts in Python for extracting data from HTML file.
- Implemented MapReduce jobs in HIVE by querying the available data.
- Configured Hive Meta store with MySQL, which stores the metadata for Hive tables.
- Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
- Performance tuning of Hive queries, MapReduce programs for different applications.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Used Cloudera Manager for installation and management of Hadoop Cluster.
Environment: Nifi 1.1, Hadoop 2.6, JSON, XML, Avro, HDFS, Teradata r15, Sqoop, Kafka, MongoDB, Hive 2.3, Pig 0.17, HBase, Zookeeper, MapReduce, java, Python 3.6, Yarn, Flume, NoSQL, Cassandra 3.11
Confidential - Wilmington, DE
Sr. Java/Hadoop Developer
Responsibilities:
- Gathered the business requirements from the Business Partners and Subject Matter Experts.
- Supported HBase Architecture Design with the Hadoop Architect team to develop a Database Design in HDFS.
- Supported Map Reduce Programs those are running on the cluster and also Wrote MapReduce jobs using Java API.
- Involved in HDFS maintenance and loading of structured and unstructureddata.
- Importeddatafrom mainframe dataset to HDFS using Sqoop.
- Handled importing ofdatafrom variousdatasources (i.e. Oracle, DB2, Cassandra, and MongoDB) to Hadoop, performed transformations using Hive, MapReduce.
- Created the Mock-ups using HTML and JavaScript to understand the flow of the web application
- Integration of Cassandra with Talend and automation of jobs.
- Used Struts framework to develop the MVC architecture and modularized the application
- Wrote Hive queries fordataanalysis to meet the business requirements.
- Involved in managing and reviewing Hadoop log files.
- Developed Scripts and Batch Job to schedule various Hadoop Program.
- Utilized Agile Scrum Methodology to help manage and organize with developers and regular code review sessions.
- Upgraded the Hadoop Cluster from CDH4 to CDH5 and setup High availability Cluster to Integrate the HIVE with existing applications
- Analyzed thedataby performing Hive queries and running Pig scripts to know user behavior.
- Continuous monitored and managed the Hadoop cluster through Cloudera Manager.
- Developed Hive queries to process thedataand generate thedatacubes for visualizing.
- Optimized the mappings using various optimization techniques and also debugged some existing mappings using the Debugger to test and fix the mappings.
- Used SVN version control to maintain the different version of the application
- Updated maps, sessions and workflows as a part of ETL change and also modified existing ETL Code and document the changes.
- Involved in coding, maintaining, and administering EJB, Servlets, and JSP components to be deployed on a Web Logic Server.
- Worked on importing data from HDFS to MYSQL database and vice-versa using Sqoop.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Extracted meaningful data from unstructured data on Hadoop Ecosystem.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
Environment: Hadoop 2.4, Java, MapReduce, HDFS, Hive 2.0, Pig 0.15, Linux, XML, Eclipse, Cloudera, CDH4/5 Distribution, DB2, SQL Server, Oracle 11g, MYSQL, Web Logic Application Server 8.1, EJB 2.0, Struts 1.1
Confidential
Java/J2EE Developer
Responsibilities:
- As a Java/J2EE Developer my role is to design, develop, deploy and maintain website and applications.
- Involved in Software Development Life Cycle (SDLC) of the application: Requirement gathering, Design Analysis and Code development.
- Implemented Struts framework based on the Model View Controller design paradigm.
- Designed the application by implementing Struts based on MVC Architecture, simple Java Beans as a Model, JSP UI Components as View and Action Servlet as a Controller.
- Used JNDI to perform lookup services for the various components of the system.
- Involved in designing and developing dynamic web pages using HTML and JSP with Struts tag libraries.
- Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML and AJAX and developed web services by using SOAP UI.
- Used JPA to persistently store large amount of data into database.
- Implemented modules using Java APIs, Java collection, Threads, XML, and integrating the modules.
- Applied J2EE Design Patterns such as Factory, Singleton, and Business delegate, DAO, Front Controller Pattern and MVC.
- Used JPA for the management of relational data in application.
- Designed and developed business components using Session and Entity Beans in EJB.
- Developed the EJBs (Stateless Session beans) to handle different transactions to the service providers.
- Developed JMS Sender and Receivers for the loose coupling between the other modules and Implemented asynchronous request processing using Message Driven Bean.
- Used JDBC for data access from Oracle tables.
- Successfully installed and configured the IBM WebSphere Application server and deployed the business tier components using EAR file.
- Used Maven for build framework and Jenkins for continuous build system.
- Deployed application on JBOSS application server environment.
- Provided SQL scripts and PL/SQL stored procedures for querying the database.
- Used Eclipse for writing JSPs, Struts and other java code snippets.
- Used JUnit framework for Unit testing of application and Clear Case for version control.
- Built application using ANT and used Log4J to generate log files for the application.
Environment: Java, J2EE, Hibernate 4.3, JSON, XML, HTML, CSS, JavaScript, AJAX, JQuery, Apache Tomcat, Maven, JBOSS, PL/SQL, Eclipse, JUnit, ANT