Sr. Big Data/hadoop Developer Resume
Harrisburg, PA
SUMMARY
- Overall 10 years of IT experience as Big Data/Hadoop Developer in all phases of Software Development Life Cycle which includes hands on experience in Java/J2EE Technologies and Big Data.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume.
- Well versed experience in Amazon Web Services (AWS) Cloud services like EC2, S3, EBS, RDS and VPC.
- Improved the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark - SQL, Data Frame, Pair RDD's, Spark YARN.
- Proficient in Core Java, Enterprise technologies such as EJB, Hibernate, Java Web Service, SOAP, REST Services, Java Thread, Java Socket, Java Servlet, JSP, JDBC etc.
- Good exposure to Service Oriented Architectures (SOA) built on Web services (WSDL) using SOAP protocol.
- Written multiple MapReduce programs in Python for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
- Experience in working on the Hadoop Eco system, also have little experience on installing and configuring of the Hortonworks distribution and Cloudera distribution (CDH3 and CDH4).
- Experience in NoSQL database HBase, MongoDB and Cassandra.
- Good understanding of Hadoop architecture and hands on experience with Hadoop components such as Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming.
- Experience in importing and exporting data between HDFS and RDBMS using Sqoop.
- Extracted & processed streaming log data from various sources and integrated in to HDFS using Flume.
- Extensively worked with different data sources non-relational databases such as XML files, parses like SAX, DOM and other relational databases such as Oracle, MySQL.
- Experience working on Application servers like IBM WebSphere, JBoss, BEA WebLogic and Apache Tomcat.
- Extensive experience in Internet, client/server technologies using Java, J2EE, Struts, Hibernate, Spring, HTML, DHTML, CSS, JavaScript, XML, PERL.
- Expert in deploying the code trough web application servers like Web Sphere/Web Logic/ Apache Tomcat in AWS CLOUD.
- Expertise in core Java, J2EE, Multithreading, JDBC, Hibernate, Shell Scripting Servlets, JSP, Spring, Struts, EJB, Web Services, XML, JPA, JMS, JNDI and proficient in using Java API's for application development
- Good working experience in Application and web Servers like JBoss and Apache Tomcat.
- Experience in writing Pig and Hive scripts and extending the core functionality by writing custom UDF's.
- Involved in converting HiveQL into Spark transformations using Spark RDD and through Scala programming.
- Integrated Kafka-Spark streaming for high efficiency throughput and reliability
- Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
- Extensive experience with Agile Development, Object Modeling using UML and Rational Unified Process (RUP).
- Strong knowledge of Object Oriented Programming (OOP) concepts including the use of Polymorphism, Abstraction, Inheritance and Encapsulation.
TECHNICAL SKILLS
Big data/Hadoop: Hadoop2.7/2.5, HDFS1.2.4, Map Reduce, Hive, Pig, Sqoop, Oozie, Hue, Flume, Kafka and Spark2.0/2.0.2
NoSQL Databases: HBase, MongoDB3.2 & Cassandra
Java/J2EE Technologies: Servlets, JSP, JDBC, JSTL, EJB, JAXB, JAXP, JMS, JAX-RPC, JAX- WS
Programming Languages: Java, Python, SQL, PL/SQL, AWS, HiveQL, Unix Shell Scripting, Scala
IDE and Tools: Eclipse 4.6, Netbeans 8.2, BlueJ
Database: Oracle 12c/11g, MYSQL, SQL Server 2016/2014
Web Technologies: HTML5/4, DHTML, AJAX, JavaScript, jQuery and CSS3/2, JSP, Bootstrap 3/3.5
Application Server: Apache Tomcat, Jboss, IBM Web sphere, Web Logic
Operating Systems: Windows8/7, UNIX/Linux and Mac OS.
Other Tools: Maven, ANT, WSDL, SOAP, REST.
Methodologies: Software Development Lifecycle (SDLC), Waterfall, Agile, STLC (Software Testing Life cycle), UML, Design Patterns (Core Java and J2EE)
PROFESSIONAL EXPERIENCE
Confidential - Harrisburg, PA
Sr. Big Data/Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and SQOOP.
- Installed Hadoop, Map Reduce, HDFS, and Developed multiple map reduce jobs in PIG and Hive for data cleaning and pre-processing.
- Worked in AWS EC2, configuring the servers for Auto scaling and Elastic load balancing
- Involved in gathering requirements from client and estimating time line for developing complex queries using Hive and Impala for logistics application.
- Responsible for design development of Spark SQL Scripts based on Functional Specifications.
- Exploring with the Spark improving the Performance and Optimization of the existing algorithms in Hadoop.
- Cluster coordination services through Zookeeper.
- Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Developed Simple to complex Map/reduce Jobs using Scala and Java in Spark.
- Developed data pipeline using Flume, Sqoop to ingest cargo data and customer histories into HDFS for analysis.
- Worked on importing data from HDFS to Oracle database and vice-versa using SQOOP to configure Hive meta store with MySQL, which stores the metadata for Hive tables.
- Written Hive and Pig scripts as per requirements to automate the workflow using shell scripts.
- Automated all the jobs for extracting the data from different Data Sources like MySQL to pushing the result set data to Hadoop Distributed File System using Oozie Workflow Scheduler.
- Participated in Rapid Application Development and Agile processes to deliver new cloud platform services.
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
- Importing and exporting Data from MySQL/Oracle to HiveQL Using SQOOP.
Environment: Apache Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, crunch API, Pig, HCatalog, Unix, Java, Oracle, SQL Server, MYSQL, Oozie, Python.
Confidential - Atlanta, GA
Sr. Big Data/Hadoop Developer
Responsibilities:
- Collaborated with business users/developers to contribute to the analysis of functional requirements.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
- Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating HIVE with existing applications.
- Worked on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
- Designed & Developed a Flattened View (Merge and Flattened dataset) de-normalizing several Datasets in Hive/HDFS which consists of key attributes consumed by Business and other down streams.
- Worked on NoSQL (HBase) for support enterprise production and loading data into HBASE using Impala and SQOOP.
- Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.
- Load and transform large sets of structured, semi structured and unstructured data.
- Performed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
- Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
- Created tables in HBase to store variable data formats of PII data coming from different portfolios.
- Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.
- Worked on importing data from HDFS to MYSQL database and vice-versa using SQOOP.
- Implemented Map Reduce jobs in HIVE by querying the available data.
- Configured Hive meta store with MySQL, which stores the metadata for Hive tables.
- Worked on data using Sqoop from HDFS to Relational Database Systems and vice-versa. Maintaining and troubleshooting
- Explored with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.
- Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
- Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
- Worked on Proof of concept with Spark with Scala and Kafka.
- Worked on visualizing the aggregated datasets in Tableau.
- Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
- Performance tuning of Hive queries, MapReduce programs for different applications.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Used Cloudera Manager for installation and management of Hadoop Cluster.
- Developing data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Specifying the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
- Worked in tuning Hive & Pig to improve performance and solved performance issues in both scripts
Environment: HDFS, Map Reduce, Pig, Hive, Sqoop, Oracle 12c, Flume, Oozie, HBase, Impala, Spark Streaming, Yarn, Eclipse, spring, PL/SQL, UNIX Shell Scripting, Cloudera.
Confidential - Dallas, TX
Sr. Hadoop Developer
Responsibilities:
- Worked on Spark SQL to handle structured data in Hive.
- Involved in making Hive tables, stacking information, composing hive inquiries, producing segments and basins for enhancement.
- Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.
- Worked on complex MapReduce program to analyses data that exists on the cluster.
- Analyzed substantial data sets by running Hive queries and Pig scripts.
- Written Hive UDFs to sort Structure fields and return complex data type.
- Worked in AWS environment for development and deployment of custom Hadoop applications.
- Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.
- Creating files and tuned the SQL queries in Hive utilizing HUE.
- Involved in collecting and aggregating large amounts of log data using Storm and staging data in HDFS for further analysis.
- Created the Hive external tables using Accumulo connector.
- Managed real time data processing and real time Data Ingestion in MongoDB and Hive using Storm.
- Created custom SOLR Query segments to optimize ideal search matching.
- Developed Spark scripts by using Python shell commands.
- Stored the processed results In Data Warehouse, and maintaining data using Hive.
- Created Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
- Worked with NoSQL databases like MongoDB in making MongoDB tables to load expansive arrangements of semi structured data.
- Developed Spark scripts by using Python shell commands as per the requirement.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
- Worked and learned a great deal from Amazon Web Services (AWS) Cloud services like EC2, S3, EMR.
Environment: HDFS, MapReduce, Storm, Hive, Pig, Sqoop, MongoDB, Apache Spark, Python, Accumulo, Oozie Scheduler, Kerberos, AWS, Tableau, Java, UNIX Shell scripts, HUE, SOLR, GIT, Maven.
Confidential - Warren, NJ
Sr. Java/Hadoop Developer
Responsibilities:
- Developed PIG UDF'S for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
- Developed Java Map Reduce programs on log data to transform into structured way to find user location, age group, spending time.
- Implemented Row Level Updates and Real time analytics using CQL on Cassandra Data.
- Collected and aggregated large amounts of web log data from different sources such as web servers, mobile and network devices using Apache Flume and stored the data into HDFS for analysis.
- Developed PIG scripts for the analysis of semi structured data.
- Worked on the Ingestion of Files into HDFS from remote systems using MFT (Managed File Transfer)
- Used Hibernate Transaction Management, Hibernate Batch Transactions, and cache concepts.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
- Designed and implemented MapReduce based large-scale parallel processing.
- Developed and updated the web tier modules using Struts 2.1 Framework.
- Modified the existing JSP pages using JSTL.
- Implemented Struts Validator for automated validation.
- Utilized Hibernate for Object/Relational Mapping purposes for transparent persistence onto the SQlServer.
- Performed building and deployment of EAR, WAR, JAR files on test, stage systems in Web logic Application Server.
- Developed Java and J2EE applications using Rapid Application Development (RAD), Eclipse.
- Used Singleton, DAO, DTO, Session Facade, MVC design Patterns.
- Writing complex SQL and PL/SQL queries for stored procedures.
- Developed Reference Architecture for E-Commerce SOA Environment
- Custom table creation and population, custom and package index analysis and maintenance in relation to process performance.
- Used CVS for version controlling and JUnit for unit testing.
Environment: Eclipse, Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, Oozie, MySQL, Cassandra, Java, Shell Scripting, MySQL, SQL.
Confidential - Northbrook, IL
Sr. Java/J2EE Developer
Responsibilities:
- Worked on designing and developing the Web Application User Interface and implemented its related functionality in Java/J2EE for the product.
- Used JSF framework to implement MVC design pattern.
- Developed and coordinated complex high quality solutions to clients using J2SE, J2EE, Servlets, JSP, HTML, Struts, Spring MVC, SOAP, JavaScript, JQuery, JSON and XML.
- Wrote JSF managed beans, converters and validators following framework standards and used explicit and implicit navigations for page navigations.
- Designed and developed Persistence layer components using Hibernate ORM tool.
- Designed UI using JSF tags, Apache Tomahawk & Rich faces.
- Used Oracle 10g as backend to store and fetch data.
- Experienced in using IDEs like Eclipse and Net Beans, integration with Maven
- Created Real-time Reporting systems and dashboards using XML, MySQL, and Perl
- Worked on Restful web services which enforced a stateless client server and support JSON (few changes from SOAP to RESTFUL Technology)
- Involved in detailed analysis based on the requirement documents.
- Involved in Design, development and testing of web application and integration projects using Object Oriented technologies such as Core Java, J2EE, Struts, JSP, JDBC, Spring Framework, Hibernate, Java Beans, Web Services (REST/SOAP), XML, XSLT, XSL and Ant.
- Designing and implementing SOA compliant management and metrics infrastructure for Mule ESB infrastructure utilizing the SOA management components.
- Used NodeJs for server side rendering. Implemented modules into NodeJs to integrate with designs and requirements. used JAX-WS to interact in front-end module with backend module as they are running in two different servers.
- Responsible for Offshore deliverables and provide design/technical help to the team and review to meet the quality and time lines.
- Migrated existing Struts application to Spring MVC framework.
- Provided and implemented numerous solution ideas to improve the performance and stabilize the application.
- Extensively used LDAP Microsoft Active Directory for user authentication while login.
- Developed unit test cases using JUnit.
- Created the project from scratch using Angular JS as frontend, Node Express JS as backend.
- Involved in developing Perl script and some other scripts like java script
- Tomcat is the web server used to deploy OMS web application.
- Used SOAPLite module to communicate with different web-services based on given WSDL.
- Prepared technical reports &documentation manuals during the program development.
Environment: JDK 1.5, JSF, Hibernate 3.6, JIRA, NodeJs, Cruise control, Log4j, Tomcat, LDAP, JUNIT, NetBeans, Windows/UNIX.
Confidential
Java Developer
Responsibilities:
- Developed using new features of Java 1.5 Annotations, Generics, enhanced for loop and Enums.
- Used Struts and Hibernate for implementing IOC, AOP and ORM for back end tiers.
- Designing of the system as per the change in requirement using Struts MVC architecture, JSP, DHTML
- Designed the application using J2EE patterns.
- Developed Java Beans for business logic.
- Design of REST APIs that allow sophisticated, effective and low cost application integrations.
- Developed the presentation layer using Struts Framework.
- Developed persistence layer using ORM Hibernate for transparently store objects into database.
- Responsible for coding all the JSP, Servlets used for the Used Module.
- Developed the JSP, Servlets and various Beans using WebSphere server.
- Wrote Java utility classes common for all of the applications.
- Designed and implemented highly intuitive, user friendly GUI from scratch using Drag and Drop with Java Swing and CORBA.
- Extensively used multithreading concepts.
- Deployed the jar files in the Web Container on the IBM WebSphere Server 5.x.
- Designed and developed the screens in HTML with client side validations in JavaScript.
- Developed the server side scripts using JMS, JSP and Java Beans.
- Adding and modifying Hibernate configuration code and Java/SQL statements depending upon the specific database access requirements.
- Design database Tables, View, Index's and create triggers for optimized data access.
- Analyzed and fine Tuned RDBMS/SQL queries to improve performance of the application with the database.
- Creating XML based configuration, property files for application and developing parsers using JAXP, SAX, and DOM technologies.
- Developed Web Services using Apache AXIS tool.
Environment: Java 1.5, Struts MVC, JSP, Hibernate 3.0, JUnit, UML, XML, CSS, HTML, Oracle 9i, Eclipse, JavaScript, WebSphere 5.x, Rational Rose, ANT.