Sr. Big Data Engineer Resume
Edison, NJ
SUMMARY
- Over 8+ years of professional experience in all phases of Software Development Life Cycle (SDLC) including hands on experience in Java/J2EE technologies and Big Data Analytics.
- Strong experience working wif different Hadoop distributions like Cloudera, Hortonworks, MapR and Apache distributions.
- Extensive experience working in Oracle, SQL Server and MySQL database.
- Excellent understanding and knowledge of NoSql databases like MongoDB, HBase, and Cassandra.
- Experience in Amazon AWS services such as EMR, EC2, S3, Cloud Formation, and Redshift which provides fast and efficient processing of Big Data.
- Good understanding of R Programming, Data Mining techniques.
- Strong experience and knowledge of real time data analytics using Storm, Kafka, Flume and Spark.
- Experience in troubleshooting errors in HBase Shell, Pig, Hive and MapReduce.
- Extensive experienced working wif Spark tools like RDD transformations, spark MLlib and spark SQL.
- Extensive experience in middle - tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSF, Struts, Spring, Hibernate, EJB.
- Expertise in developing responsive Front End components wif JSP, Html, XHTML, JavaScript, DOM, Servlets, JSF, NodeJS, Ajax, JQuery and AngularJS.
- Expertise in Web pages development using JSP, Html, Java Script, JQuery and Ajax.
- Experience in writing database objects like Stored Procedures, Functions, Triggers, PL/SQL packages and Cursors for Oracle, SQL Server, and MySQL & Sybase databases.
- Experience in Apache Flume for collecting, aggregating and moving huge chunks of data from various sources such as webserver, telnet sources etc.
- Extensive designed and executed SQL queries in order to ensure data integrity and consistency Confidential the backend.
- Working wif Sqoop in importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
- Experience in database design, entity relationships, and database analysis, programming SQL, stored procedure's PL/SQL, packages and triggers in Oracle.
- Hands on experience wif NoSQL Databases like HBase, Cassandra and relational databases like Oracle and MySQL
- Primarily involved in Data Migration process using Azure by integrating wif Github repository and Jenkins.
- Hands on experience wif Real time streaming using Kafka, Spark streaming into HDFS.
- Understanding of workload management, scalability and distributed platform architectures.
- Experience in writing MapReduce, Yarn, Pig Scripts, Hive Queries, Apache Kafka, Storm for analyzing Data
- Hands on experience on NoSQL databases such as Hbase, Cassandra- bit knowledge on MongoDB
- Developing parser and loader MapReduce application to retrieve data from HDFS and store to HBase and Hive
- Importing the data from the MySQL into the HDFS using Sqoop Importing the unstructured data into the HDFS using Flume.
- Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
- Excellent technical and analytical skills wif clear understanding of design goals of ER modeling for OLTP and dimension modeling for OLAP.
- Technology stack assessment implementing various proof of concepts (POC) to eventually adopt them to benefit from the Big Data Hadoop initiative
TECHNICAL SKILLS
Big Data and Hadoop Technologies: Hadoop 3.0, HDFS, MapReduce, Hive 2.3, Sqoop 1.4, Pig 0.17, Ambari 2.4, Hbase 1.2, MongoDB 3.6, Cassandra 3.11, Spark 2.3, Flume 1.8, Impala 2.10, Kafka 1.0.1, Oozie 4.3, Zookeeper 3.4, Cloudera Manager
Databases: MongoDB 3.6, PostgreSQL, MySQL 5.7, Oracle 12c, SQL Server, DB2, Cassandra 3.11, PL/SQL
Web Technologies: HTML5, CSS3, Cassandra 3.11, Neo4j, Cloudera, Storm 1.0.5, Rapid miner, JavaScript, XML, Servlets, and Soap.
Languages: Java/J2EE, SQL, Shell Scripting, C/C++, Python
Java/J2EE Technologies: JDBC, Java Script, JSP, Servlets, JQuery
IDE and Build Tools: Eclipse, NetBeans, MS Visual Studio, Ant, Maven, JIRA, Confluence
Version Control: Git, SVN, CVS
SDLC Methodologies: Agile, waterfall.
Distributions: Apache Hadoop 3.0, Cloudera CDH3, CDH4.
Operating System: Windows, Unix, Linux.
Scripts: JavaScript, Shell Scripting.
Version Control: SVN, CVS, TFS.
PROFESSIONAL EXPERIENCE
Confidential - Edison, NJ
Sr. Big Data Engineer
Responsibilities:
- Working as Big Data Engineer in the team dealing wif Firm's proprietary platform issues.
- Designed and developed Big Data analytics platform for processing customer viewing preferences and social media comments using Java, Hadoop..
- Designed the application framework, data strategies, tools and technologies using the Big Data and Cloud technologies.
- Provides high performance, cluster-wide messaging functionality to exchange data via publish- subscribe and direct point-to-point communication models Apache ignite
- Distributed technology like Cassandra, Ignite, apache Kafka added advantage
- Installed and configured Hadoop Ecosystem like Hive, Oozie, Sqoop by which implemented using Cloudera Hadoop cluster for helping wif performance tuning and monitoring.
- Worked in Agile Methodology and used JIRA to maintain the stories about project.
- Developed analytical solutions, data strategies, tools and technologies for the marketing platform using the Big Data technologies.
- Implemented solutions for ingesting data from various sources utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, Sqoop, Hive
- Importing and exporting data from different databases like MySQL, RDBMS into HDFS and HBase using Sqoop.
- Used Hadoop wif the AWS EC2 using certain instances to gather and analyzing the data log files.
- Independently coded new programs and designed Tables to load and test the program TEMPeffectively for the given POC's using wif Big Data/Hadoop.
- Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
- Worked in writing Hadoop Jobs for analyzing data like Text format files, sequence files, Parquet files using Hive and Pig.
- Worked on analyzing Hadoop cluster and different Big Data components including Pig, Hive, Spark, Impala, and Sqoop.
- Estimated Software & Hardware requirements for the Name Node and Data Node & planning the cluster.
- Performed rule checks on multiple file formats like XML, JSON, CSV and compressed file formats.
- Developed Spark code using Python and Spark-SQL for faster testing and data processing.
- Created Hive External tables and loaded the data into tables and query data using HQL.
- Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in CSV format.
- Responsible for loading the customer's data and event logs from Kafka into HBase using Rest API.
- Development of UI models using HTML, JSP, JavaScript, Ajax, Web Link and CSS.
- Worked on creating XML parser using python which uses MapReduce framework to parse large xml files and store data in HDFS in an efficient and fast manner.
- Helped data scientists to create data pipelines from AWS and preprocess dat data for modeling and machine learning purpose.
- Created Partitioning, Bucketing, Map Side Join, and Parallel execution for optimizing the hive queries decreased the time of execution from hours to minutes.
- Worked wif NoSQL databases like HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
- Involved wif data scientists to create product recommendation engine for clients using Hadoop and deployed dat solution in AWS cloud.
- Currently working on data lake initiative to move on premise data to AWS cloud Data Lake and create monitoring and alerting for data processing jobs.
Environment: Big Data, Spark 2.3, Yarn, Hive 2.3, Flume 1.8, Pig 0.17, Python, Hadoop 3.0, AWS, Databases, Redshift.
Confidential - Wallingford, CT
Sr. Java/Hadoop Engineer
Responsibilities:
- Developed Pig Latin scripts for replacing the existing legacy process to the Hadoop and the data is fed to AWS S3.
- Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
- Created Talend jobs to read messages from Amazon AWS SQS queues & download files from AWS S3 buckets.
- Experience in building solutions using Apache Spark, Apache Ignite
- Technical knowledge on integration solutions leveraging various open source technologies - SpringBoot framework, Kafka, Spark, Apache Ignite, DROOLS
- Worked on analyzing Hadoop cluster and different Big Data Components including Pig, Hive, Spark, HBase, Kafka, Elastic Search, database and SQOOP.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate TEMPeffective querying on the log data.
- Wrote MapReduce jobs to filter and parse inventory data which was stored in the HDFS.
- Configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster for data pipelining.
- Imported and exported data into the HDFS from the Oracle database using Sqoop.
- Integrated MapReduce wif Cassandra to import bulk amount of logged data.
- Converted ETL operations to the Hadoop system using Hive transformations and functions.
- Conducted streaming jobs wif basic Python to process terabytes of formatted data for machine learning purposes.
- Used Flume to collect, aggregate and store the web log data and loaded it into the HDFS.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Developed custom and Pig UDFs for product specific needs.
- Implemented and configured workflows using Oozie to automate jobs.
- Performed Hadoop cluster management and configuration of multiple nodes on AWS.
- Involved in creating buckets to store the data in AWS and stored the data repository for future needs and reusability.
- Worked along wif Tableau developers to help performance tune the visualizations graphs/analytics.
- Involved in the cluster coordination services through Zookeeper.
- Participated in the managing and reviewing of the Hadoop log files.
- Used Elastic Search & MongoDB for storing and querying the offers and non-offers data.
- Proficiency in developing Web applications using Servlets, JSP, JDBC, EJB2.0/3.0, web services using JAX-WS2.0 and JAX-RS APIS.
- Import the data from different sources like HDFS/HBase into Spark RDD and developed a data pipeline using Kafka and Storm to store data into HDFS.
- Used Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala and NoSQL databases such as HBase and Cassandra.
- Worked wif teams in setting up AWS EC2 instances by using different AWS services like S3, EBS, Elastic Load Balancer, and Auto scaling groups, VPC subnets and CloudWatch.
- Utilized SDLC Methodology to help manage and organize a team of developers wif regular code review sessions.
- Developed Restful web services using JAX-RS and used DELETE, PUT, POST, GET HTTP methods
- Created scalable and high-performance web services for data tracking and done High-speed querying.
- Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report on IBM WebSphere MQ messaging system.
- Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
- Created and maintained various Shell and Python scripts for automating various processes and optimized MapReduce code, pig scripts and performance tuning and analysis.
- Worked on Oozie workflow engine for job scheduling. Involved in Unit testing and delivered Unit test plans and results documents.
- Involved wif ingesting data received from various providers, on HDFS for big data operations.
- Wrote MapReduce jobs to perform big data analytics on ingested data using Java API.
- Wrote MapReduce in Ruby using Hadoop Streaming to implement various functionalities.
- Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
- Conducted meetings wif data analysts wif basic Python and wrangled data for data repositories.
Environment: Hadoop 3.0, Java, MapReduce, AWS, HDFS, Scala, Python, MongoDB, Spark, Hive 2.3, Pig 0.17, Linux, XML, Cloudera, CDH4/5 Distribution, Oracle 12c, PL/SQL, EC2, Flume 1.8, Zookeeper, Cassandra 3.11, Hortonworks, Elastic search, IBM WebSphere
Confidential - San Francisco, CA
Java/J2ee Developer
Responsibilities:
- Played a key role in discussing about the requirements, analysis of the entire system along wif estimation, development and testing accordingly keeping BI requirements as a note.
- Involved in the analysis, design and development of the application based on J2EE using Spring and Hibernate.
- Involved actively in designing web page using HTML, Backbone, AngularJS, JQuery, JavaScript, Bootstrap and CSS.
- Created Application Configuration tool using Web works MVC framework and HTML, CSS and JavaScript.
- Developed Web applications using Spring Core, Spring MVC, IBatis, Apache, Tomcat, JSTL and Spring tag libraries.
- User help tooltips implemented wif Dojo Tooltip Widget wif multiple custom colors
- Used eclipse as IDE to write the code and debug application using separate log files.
- Designed and developed frameworks for Payment Workflow System, Confirmations Workflow System, Collateral System using GWT, Core Java, Servlets, JavaScript, XML, AJAX, J2EE design patterns and OOPS/J2EE technologies.
- Used Hibernate to manage Transactions (update, delete) along wif writing complex SQL and HQL queries.
- The business logic is developed using J2EE framework and deployed components on Application server where Eclipse was used for component building.
- Established continuous integration wif JIRA, Jenkins,
- Developed the user interface screens using JavaScript and HTML and also conducted client side validations.
- Used JDBC to connect to database and wrote SQL queries and stored procedures to fetch and insert/update to database tables.
- Used Maven as the build tool and Tortoise SVN as the Source version controller.
- Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
- Involved in developing JSP for client data presentation and, data validation on the client side wif in the forms.
- Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
- Excessive work in writing SQL Queries, Stored procedures, Triggers using TOAD.
- Code development using core java concepts to provide service and persistence layers. Used JDBC to provide connectivity layer to the Oracle database for data transaction.
- Implemented logging and transaction manager using spring's Aspect Oriented Programming (AOP) concept.
- Created build scripts for compiling and creating war, jar using ANT tool kit.
- Used Angular to connect the web application to back-end APIs, used RESTFUL methods to interact wif several API's,
- Developed POJO classes and writing Hibernate query language (HQL) queries.
- Experience in using TIBCO Administrator for User Management, Resource Management and Application Management.
- Developed user interface using JSP, JSP Tag libraries to simplify the complexities of the application.
Environment: Java 1.5/1.7, Core java, Swing, Struts Framework 2.0, Hibernate4.0, Eclipse 3.2, Junit 4.x, JSP 2.x, Oracle SQL Developer 2.1, Oracle Weblogic 12.1, Restful Web Services, SOAP, Tortoise SVN 1.5
Confidential
Java Developer
Responsibilities:
- Performed Requirements gathering, Analysis, Design, Code development, Testing using SDLC (Waterfall) methodologies.
- Designed and implemented the User Interface using JavaScript, HTML, XHTML, XML, CSS, JSP, and AJAX.
- Wrote web service client for tracking operations for the orders which is accessing web services API and utilizing in our web application.
- Implemented data archiving and persistence of report generation meta-data using Hibernate by creating Mapping files, POJO classes and configuring hibernate to set up the data sources.
- Developed Spring framework DAO Layer wif JPA and EJB3 in Imaging Data model and Doc Import.
- The business logic is developed using J2EE framework and deployed components on Application server where Eclipse was used for component building.
- Actively involved in deployment EJB service jars, Application war files in Weblogic Application server.
- Developed GUI screens for login, registration, edit account, forgot password and change password using Struts 2.
- Used JUnit framework for unit testing of application and JUL logging to capture the log dat includes runtime exceptions
- Writing SQL queries for data access and manipulation using Oracle SQL Developer.
- Developed Session Bean to encapsulate the business logic and Model and DAO classes using Hibernate
- Designed and coded JAX-WS based Web Services used to access external financial information.
- Implemented EJB Components using State less Session Bean and State full session beans.
- Used spring framework wif the help of Spring Configuration files to create the beans needed and injected dependency using Dependency Injection.
- Utilized JPA for Object/Relational Mapping purposes for transparent persistence onto the Oracle database.
- Involved in creation of Test Cases for JUnit Testing.
- Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
- Used SOAP as a XML-based protocol for web service operation invocation.
- Packaged and deployed the application in IBM WebSphere Application server in different environments like Development, testing etc.
- Used Log4J to validate functionalities and JUnit for unit testing.
Environment: Java, Servlets, JSP, Struts 1.0, Hibernate3.1, spring core, Spring JDBC, HTML, JavaScript,AJAX, XSL, XSLT, XSD schema, XML Beans, Web logic, Oracle9i