Sr. Big Data Engineer Resume
Long Beach, CA
SUMMARY
- Professional with 9+ years of experience in software development/ design and architecture.
- Architect, design & develop Big Data Solutions practice including set up Big Data roadmap, build supporting infrastructure and team to provide Big Data.
- Strong Experience in Front End Technologies like JSP, HTML5, JQuery, JavaScript, CSS3.
- Experienced in application development using Java, J2EE, JDBC, spring, Junit.
- Experienced in developing web - based GUIs using JavaScript, JSP, HTML, JQuery, XML and CSS.
- Experienced to develop enterprise applications with J2EE/MVC architecture with application servers and Web servers such as, JBoss, and Apache Tomcat 6.0/7.0/8/0.
- Architecting and implementing Portfolio Recommendation Analytics Engine using Hadoop MR, Oozie, Spark SQL, Spark Mlib and Cassandra.
- Technologies extensively worked on during my tenure in Software Development are Struts, Spring, CXF Rest API, Webservices, SOAP, XML, JMS, JSP, JNDI, Apache, Tomcat, JDBC and various Databases like Oracle, and Microsoft SQL server.
- Excellent understanding of Hadoop architecture and underlying framework including storage management.
- Expertise in architecting Big data solutions using Data ingestion, Data Storage
- Experienced in Worked on NoSQL databases - HBase, Cassandra & MongoDB, database performance tuning & data modeling.
- Experienced with teh Spark improving teh performance and optimization of teh existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Extensive knowledge in programming with Resilient Distributed Datasets (RDDs).
- Experienced with Akka building high performance and reliable distributed applications in Java and Scala.
- Knowledge and experience in job work-flow scheduling and monitoring tools like Oozie and Zookeeper.
- Good experience in Shell programming.
- Knowledge in configuration and managing - Cloudera’s Hadoop platform along with CDH3&4 clusters.
- Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
- Excellent technical and analytical skills with clear understanding of design goals of ER modeling for OLTP and dimension modeling for OLAP.
- Experience in development of Big Data projects using Hadoop, Hive, HDP, Pig, Flume, Storm and Map Reduce open source tools/technologies.
- Architecting, Solutioning and Modeling DI (Data Integrity) Platforms using sqoop, flume, Kafka, Spark Streaming, Spark Mllib, Cassandra.
- Strong experience in migrating data warehouses and databases into Hadoop/NoSQL platforms.
- Strong expertise on Amazon AWS EC2, Dynamo DB, S3, Kinesis and other services
- Expertise in Big Data architecture like hadoop (Azure, Hortonworks, Cloudera) distributed system, MongoDB, NoSQL
- Hands on experience on Hadoop /Big Data related technology experience in Storage, Querying, Processing and analysis of data.
- Experienced in using various Hadoop infrastructures such as Map Reduce, Hive, Sqoop, and Oozie.
- Expert in AmazonEMR, Spark, Kinesis, S3, Boto3, Bean Stalk, ECS, Cloud watch, Lambda, ELB, VPC, Elastic Cache, Dynamo DB, Redshit, RDS, Aethna, Zeppelin & Airflow.
- Experienced in testing data in HDFS and Hive for each transaction of data.
- Strong Experience in working with Databases like Oracle 12C/11g/10g/9i, DB2, SQL Server and MySQL and proficiency in writing complex SQL queries.
- Experienced in using database tools like SQL Navigator, TOAD.
TECHNICAL SKILLS
Hadoop/Big Data: Map Reduce, HDFS, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, Flume, Scala, Akka, Kafka, Storm, Mongo DB
Java/J2EE Technologies: JDBC, Java Script, JSP, Servlets, jQuery
Languages: Java, J2EE, PL/SQL, Pig Latin, HQL, R, Python, XPath, Spark
No SQL Databases: Cassandra, mongo DB, Dynamo DB
Frameworks: MVC, Struts, Spring, Hibernate.
Web Technologies: HTML, DHTML, XML, XHTML, JavaScript, CSS, XSLT, AWS, Dynamo DB
Web/Application servers: Apache Tomcat6.0/7.0/8.0, JBoss
Network protocols: TCP/IP fundamentals, LAN and WAN.
Databases: Oracle 12c/11g/10g/9i, Microsoft Access, MS SQL.
Operating Systems: UNIX, Ubuntu Linux and Windows, Centos, Sun Solaris.
PROFESSIONAL EXPERIENCE
Confidential, Long beach, CA
Sr. Big Data Engineer
Responsibilities:
- Implemented solutions for ingesting data from various sources and processing teh Data-at-Rest utilizing Big Data technologies such asHadoop, Map Reduce Frameworks, HBase, Hive.
- Unified data lake architecture integrating various data sources on Hadoop architecture
- Designed and deployed full SDLC of AWS Hadoop cluster based on client's business need.
- Experience on BI reporting with at Scale OLAP for Big Data.
- Implementation of Big Data ecosystem (Hive, Impala, Sqoop, Flume, Spark, Lambda) with Cloud Architecture.
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Writing Scala code to run SPARK jobs in Hadoop HDFS cluster.
- Identify query duplication, complexity and dependency to minimize migration efforts
- Technology stack: Oracle, Hortonworks HDP cluster, Attunity Visibility, Cloudera Navigator Optimizer, AWS Cloud and Dynamo DB.
- Experience in AWS, implementing solutions using services like (EC2, S3, RDS, Redshift, VPC)
- Worked as a Hadoop consultant on (Map Reduce/Pig/HIVE/Sqoop).
- Integrated NoSQL database like Hbase with Map Reduce to move bulk amount of data into HBase.
- Worked using Apache Hadoop ecosystem components like HDFS, Hive, Sqoop, Pig, and Map Reduce.
- Worked with AWS to implement teh client-side encryption as Dynamo DB does not support at rest encryption at dis time.
- Define and manage teh architecture and life cycle of Hadoop and SPARK projects
- Designed and Developed Real time Stream processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning.
- Exploring with teh Spark for improving teh performance and optimization of teh existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Utilized NOSQL database HBase for loading HIVE tables into HBase tables through Hive-HBase integration which was consumed by Data scientist team.
- Performed data profiling and transformation on teh raw data using Pig, Python, and Java.
- Developing predictive analytic using Apache Spark Scala APIs.
- Involved in working of big data analysis using Pig and User defined functions (UDF).
- Created Hive External tables and loaded teh data into tables and query data using HQL.
- Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream teh log data from servers.
- Implement enterprise grade platform (mark logic) for ETL from mainframe to NOSQL (Cassandra).
- Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored teh data into HDFS in CSV format.
- Developed Spark streaming application to pull data from cloud to Hive table.
- Used Spark SQL to process teh huge amount of structured data.
- Assigned name to each of teh columns using case class option in Scala.
- Implemented Spark GraphX application to analyze guest behavior for data science segments.
- Enhancements to traditional data warehouse based on STAR schema, update data models, perform Data Analytics and Reporting using Tableau.
- Experience on BI reporting with at Scale OLAP for Big Data.
- Responsible for importing log files from various sources into HDFS using Flume
- Expert in performing business analytical scripts using Hive SQL.
- Implemented continuous integration & deployment (CICD) through Jenkins for Hadoop jobs.
- Worked in writing Hadoop Jobs for analyzing data using Hive, Pig accessing Text format files, sequence files, Parquet files.
- Implemented a proof of concept deploying dis product in Amazon Web Services AWS.
- Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and Horton Works Distributions (HDP) and MapR.
- Experience in integrating oozie logs to kibana dashboard.
- Extracted teh data from MySQL, AWS RedShift into HDFS using Sqoop.
- Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
Environment: Big Data, Spark, YARN, HIVE, Pig, Scala, Python, Hadoop, AWS, Dynamo DB, Kibana, Cloudera, EMR, JDBC, Redshift, NOSQL, Sqoop, MYSQL.
Confidential, NYC, NY
Big Data Engineer
Responsibilities:
- Architected, Designed and Developed Business applications and Data marts for Marketing and IT department to facilitate departmental reporting.
- Utilize AWS services with focus on big data Architect /analytics / enterprise data warehouse and business intelligence solutions to ensure optimal architecture, scalability, flexibility, availability, performance, and to provide meaningful and valuable information for better decision-making.
- Experience in data cleansing and data mining.
- Design AWS architecture, Cloud migration, AWS EMR, Dynamo DB, Redshift and event processing using lambda function
- All teh data was loaded from our relational DBs to HIVE using Sqoop. We were getting four flat files from different vendors. These were all in different formats e.g. text, EDI and XML formats.
- Ingest data into Hadoop / Hive/HDFS from different data sources.
- Created Hive External tables to stage data and then move teh data from Staging to main tables
- Objective of dis project is to build a data lake as a cloud based solution in AWS using Apache Spark and provide visualization of teh ETL orchestration using CDAP tool.
- Implemented Installation and configuration of multi-node cluster on Cloud using Amazon Web Services(AWS) on EC2.
- Writing Hive join query to fetch info from multiple tables, writing multiple Map Reduce jobs to collect output from Hive
- Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting on teh dashboard.
- Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
- AWS Cloud and On-Premise environments with Infrastructure Provisioning / Configuration.
- Worked on writing Perl scripts covering data feed handling, implementingmark logic, communicating with web-services through SOAP Lite module and WSDL.
- Developed teh code for Importing and exporting data into HDFS and Hive using Sqoop
- Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
- Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
- Design of Redshift Data model, Redshift Performance improvements/analysis
- Continuous monitoring and managing teh Hadoop cluster through Cloudera Manager.
- Worked on configuring and managing disaster recovery and backup on Cassandra Data.
- Performed File system management and monitoring on Hadoop log files.
- Utilized Oozie workflow to run Pig and Hive Jobs Extracted files from Mongo DB through Sqoop and placed in HDFS and processed.
- Used Flume to collect, aggregate, and store teh web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
- Implemented partitioning, dynamic partitions and buckets in HIVE.
- Developed customized classes for serialization and Deserialization in Hadoop.
Environment: Pig, Sqoop, Kafka, Apache Cassandra, Oozie, Impala, Cloudera, AWS, AWS EMR, Redshift, Flume, Apache Hadoop, HDFS, Hive, Map Reduce, Cassandra, Zookeeper, MySQL, Eclipse, Dynamo DB, PL/SQL and Python.
Confidential, Tampa, FL
JAVA/Big Data Developer
Responsibilities:
- Experience working with big data and real time/near real time analytics and big data platforms like Hadoop, Spark using programming languages like Scala and Java.
- Involved in Big Data Project Implementation and Support.
- Involved in teh coding and integration of several business-critical modules of CARE application using Java, spring, Hibernate and REST web services on Web Sphere application server.
- Developed web components using JSP, Servlets, and JDBC.
- Designed and developed Enterprise Eligibility business objects and domain objects with Object Relational Mapping framework such as Hibernate.
- Developed teh Web Based Rich Internet Application (RIA) using JAVA/J2EE (spring framework).
- Used teh light weight container of teh Spring Framework to provide architectural flexibility for inversion of controller (IOC).
- Developed and Implemented new UI's using Angular JS and HTML.
- Developed Spring Configuration for dependency injection by using Spring IOC, Spring Controllers.
- Implementing Spring MVC and IOC methodologies.
- Used teh JNDI for Naming and directory services.
- Involved in teh coding and integration of several business-critical modules of application using Java, Spring, Hibernate and REST web services on WebSphere application server.
- Deliver Big Data Products including re-platforming Legacy Global Risk Management System with Big Data Technologies such as Hadoop, Hive and HBase.
- Worked with NoSQL Mongo DB and heavily worked on Hive, HBase and HDFS.
- Developed Restful web services using JAX-RS and used DELETE, PUT, POST, GET HTTP methods in spring 3.0 and OSGI integrated environment.
- Developed optimal strategies for distributing teh web log data over teh cluster, importing and exporting teh stored web log data into HDFS and Hive using Scoop.
- Collected and aggregated large amounts of web log data from different sources such as webservers, mobile and network devices using Apache Flume and stored teh data into HDFS for analysis.
- Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report on IBM WebSphere MQ messaging system.
- Developed presentation layer using Java Server Faces (JSF) MVC framework.
- Participated in JAD meetings to gather teh requirements and understand teh End Users System.
- Developed user interfaces using JSP, HTML, XML and JavaScript.
- Generated XML Schemas and used XML Beans to parse XML files.
- Modified teh existing JSP pages using JSTL.
- Developed web pages using JSPs and JSTL to help end user make online submission of rebates. Also used XML Beans for data mapping of XML into Java Objects.
- Used Spring JDBC Dao as a data access technology to interact with teh database.
- Developed Unit and E2E test cases using Node JS.
Environment: JSP 2.1, Hadoop 1x, Hive, Pig, HBASE, JSTL 1.2, Java, J2EE, Java SE 6, UML, Servlets 2.5, Spring MVC, Hibernate, JSON, Eclipse Kepler-Maven, Serena Dimensions, Unix, JUnit, DB2, Oracle, Restful Web services, Big Data, jQuery, AJAX, Angular Js, JAXB, IRAD Web sphere Integration Developer, Web Sphere 7.0.
Confidential, Chicago, IL
Sr. Java/ J2EE Developer
Responsibilities:
- Developing Intranet Web Application using J2EE architecture, using JSP to design teh user interfaces, and JSP tag libraries to define custom tags and JDBC for database connectivity.
- Extensively worked onOIMConnectors like Active Directory, ED, IBM RACF, RSA, OID, OIF, Database User Management, CA Top Secret Advanced and Flat File.
- Expertise in designing and creating RESTful API's using ApacheSolrand Spring WS Developed and modified database objects as per teh requirements.
- Implemented Log4j for teh project to compile and package teh application, used ANT and MAVEN to automate build and deployment scripts.
- Created POJO classes, java beans, EJB Beans and wrote JUnit test cases to test code as per teh acceptance criteria throughout teh application during development and testing Phase.
- Used Git as version control tools to maintain teh code repository.
- As part ofAngularJSdevelopment have used data-binding and developed controllers, directives, filters and integrated with teh backend-services.
- Worked on Designing and Developing ETL Workflows using Java for processingdatain HDFS/HBase using Oozie.
- Worked on importing teh unstructureddatainto teh HDFS using Flume.
- Wrote complex Hive queries and UDFs.
- Exporteddatafrom HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Created and maintained Technical documentation for launchingHadoopClusters and for executing Hive queries and Pig Scripts.
- Used Flume extensively in gathering and moving logdatafiles from Application Servers to a central location inHadoopDistributed File System (HDFS).
- Involved in developing Shell scripts to easy execution of all other scripts (Pig, Hive, and MapReduce) and move thedatafiles within and outside of HDFS.
- ManagedHadoopjobs using Oozie workflow scheduler system for Map Reduce, Hive, Pig and Sqoop actions.
- Involved in initiating and successfully completing Proof of Concept on FLUME for Pre-Processing, Increased Reliability and Ease of Scalability over traditional MSMQ.
- Used Flume to collect teh logdatafrom different resources and transfer thedatatype to hive tables using different SerDe to store in JSON, XML and Sequence file formats.
- Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
Environment: Java, JSF, Spring, Hibernate, Linux Shell Script, JaxWS, SOAP, WSDL, CSS3, html3, JBOSS, JSF, Rally, Hudson xml, html, Clear Case, Clear Quest, My Eclipse, ANT, Oracle, Linux, Oracle 10g database.
Confidential
Java Developer
Responsibilities:
- Responsible for design, development, test and maintenance of applications designed on Java technologies.
- Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
- Created rapid prototypes of interfaces to be used as blueprints for technical development.
- Developed user stories using Core Java and Spring 3.1 and consumed rest web services exposed from teh profit center.
- Used UML diagrams Use Cases, Object, Class, State, Sequence and Collaboration to design teh application using Object Oriented analysis and design.
- Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
- Developed JavaScript behavior code for user interaction.
- Created database program in SQL server to manipulate data accumulated by internet transactions.
- Wrote Servlets class to generate dynamic HTML pages.
- Developed SQL queries and Stored Procedures using PL/SQL to retrieve and insert into multiple database schemas.
- Developed teh XML Schema and Web services for teh data maintenance and structures Wrote test cases in JUnit for unit testing of classes.
- Worked extensively with JSP's and Servlets to accommodate all presentation customizations on teh front end.
- Developed JSP's for teh presentation layer.
- Created DML statements to insert/update teh data in database and also created DDL statements to create/drop tables to/from oracle database.
- Configured Hibernate for storing objects in teh database, retrieving objects, querying objects and persisting relationships between objects.
- Used DOM and DOM Functions using Firefox and IE Developer Tool bar for IE.
- Debugged teh application using Firebug to traverse teh documents.
- Involved in developing web pages using HTML and JSP.
- Provided Technical support for production environments resolving teh issues, analysing teh defects, providing and implementing teh solution defects.
- Used Oracle 10g as teh backend database using UNIX OS.
- Used JSP, HTML, Java Script, Angular JS and CSS3 for content layout and presentation.
- Did core Java coding using JDK 1.3, Eclipse Integrated Development Environment (IDE), clear case, and ANT.
- Used Spring Core and Spring-web framework. Created a lot of classes for backend.
- Involved in writing SQL Queries, Stored Procedures and used JDBC for database connectivity with MySQL Server.
- Developed teh presentation layer using CSS and HTML taken from bootstrap to develop for browsers.
Environment: Java, XML, HTML, JavaScript, JDBC, CSS, SQL, PL/SQL, XML, Web MVC, Eclipse, Ajax, jQuery, spring with Hibernate, Active MQ, Jasper Reports, Ant as build tool and My SQL and Apache Tomcat