We provide IT Staff Augmentation Services!

Sr. Big Data Architect Resume

Albertville, AL


  • Above 9 years of experience as a Java/J2EE full stack Programmer for entire Software Development Life Cycle (SDLC) including analysis, design, implementation, integration, testing and maintenance of applications using Java/J2EE and Object Oriented Client - Server technologies.
  • Architect, design & develop Big Data Solutions practice including set up Big Data roadmap, build supporting infrastructure and team to provide Big Data.
  • Hands on experience in development, installation, configuring, and using Hadoop & ecosystem components like Hadoop MapReduce , Spark, Scala,HDFS, HBase, Hive, Impala, Sqoop, Pig, Flume, Kafka, Storm, Spark, Elastic Search.
  • Excellent understanding of Hadoop architecture and underlying framework including storage management.
  • Expertise in architecting Big data solutions using Data ingestion, Data Storage
  • Experienced in Worked on NoSQL databases - Hbase, Cassandra & MongoDB, database performance tuning & data modeling.
  • Knowledge and experience of architecture and functionality of NOSQL DB like Cassandra and Mongo DB .
  • Strong Experience in Front End Technologies like JSP, HTML5, JQuery, JavaScript, CSS3 .
  • Experienced in application development using Java, J2EE, JDBC, spring, Junit.
  • Experienced in developing web based GUIs using JavaScript, JSP, HTML, JQuery, XMLand CSS.
  • Experienced to develop enterprise applications with J2EE/MVC architecture with application servers and Web servers such as, JBoss, and Apache Tomcat 6.0/7.0/8/0.
  • Architecting and implementing Portfolio Recommendation Analytics Engine using Hadoop MR, Oozie, Spark SQL, Spark Mlib and Cassandra.
  • Technologies extensively worked on during my tenure in Software Development are Struts, Spring, CXF Rest API, Webservices, SOAP, XML, JMS, JSP, JNDI , Apache, Tomcat, JDBC and various Databases like Oracle, and Microsoft SQL server.
  • Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
  • Experience in development of Big Data projects using Hadoop, Hive, HDP, Pig, Flume, Storm and Map Reduce open source tools/technologies.
  • Architecting, Solutioning and Modeling DI (Data Integrity) Platforms using sqoop, flume, kafka, Spark Streaming, Spark Mllib, Cassandra.
  • Strong experience in migrating data warehouses and databases into Hadoop/NoSQL platforms.
  • Strong expertise on Amazon AWS EC2, Dynamo DB, S3, Kinesis and other services
  • Expertise in Big Data architecture like hadoop (Azure, Hortonworks, Cloudera) distributed system, MongoDB, NoSQL
  • Hands on experience on Hadoop /Big Data related technology experience in Storage, Querying, Processing and analysis of data.
  • Knowledge and experience in job work-flow scheduling and monitoring tools like Oozie and Zookeeper.
  • Good experience in Shell programming.
  • Knowledge in configuration and managing - Cloudera’s Hadoop platform along with CDH3&4 clusters.
  • Extensive knowledge in programming with Resilient Distributed Datasets (RDDs).
  • Excellent technical and analytical skills with clear understanding of design goals of ER modeling for OLTP and dimension modeling for OLAP.
  • Experienced in using various Hadoop infrastructures such as Map Reduce , Hive , Sqoop , and Oozie .
  • Expert in Amazon EMR, Spark, Kinesis, S3, Boto3, Bean Stalk, ECS, Cloud watch, Lambda, ELB, VPC, Elastic Cache, Dynamo DB, Redshit, RDS, Aethna, Zeppelin & Airflow.
  • Experienced in testing data in HDFS and Hive for each transaction of data.
  • Strong Experience in working with Databases like Oracle 12C/11g/10g/9i, DB2, SQL Server and MySQL and proficiency in writing complex SQL queries.
  • Experienced in using database tools like SQL Navigator, TOAD.
  • Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Experienced with Akka building high performance and reliable distributed applications in Java and Scala.


Java/J2EE Technologies: JDBC, Java Script, JSP, Servlets, JQuery

Hadoop/Big Data: Map Reduce, HDFS, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, Flume, Scala, Akka, Kafka, Storm.

Frameworks: MVC, Struts, Spring, Hibernate.

Languages: Java, J2EE, PL/SQL, Pig Latin, HQL, R, Python, XPath, Spark.

Web Technologies: HTML, DHTML, XML, XHTML, JavaScript, CSS, XSLT, AWS

Web/Application servers: Apache Tomcat6.0/7.0/8.0, JBoss.

No SQL Databases: Cassandra, Mongo DB, Dynamo DB.

AWS: EC2, AMI, EMR, S3, Redshift .

IDE: Eclipse 2.1, 3.0, IBM WebSphere RAD6, 7, JBuilder 7, XML Spy 4.1, Packages and Utilities MS Office, Adobe Acrobat.

Web Servers: Apache HTTP Server 1.3 and Tomcat 4.0.1, 5.0

Network protocols: TCP/IP fundamentals, LAN and WAN.

Databases: Oracle 12c/11g/10g/9i, Microsoft Access, MS SQL.

Operating Systems: UNIX, Ubuntu, Linux, Windows, Centos, Sun Solaris.


Confidential, Albertville, AL

Sr. Big Data Architect

  • Involved in the high-level design of the Hadoop architecture for the existing data structure and Business process.
  • Unified data lake architecture integrating various data sources on Hadoop architecture.
  • Extensively involved in Design phase and delivered Design documents in Hadoop eco system with HDFS, HIVE, PIG, SQOOP and SPARK with SCALA.
  • Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Kafka.
  • Designed and deployed full SDLC of AWS Hadoop cluster based on client's business need.
  • Experience on BI reporting with at Scale OLAP for Big Data.
  • Implementation of Big Data ecosystem (Hive, Impala, Sqoop, Flume, Spark, Lambda) with Cloud Architecture.
  • Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Writing Scala code to run SPARK jobs in Hadoop HDFS cluster .
  • Identify query duplication, complexity and dependency to minimize migration efforts Technology stack: Oracle, Hortonworks HDP cluster, Attunity Visibility, Cloudera Navigator Optimizer, AWS Cloud and Dynamo DB.
  • Experience in AWS, implementing solutions using services like (EC2, S3, RDS, Redshift, VPC)
  • Worked as a Hadoop consultant on (Map Reduce/Pig/HIVE/Sqoop).
  • Integrated NoSQL database like Hbase with Map Reduce to move bulk amount of data into HBase.
  • Worked using Apache Hadoop ecosystem components like HDFS, Hive, Sqoop, Pig, and Map Reduce.
  • Worked with AWS to implement the client-side encryption as Dynamo DB does not support at rest encryption at this time.
  • Define and manage the architecture and life cycle of Hadoop and SPARK projects
  • Designed and Developed Real time Stream processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning.
  • Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Utilized NOSQL database HBase for loading HIVE tables into HBase tables through Hive-HBase integration which was consumed by Data scientist team.
  • Performed data profiling and transformation on the raw data using Pig, Python, and Java.
  • Developing predictive analytic using Apache Spark Scala APIs.
  • Involved in working of big data analysis using Pig and User defined functions (UDF).
  • Created Hive External tables and loaded the data into tables and query data using HQL.
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
  • Implement enterprise grade platform (mark logic) for ETL from mainframe to NOSQL (cassandra).
  • Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in CSV format.
  • Developed Spark streaming application to pull data from cloud to Hive table.
  • Used Spark SQL to process the huge amount of structured data.
  • Assigned name to each of the columns using case class option in Scala.
  • Implemented Spark GraphX application to analyze guest behavior for data science segments.
  • Enhancements to traditional data warehouse based on STAR schema, update data models, perform Data Analytics and Reporting using Tableau.
  • Experience on BI reporting with at Scale OLAP for Big Data.
  • Responsible for importing log files from various sources into HDFS using Flume
  • Expert in performing business analytical scripts using Hive SQL.
  • Implemented continuous integration & deployment (CICD) through Jenkins for Hadoop jobs.
  • Worked in writing Hadoop Jobs for analyzing data using Hive, Pig accessing Text format files, sequence files, Parquet files.
  • Implemented a proof of concept deploying this product in Amazon Web Services AWS.
  • Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and Horton Works Distributions (HDP) and MapR.
  • Experience in integrating oozie logs to kibana dashboard.
  • Extracted the data from MySQL, AWS RedShift into HDFS using Sqoop.
  • Developed Spark code using Scala and Spark-SQL for faster testing and data processing.

Environment: Spark, YARN, HIVE, Pig, Scala, Python, Hadoop, AWS, Dynamo DB, Kibana, Cloudera, EMR, JDBC, Redshift, NOSQL, Sqoop, MYSQL.

Confidential, Tampa, FL

Big Data Engineer

  • Architected, Designed and Developed Business applications and Data marts for Marketing and IT department to facilitate departmental reporting.
  • Involved in Design and Architecting of Big Data solutions using Hadoop Eco System.
  • Involved in Requirement gathering, Business Analysis and translated business requirements into Technical design in Hadoop and Big Data.
  • Created Spark streaming projects for data ingestion and integrated with Kafka consumers and producers for messaging.
  • Utilize AWS services with focus on big data Architect /analytics / enterprise data warehouse and business intelligence solutions to ensure optimal architecture, scalability, flexibility, availability, performance, and to provide meaningful and valuable information for better decision-making.
  • Experience in data cleansing and data mining.
  • Used NIFI/Flume to create flows for data ingestion.
  • All the data was loaded from our relational DBs to HIVE using Sqoop. We were getting four flat files from different vendors. These were all in different formats e.g. text, EDI and XML formats.
  • Ingest data into Hadoop / Hive/HDFS from different data sources.
  • Created Hive External tables to stage data and then move the data from Staging to main tables
  • Objective of this project is to build a data lake as a cloud based solution in AWS using Apache Spark and provide visualization of the ETL orchestration using CDAP tool.
  • Implemented Installation and configuration of multi-node cluster on Cloud using Amazon Web Services(AWS) on EC2 .
  • Writing Hive join query to fetch info from multiple tables, writing multiple Map Reduce jobs to collect output from Hive
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
  • Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
  • AWS Cloud and On-Premise environments with Infrastructure Provisioning / Configuration.
  • Worked on writing Perl scripts covering data feed handling, implementingmark logic, communicating with web-services through SOAP Lite module and WSDL.
  • Developed the code for Importing and exporting data into HDFS and Hive using Sqoop
  • Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
  • Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
  • Design of Redshift Data model, Redshift Performance improvements/analysis
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
  • Worked on configuring and managing disaster recovery and backup on Cassandra Data.
  • Performed File system management and monitoring on Hadoop log files.
  • Utilized Oozie workflow to run Pig and Hive Jobs Extracted files from Mongo DB through Sqoop and placed in HDFS and processed.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Implemented partitioning, dynamic partitions and buckets in HIVE.
  • Developed customized classes for serialization and Deserialization in Hadoop.

Environment: Pig, Sqoop, Kafka, Apache Cassandra, Oozie, Impala, Cloudera, AWS, EMR, Redshift, Flume, Apache Hadoop, HDFS, Hive, Map Reduce, Cassandra, Zookeeper, MySQL, Eclipse, Dynamo DB, PL/SQL and Python.

Confidential, Long beach, CA

Big Data / Java Developer

  • Experience working with big data and real time/near real time analytics and big data platforms like Hadoop, Spark using programming languages like Scala and Java.
  • Involved in Big Data Project Implementation and Support.
  • Developed the project using Agile/Scrum methodologies.
  • Involved in the coding and integration of several business critical modules of CARE application using Java, spring, Hibernate and REST web services on Web Sphere application server.
  • Developed web components using JSP, Servlets, and JDBC.
  • Designed and developed Enterprise Eligibility business objects and domain objects with Object Relational Mapping framework such as Hibernate.
  • Developed the Web Based Rich Internet Application (RIA) using JAVA/J2EE (spring framework).
  • Used the light weight container of the Spring Framework to provide architectural flexibility for inversion of controller (IOC).
  • Involved in end to end implementation of Big data design.
  • Collected and aggregated large amounts of web log data from different sources such as webservers, mobile and network devices using Apache Flume and stored the data into HDFS for analysis.
  • Involved in the coding and integration of several business critical modules of application using Java, Spring, Hibernate and REST web services on WebSphere application server.
  • Worked closely with Business Analysts in understanding the technical requirements of each project and prepared the use cases for different functionalities and designs.
  • Worked with NoSQL Mongo DB and heavily worked on Hive, Hbase and HDFS
  • Created scalable and high-performance web services for data tracking and done High-speed querying.
  • Developed optimal strategies for distributing the web log data over the cluster, importing and exporting the stored web log data into HDFS and Hive using Scoop.
  • Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report on IBM WebSphere MQ messaging system.
  • Developed presentation layer using Java Server Faces (JSF) MVC framework.
  • Participated in JAD meetings to gather the requirements and understand the End Users System.
  • Developed user interfaces using JSP, HTML, XML and JavaScript.
  • Generated XML Schemas and used XML Beans to parse XML files.
  • Modified the existing JSP pages using JSTL.
  • Developed web pages using JSPs and JSTL to help end user make online submission of rebates. Also used XML Beans for data mapping of XML into Java Objects.
  • Used Spring JDBC Dao as a data access technology to interact with the database.
  • Developed Unit and E2E test cases using Node JS.
  • Developed and Implemented new UI's using Angular JS and HTML.
  • Developed Spring Configuration for dependency injection by using Spring IOC, Spring Controllers.
  • Implementing Spring MVC and IOC methodologies.
  • Used the JNDI for Naming and directory services.

Environment: JSP 2.1, Hadoop, Hive, Pig, HBASE, JSTL 1.2, Java, J2EE, Java SE 6, UML, Servlets 2.5, Spring MVC, Hibernate, JSON, Eclipse Kepler-Maven, Unix, JUnit, DB2, Oracle, Restful Web services, jQuery, AJAX, Angular Js, JAXB.

Confidential, Washington, DC

Sr. Java/ J2EE Developer

  • Implemented features like logging, user session validation using Spring-AOP module and Used Spring IOC as Dependency Injection.
  • Used AJAX for client-to-server communication and developed Web Services' API using java.
  • Expertise in designing and creating RESTful API's using Apache Solr and Spring WS Developed and modified database objects as per the requirements.
  • Wrote Hibernate queries over Spring framework with respect to the persistence layer.
  • Designed and developed the UI using Struts view component JSP, HTML, Angular JS, JavaScript, AJAX , and JSON.
  • Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
  • Implemented Log4j for the project to compile and package the application, used ANT and MAVEN to automate build and deployment scripts.
  • Involved in designing and writing ANT scripts that would build the whole Ear along with EJB Jar's and War files along with JUNIT test cases.
  • As part of AngularJS development have used data-binding and developed controllers, directives, filters and integrated with the backend-services.
  • Worked on Designing and Developing ETL Workflows using Java for processing data in HDFS/Hbase using Oozie.
  • Used Web services (SOAP) for transmission of large blocks of XML data over HTTP and developed Web Services for data transfer using SOAP and WSDL
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Used Flume extensively in gathering and moving log data files from Application Servers to a central location in Hadoop Distributed File System (HDFS).
  • Managed Hadoop jobs using Oozie workflow scheduler system for Map Reduce, Hive, Pig and Sqoop actions.
  • Involved in initiating and successfully completing Proof of Concept on FLUME for Pre-Processing, Increased Reliability and Ease of Scalability over traditional MSMQ.
  • Used Flume to collect the log data from different resources and transfer the data type to hive tables using different SerDe to store in JSON, XML and Sequence file formats.
  • Created POJO classes, java beans, EJBBeans and wrote JUnit test cases to test code as per theacceptance criteria throughout the application during development and testing Phase.
  • Used Git as version control tools to maintain the code repository.

Environment: Java, JSF, Spring, Hibernate, Linux Shell Script, JaxWS, SOAP, WSDL, CSS3, html3, JBOSS, JSF, Rally, Hudson xml, html, Clear Case, Clear Quest, My Eclipse, ANT, Oracle, Linux, Oracle 10g database.


Java Developer

  • Responsible for design, development, test and maintenance of applications designed on Java technologies.
  • Involved in designing, coding, debugging, documenting and maintaining a number of applications.
  • Used UML diagrams Use Cases, Object, Class, State, Sequence and Collaboration to design the application using Object Oriented analysis and design.
  • Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
  • Developed JavaScript behavior code for user interaction.
  • Developed UI using HTML , JavaScript , and JSP , and developed Business Logic and Interfacing components using Business Objects, XML , and JDBC .
  • Developed user stories using Core Java and Spring 3.1 and consumed rest web services exposed from the profit center.
  • Created database program in SQL server to manipulate data accumulated by internet transactions.
  • Wrote Servlets class to generate dynamic HTML pages.
  • Used JSP, HTML, Java Script, Angular JS and CSS3 for content layout and presentation.
  • Did core Java coding using JDK 1.3, Eclipse Integrated Development Environment (IDE), clear case, and ANT.
  • Used Spring Core and Spring-web framework. Created a lot of classes for backend.
  • Involved in writing SQL Queries , Stored Procedures and used JDBC for database connectivity with MySQL Server.
  • Developed the presentation layer using CSS and HTML taken from bootstrap to develop for browsers.
  • Developed SQL queries and Stored Procedures using PL/SQL to retrieve and insert into multiple database schemas.
  • Developed the XML Schema and Web services for the data maintenance and structures Wrote test cases in JUnit for unit testing of classes.
  • Worked extensively with JSP's and Servlets to accommodate all presentation customizations on the front end.
  • Developed JSP's for the presentation layer.
  • Created DML statements to insert/update the data in database and also created DDL statements to create/drop tables to/from oracle database.
  • Configured Hibernate for storing objects in the database, retrieving objects, querying objects and persisting relationships between objects.
  • Used DOM and DOM Functions using Firefox and IE Developer Tool bar for IE.
  • Involved in developing web pages using HTML and JSP .
  • Used Oracle 10g as the backend database using UNIX OS.

Environment: Java, XML, HTML, JavaScript, JDBC, CSS, SQL, PL/SQL, XML, Web MVC, Eclipse, Ajax, JQuery, spring, Hibernate, Active MQ, Ant as build tool and My SQL and Apache Tomcat

Hire Now