Sr. Big Data Architect Resume
Englewood, CO
SUMMARY
- Above 10+ years of experience as Big Data Engineer and Hadoop/Java Developer in Analysis, Design, Development, Deployment and Maintenance of software in Java/J2EE technologies and Big Data applications.
- Extensive experience in architecting, prototyping, testing, and implementing Big Data analytic solutions on a Hadoop/Spark/NoSQL Platform and leveraging a broad set of Big Data skills and capabilities to provide innovative leadership, solution architecting and implementation guidance for robust Big Data solutions.
- Experience in creating data ingestion and processing pipelines focused on structured, semi - structured/ unstructured data or both to assist with data visualization and machine learning algorithms including, delivering proof of concept execution to prove teh value of Big Data use cases.
- Proficient in Big Data Exploration, Profiling, Quality, Transformation with hands on experience in designing efficient and robust ETL/ELT workflows, schedulers, and event-based triggers.
- Expertise in developing relational and NoSQL databases including OLTP, OLAP, MDM, Data Warehouse and Data Governance solutions using 3NF, Star and Snowflake schemas designs.
- Experience in implementing Data Mining, data visualization, statistical and historical/predictive Data Analytics solutions using enterprise level BI and Big Data processing tools.
- Strong Data modeling (Logical and Physical) experience and high proficiency in using Erwin, ER/Studio, and Entity-Relationship Modeling with in-depth understanding of business applications, dataflow and teh use of stored procedures/triggers in ETL development tasks.
- Experience in Data Dictionary implementation, Data security (SSL/ACLs/RBACs/Encryption), TOGAF, SOA, UML, data structure/model classifications, data wrangling and data cleansing.
- Extensive hands-on experience in RESTful APIs, Docker, Cloud and Virtualization technologies.
- Has hands-on experience in implementing Big Data solutions leveraging Databricks, Azure HDInsight, Hortonworks, Cloudera-CDH, AWS; data ingestion using Kafka and Sqoop, analyzing Big Data using python, Pig, Impala/HiveQL, SparkSQL, and IBM BigSQL.
- Has hands-on experience in SaaS, PaaS, IaaS, DaaS, SDLC (Agile/SCRUM framework - sprint planning, daily SRUM meeting, product/sprint backlog, sprint review, and sprint retrospective).
- Significant training and experience in project management, unit/functional and system testing, user stories/requirements management, team(s) and tasks management using Rally Scrum/Agile, TFS (Team Foundation Server), GitHUB, SharePoint, and IBM Relational DOORS.
TECHNICAL SKILLS
Hadoop Ecosystem: Hadoop2.7/2.5, MapReduce, Sqoop, Hive, Oozie, Pig, HDFS1.2.4, Zookeeper, Flume, Hbase, Impala, Storm, Hadoop (Cloudera), Hortonworks and Pivotal).
NoSQL Databases: Hbase, Cassandra, MongoDB 3.2.
Web Technologies: HTML5/4, CSS3/2, JavaScript, JQuery, Bootstrap 3/3.5, XML, JSON, AJAX
Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC
IDE and Tools: Eclipse 4.6, Netbeans 8.2, IntelliJ, Maven
Languages: Java, SAS, Scala and Apache Spark, SQL, PL/SQL, PIG Latin, HiveQL, Unix.
Databases: Oracle12c/11g, MYSQL, DB2, MS SQL Server 2008.
PROFESSIONAL EXPERIENCE
Confidential - Englewood CO
Sr. Big Data Architect
Responsibilities:
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
- Lead teh development efforts of migration of historical data from existing warehouses to Hadoop using Sqoop for Scalable processing of teh data and teh eventual insights are sqooped back.
- Direction to Compliance and IT Security groups in designing, documenting and implementing Perimeter Security, Access Management, Auditing and Data Encryption for Hadoop Clusters.
- Direction to Admin teams for configuring and implementing Knox, Kerberos, Sentry, Data Encryption solutions on Hadoop Cluster.
- Strong experience in submitting Sparkapplications in different clusters such as Spark Standalone and Hadoop Yarn.
- Responsible for evaluation, internal branding, pilot/evaluation and full scale implementation of 3rd party vendors (BI, Visualization, Analytics, Machine Learning, ETL, etc.) dat can integrate and leverage Hadoop as processing platform.
- Responsible for overlook of new technologies and possible integration of those to build a robust, Scalable and configurable technology solutions dat would should be leveraged by new products enabled for teh Barclay's banking customers.
- Worked on Amazon Web Services (EC2, ELB, VPC, S3, Cloud Front, IAM, RDS, Route 53, CloudWatch, SNS, SQS and SES).
- Performance Tuning of MS-SQL server database
- Installed and Setup Web Servers (Apache and Tomcat), DB Server (MySQL).
- Good Knowledge on Stored Procedure, Triggers, Batch Referential Integrity, Indexes and other features of Teradata.
- Installed and Setup MySQL (Master and Slave) Server, Multiple MySQL Instance with different port.
- MySQL Database backup (Hot/Cold) and recovery, repair and optimize tables,
- MySQL Database security, creating users and managing permissions.
- Strong knowledge in developing Sparkapplication using Scala.
- Responsible for Enterprise and Systems Architecture, Design and Application development initiatives, strategic roadmap, solution delivery, etc. dat aligns with overall IT strategy of teh Hadoop implementation.
- Migrate PentahoTransformations and Jobs using Import and Export.
- Performance Tuning for SQL procedures and application queries.
- Worked with Enterprise Architecture Board in creating teh technology road map, reference architecture and deriving teh project strategy.
- Working on a realtime streaming DW project using ELK stack for CTL CI/CD, DevOps, Jenkins, Github, Bamboo Java, Scala, Python.
- Lead teh architecture and development efforts for teh complex real-time event processing from myriad of internal and external data sources to ingest teh data into Hadoop, process teh data real-time for specific use cases of fraud detection, recommendation engines, dashboards and visualization using case specific integrated technologies like Spark Streaming, Flume, Kafka, Hbase, Scala, etc.
- Introduced teh Lambda architecture at Sprint US dat provides teh latest state of teh data combining teh batch (slow data) and real-time event (fast data) to get teh current state of teh data.
- Lead teh application development of Batch Ingestion frameworks using custom Spark Hive, HDFS and used Pig scripting for data cleansing and transformations. Used Oozie as teh workflow scheduler for teh batch data pipeline.
- Creating folders, store transformations and jobs, move, revise, lock, delete and restore artifacts using PentahoEnterprise Repository.
- Developed teh warehouse specific Data Lake using Hive and Pig scripting and also ETL Talend pipelines for populating teh Data Marts for user/business consumption using Hive/Impala and Spark/Scala.
- Developed teh streaming process POC with Apache Kafka and Apache Flink with Talend Big Data framework and also on Hadoop cluster.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Hbase, ZooKeeper, Spark, MySQL, Redhat Linux.EC2, Lambda, Elastic Bean Stalk, S3, Glacier, Storage Gateway, RDS, RedShift, Pentaho, Teradata,VPC, Route 53, Jenkins, Snowball, Cloud Watch, Cloud Trail, IAM, ELB, Storage Gateway, SES, SNS, SQS, TALEND
Confidential - Florhan Park, NJ
Sr.Big Data/Hadoop Architect
Responsibilities:
- Has architect and implemented 50+ TB data application on 40+ nodes cluster using Cloudera (CDH 4.x) stack and different Big Data analytic tools.
- Experience on designing with Big Data, Hadoop File System (HDFS) solutions with estimation of teh infrastructure need, effort estimation etc.
- Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
- Design and plan of teh data gathering and processing.
- Hands on experience in installing, configuring and using HADOOP ecosystem components.
- Hands on experience in working with Hadoop echo systems applications like Hive, Pig, Sqoop, Map Reduce and Oozie.
- Extensive experience in developing teh Talend job using various components.
- Strong knowledge of Hadoop and Pig and Hive's analytical functions (UDF's).
- Capturing data from existing databases dat provide SQL interfaces using Sqoop.
- Efficient in building Hive, Pig and map-reduce scripts.
- Migration of huge amounts of data from different databases( me.e. Netezza, Oracle, SQL Server) to Hadoop using relevant connectors
- Monitoring teh Performance using Teradata Viewpoint. Notifying teh batch and report users running teh problem queries. Providing them with teh consultation for changing teh queries and possible improvement areas
- Troubleshooting, debugging & fixing Talend specific issues, while maintaining teh health and performance of teh ETL environment.
- Used Solr for indexing and search capabilities on HDFS.
- Experienced in managing and reviewing Hadoop log files.
- Worked on Multi Clustered environment and setting up Cloudera Hadoop echo-System.
- Ability to build clusters on AWS and Rackspace cloud systems.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Hbase, ZooKeeper, Java (jdk1.6), Custom APIs, Hadoop distribution of Hortonworks & Cloudera, Teradata, SQL Server 2008/2012, Python, Talend, Perl, UNIX Shell Scripting, MySQL, Redhat Linux.
Confidential - Minneapolis, MN
Big Data Architect
Responsibilities:
- Implemented Amazon AWS Data Lake leveraging EC2, EMR, RedShift, Data Pipeline, Lambda, S3, Kinesis, and Cloud Formation in performing data processing/storage/visualization, data migration and integration, writing complex SQL queries, analytical and aggregate functions to understand teh current state of Amtrak data assets and how those data assets can be leveraged to provide enhanced near real time decision making.
- Performed data extraction and migration, data cleaning, analysis, and visualization using Informatica, Tableau Desktop 9.3 to support Redshift Data warehousing solution on AWS.
- Implemented Big Data solution using CDH 5.0 and performed data streaming into HDFS from AWS and Arrow web servers using Sqoop/Flume.
- Used Flume and Sqoop as teh main Hadoop ETL tools for both batch and streaming data processing to extract, transform and load data coming from many heterogeneous systems into teh integrated SQL Server 2012 operational data store and Netezza EDW solution.
- Prototyped BI solutions using Tableau desktop 9.3 and lead teh BI product design and teh development of data models, testing, and integration of Sales ODS.
- Developed reports using Tableau and materialized/updatable views for real-time data analytics.
- Led modeling and design of data architecture for operational data stores, data marts and enterprise data warehouse with data sources such as DB2, SQL server 2012 and Oracle 11g.
- Performed data mapping tasks, data profiling, data cleansing, data standardization and consolidation using Informatica power center, including intensive data migration and teh creating, executing, and documenting of test cases.
- Used Informatica Designer, Workflow Manager and Repository Manager to create source and target definition, design mappings, create repositories and establish users, groups and their privileges.
- Extracted data from teh databases (Oracle 12c/ 11g, SQL Server 2012, and DB2) using Informatica to load it into a single Netezza warehouse repository and developed many stored procedures to manage different aspects of teh company's database and data warehouse systems.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Hue, Ganglia, Nagios, Java, Kafka, Elastic Search, SQL, Scala, Oracle, Netezza, Ambari, Sqoop, Flume, Oozie, Java, Eclipse.
Confidential - Virginia Beach, VA
FullStack Developer
Responsibilities:
- Developed teh application involving Spring MVC, Hibernate and Web Services.
- Played a vital role in teh architecture team for design and implementation of site components using J2EE framework.
- Implemented Dependency Injection (IOC) feature of Spring framework to inject beans into User Interface and AOP for Logging.
- By using teh Swing we created teh object of Frame class. Teh frameworks like JFrame.
- In Swing we used JFrame, JDialog, and JApplet.
- Hands-on experience with developing applications dat use MongoDB2 and also Knowledge on schema modeling, querying, tuning.
- By using EJB, we developed server-side software component dat encapsulates teh business logic of an application.
- Migrated teh project (uses servlets / jsp / Jboss/ jdbc / jndi) build on Weblogic to an Apache Tomcat. me has managed to configure teh ldap authentication server and also to replace teh xxx-jdbc.xml used by weblogic.
- Used ATG (ATG Dynamo) application framework - for building data- and content- driven web applications - largely for commerce and publishing.
- By using EJB we did clean and Build teh application to make sure business logic is working as per teh requirements.
- By using JSF we has done teh Application configuration resource file for configuring application resources.
- Implementation of framework level, we has used application platform for hosting web-based applications, as well as RMI accessible business components, with an ORM layer, a component container, an MVC framework, and a set of tag libraries for JSP.
- Built UI components using React library. Defined and created React js components using jsx.
- Configured development environment using Tomcat application server for developers integration testing.
- Qualified Full Stack Engineer with strong emphasis on Front End Experience with proven experience with Augular.js, Backbone.js, Node.js and Javascript.
- Used test driven approach for developing teh application and implemented teh unit test using teh darn test framework.
- Support for teh prototype pattern to set global defaults without creating a separate hierarchy of subclasses by using GWT.
- Business process modeling and monitoring along with content management and collaboration in UNIX.
- Used JBoss Web server to built on Apache and Tomcat.
- HTML forms are teh integral part of teh web pages and applications, but styling teh form controls manually one by one with CSS are often boring and tedious.
- Bootstrap greatly simplifies teh process of styling and alignment of form controls like labels, input fields, selectboxes, textareas, buttons, etc. through predefined set of classes.
- Experience in responsive layout and design with CSS3/HTML5 and Bootstrap3.2 framework.
- Performed client side validations using JavaScript.
- Developed teh application using Angular JS.
- Optional data caching to improve request-response performance.
- Analyze teh database needs of applications and optimize them using MongoDB and NOSQL.
- MongoDB stores its data in BSON (binary JSON) . Each server TEMPhas a number of databases, and each database TEMPhas a number of collections.
- Created rich and highly interactive responsive UI components with JavaScript, HTML5 and CSS3.
- By "object oriented" we can create classes dat correspond to teh domain model of GitHUB.
- Used Spring framework for Dependency Injection, AOP and Transaction management.
- Worked on web services dat employ teh SOAP and REST architectures of technologies.
- Developed reusable and interoperable Web service modules based on SOA architecture using SOAP and RESTFUL API.
- Used MongoDB for data modeling.
- Teh data is split into ranges (based on teh shard key) and distributed across multiple shards by using MongoDB..
- Implemented Hibernate as ORM and integrated to Spring using Spring ORM. Also implemented some DAO calls using Spring Security.
- Developed unit testing framework using JUnit test cases for continuous integration testing and used JTest Tool for performance testing.
- Implemented PL/SQL queries and used Oracle stored procedures, and built-in functions to retrieve and update data from teh databases
- Used Sonar Qube to measure code coverage, code standard compliance, code duplication and unit test results.
Environment: Java, J2EE, TDD, UML, Scrum, Rest API, SOAP, HTML, CSS3, JavaScript, JQuery, AngularJS, XML, Spring MVC, Hibernate, REST Api, Intellij, Tomcat, Oracle, MongoDB, Maven, GIT, Log4J, Junit, Mockito, PL/SQL.
Confidential
Java Developer
Responsibilities:
- Design and Developed parts of teh application using Spring (IOC, MVC and Security), AJAX, and Hibernate
- Extensively involved in admin user interfaces design using JSF and Web2.0 technologies (YUI and AJAX)
- Worked on developing Object-Oriented n-tier Scalable high-performance and web application module using Core Java.
- By using JavaFX in Swing we created GUI.
- Worked on Core Java Multi-threading.
- Designed, implemented, and maintain an asynchronous, AJAX based rich client for improved customer experience.
- Wrote new JSP's, and modified existing JSP's, Servlets and deployed them on Weblogic Application server.
- Developed Custom tags and Interceptors to persist teh frame state.
- Extensively involved in Server-side programming using Struts as handlers for dynamic Content generation and User Interface (UI) using XML.
- Developed teh front end application using Spring.
- Implemented teh front end of teh application using Spring MVC.
- Developed various controllers and validators for teh front end and defined common page layouts using tiles.
- Used Dependency injection in Spring for Service layer and DAO layer.
- Designed, developed and maintained teh data layer using teh ORM framework called Hibernate.
- Implemented SOAP, WSDL and a subset of XML Schema for a Web services toolkit, and for web services integration.
Environment: JSDK, Core Java, J2EE, Servlet, Spring, Hibernate, HTML, DHTML, Java Script, REST, Web services, JUnit, CVS.
