Big Data Technical Architect Resume

SUMMARY:

Over 15 years of professional IT experience with emphasis on Big data Technologies, working with many large scale applications in various domains including Finance, Banking, Insurance and Health Care
Cloudera Certified Hadoop Developer for Apache Hadoop (CCDH 410 - Version: 5)
Experience in complete Software Development Life Cycle process of application development. (Requirements gathering, analysis, design, development, testing and implementation).
Expertise in Hadoop Distributed File System (HDFS), Map Reduce, PIG, HIVE, HBASE, SQOOP .
Extensive experience working on working in Big Data Hadoop Ecosystem comprising Apache Spark 2.3, PySpark API, Docker, Map Reduce, Hive, Pig, Apache Oozie, Sqoop, Flume, HDFS, Apache Avro .
Expertise in working on AWS using Lambda, EMR, Redshift, SNS, SES, Glue, Data Pipeline, S3, API Gateway, Athena API, Amazon Kinesis and DynamoDB No SQL DB .
Excellent understanding of Hadoop architecture and Hadoop ecosystem such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
Extensive experience in importing/exporting data from/to RDBMS the Hadoop Ecosystem using Apache Sqoop.
Good Experience in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
Extensive experience in creating Complete Workflow chain from the scratch for multiple projects within client domain using Apache Oozie . Workflow Scheduling involves Map Reduce Jobs, Hive, PySpark and Shell Script, Email actions with output of one workflow fed as input to another.
Good experience in Cloudera platform and Cloudera Manager.
Migration of the revenue data from Oracle to Hadoop, Hive, and Amazon Redshift
Very strong industry experience in Apache Hive for data transformation.
Strong experience on both Development and Maintenance/Support projects.
Good team player with excellent communication skills to work in a team and individual environment.
Strong exposure to IT consulting, software project management, team leadership, design, development, implementation, maintenance/support and Integration of Enterprise Software Applications.
Extensive experience in conducting feasibility studies, Plan reviews, Implementation, and Post Implementation Surveys
Demonstrated ability to work independently, showing a high degree of self-motivation and initiative.
Excellent team member with problem-solving and trouble-shooting capabilities, quick learner, result oriented and an enthusiastic team player.
Extensive Experience in designing and developing in Spark using Python
Excellent in Analytical /problem solving skills

TECHNICAL SKILLS:

Big Data Technologies: Apache Spark 2.3, Python API for Spark Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, Impala, Apache Avro

Programming languages: Python, Java, Cobol

Databases: Oracle, DB2,HBase, MySQL, Redshift

AWS: Lambda, EMR, Redshift, CFT, ECS, SNS, SES, Glue, Data Pipeline, S3, API Gateway, Athena API, Amazon Kinesis and DynamoDB No SQL DB.

NoSQL Database: DynamoDB

Operating Systems: Windows & Linux, UNIX

Control: M and Oozie

Other Tools/Utilities: TSO/ISPF, QMF, SPUFI, SDF II and Changeman, CVS, SVN, GIT

Defect Tracking Tools: HP Quality Center

PROFESSIONAL EXPERIENCE:

Confidential

Big Data Technical Architect

Responsibilities:

Designed and developed end to end applications for Data ingestion, Organized Data layers and business use cases.
Developed DynamoDB components to store the Insights Data
Developed AWS Glue ETL jobs using PySpark to securely transform datasets in S3 curated storage into consumption data views
Worked on Continuous Integration, Continuous Deployment, Build Automation and Test Driven Development to enable the rapid delivery of end user capabilities using Amazon Web Services (AWS) Stack (Code Commit, CodeDeploy, Codepipeline, CodeBuild, IAM, CFT)
Designed and developed the insights applications using AWS using Lambda, SNS, Glue API, S3, API Gateway, Athena API.
Developed python Spark AWS glue ETL Job to process insights raw & aggregates data in Parquet and push the output to S3 & DynomoDB.
Created System architecture/ Design and Software development for insights application.
Worked on AWS Cloud Formation to provision AWS resources(S3, SNS, RDS, EMR, Glue, Lambda, DynamoDB)
Developed Spark Code to implement Data Quality Checks - to check the processed data across the system, flight count etc.
Leveraged Amazon Athena for ad-hoc query analytics
Analyze the Business Requirements and come up with Design/ Architecture identifying the different components, flow diagrams and discuss with the team.
Participate in the end-to-end life cycle of the project right from requirements, design, development and testing.

Confidential

Big Data Technical Lead

Environment: AWS EMR, AWS S3, AWS Cloud Watch, RDS(MYSQL), HDFS, Hive, Redshift, Sqoop, Oozie Workflows, Shell Scripts, Spark

Responsibilities:

Analyzing the requirements and the existing environment to help come up with the right strategy to build the BIC system.
Developed Spark, Hive scripts for Data processing.
Developed Oozie Workflow and Coordinator for integrating other systems like Denodo, Hadoop ETL (Hive, Sqoop), Redshift & cloudwatch
Enabled the Oozie SLA feature to alert the long running job.
Built and Owned Data ingestion process from different sources to Hadoop cluster
Developed programs Python Spark job to process raw data in Parquet and push the output to S3.
Worked on ETL scripts to pull the data from denodo Data Base into HDFS.
Developed hive tables to upload data from different sources.
Involved for Database Schema design.
Developed script to load the data in to Redshift from Hive tables.
Created different views in Redshift for different applications.
Stored the job status in MYSQL RDS
Proposed an automated system using Shell script to sqoop the job.
Worked in Agile development approach.
Created the estimates and defined the sprint stages.
Mainly worked on Hive queries to categorize data of different claims.
Created cloud watch to monitor the application
Monitored System health and logs and respond accordingly to any warning or failure conditions.
Involved in the design of Distribution styles for redshift tables.

Confidential

Big Data Technical Lead

Environment: CDH 5, HDFS, Hive, Impala, Java, Sqoop, Tableau, Oozie Workflows, Shell Scripts, IntelliJ, Gradle, Core

Responsibilities:

Analyzing the requirements and the existing environment to help come up with the right strategy to build the DDSW system.
Designed and Executed Oozie workflows using Hive, Python and Shell actions to extract, transform and Load data into Hive Tables.
Worked extensively with Avro and Parquet file formats.
Involved Low level design for MR, Hive, Impala, Shell scripts to process data.
Worked on ETL scripts to pull the data from Oracle Data Base into HDFS.
Developed hive tables to upload data from different sources.
Involved for Database Schema design.
Involved Sprint Planning and Sprint Retrospective meetings
Daily Scrum Status meeting.
Proposed an automated system using Shell script to sqoop the job.
Worked in Agile development approach.
Created the estimates and defined the sprint stages.
Developed a strategy for Full load and incremental load using Sqoop.
Mainly worked on Hive/Impala queries to categorize data of different claims.
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
Generate final reporting data using Tableau for testing by connecting to the corresponding Hive table’s using Hive ODBC connector.
Written python scripts to generate alerts
Monitored System health and logs and respond accordingly to any warning or failure conditions.
Implemented POC on AWS
Worked on Kerberos Authentication for Hadoop.

Confidential

Big Data Technical Lead

Environment: Hadoop, HDFS, MapReduce, HBase, Hive, Flume, Oozie, DB2 and Cloudera Hadoop Distribution (CDH 4).

Responsibilities:

Lead Developer for migrating application Archival data to Big Data
Worked on analyzing Hadoop cluster using different big data analytic tools including Hive, and MapReduce.
Responsible for managing data from multiple sources.
Involved in managing and reviewing Hadoop log files.
Involved in creating Hive tables, and loading and analyzing data using hive queries.
Developed Simple to complex MapReduce Jobs using Hive.
Created Hbase tables to store various data formats of data.
Implemented test scripts to support test driven development and continuous integration.
Worked on tuning the performance of hive queries.
Responsible for building scalable distributed data solutions.
Importing and exporting data into HDFS and Hive using Sqoop.
Extracted files from DB2 through Sqoop and placed in HDFS and processed.
Supported MapReduce Programs those are running on the cluster.
Provide batch processing solution to certain unstructured and large volume of data by using Hadoop Map Reduce framework.
Involved in scheduling Oozie workflow engine to run multiple Hive jobs.
Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews, test development, test automation.
Documented the systems processes and procedures for future references.

Confidential

Technical Lead

Responsibilities:

Involved in different phases to gather requirements, document the functional specifications, design, data modeling and development of the applications.
Developed POC for different use cases to implement big data solutions.
Load and transform large sets of structured and semi structured
Configured workflows to run on the top of Hadoop using spring batch shell script and these workflows comprises of heterogeneous jobs like hive and Map Reduce to ingest the offloaded cold data from enterprise data warehouse into HDFS for archival
Created JCL’s (Using ICETOOL, SORT) to copy VSAM Files to Flat files and convert the data types to readable format (text files) and FTP them to Hadoop cluster (HDFS)
Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library
Unload reference data from DB2 to HDFS using SQOOP.
Worked on Hive for exposing data for further analysis.
Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
Involved in writing Hive scripts to extract, transform and load the data into Database.
Developed workflow in shell script to automate the tasks of running Map Reduce jobs and Hive Scripts on the imported data from Mainframes
Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews, test development, test automation.
Built Hadoop cluster ensuring high availability for Name Node, mixed-workload management, performance optimization, health monitoring, backup and recovery across one or more nodes
Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, HBase, Zookeeper and Sqoop.
Scheduled and managing cron jobs, control-m jobs and write shell scripts to generate alerts
Monitor and manage daily jobs through RabbitMQ and Apache dashboard application
Used Sqoop to efficiently transfer data between database and HDFS and use Flume to stream the log data from servers.
Implemented Name Node backup using NFS. This was done for High availability.
Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
Setup alerts with Cloudera Manager about memory and disk usage on the cluster.
Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the requirement.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Installed Oozie workflow engine to run multiple Hive jobs.

Confidential

Sr. Software Engineer

Environment: Java, JSP, Oracle, JDBC Template, Tomcat, JavaScript, XML, DHTML, CSS, HTML, JQuery

Responsibilities:

Involved in Analysis, Design, Development and Testing of application modules.
Analyzed the complex relationship of system and improve performances of various screens.
Developed JSP pages, using Java Script, Jquery, AJAX for client side validation and CSS for data formatting.
Developed various reports using Adobe APIs and Web services.
Preparation of Technical Design for the enhancement and maintenance requests.
Review of Code, Unit Test Plan and Unit test results.
Team tracking and Issue management.
Wrote test cases using Junit and coordinated with testing team for integration tests
Fixed bugs, improved performance using root cause analysis in production support

Confidential

Sr. Software Engineer

Environment: Java, J2EE (JSPs & Servlets), JUnit, HTML, CSS, JavaScript, Apache Tomcat, Oracle

Responsibilities:

Involved in different phases to gather requirements, document the functional specifications, design, data modeling and development of the applications.
J2EE server side development to support business logic, integration, and persistence.
Used JSP with Spring Framework for developing User Interfaces.
Integrated Security Web Services for authentication of users.
Responsible for Testing and moving the application into Staging and Production environments.
Responsible for Project Documentation, Status Reporting and Presentation.
Used CVS version control to maintain the Source Code.

Confidential

Sr. Software Engineer

Environment: Java, J2EE (JSPs & Servlets), JUnit, HTML, CSS, JavaScript, Apache Tomcat, Oracle

Responsibilities:

Involved in requirements analysis and prepared Requirements Specifications document.
Designed implementation logic for core functionalities
Developed service layer logic for core modules using JSPs and Servlets and involved in integration with presentation layer
Involved in implementation of presentation layer logic using HTML, CSS, JavaScript and XHTML
Design of Oracle database to store customer's & account’s details
Used JDBC connections to store and retrieve data from the database.
Development of complex SQL queries and stored procedures to process and store the data
Developed test cases using JUnit
Involved in unit testing and bug fixing.
Used CVS version control to maintain the Source Code.
Prepared design documents for code developed and defect tracker maintenance.

Confidential

Sr. Programmer Analyst

Environment: Core Java, Java Batch, Service Beans, EJB, RMI/IIOP, J2EE, COBOL390, CICS, DB2

Responsibilities:

Responsible for Proof of Concept, Planning, Designing new proposed Architecture.
Worked on Java, Swing, Web services, XML in addition to Mainframe Technology.
Extracted the business rules from Legacy COBOL programs to code in Java.
Used latest methodologies to convert the existing Mainframe programs to Java &Java batch.
Able to migrate with the limited resources available in Mainframe.
Fine tuning of application programs with the help of DBA.
Utilized transaction wrapper technology (EJB, Batch, ServiceBean on Webpshere cluster).
Attended the functional meetings and prepared the high level Detail Design Document.
Designed high and low level Design documents for the new functions to be implemented.
Supported the re-structuring of DB2 tables by re-writing the existing programs
Debugging and troubleshooting any technical issues while implementing the applications
Implemented Java client based OLTP process with Websphere server running on Mainframe z/OS Host.

Confidential

Programmer Analyst

Responsibilities:

Procuring the project requirements from business analyst & users, breaking up the project delivery into phases and meeting the deadlines as per the estimates.
Transforming the Business requirements into design.
Preparation Analysis, estimation and design.
Single point of contact between customer and offshore team members.
Prepared high-level and low-level design based on business requirement document.
Preparation of Technical Specifications by using high-level design and business requirement document.
Providing Module Inventory and Estimates by identifying the impacted components.
Business and Technical knowledge sharing with other Team members.
Coded complex programs, report program (batch & Online) in COBOL/VSAM/DB2/CICS
Preparation of analysis documents, modification of Programs / JCLs and peer review
Preparing the Unit Test Case document, Coding and Unit Test Results document
Development of the maps, online and batch programs and perform Review of Test cases and code.
Solving defects at SIT/UAT phases and giving the Implementation support.
After implementation preparing Defect log and Defect Action Plan documents.
Mentoring and motivating team members in enabling the team to work independently on Tasks.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship