Hadoop / Spark Development Lead Resume
Phoenix, AZ
SUMMARY
- Over 11 years of experience in full life cycle of the software development process (SDLC) including requirement gathering, solving, innovating, analyzing, designing, developing, writing technical/system specifications, interface development and implementing many successful Mainframe, distributed and Big Data applications.
- Over 2 years of strong experience designing and implementing Big Data solution architecture from ground up using cutting edge Big Data technologies.
- Strong hands - on experience implementing big data solutions using technology stack including Hadoop MapReduce, Hive, HDFS, Spark, Sqoop, Flume and Oozie.
- Experience in multiple Big Data distributions, Cloudera 5.x, HDP 2.4.
- Experience in Scala functional language.
- Experience in UNIX shell scripting and Cron Job Scheduling and working on Linux infrastructure.
- Strong experience with SQL Scripts and PL/SQL in Oracle & MySQL databases, and reporting solutions using Jasper reports.
- Expertise in architecting, designing and implementing large scale mainframe enterprise applications using Cobol, DB2, IMS, Jcl, Vsam, Rexx, Sas, Focus technologies.
- Extensively worked on Microsoft tools for documentation and presentations: Visio, Word, PowerPoint and Excel Macros.
- Strong experience in designing and implementing reusable balancing and reconciliation controls using the Infogix product suite (Infogix ACR/Summary, ACR/Detail, Assure, Perceive, Insight and ER products)
- Built and mentored product development & engineering teams with ability to handle independent responsibilities from requirements to project delivery.
- Sound knowledge in different SDLC models like Agile, SCRUM and Test Driven Development.
- Proficiency in object oriented technology, object relational modeling and defining SOA roadmaps.
- Expertise in creating solutions and choosing technologies that map to company needs, as well as deep understanding of prerequisites and hardware and software requirements.
- An excellent team player with extraordinary problem solving and trouble-shooting. Capabilities and ability to work under pressure with minimum or no supervision.
- Exposure to Post production support and Maintenance of the application.
- Experience in creating SDLC documents like HLD (High Level Design), DD (Detailed Design), AID (Application interface design), IDT (Interface Design Document), TRN (Technical Release Notes) and User Guide.
- High self-motivation to learn new skills to meet growing demand of businesses to modernize legacy application with new technology solutions and innovations.
TECHNICAL SKILLS
Big Data and Data Science Technologies: Hadoop, MapReduce, Pig, Hive, HDFS, Spark, YARN, Zookeeper, Sqoop, Flume, Oozie, CDH5.x
Programming Languages: Scala, Bash Shell scripting, COBOL, MVS/JCL, REXX
Databases: DB2, Oracle, IMS, MySQL, SQL
Tools: /Utilities/APIs: Focus, SAS, Jasper reports, DFSort, REXX, SPUFI, File Manager, QMF, Infogix products Suite (ACR/Summary, ACR/Detail, Assure, Perceive, ER)
Development Tools: SVN, Putty, SQL Developer, Sharepoint, Servicenow
Development Process: SCRUM, Classical SDLC and eXtreme programming
Operating Systems: Linux, Mainframes (ZOS/390), Windows
PROFESSIONAL EXPERIENCE
Confidential, Phoenix, AZ
Hadoop / Spark Development Lead
Environment: Hadoop 2.7, YARN, Spark, Pig, Hive, Oozie, Sqoop, Apache Spark
Responsibilities:
- Involved in use case analysis, design and development of Big Data solutions using Hadoop for credit analysis, product performance monitoring and reporting.
- Implemented data ingestion through Sqoop from various oracle tables and optimized the data loading.
- Facilitated review meetings, brain-storm and shared best practices in Hadoop implementation.
- Gathered business requirements from the business stakeholders and data experts.
- Coordinated the Hadoop system setup and data flow implementation with IT architect.
- Created some HQL scripts for the BI users to analyze the intermediate results.
- Validated the Hive results, evaluated the end user formatting and file type requirements.
- Testing assistance with other application leads to complete end-to-end process validation.
- Created Hive and Pig scripts for data manipulation and cleansing needs for merchant pricing requirements.
- Merchant pricing calculation with additional requirements has been developed through user defined algorithms.
- Designed and Developed a complete dynamic processing application which expands the processing of accounting and reporting.
- Extracting and processing required data using Spark RDDs
- Successfully created workflows in Oozie and enable the scheduling for the recurring processes.
- Excellent Onsite/Offshore work management and multi-vendor contributing deliverable.
- Exploring the new process incorporating Spark for enterprise requirements.
- Weekly batch execution monitoring and HDFS maintenance Successfully installed Oozie workflow engine to run multiple Hive and Pig Jobs required by different business users.
- Used Hortonworks Ambari to monitor the Hadoop eco system.
Confidential, Phoenix, AZ
Hadoop / Spark Development Lead
Environment: Apache Hadoop 2.0, Pig, Hive, Hbase, Sqoop, Spark, Oozie, Linux, Shell scripting, MySQL, Ambari and Hue.
Responsibilities:
- Worked on Hadoop platform to implement Big Data solutions using Pig, Hive, Spark and shell scripting.
- Developing services to ingest and extract data feeds into application use case area from RDMS databases like Oracle using Sqoop.
- Developing bash shell scripts invoking hive HQL scripts and creating appropriate dependency.
- Implemented critical solution components using technologies including Hadoop, MapReduce, Hive, Spark, HDFS, Sqoop, and Flume.
- Batch job scheduling using Crontab and Oozie workflow.
- Extracting and processing required data using Spark RDDs.
- Analyze and create solution diagrams and documentation for business presentations.
- Provide optimization recommendations and solutions for existing processes and designs.
- Leading and coordinating with offshore development team for development and unit testing.
- Perform code-review and co-ordinate with QA team for system and user testing.
- Implement and co-ordinate production deployment and application maintenance.
- Created data compression and data compaction best practices and techniques.
- Identified and evaluated new big data technologies/products/tools that help fill the gap in overall enterprise architecture for future business needs.
Confidential
Technical Lead
Environment: Linux, Bash scripting, Oracle, Jasper Reports, Infogix products Suite
Responsibilities:
- Developed balancing and reconciliation controls using Infogix Assure tool.
- Developed Jasper reports to produce On-demand reports and graphs from the Infogix Perceive application.
- Created bash shell scripts to monitor the application usage - CPU usage, mount usage, heap usage etc.
- Created many summary controls using Infogix Assure to monitor daily data and report any discrepancies.
- Involved in designing & developing new balancing controls and enhancing the existing controls.
- End-to-end involvement from analysis till warranty including all quality processes related to the projects.
- Supported periodic audit requirements by providing supporting data / reports from the application.
- Created bash shell scripts to monitor the arrival of critical files, report any errors from application logs, monitor the timing of
- Post production monitoring of controls and troubleshooting application related issues.
- The role also involves leading & guiding the team in all project related tasks.
- Involved in review & lead-level approval of all major deliverables in the project.
- Provided immense support to the team on resolving project related issues and day to day activities.
- Developed Excel macros to automate the validation of balancing results.
Confidential
Technical Lead
Environment: Mainframes, Infogix products Suite
Responsibilities:
- The role involved managing end to end project life cycle management from requirement analysis till warranty.
- Co-ordinated with multiple business stakeholders to gather the control requirements.
- Design the balancing controls using mainframe (JCL, Vsam) and Infogix ACR tools to meet the client quality and audit standards.
- Managing the team and co-ordinate with the team for development activities.
- Co-ordinated with multiple teams for System and UAT testing.
- Co-ordinated the incident reduction initiative, which saved around $100K per year.
- Involved in the migration from Jobtrac to Control-m scheduler which included around 3000+ jobs.
- Leading the team in multiple LEAN initiatives for cost savings.
- Handled incidents related to projects in warranty and troubleshooting issues.
- Provided transition to support team on the implemented controls along with required documentation.
- Defined the quality processes / documentation to be followed to ensure defect-free quality deliverables.
- Supported external audits (ISO) by maintaining documentation for all the projects implemented, and submitted the required evidences.
Confidential
Senior software engineer
Environment: Mainframes (COBOL, JCL, DB2, VSAM, REXX, Syncsort)
Responsibilities:
- Handled work items ranging from minor changes to large development projects.
- The project involves direct interaction with US clients. Played the role of an onsite coordinator in the project ensuring smooth onsite-offshore coordination.
- Handled many critical and large work items (spanning months) that involved intense understanding of the system and business knowledge.
- Performed the role of a module lead in the project and handled a team of 4.
- Every work item involves working in all project phases from Impact Analysis till Implementation, and post production support.
- The CLMI Portfolio project deals with the major applications such as ACER (Agent Account Experience Reporting), AFRF (Agent Financial Reporting Facility), Ratemaking, Controllable Income Statement and Pool Reporting.
- Interaction with multiple business users such as Actuarial Group Members, Financial Group Members, Residual Market Group.
- Involved in complete Software Development life cycle for all the Work Items Delivered (Impact Analysis, Design, Test case preparation, Development, Unit & System Testing and Implementation)
- Handled multiple tasks in parallel to meet aggressive deadlines
- Provided technical and business solutions for report level performance improvement.
- Developed REXX tools to automate the System testing process, to ensure speedy delivery.
Confidential
Software engineer
Environment: Mainframes (COBOL, IMS, JCL, DB2, VSAM, Syncsort, NDM)
Responsibilities:
- Developed mainframe applications using COBOL. JCL, VSAM to extract data from IMS database and VSAM files.
- The extracted data is processed and stored using LOAD utilities into the DB2 tables.
- Involved in the design of the DB2 database to store the application related data.
- Performed maintenance projects to extract additional sources of data and load the the DB2 tables.
- Developed multiple application programs to extract data from the centralized database, and create reports for various downstream systems.
- Utilized NDM utilities to move the reports from the mainframe application to distributed platforms using the NDM (Network Data Mover) utilities.
- Proactively worked as a Quality Lead and defined processes to ensure better Quality for all deliverables.
- Successfully handled the ISAE 3402 (Formerly SAS-70) Audits twice for the team as QL; also handled the ISO Audits successfully
- Ensured defect free deliverable by conducting intermediary Defect prevention activities in the team.