We provide IT Staff Augmentation Services!

Big Data Architect Resume

4.00/5 (Submit Your Rating)

CA

SUMMARY

  • Innovative, hands - on, results oriented leader and software architect who thrive in challenging environment. A solid track record in recruiting, mentoring, & managing distributed software development & IT teams.
  • Extensive experience in SDLC and Big Data apps/processes utilizing Agile & Scrum methodologies.
  • 15+ years of experience in developing applications & processes for various types of data collection & ingestion (ETL) for market research and advance analytics of financial, high tech, retail, media, digital, & e-commerce data.
  • 4+ years of professional hands-on work experience in Big Data Hadoop eco system development & architecture.
  • Extensive experience in working wif HDFS, PIG, Sqoop, Flume, Hive, Hbase, Phoenix, ELK Stack.
  • Proto-typing next generation processes wif Spark, SparkSQL, Spark Streaming, Spark ML in Python and Scala.
  • Presented in Dreamforce Conference in 2014 on Qlikview Implementation Usage and Best Practices
  • Hands-on experience wif NoSQL databases including HBase and Apache Phoenix.
  • Experienced wif Business Intelligence tools like Oracle BI, Business Objects, Qlik, Tableau and BIRT.
  • Implemented Waterline data on Hadoop Enterprise Data Lake to find, understand, and govern data in Hadoop.
  • Experienced in ingesting data from various sources using Sqoop and Flume.
  • Experienced in importing and exporting data between RDBMS and HDFS using Sqoop.
  • Experienced in performance tuning of Hive for scalability and faster execution.
  • PMI certified Professional wif ability to lead and manage team to successfully execute projects.
  • Presented to C-Level Staff and in several conferences and Meet-ups.
  • Extensive experience in several data warehouse implementations and Business Intelligence implementations.
  • Proficient and experienced in planning, Analysis, designing & Architecting software applications (SDLC) including interaction wif end users, developers, & testers at multiple locations, along wif trouble-shooting, optimizing, performance tuning, analytical & problem solving skills.
  • Extensive experience wif client interaction, requirement analysis, negotiating features and developing new software products/applications. Evaluate teh existing processes and suggest innovative ways to optimize and improve teh ROI.
  • Self-motivated wif excellent written and verbal communication skills both as an individual and also in a team environment. Consistently demonstrated solid leadership skills, manage multiple projects at teh same time, work well in a challenging fast paced environment and adapt to change quickly.

TECHNICAL SKILLS

Cloud & SaaS Platforms: AWS, Azure

Hadoop Eco System: HDFS, Yarn, MapReduce, Sqoop, Flume, Hive, Hbase, Apache Spark, Pig, SparkSQL, Spark Streaming, Spark ML, Cloudera, HortonWork, Amazon EMR, ELK stack (Elastic Search, Logstash, Kibana)

Visualization: Oracle BI, SAP BW/ Business Objects, Qlik, Tableau, BIRT, Kibana

Database: Oracle, SQL Server, MySQL, Hbase

Data Ingestion: Informatica PowerCenter, Data Stage

Operating Systems: Windows Server (2008, 2012), Linux (Red Hat, Ubuntu, CentOS)

Programming Language: Java, Python, PHP

Methodologies: Waterfall, Agile, SCRUM, Kanban

Atlassian Products: Jira, Confluence, Stash Git

PROFESSIONAL EXPERIENCE

Confidential, CA

Big Data Architect

Responsibilities:

  • Won a Hackathon by showcasing teh idea for a Customer scoring initiative and engaging customers for feedback.
  • Worked wif teh Data scientist team to implement teh Customer Scoring Algorithm, which can be leveraged for customer, churn prevention, customer satisfaction and cross sale etc.
  • Developed an Sqoop based data ingestion framework to ingest data from relational data sources for teh IDW and Digital and Business banking applications (including large tables having 6 billion+ record sets).
  • A Flume based solution to ingest teh Global logging from various logging events and transaction multiplexed into HDFS and Hbase
  • Implement teh Flume to Hbase multiplex to support teh Fraud Analytics feed to a third party company Guardian Analytics.
  • Designed teh Business banking billing solution based on Sqoop and Moveit based data migration to Finance team. A detailed report enabled for teh Financial institutions to verify teh accuracy of teh billing via detailed report in Admin platform portal
  • Lead teh technology evaluation for visualization tools by comparing Qlik Sense and Tableau wif selection of Tableau.
  • Performed a POC for Spark to connect to Tableau for Spark SQL based Analytics.
  • Recommended and designed customer engagement meter projects to determine Customer score, which can be leveraged for customer, churn prevention, customer satisfaction and cross sale etc.
  • Worked on Logstash and Kibana from teh ELK stack to etract /parse data and present in dashboard wifin teh analyst portal.
  • Identified and setup open source reporting using BIRT
  • Developed a CDH Upgrade plan to ensure all teh aspects of teh upgrade as covered wifout impacting existing functionality
  • Evaluated teh CDH 5.3 release bug list to ensure no existing bug would impact teh upgrade and come up wif mitigation plan.
  • Prepared a step by step implementation and migration playbook by performing teh steps in teh Sandbox instance and documenting for teh production support team
  • Installation and setup of teh new version of CDH 5.3 using cloudera manager
  • Deployed teh HbaseSink and HbaseFilter on teh new cluster
  • Updated Flume configuration to start sending data to new cluster.
  • Update teh configuration for Hive, new zookeeper quorum
  • Migration of teh additional nodes from teh old to teh new cluster to increase teh capacity of teh cluster
  • End-to-End validation and testing, bug fixes to be work wif Cloudera support team.

Confidential, San Rafael, CA

Big Data/BI Architect/Tech Lead

Responsibilities:

  • Developed an EWS (Early warning system) to provide an internal scoring for teh customer based on all teh touch points and proactive engage to improve teh chance of teh annual subscription renewal.
  • Cost savings of around $250K by implementing proactive cluster management strategies
  • As part of UCP project developed Map reduce jobs to get data from teh ODS (Operational Data Store) EDW ( Enterprise Data warehouse) into HDFS
  • Implemented a centralized Data Lake in Hadoop wif data from various sources.
  • Fast Access was developed using Amazon Redshift and finally connecting it to Tableau and Qlikview for Visualizations and reporting.
  • Developed and Documented on wiki teh Data Governance, security strategies for teh Big data platform
  • Documented on wiki teh best practices and coding standards for Hadoop Big data Platform
  • Engaged wif teh Admin team to ensure installation and setup for teh Amazon EMR
  • Big Data performance optimization wif comparison of Hadoop tools like Hive, Impala, Presto and Spark along wif other tuning options like partitioning, bucketing etc.
  • Performed a POC for Spark to provide real time analytics capability for teh customer survey analysis
  • Lead teh Hadoop Admin team wif Automation and monitoring for Hadoop platform
  • Setup teh Datameer Architecture and installation wif Datameer Edge on EC2 instance and connect to Amazon EMR cluster
  • Developed a Hadoop Big Data Business case justification for teh management and presented to teh C-Level executive staff.
  • Upon approval from C-Level staff and management, developed a Unstructured storage and Big Data Roadmap
  • Performed a detailed evaluation of Big Data technologies as Cloudera, Hortonworks and Amazon Elastic Map Reduce (EMR)
  • Attended a week long Hadoop Developer training from Hortonworks to understand teh various components of Big data
  • Successfully performed a POC to convert teh existing EDW systems built on other ETL platforms like into suitable Java map reduce, Hive and Pig Latin. Having background on all teh ETL technologies halped me to analyze and convert teh existing system faster into Hadoop.
  • Finally implemented 4 different full production cloud AWS EMR based Big data implementations for UCP, Cloud Platform, Finance and Marketing.
  • Developed a Hadoop Big Data Business case justification for teh management and presented to teh C-Level executive staff.
  • As BI Architect, developed teh complete Architecture for highly complex for Tier1 implementation of Qlikview 11.x for High Availability and Disaster recovery. Successfully established a distributed architecture wif QMC, Qlikview server and publisher for a globally used Qlikview platform
  • Lead and successfully executed several upgrades for Qlikview platform.
  • Performed solution design and developed teh data-provisioning layer for several Qlikview application for Sales, Service and Finance domains.
  • For teh Finance Dashboard developed Qlikview scripts and task to extract multiple layers QVD structure and finally joined into a consolidated Revenue Qlikmart. Teh Dashboard TEMPhas 2 components - Aggregated Dashboard and detailed Dashboard.
  • Established teh global footprint of OBIEE (Oracle BI Applications) for Sales, Services and Support Analytics by collaborating wif various business partners.
  • Engaging business partners for gathering requirements, conducting workshops and propose various ideas to streamline existing reporting processes. Teh role also involved project planning and tracking, change management, risk mitigation and deployment strategies during teh tenure of teh project. Establish teh ground-up BI reporting requirements from Operations to C-level executives from various source systems.
  • Engage wif Database Administrators in teh installation of teh BI applications suite - OBIEE, BI Apps, Informatica, DAC in Development, Test and Production environments.
  • Train teh end-users on using teh Business Intelligence Analysis and facilitate create KPIs along wif teh Operation Managers in each group. Implement necessary information security and process controls in teh project.

We'd love your feedback!