We provide IT Staff Augmentation Services!

Big Data Architect Resume

3.00/5 (Submit Your Rating)

SUMMARY:

  • Having 10+ years of BI, DWH and Big Data Lake in IT industry, with over 3 years of experience in Cloudera & Hortonworks Hadoop Ecosystem.
  • Expertise in concepts of end - to-end Greenfield project planning and implementation from scope management in various environments viz. release based maintenance, custom application development, enterprise wide application deployment, testing support and quality management in adherence to international GDPR guidelines and Regulatory Norms.
  • Drafting the guidelines for Data Governance, Standard Guidelines and Best practices.
  • Managed the Project including resource planning and presented the road map for Digital Data Transformation for Banks
  • Worked on Automating Dev-ops by Integrating Jenkins, Nexus and Ansible for real time production release.
  • Optimizing the Job flows and implementing best practices for performance oriented architecture compliant to NFRs.
  • Cloudera/Hortonworks Hadoop Developer with hands on experience on major components in Cloudera/Hortonworks Hadoop Ecosystem like HUE, Navigator. Hadoop/HDFS, HIVE on SPARK, Impala, Cassandra, Kafka, Zookeeper, Oozie, Sqoop and Flume.
  • Prepared Ansible playbooks for Automated deployment of Databases, Hortonworks Distribution, Kafka and integrated the Data Centre peripherals like HDFS and Databases with Kafka Connect.
  • Creating Proof of Concepts from scratch illustrating how these data integration techniques can meet specific business requirements reducing cost and time to market.
  • Excellent understanding and knowledge of SQL databases like Oracle, PostgreSQL and Datastax Cassandra.
  • Experience in working with Windows, UNIX/LINUX platform with different technologies such as Big Data, SQL, No-SQL, Scala, Ansible, XML, HTML, Shell Scripting etc.
  • Expertise in setting up processes for Hadoop and RDBMS based application design and implementation for Big Data Lake Analytics in Near Real Time.
  • Experience in importing and exporting data using Kafka Streaming in Avro Format and CDC Applications (i.e. Attunity/Striim) from Oracle to HDFS, Cassandra, PostgreSQL and vice-versa.
  • Experience in using Tableau, OBIEE for Big Data analytics.
  • Experienced in processing Big data on the Apache Spark framework using Scala programs.
  • Led the team of engineers and coordinated the QA effort including training for prod-ops, QA and presentations to product and executive teams.
  • Very good experience in customer specification study, requirements gathering, system architectural design and turning the requirements into final product.
  • Experience in interacting with customers and working at client locations for real time field testing of Big data tools and services.
  • Ability to work effectively with Data scientists, Business Analysts and other stake holders at all levels within the organization.

TECHNICAL SKILLS:

Technology: Cloudera CDH 5.13, Hortonworks Hadoop Framework HDP2.6, Tableau, Hive on Spark, Impala, Informatica Big data Management, Confluent Kafka, Datastax Cassandra, HDFS, Sqoop 2.2.x, Flume 2. Oozie, PostgreSQL, CDC tools like Attunity & Striims, ETL and Data warehousing applications, OBIEE, Oracle, PL/SQL in Energy & Gas And Investment banking domain.

Operating system: Windows, Linux (RHEL)and UNIX

Hadoop/Bigdata: Cloudera Navigator, Ambari, Atlas,HDFS,S3, Scala, Hive, Impala, CASSANDRA, Kafka, Zookeeper, Spark, Oozie.

Distributions: Cloudera 5.13.x,Hortonworks HDP 2.6

Front End: Cloudera Navigator, Informatica BDM, Tableau Desktop 10.1.3, HUE, Ambari, OBIEE 10.2.3.x/ 11.1.7.14; OBIEE 11g/10g,OBI Publisher, Oracle BI.

Databases: PostgreSQL, Cassandra, Hive on S3, Oracle, PL/SQL.

Programming Languages: SQL, Shell, Ansible Playbooks, Scala

IDE's Utilities: IntelliJ, Microsoft Visual Studio Code, Jupyter, Anaconda.

Protocols: TCP/IP,SSH,HTTP and HTTPS

Scheduling: Control M, Oozie, Stonebranch

Operating System: Windows, Linux and Unix

Version control: Git, SVN, BitBucket(Stash), SHARY.

Tools: Putty, PL/SQL Developer, PGAdmin, DBvisualizer, TOAD, BMC Remedy tool, JIRA, WINSCP, Informatica BDM.

Microsoft suite: Excel, Outlook, Word, OneNote, Notepad++ and PowerPoint.

PROFESSIONAL EXPERIENCE:

Confidential

Environment: Hadoop/HDFS, Scala, Spark on IntelliJ, HIVE on Spark, IMPALA, SQOOP, Tableau, Oracle Business Intelligence11g, Big Data Lake - CASSANDRA HDFS, Database - Oracle 10g, 11g, 12c & PostgreSQL

Responsibilities:

  • 3+ years of enriched industrial experience with extensive experience working with Big Data tools with specialization on
  • Managing the Greenfield Project end to end including resource planning and presented the road map for Digital Data Transformation for Banks
  • Automating Dev-ops by Integrating Jenkins, Nexus and Ansible 2.6 for real time production release.
  • Optimizing the Job flows and implementing best practices for performance oriented architecture compliant to NFRs.
  • Processed data into HDFS on both Cloudera and Hortonworks Hadoop Distribution by developing solutions, analyzed the data using MapReduce, Scala-Spark for creating RDDs for Spark streaming, creating Datasets for Spark Structured Streaming, Hive Queries and produce summary results from Hadoop to downstream systems.
  • Used Sqoop/Kafka widely in order to import data from various systems/sources like Oracle and PostgreSQL into HDFS.
  • Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
  • Involved in creating Hive tables, and then applied HiveQL on those tables for data validation.
  • Moved the data from Oracle into HDFS and PostgreSQL through Kafka Streaming.
  • Used Zookeeper for various types of centralized configurations in Kafka and Hadoop.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Scala-Spark Transformations.
  • Managed and reviewed Hadoop log files from Ambari Tools
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive, PostgreSQL and SPARK.
  • Developed multiple Spark jobs in Scala for transformation, data cleaning and preprocessing.
  • Involved in Importing and exporting data into HDFS and Hive using Kafka/Sqoop.
  • Involved in Loading and transforming large sets of structured, semi structured and unstructured data.
  • Involved in loading data from UNIX file system to HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in Map reduce way.
  • Worked on 5 TB per Schema of data involving 40 + Nodes.
  • Good understanding and related experience with Hadoop stack-internals, Hive, spark and Map/Reduce.
  • Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms.
  • Installed and configured Kafka, Hadoop, PostgreSQL, Cassandra for data cleaning and pre-processing.
  • Wrote Spark jobs to discover trends in data usage by users.
  • Involved in running Hadoop streaming jobs to process terabytes of text data and Structured data.
  • Developed HIVE queries for the analysts.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Used Git/SVN & Bitbucket(Stash) for version control.
  • Maintain System integrity of all sub-components primarily HDFS, MR, Cassandra, and Kafka .
  • Monitor System health as Ambari, Navigator Admin and respond accordingly to any warning or failure conditions.

Big Data Architect

Confidential

Environment: Oracle Business Intelligence, Oracle Database

Responsibilities:

  • Provide business analytical support with OBIEE and data warehousing solutions
  • Identify the data requirement list and analyze the data to be extracted or added from the data warehouse in an OBIEE environment
  • Administer and analyze the reporting and analysis development, implementation and changes according to the enterprise objectives
  • Oversee the project performance, status and administrative process throughout the life cycle of the project
  • Directly interact with the client to conduct meetings, presentations and status report projection
  • Work in collaboration with IT, administrative, development teams and other business stakeholders to maintain cooperative relation and translate plans into action
  • Available to resolve analytical problems and reporting needs of OBIEE and data warehouse applications
  • Develop and maintain documentation, BI dash boards and reports
  • Train new joinees, other business analysts or data warehouse and development teams
  • Expertise in Complete SDLC of Analysis, Design, Development, Maintenance and Documentationof various functionalities of OBIEE.
  • Very strong experience in developing OBIEE Repository (RPD) - three layers (Physical Layer, Business Model & Presentation Layer), Time Series Objects, Siebel Interactive Dashboards with drill-down capabilities using global & local Filters.
  • Experienced in Data Analysis, Identification of Dimensions, Facts, Measures, Hierarchies developed Reports/Dashboards with different Analytics Views (Drill-Down / Pivot Table, Chart, Column Selector, Tabular with global and local Filters) using Siebel Analytics Web.
  • Extensively worked on data extraction, Transformation and loading data from various sources like Oracle, SQL Server and Flat files.
  • Responsible for all activities related to the development, implementation, administration and support of ETL processes for large scale data warehouses using Informatica Power Center.
  • Strong experience in Data Warehousing and ETL using Informatica Power Center 8.6.
  • Had experience in data modeling using Erwin, Star Schema Modeling, and Snowflake modeling, FACT and Dimensions tables, physical and logical modeling.
  • Strong skills in Data Analysis, Data Requirement Analysis and Data Mapping for ETL processes.
  • Had knowledge on Kimball/Inmon methodologies.
  • Hands on experience in tuning mappings, identifying and resolving performance bottlenecks in various levels like sources, targets, mappings and sessions.
  • Extensive experience in ETL design, development and maintenance using Oracle SQL, PL/SQL, SQL Loader, Informatica Power Center v 8.x/9.x.
  • Experience in testing the Business Intelligence applications developed in OBIEE 11g/10g.
  • Well versed in developing the complex SQL queries, unions and multiple table joins and experience with Views.
  • Experience in database programming in PL/SQL (Stored Procedures, Triggers and Packages).

Sr. Developer

Confidential

Environment: Database Warehousing, Oracle 10g, SQL, PL/SQL.

Responsibilities:

  • Extensively involved in coding of the Business Rules through PL/SQLusing theFunctions, Cursors, Triggers,Stored Procedures, and Packages in the server side.
  • Participated actively in the technical and functional discussions.
  • Extensively involved in designing the project and coordinating a highly professional team.
  • Interacted with the user group on a regular basis to discuss requirements and updates.
  • Used Forms to provide the interface for the application and Reports to take periodical reports.
  • Responsible for the Instructor module user interface design, implementation and testing.
  • Generated reports and graphs to study instructor utility percentage every quarter and display the feedback results.
  • Was responsible for providing expertise training to end users on the new system and its user interface and reports running procedure.

We'd love your feedback!