We provide IT Staff Augmentation Services!

Big Data Lead/architect Resume

3.00/5 (Submit Your Rating)

SUMMARY:

An accomplished, customer - focused Big Data Analytics and Data Services Architect with wide ranging 11 years of IT experience in Hadoop, Bigdata and Oracle with multiple project implementations, support and maintenance. Strong Hands-on and up to date skills on the state-of-the art Analytics technologies in Big Data including Hadoop Eco-system, Hive, Spark, Python and Visualization tools (Tableau, Qlikview Power BI). Wide ranging experience in Data Management Technologies to provide effective solutions in Big Data/Modern Data Architectures. Deep technical expertise and Wide business exposure to understand and solve business problems with technology.

TECHNICAL SKILLS:

BIG Data Platforms: Hortonworks, Cloudera, Map R

Big Data associated: Hive, Spark, Sqoop, Oozie, Flume, Pig, Kafka, HBase, Cassandra, Nagios

Database (RDBMS): Amazon RDS, Oracle 9i/10g/11g/12C, SQL Server 2008R2 /2012/2014,MySQL, Postgres

Cloud: Microsoft Azure, AWS, EC2, EMR, Redshift, S3, RDS, Oracle OPC cloud.

Modelling: Kimball, Inmon, Data Vault (Hub & Spoke), Hybrid

Languages: HQL, Oracle SQL, T SQL, R, R-Hadoop & Python

BI/Data Discovery Tools: QLIKVIEW, TABLEAU, SPOTFIRE, Power BI, Oracle OBIEE

ETL/Data Integration Tools: Informatica, Talend, QLIKVIEW

Scripting: UNIX Shell Scripting

Code management: Github

Data Tools: Microsoft Visio, Erwin, Redgate tools

Tools: & Utilities: SQL Developer, Putty, Toad, CVS, SVN, ACCUREV, Eclipse, HUE

PROFESSIONAL EXPERIENCE:

Confidential

Big Data Lead/Architect

Responsibilities:
  • Implement Hadoop echo system and tools in Hortonworks Data Platform (HDP 2.6) in Microsoft Azure cloud.
  • Lead the datalake project, involved in all the phases of the project Analyze, design, ETL and testing.
  • Provide big data solutions by leveraging various Hadoop echo system tools.
  • Provide support for Hortonworks Data Platform (HDP) cluster with Ambari and Apps.
  • Subject Matter Expert (SME) in HDP Ambari with auto-configuring Apps like Hive, Zookeeper, Oozie, Sqoop, Flume, Spark, HBase.
  • Create centralized datalake to store the data coming from various sources.
  • Automate data ingestion process using oozie and shell scripting.
  • Create data pipelines and develop DDL for transformations.
  • Involved in deployment phase meetings for change management
  • Involved in project Life Cycle - from analysis to production implementation, with emphasis on identifying the source and source data validation, developing particular logic and transformation as per the requirement and creating mappings and loading the data into different targets.
  • Involved in identifying opportunities for process/cost optimization, process redesign and development of new process.
  • Wrote Release notes, Deployment documents and scheduled the jobs.
  • Support team on Bigdata testing.
  • Define a high level data governance framework addressing data discovery, data lineage, data quality, data catalogue, metadata management, data privacy and security, and data lifecycle management
  • Document the environment configuration of all hardware and software.
  • Mentor the team to bring up to speed on big data technology areas. Give Knowledge transition sessions to the team.
  • Extracted the data from Oracle, SQL server using Sqoop into HDFS.
  • Scale up/down the environment by adding/removing the nodes, server resources.
  • Job tracking using job tracker.
  • Monitor cluster services and ensure they are available.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both managed and external tables in Hive to optimize performance
  • Install/configure security tools for fine grained data security.
  • Enable data security (Authenication, Authorization, Data protection) and access management.
  • Install/config security tools (Kerberos, Knox, and Ranger). Integrate the cluster with enterprise LDAP.
  • Actively interact in all stake holders meetings.
  • Implemented fine grained access controls using Apache Ranger and Knox.
  • Solved performance issues with understanding of Joins, Group and Aggregations.
  • Interact with multiple stake holders Business Analysts and Data Modelers, Developers and defined Source and Target Mapping documents, and documenting best practices.
  • Used Tableau, Qlikview and Power BI for the reporting purposes.

ENVIRONMENT: HDFS, Yarn, Hive, Sqoop, Hbase, Pig, Mahout, Flume, Spark, Kafka, Python, Pig, Oozie, HBase(No SQL), Shell Scripting, HCatalog, RDBMS(SQL server & Oracle Warehouse)

Confidential

Big Data Architect /Data Pipeline Engineer

Responsibilities:
  • Involved in Big data requirement analysis, develop and design solutions for ETL and Business Intelligence platforms.
  • Hadoop environment setup using Cloudera manager 5.6.1 in AWS/Oracle cloud
  • Mapping Hbase Key-values pairs to Impala tables to achieve optimal performance.
  • Used Impala to query Hbase tables stored in HDFS and enable analysis by the downstream applications.
  • Implemented Partitions in Impala based on different business drivers.
  • Worked on tuning the performance of Impala queries by running and analysing Tables Statistics.
  • Used Impala to query the historical data stored on S3 in combination with HDFS data.
  • Experience in interacting with Business Analysts and Data Modelers and defined Source and Target Mapping documents, and documenting best practices.
  • Visual Analytics and Data wrangling using Tableau and Impala.
  • Used Sqoop extensively to import & export data to and from RDBMS systems like (Oracle, MySQL, Sqlserver) in to HDFS, HIVE data warehouse, HBASE.
  • Used Spark and Kafka for streaming requirement
  • Used HQL scripts to extract and load data in to Hive Data warehouse.
  • Experience in Hadoop day-to-day operations (HDFS, Map-Reduce, Hbase, and Hive) including operation, deployment and debugging of job issues.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Experienced in managing and reviewing the Hadoop log files.
  • Defining and creating the Unified Data Platform for all the Enterprise needs.

ENVIRONMENT: Spark, Spark Streaming, Kafka, Python, Hadoop, HDFS, Java Map Reduce, Yarn, Hive Impala, Pig, Sqoop, Oozie, HBase, Shell Scripting, Mahout, Spark ML

Confidential

Big Data Architect/Data Pipeline Engineer

Responsibilities:
  • Collaborate with team members and clients to deliver a technical solution that meets the unique needs of our clients.
  • Involved in Big data requirement analysis, design and design solutions for ETL and Business Intelligence platforms.
  • Hadoop environment setup using Hortonworks Data Platform (2.4).
  • Experience in interacting with Business Analysts and Data Modelers and defined Source and Target Mapping documents, and documenting best practices.
  • Expertise of Hadoop1.x (MapReduce) & Hadoop 2.x (Yarn) programming models.
  • Used Sqoop extensively to import & export data to and from RDBMS systems like (Oracle, MySQL, Sqlserver) in to HDFS, HIVE data warehouse, HBASE.
  • Mapping Hbase Key-values pairs to Impala tables to achieve optimal performance.
  • Partitions creation of impala to suit the business requirements
  • Developed a Spark Streaming Kafka App to Process Hardtop Jobs Logs.
  • Experience in writing Hive/HQL scripts to extract and load data in to Hive Data warehouse.
  • Experience in writing PIG Scripts to read, transform large sets of structured, semi structured and unstructured data, and load into HDFS, HIVE.
  • Create quality documentation to communicate incident reports or to appropriate audiences.
  • Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.
  • Performed Hadoop day-to-day operations (HDFS, Map-Reduce, Hbase, and Hive) including operation, deployment and debugging of job issues.
  • Developed semantic layer on top of Hive to facilitate the analytics team to generate ad-hoc reports.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Experienced in managing and reviewing the Hadoop log files.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data into HDFS.
  • Involved in creating Hive Tables, loading data and writing hive queries.
  • Worked on implementing the Master Data Management strategies for the new analytics platform.
  • Defining and creating the Unified Data Platform for all the Enterprise needs.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and
  • Visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.
  • Involved in Unit testing and delivered Unit test plans and results documents.

ENVIRONMENT: HDFS, Yarn, Sqoop, Flume, Hive, Spark, Spark Streaming, Kafka, Python, Scala Hadoop, Java Map Reduce, Pig, Oozie, HBase, Shell Scripting, HCatalog, RedShift, RDBMS, Postgresql

Confidential

ETL/Reporting Engineer

Responsibilities:
  • Participated in business and system analysis to ensure the alignment of business intelligence solutions with business objectives and requirements.
  • Lead the team in design and development.
  • Created RPDs, defined Physical Layer, and developed Business Model & Mapping and Presentation Layer using Admin Tool.
  • Created and managed OBIEE dashboards.
  • Develop/Migrate/Customize ETLs for Data Extraction, Transformation and Loading using Informatica Power Center 9.0.1/9.1.0 .
  • Involved in performance tuning and code review.
  • Involved in DAC monitor.
  • Involved in creation of QC defects, PPM packages, Requests and tuned code uploaded into Clear case.
  • Migrated mappings from one environment to other environment.
  • Involved in testing the OBIEE existing reports.
  • Involved in preparing Unit test plans and test results.
  • Involved in preparing high level migration documentation.
  • Taking care of SQL*Net related problems like listeners and connect strings
  • Manually created single instance and later added it with the RAC
  • Building Physical standby database
  • Verifying backups in the Recovery Catalog database
  • Management of Database Creation and Maintenance of Table spaces, Data files, Control files, Redo Log files.
  • Monitoring the Storage Usage and Disk I/O by managing Table spaces, Data files and OS File Systems.
  • Implementation of security and Integrity of Users connected to the Database by Enrolling, Monitoring and Dropping Database Users, Roles, Privileges and Profiles.

ENVIRONMENT: Oracle EBS R12, OBIEE 10g/11g BI Publisher, Informatica Power Center 9.0.1/9.1.0, Oracle Database 11g, Oracle SOA

Confidential

Oracle Datawarehouse/EBS/Fusion Programmer Analyst

Responsibilities:
  • Had a great experience and exposure working with ORACLE provide end to end database/application administration services for top 25 customers (American Air Lines, Clopay, Aramark, Essilor, School Speciality, Delmonte etc...) these deals with high criticality and customer sensitivity.
  • Worked in ultimate environments like Multi-Nodes (up to 25nodes) of App Tiers with shared application tops, shared technology stacks, PCP, Discoverer, RAC,ASM, RMAN and Standby for databases.
  • Worked in Oracle Applications releases starting from 11.5.9 to 12.1.3
  • EBS application patching, cloning, configuration changes, database and application upgrades.
  • Worked in databases sizes vary from 500GB to 12TB and the database versions starting from 9i to 11g.
  • Building Physical standby database
  • Verifying backups in the Recovery Catalog database
  • Management of Database Creation and Maintenance of Table spaces, Data files, Control files, Redo Log files, Archive Log files.
  • Monitoring the Storage Usage and Disk I/O by managing Table spaces, Data files and OS File Systems.
  • Implementation of security and Integrity of Users connected to the Database by Enrolling, Monitoring and Dropping Database Users, Roles, Privileges and Profiles.
  • Monitoring the Hit Ratios and tuning the System Global Area (SGA) accordingly
  • Applied RDBMS Patches.
  • Performing Database Cloning from production to test servers using RMAN.
  • Reorganization of tables, indexes and Lobs.
  • ASM to Non ASM cloning
  • Database Patching and Cloning.
  • Automated several processes like daily logical backup and physical backup.
  • Creation and maintenance of other database objects like views, synonyms, sequences, and database link.
  • Creating database roles & assigning privileges to the roles.
  • Creating database users and by setting the default & temporary table spaces, assigning quotas, roles, privileges and profiles to users.
  • Making table spaces available to users by adding data files to the table spaces and resizing of data files whenever the table spaces crossed the threshold.

ENVIRONMENT: Oracle Database 10,11g Exadata, Oracle applications (11.5.10, 11.5.10.2, R12), RAC, ASM, Data Guard, RMAN

Confidential

DBA/Data warehouse Engineer

Responsibilities:
  • Monitoring Concurrent Manager & its operations like starting, stopping the concurrent managers & Troubleshooting problems, resolved the view log/out files issues of concurrent requests.
  • Cloning, maintaining concurrent manager processes, Patching, space management issues and User management issues related to Oracle Applications.
  • Very well versed in Oracle Applications Utilities like AD Administration, AD Controller, Autoconfig, AD Merge Patch, AD Relink, AD Splice, Auto Patch, FNDCPASS, f60gen, WFLOAD, FNDLOAD etc.
  • Assigned specific concurrent requests to a concurrent manager.
  • Generated the context.xml files, and environment files related to apps.
  • Monitoring, identifying and resolving user issues.
  • Managing passwords of apps, Product passwords.
  • Altered several components configurations like port numbers and made them Functional.
  • Monitoring the Application System with Oracle Applications Manager (OAM)
  • Have expertise in activities like maintaining instances, maintenance of file system and taking appropriate backups for security and consistency.
  • Validating & Creating grants & synonyms on Apps schema, Performed jar file/forms generation as a post patch step, relinking the Oracle apps programs.
  • Performed research for patches for specific problems and downloaded patches
  • Done patch analysis, applied patches (one-off, mini packs, family packs) and checked the patch impact. Managing patch history till go-live. Review patches
  • Observe patch impact and applying Applications patches including pre requisites, co-requisites and post requisites
  • Reduced the timelines of patching using features like defaults file, various ad patch options and merge patch.
  • Troubleshooting several worker issues while patching.
  • Applied RDBMS patches using OPatch.
  • Managed and troubleshooted apps components like apache/forms etc.
  • Creation of application user accounts and assigning the responsibilities.
  • Moving customized concurrent program executable from one environment to other i.e. development to test, test to production.
  • Worked on various issues like User deletions, User additions, lock accounts, enable auditing for a user, grant privileges from one schema object to another schema objects, collection of statistics for schema level, database level, Temporary table issues, data file addition, moving data files, snap shot too old issues.
  • Taking care of tablespace related issues like increasing the storage parameter values, resizing the datafiles, adding new datafiles etc
  • DB Link creation
  • Taking care of SQL*Net related problems like listeners and connect strings
  • Manually created single instance and later added it with the RAC
  • Oracle network Administration by Configuring tnsnames.ora, listener.ora.
  • Establish and Maintain Sound Backup Policies and procedures including Hot and Cold O.S. Backup and also logically through Oracle Export Utility.
  • Backup using Recovery Manager (RMAN).
  • Checking whether archive logs are applying from primary to standby correctly or not If not fixing the issue.
  • Moving data files from one disk group to another disk group in ASM.
  • Collecting AWR reports.
  • Performance tuning.
  • Monitoring Logical Standby database.
  • Documenting the database setup & updating the document whenever there is a change in the database structure, backup strategy etc.

ENVIRONMENT: Oracle Database 9i/10g, Oracle EBS 11i/R12, RAC, ASM, Data guard, RMAN, Oracle enterprise manager, Unix Sun, Solaris OS.

We'd love your feedback!