Big Data Engineer/administrator Resume
Manhattan, NY
SUMMARY
- 14+ years of solid experience in Information Technology with a strong background as a Technical Administrator/Architect/Engineer of Big Data, Data Warehouse, Data Analytics, and Managing Very large cluster environments
- Around 7.5 years of experience in three Hadoop distribution areas (CDP/ Cloudera/ Hortonworks/ Azure HDInsight/ MapR) administration activities like cluster Builds/Upgrades, Configuration management, POCs, Installation and maintenance of Hadoop ecosystems including Cloudera Manager, Ambari, HDFS, YARN, Spark, Machine Learning, Hive, Hbase (NoSQL), Hue, Impala, Kafka, MapReduce, Zookeeper, Oozie, Solr, Sqoop, Flume, Pig, Chef, Puppet, Knox and Cloudera Navigator, Metadata (MySQL) backup and recovery, job scheduling and maintenance, code and data migration, debugging, Troubleshooting (Connectivity/Alerts/System/Data Issues), Performance tuning, Backup and recovery (BDR), monitoring a Hadoop System using Nagios, Ganglia, Machine Learning, Python Scripting, and Security setup and configuration includes Kerberos, Sentry and LDAP.
- Lead the complete project end to end: CDP POC, Cloudera POC, Hortonworks POC, HDInsight POC, Hive LLAP POC, Waterline POC, StreamSets POC, Databricks POC, ADF POC (Data Factory), Unravel POC, Trifacta POC, whole Cluster Build, High Availability setup for multiple components, and configuration, Security Design, setup and Upgrades
- Performed as a DevOps Systems Admin by supporting deployment and operations of various applications in Production and lower environments using Git, GitHub, Jenkins, Chef, Puppet, Ansible, Salt and Docker.
- Good working experience in Azure HDInsight provisioning/Services and in - depth knowledge of HDInsight Cluster deployment includes Spark, HBase, Kafka, Hive LLAP, etc and expertise in monitoring, logging and cost management tools that integrate with HDInsight.
- Experience in the successful implementation of ETL solutions on data extraction, transformation and load in Sqoop, Hive, Pig, Spark and HBase (NoSQL Database).
- Developed/migrated numerous applications on Spark using Spark Core and Spark Streaming APIs in Java. Optimized Map Reduce jobs/SQL queries into Spark Transformations using Spark RDDs. Developed and integrated Spark Framework with Kafka Topics for real time streaming applications.
- Designed and Documented Big Data Best Practices and Standards, EDL (Enterprise Data Lake) Overview, Step by Step Instructions on Cluster setup/upgrade/Adding/Decommission Nodes, Onboarding Process, Security Design Model, Performance Tuning, Failure Scenarios and Remedy, PROD to COB Discrepancies, and EDL Migration.
- Advanced expertise in SQL, Python, Scala, Shell, Java, and Perl scripting
- Innovative and risk taker with proven track record of addressing numerous issues which help steer organization toward success.
TECHNICAL SKILLS
Operating Systems: Unix, Linux, Solaris-UNIX, AIX, Windows XP/NT/2000
Hadoop Distribution: Cloudera, Azure HDInsight, MapR and Hortonworks
Big Data: HDFS, NFS, HBase, MapReduce, Cloudera Manager, Ambari, MapR Control System, Cloudera Navigator, Machine Learning, Nat Yarn, HUE, Hive, Impala, Pig, Sqoop, Flume, Kafka, ADF(Data Factory), Oozie, Zookeeper, Spark, PySpark, Storm, Ganglia, Nagios, Avro, AWS, DevOps Tools (Chef, Puppet), Kerberos, Knox, Ranger NiFi, Tez, BoKS, Sentry, LDAP, AD, Cassandra, MongoDB, REST APIs and Fabric
Teradata: Viewpoint, Data Mover, TMSM, Teradata Administrator, Teradata SQL Assistant, TASM, TSET, Teradata Visual Explain, Teradata Statistics Wizard, Teradata Index Wizard, Schmon, Tdwm, CTL/XCT, Lockdisp, Showlock, Qrysessn, TSTSQL, Vprocmanager, gsc tools, DBSControl, Ferret, rcvmanager, Update Space, Gateway Global, TPT, BTEQ, Fast Export, Fast Load, Multi Load and TPump
DevOps: Git, GitHub, Bitbuket, Jenkins, Chef, Puppet, Docker, CI/CD,Docker, Kubernetes (K8S), Splunk
Databases: HBase, Cassandra, MangoDB, PostgreSQL, Teradata DBMS, Oracle 9i, DB2, SQL Server
Languages & Others: SQL, Python, Ruby, Scala, Unix Shell Scripts, C, C++, Java, and Perl Scripts, Kerberos, APT, Sentry, Anaconda,JSON, XML and ACL
Ticket Management: Infoweb, ManageNow, mainframe, TechXL, IMR, MIS, HPSM, HPSD, ServiceNow
Backup Tools: Tivoli Storage Manager, NetVault, NetBackup, Teradata ARC
Data Integration Tools: Ab-Initio, Informatica, Talend
Others: Protegrity tools, GitHub, EventEngine, Autosys, ERwin, Clear Case, Autosys, MicroStrategy, Eclipse, Control-M, Clear Quest, AtanaSuite, WinMerge, UltraEdit, SecureShell, Tectia Client
PROFESSIONAL EXPERIENCE
Confidential, Manhattan, NY
Big Data Engineer/Administrator
Responsibilities:
- Primary Participant in administrative activities includes HDInsight Spark/HBase/Kafka/Interactive Query (LLAP) Cluster deployment using ARM Templates and using Azure Portal, creating Resource Groups, NSG Rules, Scaling the Clusters, creating blob and ADLS Storage accounts, day to day operational activities like monitoring the jobs, giving recommendation to the skewed jobs, different service related issues, tuned multiple services like Yarn, Kafka, Impala, Spark, Hive, Performance Tuning and configuration checks, setup process for User Onboarding, and Backup metadata.
- Played Architect role in setting up the process includes User Onboarding, Application Onboarding, setting up optimal SKU for Spark/HBase/Kafka Clusters from lower to higher environments, recommending different configuration setting changes from default to non-default values for multiple components like Spark2, MapReduce2, Hive, LLAP, and Queue Manager to enhance the cluster performance.
- Handling and co-ordination with Microsoft/Hortoworks Support team and different production support team
- Setup the Nagios alert system to collect all possible/reasonable metrics to alert us only on those that require an action
- Setup Kafka Performance Flags and alerts to address the non-sync between producer and consumer
- Setup OpsGenie notification system to send the notification to the on-call Infra Admin person
- Install and configured Cloudbreak in a VM using Azure cloud resources
- Setup multiple Cloudbreak blueprints, recipes, Management packs and configured external database for Ranger
- Involved in setting up Azure subscription/ interactive Credential, and Vnet/Subnet
- Setup ADLS Gen2 storage account with two filesystems storage-fs and logs-fs
- Created Managed Identities for Data Lake Admin, Assumer, Ranger Audit Logger and Logger
- Registered an Azure Environment
- Created cluster template (blueprint) for our application using existing cluster template
- Used Management Console to register the clusters and build new clusters
- Used Data Hub to configure cluster topology (master, worker, and compute) and cloud storage for HDFS, Yarn and Zeppelin
- Tested Auto scaling based on performance metrics
- Used Replication Manager to register the existing clusters and copied the HDFS data
- Scheduled Stop/Start the clusters with optimal performance
Environment: RHEL 6.5, CDP, HDP 1.3.1/2.1/2.5/3.0 , Ambari, Hive, Kafka, Sqoop, Tez, Storm, Python, Hbase, Teradata Query Grid, ZooKeeper, Oozie, Kerberos, Knox, Ranger, Pig, HBase, Avro
Confidential, Phoenix, AZ
Sr Big Data Analyst/Administrator
Responsibilities:
- Primary participant in certifying Big Data product for use with in multi tenancy group, new cluster evaluation recommendations, RAM estimates and Hadoop cluster upgrade for: CDH5.48/CDH5.53 to CDH5.7
- Active participant in performing COB (continuity of Business) Switchover includes COB Cluster Checkout, COB Cluster testing for all the components like Hadoop Cluster, MySQL, Flume, Datameer, Platfora, Data Ingestion, Talend, and Data Center Failover.
- Setup/Configured/Documented Hive Metastore high availability (HA)
- Primary participant in Sqoop setup and securing Sqoop2 Server and used Sqoop on Sentry enabled cluster.
- Build the BDR Requirement template and COB Business recovery plan template.
- Installed Kafka, Enabling SSL and High Availability on Kafka Brokers and ingested multiple cluster kafka topics into one cluster using Mirror Maker Service and handled multiple Kafka failure issues
- Tuned Kafka Prod cluster by adjusting the configuration parameters like num.partitions etc.
- Configured HBase to use HDFS High availability, cleanup split logs and added kafka topic to entity ‘mk consumer config’ in Hbase
- Secured Sqoop2 Server by enabling Kerberos Authentication and SSL Encryption and resolved issues with Firewall while connecting to RDBMS to extract data via Sqoop2.
- Upgraded Cloudera Manager patch 5.4.8 and documented the whole process.
- Certified SPARK-SQL on CDH 5.7 by testing all the functionalities.
- Enabled SPARK Encryption using Cloudera Manager for encrypting Spark data at rest, and data in transit.
- Worked on Big Data Issues and documented in Big Data Issue Tracker
- Resolved HUE related issues like User not able to login to Hue, Hue will not start in Cloudera Manager, Hue Daemon not starting, and how to change timezone for Hue logs, etc.
- Tuned Yarn/Impala/Hive/Hue/Spark for Performance and involved in translating from MapReduce to Spark and memory tuning, Hive parameter set to use Spark engine
- Tuned Spark Performance related issues includes SparkContext did not initialize, Spark Query running very Slow, Spark history Server log is not creating, NoClassDefFound Error, Failed to get Spark client, Long running Spark Executors for Hive Sessions, Spark Job aborted issue. Spark bugs and version compatibility issues, etc.
- Involved in configuring Spark includes Configured Spark with Python 2.7, Configuring Spark Container and overhead settings on Yarn, and Configured RollingFileAppender.
- Involved in security setup for Big Data includes Unix Security setup, Cloudera Manager Security, HDFS Security, MapReduce security, HBase Security, Hue Security, Sqoop and Flume Security, Port /URL security, Sentry setup, Port used and Cloudera Search Authorization with Sentry.
- SPNEGO Web Authentication Configuration and Kerberos setup for Cloudera Manager and CDH.
- Prepared Best Practice documentation includes Hive, Impala, Spark, Sqoop and Hadoop file formats and compression.
- Primary Participant in documenting whole Application life cycle process includes Application onboarding, Technical Assessment, Performance Review and Reporting and involved in day to day administrative activities like monitoring/troubleshooting user issues and user access management.
- Gathered and documented on Failure scenarios and remedy includes Software failures, manually moving Name node to a new node, add new data disks to a worker node, Name Node failures, Data Node failures, Cloudera Manager failures, Backup the PostgreSQL database, data ingestion issues, Hue Database failures, Hive Metastore Database Failures, Hardware/Rack/Master Server/Proxy Server/Switch/Disk failures, CPU Memory IO Spikes, and Kerberos Issues, etc.
- Documented on PROD to COB discrepancies includes all the components.
- Raised a case and coordinated with supporting engineer to resolve CPU saturation (100%) issue caused by python process, Cluster and Cloudera Manager issues and OOM issues.
Environment: CDH 5.7/5.8.3, RHEL, Cloudera Manager, Cloudera Navigator, Yarn, Impala, Hive, HUE, Kafka, Sqoop, Pig, HBase, Avro
Confidential, Manhattan, NY
Sr.Teradata System DBA/Big Data DBA
Responsibilities:
- Involved in everyday DBA activities: monitoring system, housekeeping activities and change implementation
- Resolved different issues on LDAP mechanism with different tools and prepared documentation on login procedures & issues on LDAP
- Produced Audit Weekly Reports: To find Direct Grant Users, Using Functional ID logon from user workstation, NonLDAP Users, Users in NonLDAP Profiles, Users have rights greater than read privileges, etc on Production Databases
- Initiated and lead the cleanup activities on extraneous privileges in the roles and direct grant privileges, Profiles as per Best Practices
- Worked on different system maintenance activities like, restart, node down issues, performance issues and used concern utilities accordingly like CNS utilities, Checktable (level2), Packdisk, Scandisk, rcvmanager, qrysession, DBScontrol etc
- Raised the issues to GSC on different Viewpoint portlets namely System health, query spotlight, TASM (workload designer/health/monitor), resolved different portal issues, modified TDPID for the system, migrated the passive Viewpoint server to Viewpoint active during disasters and calamities (Irene hurricane) to provide the ceaseless system monitoring
- Worked on changing the states and events and event actions for client requirements.
- Generated TASM reports based on the exception logs and dbql information
- Involved in Sybase to Teradata Migration project includes performance analysis, overnight watch on the system alerts (using Netcool), RFB calls, team coordination and time to time report about the system and documentation
- Worked on MorganStanely Halsey Data Center Power Down project, involved in Pre Power Down and Post Power Down tasks and issues
- Created backup and restore jobs via Netvault & Netbackup and resolved different drive down issues(frozen drive from configuration etc), debugged issues in TARA GUI, added new policies and Netvault Device Management issues
- Experience with AWS and SWS for the window setup before system force TPA restart
- Lead the JVC(Production mirror image for testing) Flush Project for refreshing the data with PRODUCTION data
- Recovered failed PDCR jobs due to different failures and documented the process
- Generated different performance reports on poorly tuned queries and highly resource consuming queries and analyzed with ETL teams
- Performed the significant role in total upgrade project from DBMS V12 to V13
- Implemented Strong Password rule set for all the Generic IDs
- Provided Teradata Client Installation packages to Techconnect team for the rollout of Teradata on all Windows Servers and resolved the different installation issues.
- Conducted presentations to end users on Viewpoint usage on portlets like Query Monitor, Query Spotlight, System health, Capacity Heatmap
- Co-operation with Teradata DBS team & GSC on numerous issues on Database, performance etc
- Migrated the data from one environment to another environment and implemented Version DDL changes using Perl Scripts
- Performed the Offshore Team lead role
- 24X7 support for production system
Software/Platform: Teradata Database V12 & 13, Linux, AIX, Windows 32/64 bit, Viewpoint V12, 13.11/12, Teradata Administrator, Teradata Manager, Teradata SQL Assistant, BTEQ, TSET, Netbackup, Netvault
Confidential, Hartford, CT
Teradata DBA
Responsibilities:
- Involved in LDAP Mechanism in PERSONAL INSURANCE (PI) Line of Businesses (LOB’s) as per Audit rules
- Cleaned up redundant roles and erroneous privileges in the roles, direct grant privileges as per Teradata Best Practices
- Implemented Version DDL changes for all Environments in three LOB’s using Perl Script
- Modified DBScontrol internal parameter 142 (DeleteLeftOverSpool) to automatically delete the left over spool when the system detects
- Involved ETL teams to debug the performance issues with long running queries via DBQL reports and ResUsage reports.
- Participated in IP Filter Setup in Production PI LOB using ipxml2bin utility as per Audit
- Coordinated with Teradata performance analysis representative
- Used different CNS utils, vprocmanager, qrysession, rcv to debug the issues and for maintenance activities
- Managed the crashdumps accordingly with the coordination of Teradata
- Using Viewpoint monitored the system health and query monitor portlet for blocking and other issues
- Released the HUT locks on databases using Show locks and MLOAD locks and resolved 7446, 7449 errors
- Assisted the DEV team on resolving the optimization issues with LOAD Utilities
- Worked on Audit weekly Reports to stop Direct Grant Users, Insecure Non Adhoc Profiles, Insecure Adhoc Users and other reports in Production for three LOB’s
- Working on Atana Sync Tool for the database synch up across different environments (DEV/TEST/PROD)
- Aborted the blocked sessions in PMON
- Product Support on ODBC connection issues
- Worked on User Provision process (using macros & stored procedures) to manage Role Management, Profile Management, Space Allocation and User Management
- Worked on the SU ID for DDL Version Changes in PRODUCTION ENVIRONMENT
- DB2 support for the different version changes
- Responsible for the documentation on LDAP log in process and issues with Audit
- Created the tables, macros, procedures on adhoc basis
- Weekend Support on Critical Production Issues
- Implemented Strong Password rule set
- Documentation on Access Rights and Space Allocation to track and streamline the user requirements
- Worked on MIS and HPSM Ticket Management Systems
Software/Platform: Solaris10, Linux, AIX, Windows 32/64 bit, Teradata Database V12, Teradata Administrator, Viewpoint 12, Teradata Manager 12.0.0.5, Teradata SQL Assistant, BTEQ, TSET, Tivoli Storage Management, Teradata Arcmain.