Run Lead - Big Data Platform Resume
Chicago, IL
SUMMARY:
- Certified and Prominent Hadoop Lead with vast experience in managing, designing, executing, implementing and administering industrial grade Hadoop solutions for a competitive edge.
TECHNICAL SKILLS:
Operating Systems: RHEL/Ubuntu
Hadoop Ecosystem: Hadoop(HDFS) technology, NFS, Hive, Pig, Sqoop, HBase, Map - Reduce, Ranger, Ranger- Kms, zeppelin, Cloudbreak, Atlas, hUE, Yarn, oozie, falcon, atlas, knox, zookeeper, HAWQ, Flume, SPARK, Zeppelin
Distributions: HortonWorks data platform(HDP), Cloudera CDH distribution & Pivotal Big Data Suite
Data Logistics Solution: Apache Nifi
Datacentric Encryption Security Solution: Protegrity
Security: kerberos, knox, Ranger, LDAPS/LDAP, Active Directory
Pivotal Big Data Suite: GemFire, Greenplum & HDB
Cluster Provision/Auto Scaling: Cloudbreak/ClouderaDirector
Cluster Monitoring: Ambari, Cloudera Manager
Infrastructure/Configuration Management: SaltStack platform
NoSQL Data store: Mongodb, Hbase
RDBMS: MS SQL (2012,2008r2), MySQL, Postgres
Cloud/Data Center Solution: EC2, EMR & S3(AWS), EMC(Dell)
Search: Solr
Programming Languages: PL/SQL, Python & Java
Methodologies: Agile
WORK EXPERIENCE:
Confidential (Chicago, IL)
Run Lead - Big Data Platform
Responsibilities:
- Manage onshore and off - shore Hadoop teams consisting of Developers, Architects, Data Engineers and Administrators.
- Report to stakeholders on bi-weekly basis on the overall health of the Enterprise Big Data platform.
- Oversee Synchrony core 5 Hadoop environments significantly ingesting large datasets sourced from real-time feeds, variety user interactions with mobile apps and internal transactions.
- Single point of contact for all platform operations ( Run ) issues from immediate response, coordination, escalation, root cause analysis and resolution.
- Ensuring SLA s for availability, performance, security, maintenance - upgrades, installation and user administration across all Data Lake environments.
- Administer and maintain Pivotal/Hortonworks Hadoop, Greenplum and GemFire clusters across all environments.
- Installation/configuration of the Hadoop eco-system tools and continuous enhancement and expansion of the enterprise data lake.
- Collaborate with development teams to assist in code promotion across environments and deployments in production, including CMDB CI creation and updates
- Proactively monitor cluster health and perform performance tuning activities
- Perform capacity planning and expansion activities working across Infrastructure and other enterprise service teams
- Perform cluster maintenance with patching/upgrades/migration, user provisioning, automation of routine tasks, re-processing of failed jobs, configure and maintain security policies.
- Ensure the Enterprise Data Lake initiative is continually providing unprecedented customer 360 capabilities and associated services at all time.
Confidential Chicago, IL)
Certified HDP Hadoop Administrator
Responsibilities:
- Cluster planning and engineering of POC and Production Clusters
- Strong experience on Hadoop distributions Hortonworks & Cloudera
- Administer, troubleshoot and debug cluster problems on RHEL/Ubuntu
- Troubleshoot and debugs cluster problems on RHEL based Linux infrastructure.
- Kerberos Administrator/Kerberize cluster and ensure sound domain name to - realm mapping
- Periodically Check HDFS health
- Configure YARN Capacity Scheduler based on infrastructure needs/YARN tuning
- Perform upgrades, patches and fixes using effective roll-out method
- Ensure HDFS is Balanced and performing optimally at all times
- Commission/Decommission Hadoop cluster nodes
- Review namenode WebUI for information concerning Datanode volume failures
- Back-up existing Hadoop databases such as oozie, hive metastore, HDFS Metadata backup etc
- Active Directory/LDAPS integration and management
- Purge older log files
- Build cluster according to workload pattern/cluster type
- Configure Cluster services for High Availability
- Ensure volume and hdfs encryption(Data at rest encryption)
- Experienced in AWS configuration optimization for Hadoop
- Backup Procedures and Disaster Recovery
- ACL s and audits to meet compliance specifications using Ranger
- Open tickets and troubleshoot cluster problems with support
- Experienced in AWS Storage methodologies
- Expertise in cloudbreak setup/configuration and blueprint creation.
- Engineer data pipeline management using Falcon
- Defining/executing feeds, processes, data pipelines, jobs mirroring between production and testing cluster using Falcon
- ETL offload using Atlas
- Securing hadoop services (Webhdfs/hive/yarn) using Knox for REST API calls.
- Trains and leads teams in understanding and implementing hadoop solutions in organization.
- Monitor cluster/organization servers for intrusions/detection, log file reviews and threats using SIEM solution Alien vault. Documents and reports findings to compliance and audit teams.
- Perform Active directory/windows server administration
Confidential (Oakbrook, IL)
Big Data Analyst
Responsibilities:
- Developed data governance process and controls and ensured compliance with enterprise data architecture principles and standards for the various systems and components.
- Analyzing, profiling data for quality and reconciling data issues.
- Build, test and deploy Hadoop solutions using most Hadoop ecosystem components.
- Administer Hadoop cluster, monitor performance with Ganglia.
- Design robust Hadoop solutions for complex business problems.
- Utilize new and latest Open Source tools for addressing business challenges.
- Work with multiple customer teams and support teams to execute Hadoop engagements.
- Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data
Confidential
Python developer
Responsibilities:
- Build command - line app with Restful API layers that cater for end users who are mostly kids.
- Pull data out of HTML and XML file with Scrapy/Beautiful soup.
- Interacting with web-app using Flask.
- Performed Data analysis using Python Pandas.
- Processing Data records and returning computed results using Mongo DB Aggregation framework. 6. Parse aggregated data into Apache Solr and graph database Orient DB.
Confidential (Chicago)
IT Analyst /Programmer
Responsibilities:
- Worked both independently and in a team - oriented collaborative environment.
- Worked with Microsoft SQL server.
- Documented and provided status of project and technical information related to the application/software supported (Web2Py & .NET).
- Supported remote users at their home office, hotel, or customer site utilizing remote tools and troubleshooting over phone using VPN.
- Knowledge of DSL/Cable Modem Routers, Windows Server 2012 and 2008, CISCO Switches and Routers and VOIP (i.e. Cisco Call Manager).
