Staff Data Engineer (big Data) Resume
MichigaN
SUMMARY:
- Big Data Platform Solution Architect with 12+ years of experience in BIG DATA solutions, Database Development, Security Architect, Administration and Virtualization.
- A high - integrity, results-oriented, creative, energetic technical expert with proven track record of success catering to Business Data Lake, Cloud services and clustered server environment with distributed computing.
- Data intensive applications using Hadoop, Greenplum, Isilon, Pig, Hive, HBase. Architect and develop innovative security solutions for big data deployments using Hadoop and its ecosystem technologies like Kerberos, Ranger, Atlas, Knox - presentation components, middleware, databases and security at all levels.
- Analyzed and designed security solutions with HDP 2.x using protegrity and Zettaset, also utilized Ranger and Knox for auditing and perimeter security for Hadoop cluster, expert in performing Security analysis and server/cluster hardening, monitoring, risk mitigation across entire architecture.
- Expert in understanding the data and designing/implementing enterprise platforms like business / finance data lakes with security in place to take care of Authentication/Authorization/Auditing data security.
- In depth experience in translating key strategic objectives into actionable and governable roadmaps and designs using best practices and guidelines.
- Fine-tuned several complex ETL reporting applications, procedures, programs with a goal to provide faster and more efficient platform for business users.
- Solid Systems Engineer with extensive experience in Big Data, OLTP, NoSQL databases, OS s, security, Virtualization, DNS, Active Directory, Group Policy,
- Effective communicator in a cross-functional and context Stalwart supporter of change with steadfast dedication to the business objectives
- Design, Architect and build finance data lake solution for Confidential
- Design, Architect and build Business Data Lake environment in Confidential IT.
- Architecting and supporting Greenplum deployments and implementations for Confidential.
- Supporting Hadoop ecosystem: Hadoop, MapReduce, Hbase, Sqoop, solr on Confidential Big DATA DCA (Data Computing Appliance) and Hortonworks distribution.
TECHNICAL SKILLS:
Big Data ecosystem: Hadoop, MapReduce, Druid, Presto, spark, storm, Hive LLAP, Pig, Zookeeper, Oozie, NoSQL, Solr, Kerberos, Protegrity, Zettaset security for BDL. Hortonworks (HDP 2.x) and Pivotal Hadoop distribution
Security: Kerberos, LDAP, AD, Ranger, Ranger KMS, Knox, Atlas, Zettaset, Protegrity
Databases: Greenplum 4.x, Postgres 8.x/9.x, Hive, Hbase, MongoDB, ORACLE RAC 10g/11g; ORACLE 11g, 10g & 9i
Applications: Oracle10g and 11g Real Application Clusters, Oracle*Net, SQL*Net, TCP/IP, Oracle advanced security options, Oracle Enterprise User security
Operating Systems: Linux Red Hat (6.x,7.x), oracle unbreakable Linux, HP (HP-UX 10.0 - 11.11), SUN SPARK 64-bit and 32-bit, Microsoft Windows 2008 and windows 2012 server
Programming Languages: Python, Perl, C, SQL*Plus, SQL and PL/SQL programming and development, Shell scripting, sed, awk, Java
Virtualization & Storage: VMware vSphere 5.0/6.0, VMware ESXi 4.x/5.x, Isilon, Wandisco, ScaleIO
PROFESSIONAL EXPERIENCE:
Staff Data Engineer (Big Data)
Confidential, Michigan
Technologies Used: Presto, Druid, Atscale, Greenplum Database, Hortonworks HDP2.x Hadoop ecosystem, Talend, Tableau, Spotfire
Responsibilities:
- Responsible for architecting end - to-end solution for finance data lake on Hadoop open source stack
- Design, Architect and Support Hadoop cluster: Hadoop, MapReduce, Hive, Sqoop, Ranger, Presto and high performance SQL query engine, Druid for indexing etc.
- Building the knowledge base and helping the team to operationalize Finance Data Lake.
Confidential, Novi, MI
Big Data Architect
Technologies Used: Cloudera 5.x Hadoop ecosystem, Python.
Responsibilities:
- Architectural suggestions for the next level platform
- Develop logical architecture solutions
- Collaborated with Business Analysts, Architects and Senior Developers to build physical application framework
- Designed data pipeline for ford direct vendors and dealers.
Confidential, Los Angeles, CA
Big Data Architect
Technologies Used: Hortonworks HDP2.6 Hadoop ecosystem, Talend, ER/Studio, Greenplum Database, Cassandra, Visual Studio 2015, SQLServer.
Responsibilities:
- Responsible for architecture and development of end - to-end solution for datalake implementation
- Designed ingestion for deep storage(HDFS) from different sources and design and implement transformation logic (CDC, SCD, building common model) in spark and talend.
- Designing the models in ER/Studio and utilizing Collibra for data governance.
- Responsible for supporting Hadoop and Greenplum, ensuring their Optimum performance
Confidential
Sr Data and Information Architect
Technologies Used: Presto, Druid, Atscale, Greenplum Database, Hortonworks HDP2.x Hadoop ecosystem, Talend, Tableau, Spotfire
Responsibilities:
- Responsible for architecting end - to-end solution for finance data lake on Hadoop open source stack
- Design, Architect and Support Hadoop cluster: Hadoop, MapReduce, Hive, Sqoop, Ranger, Presto and high performance SQL query engine, Druid for indexing etc.
- Building the knowledge base and helping the team to operationalize Finance Data Lake.
- Responsible for supporting Greenplum databases and ensuring their Optimum performance
Confidential
Principal Engineer
Technologies Used: Greenplum Database, Hadoop ecosystem, Confidential Command Center, Confidential Chorus, Alpine Miner, VMware Vsphere and VMware Vmotion. ECC 6.1, SRM 7
Responsibilities:
- Responsible for supporting 2 Hadoop clusters holding a massive 2.5 PB of structured/unstructured/semistructured data and eight Greenplum databases holding 1PB of data and ensuring their performance, availability and security by means of scripts and the right tools.
- Support Hadoop ecosystem: Hadoop, MapReduce, Hbase, Sqoop, solr on Confidential Big DATA DCA (Data Computing Appliance). Working on data intensive applications using Hadoop ecosystem products
- Working on cloud services and clustered server environment with distributed computing.
- Responsible to configure and implement customer scenarios with MPP (Massively parallel processing) multi - node Greenplum databases and standalone databases.
- Responsible for improvement and maintenance of Greenplum-Hadoop DCA (Data computing appliance) Appliance holding petabytes of structured and unstructured data.
- Responsible and technical owner of multiple enterprise applications suites, ex: CRM, ERP etc.
- Responsible for evaluating future database management strategies, providing guidance in acquiring requisite system software and hardware and developing procedures for database and applications change control.
- Implement and test disaster recovery procedures and solutions. Administer and implement strict security integrity controls.
Confidential
Senior Software Engineer
Technologies Used: Sql developer, Toad, RMAN, Statspack, AWR, ADDR, Sql loader, logminer
Responsibilities:
- Oracle EBS 11i installation, configuration, cloning and management
- Handle Oracle EBS 11i and DBA related activities for databases located in Germany, Japan, and India which includes 2 RAC databases in windows and Linux platform and more than 30 standalone databases.
- The job responsibility also includes database development and handling critical issues as well as daily monitoring of databases during the APAC time zone.
Confidential
Software Engineer
Technologies Used: Oracle RAC 10g, ITS (Incident Traking System), Statspack, AWR Report generation, Remote Diagnostic Agent (RDA), Strace/truss/tusc for Linux/Solaris/HP-UX
Responsibilities:
- Handle Oracle networking and advanced security issues related to oracle customers located worldwide which includes handling all the issues on various platforms like Linux, UNIX, Windows, HP - UX.
- The job responsibilities include handling critical severe issues related to Oracle Networking and advanced security, which also includes issues, related to RAC load balancing and failover methods.
Confidential
System/Database Administrator
Technologies Used: Squid, IPTables, qmail server and other datacenter related tools
Responsibilities:
- Configuration of Radius server, Mail server, Web server, DNS servers is all kept in the secured zone behind Cisco pix 5.2 firewall configured with fail over mode.
- Install, configure and maintain Windows & Linux Operating System
- Configure Domain Name Server, NIS, NFS and Perform backup and recovery using native tools, e.g. Windows Backup Utility, tar, jar, rsync
- Implement and monitor Linux security on all Linux servers