Big Data Engineering Consultant Resume
Bay, AreA
SUMMARY
- 16+ years of experience as data/database technology expert wif proven technical, leadership, communication, and customer service skills in early - stage startup to large global multinational enterprise environment.
- Expert in relational, noSQL, and big data On prim as well as Cloud solutions
- Excellent understanding and knowledge of relational and NoSQL databases including Postgres, SQL server, Oracle, HBase, MongoDB, Cassandra.
- Strong experience creating real time data streaming solutions using Apache Spark Core, Spark SQL & Dataset/DataFrame/RDD, Spark Streaming, Apache Storm, Kafka.
- Proficient at using Spark APIs to cleanse, explore, aggregate, transform, and store system logs, sales, marketing and machine sensor data.
- Worked extensively wif Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
- Experience in building Data pipelines using Big Data Technologies
- Excellent knowledge on Hadoop (Gen-1 and Gen-2) and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager (YARN).
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS.
- Configured Hadoop cluster to transfer data from Amazon S3 to HDFS and HDFS to Amazon S3 and also to direct input and output to the Hadoop MapReduce framework.
- Hands-on experience wif message brokers such as Apache Kafka and RabbitMQ.
- Implemented Hadoop based data warehouses, integrated Hadoop wif Enterprise Data Warehouse systems
- Experience in working wif various Cloudera distributions (CDH4/CDH5) and has knowledge on Hortonworks and Amazon EMR Hadoop Distributions.
- Experience in administering large scale Hadoop environments including design, configuration, installation, performance tuning and monitoring of the cluster using Cloudera manager and ganglia.
- Experience in Object Oriented Analysis Design (OOAD) and development
- Experience in designing both time- and data-driven automated workflows using Oozie.
- Built real-time Big Data solutions using HBASE for handling billions of records.
- Experiences in building, deploying, and automating project using Jenkins.
- Experience in writing UNIX shell scripts.
- Worked on Spark Machine Learning library for Recommendations, Regression and Classification problems.
- Involved in designing the data model in Hive for migrating the ETL process into Hadoop and wrote Pig Scripts to load data into Hadoop environment.
- Expertise in writing Hive UDF, Generic UDF’s to incorporate complex business logic into hive queries in the process of performing high level data analysis.
- Hands-on experience in writing MapReduce programs and user-defined functions for Hive and Pig
TECHNICAL SKILLS
Databases: SQL server, Oracle, Postgres, MongoDB, Cassandra, Couchbase
Platforms: Unix, Linux, Windows, Ubuntu
Big-data technology: Spark, Hadoop, HDFS, YARN,Kafka, Zookeeper, HBase, Hive, Neo4J, Oozie
Cloud Platforms: AWS, Google Cloud Platform, Azure, Internal/private cloud
Languages: Python, Pyspark, SQL, N1QL, HiveQL
Technical Tools: Microstrategy, Tableau server, Tableau prep, Tableau desktop, Jenkins
Other: Global team management, Collaborative Problem Solving, Vendor & Contractor Selection, Systemic issues resolution, App Design Reviews, Customer engagement, Reliability Engineering, External/Internal Customers, High Availability, Service Assurance, SWAT initiatives, Enterprise Monitoring, DB Engineering, Production Deployment, Storage solutions, System Audit,Enterprise Architecture, DR Design, Process Documentation, IaaS/Internal Cloud
PROFESSIONAL EXPERIENCE
Big Data Engineering Consultant
Confidential, Bay area
Responsibilities:
- Engineering and implementing a system logs analytics system solution using Big Data, Spark and Kafka along wif many other Big data tools.
- Establishing data infrastructure architecture using a variety of Big data technologies including Hadoop, Spark, MapReduce, NoSQL and relational DBs
- Creating different ETL architectures after resolving data and architectural issues
- Developed a platform/analytics system for system logs analysis by developing Spark RDD and DF-based data manipulation for large complex datasets
- Configured data pipeline for different system logs generated by diverse systems.
- Used Spark wif relational database like Postgres and SQL server.
- Leading business development efforts by negotiating wif potential customers.
Solution Architect
Confidential
Responsibilities:
- Enterprise solutions Lead wif well-rounded experience including technical expertise, relationship building skills as well as complex customer engagements in the area of Data/database that provided Dell’s internal ‘Data’ and Big data solutions to its customers for Spark, Hadoop, RDBMS & data warehousing.
- Designed and reviewed Clusters involving Hadoop, Spark (Preview release of Spark 2.0), NoSQL DBs (Postgres, MongoDB) and RDBMS (Oracle, SQL server) clusters
- Managed and coordinated wif global team of Grid data and database coordinators, first response and L2 teams (200 data professionals) globally for all data related issues involving automation, job creation, ETL pipelines, cube refresh, BI system issues, data pipelines, Architectural design and problem resolution,
- Provided direction and support to all customers globally to meet their SLAs.
- Collaborated wif engineering, security, infrastructure provisioning, application teams, and project/operations data professionals across the world to implement and establish industry/Dell standards
- Designed highly scalable infrastructure (database clusters, Spark clusters, Hadoop systems, ETL pipelines etc.) to meet the key business performance metrics.
- Responsible for operations, security, high availability, capacity management, patching & migration.
- Deployed 30% new grids in APJ (Malaysia and China) saving millions by preventing individual server provisioning.
- Saved 70% of costing by implementing multitenant cluster solutions and made provisioning 3000% faster.
- Established new web interface, processes & standards for seamless execution and great time saving for internal and external customer teams across globe.
- Worked wif more TEMPthan 1500 external and internal customers and led transformation initiative across complex, disparate IT infrastructure and systems.
- Improved Data and Database Availability by implementing Big Data solutions
- Led complex external (Tenet healthcare, SeaWorld etc.) & internal (all application teams worldwide) customers engagements
- Developed reliable data engineering, Infrastructure processes and frameworks to optimize solutions addressing systemic issues across enterprise
- Led design reviews on database and monitoring tower for more TEMPthan 150 applications along wif storage, network, server, facility & application towers.
- Led assessment of Clusters involving big data, noSQL, dbs, SQL Server & Oracle from configuration, life cycle & high availability solutions like RAC and Cluster, mirroring, Always on, and Data guard.
- Developed Enterprise monitoring and solutions to address issues related to EOTS, SCOM, OEM, BMC, Foglight
Senior Database Engineer
Confidential, San Antonio, TX
Responsibilities:
- SOX and SAS70 compliant and Agile Development Environment of different DBMS systems involving more TEMPthan 400 SQL server instances spread across US (40 different locations and 3 different Data Centers) and integration of new systems based upon 2 to 4 new acquisitions yearly.
- POC Hadoop cluster
- Performed data engineering and operations for the enterprise environment across the globe.
- Security and PCI compliance initiatives that involved TDE (Transparent Data Encryption) and CDC (Change Data Capture)
- Led key Big Data projects and managed team of DBAs and developers onsite and offshore. Also managed environment of more TEMPthan 200 servers running SQL Server 2000, 2005 and 2008 in standalone and Clustered Environment wif IBM SAN storage.
- Worked on DB upgrade, patching, migration, maintenance plans, DTS packages, SSIS, performance tuning, indexing, replication, FTP set up (Using SFTP tool) and Security (access, app role, Firewall), space management, tools/technology evaluation
- Led a team consisting of developer, DBA, Network resource, service providers and completed the project 3 months earlier TEMPthan the deadline.
- Actively involved in identifying and implementing requirements and coming up wif database design along wif implementing and monitoring database architecture.
