Senior Big Data Engineer Resume Mountain View, CA - Hire IT People

SUMMARY:

A goal - oriented and enthusiastic professional with a versatile skill set and high-learning agility looking to utilize experience with large-scale, big data methods and data pipelines in a position as a Data Engineer with a world-class, high integrity company.
Holds a B.E. in Electrical & Electronics Engineering
Ingest data into HDFS from heterogenous sources like S3, Web using RestAPI calls, Database using BCP (SQL Server) & Oracle (Sqoop)
Design and Develop ETL using variety of tools like Pyspark, bash scripting, Hive to integrate various systems together
Possesses hands-on experience with building ETL data pipelines projects involving data ingestion from the web to Hadoop HDFS involving data formats CSV/Text/JSON and uploaded the data to a Hive Table
Big data experience includes a strong understating of the internals of Hadoop Ecosystem using Apache Sqoop, and Hadoop technologies in Cloudera 5.11; Sqoop to ingest data from Oracle SQL server; Oozie to schedule jobs
Fluent in Python and very good understanding of Data Structures and Algorithms.
Hands-on experience with large-scale, big data methods including Hadoop (worked with components including HDFS, Oozie), Spark, Hive (data transformation), Impala & Hue
Experience in schema modeling using Erwin and creating mapping document for ETL transformation and Load
Experience in handling structured and unstructured data and aim to provide clean and usable data
Worked in Amazon AWS S3 and EC2; worked as Database Architect and provided solutions for load balancing (Oracle RAC); very knowledgeable in horizontal and vertical scaling, memory management and disk maintenance
Used GIT for version control and JIRA/Cherwell for Ticketing
On-call Support for ETL job failures, provide RCA and recommend long term solution and meet SLA’s.
Collaborated with Data Scientists, Analysts & TPM's to understand the requirement and recommend ways to improve data reliability, efficiency and quality
Employs excellent problem-solving and verbal/written communication skills in all interactions
Demonstrated ability to explain technical concepts in a manner that is easily understood by non-technical professionals

TECHNICAL SKILLS:

Big Data Technologies: Hadoop, HDFS, Map Reduce, Hive, PIG, Sqoop, Oozie, AWS S3, Spark SQL, Apache Hue, Cloudera Manager

Programming Languages: Python, Bash Script, SQL

Databases: Oracle, SQL Server, MongoDB

Operating Systems: Linux, Sun Solaris, AIX & Windows

Scheduling Tools: Oozie, Crontab

PROFESSIONAL EXPERIENCE:

Senior Big Data Engineer

Confidential - Mountain View, CA

Responsible for Omnichannel - Customer 360 project which gets the single complete actionable view of a customer.
Created Data lake by extracting data from heterogenous sources like flat files from vendors(csv/excel), databases using bcp/sqoop and web using RestAPI's (JSON)
Used Apache Hive/Impala to build ETL on HDFS data with dynamic partition tables for efficiency and provide data in requested format to build dashboards.
Build aggregate layers using hive/impala queries to implement business logic for weekly/Monthly reports
Design ETL using Internal/External tables and store in parquet format for efficiency.
Improved performance by partitioning, data physicalizing and using Hive parameters.
Developed scripts for process validation framework which sends exception reports for data quality and alerts for job failures
Developed business validation framework which validates source tables and dashboard views and report exception.
Data modeling using various techniques like dynamic partitioning, storing tables in parquet format for effective storage and retrieval of data from HDFS for performance
Used bash scripting, python to automate ETL and schedule ETL jobs in Oozie.
Used JIRA for ticketing and GIT for source control

Confidential - San Ramon, CA

Senior Data Architect

Created a data pipeline that ingested cyber-attack information from the web utilizing Python scripts; uploaded to Hive via Oozie scheduler for use by the Cybersecurity Analytics Team
Designed managed and external Hive Tables to store data efficiently for retrieval and compression in Parquet/AVRO/ORC formats
Contributed to the Database Schema design using Erwin, and developed hive tables in order to upload data from heterogeneous system including Oracle/SQL server/WEB
Partnered with Data Scientist team to automate machine learning model for Bank’s online account Attrition using Shell script.
Designed and executed Oozie workflows in a manner that allowed for scheduling Sqoop and Hive job actions to extract, transform and load data
Aided the Analytics team to derive a method for retrieving data from Oracle/SQL Server using complex ETL queries and imported in Hive using Sqoop full and incremental strategy and store in HDFS
Gained exposure and knowledge in PySpark dataframes and transformations for building machine learning models
Acquired experience in SQL Tuning and gained proficiency with Oracle Tuning features like SQL Profiling and SQL Hints
Teamed with the Data Analyst and Data Scientist on several joint endeavors
Became familiar with Machine Learning Algorithms (Linear, Logistics & Random Forest Algorithms)
Deploying data replication from SQL Server to Big Data Hive for Risk Analytics Team using Oracle GoldenGate.
Set up MongoDB Cluster with 3 replica sets with Sharding

Confidential - Foster City, CA

Database Engineer

Automated PROD and DR switchover using shell script; integrated with SRM

Confidential - San Jose, CA

Database Engineer

Installed and maintained R12 Oracle Applications for a Cisco International project
Maintained and updated the latest PSU's for Oracle database and Application software

We provide IT Staff Augmentation Services!

Senior Big Data Engineer Resume

Mountain View, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship