We provide IT Staff Augmentation Services!

Big Data Engineer Resume

San Jose, CA


  • 7+ years of experience in Business Analytics, Data Modeling and Data warehousing.
  • Hands - on experience with Horton Works, Cloudera Hadoop platforms.
  • Worked on Microsoft Azure platform, deploying IaaS, & PaaS solutions.
  • Worked with Data analytic tools like Spark, Hive & Kafka.
  • Worked with Hive, creating Hive tables, writing queries for analysis and reporting.
  • Working knowledge with Python, SQL and Shell scripting.
  • Involved in designing and building Data pipelines.
  • Led data migration efforts between RDBMS & Hadoop environments.
  • Worked with Hadoop Ecosystem tools like Hive, Sqoop, Flume & Zookeeper.
  • Experienced working with NoSQL databases like HBase.
  • Experienced in DWH, ETL best practices & relational databases.
  • Good knowledge in data architecture, data modelling and data Integration
  • Experienced with GIT, Perforce and other source code management tools.
  • Experienced in all phases of Software Development Life Cycle.
  • Excellent communication skills, worked with multiple teams across the organization.


Data Analytics: Hadoop, MapReduce, Spark, Kafka, Storm, Hive.

Databases: Data Management Tools

HBase, Oracle, SQL Server, Teradata: Sqoop, Zookeeper, Flume

Data Reporting Tools: Tableau, Zeppelin

ETL: Informatica Power center

Languages: Python,SQL, Shell Scripting.

Version Control SW: IDE & Tools

Perforce, GIT, Jenkins, JIRA: Eclipse IDE, IntelliJ, Maven


Confidential, San Jose, CA

Big Data Engineer

  • Worked on FACT - Reconciliation financial project on Hadoop platform.
  • Fixing JIRA defects, resolving variance issues observed in data, removing duplicate data, data cleansing, data aggregation and reporting.
  • Worked with Braintree clients to resolve issues reported on Fuse reports and internal Rugo tables. Scripts to check periodic variance reported in ledger.
  • Worked on PROD, QA data querying & manipulating data sets using Hive client.
  • Worked on data quality issues using Jupyter notebooks, Hive client.

Confidential, Mountain View, CA

Sr.Data Engineer


  • As a part of Data Services team, built Hadoop cluster on Azure Platform & deployed various data analytic solutions.
  • Worked on Confidential Telemetry project to create data aggregations and develop meaningful insights based on cyber threat level using Spark, Kafka for product team.
  • Ingested Learning Platform data and developed transformations using Spark to categorize data & generated course related info &employee participation metric reports.
  • Used HIVE extensively for data analysis in streaming & batch processing jobs.
  • Developed custom Python code to parse inbound CCS data of Confidential .
  • Worked with CPE team, Product team and Data science teams & Business to translate business requirements to technical specifications and in implementing POC’s.
  • Worked on Performance tuning of Spark jobs, Hive jobs & OS related tuning.
  • Experience working with various file formats like CSV, AVRO, Parquet and JSON.
  • Experienced working on data quality issues.
  • Worked on collecting and storing stream data into log data in HDFS using Flume.
  • Led Data migration efforts from various RDBMS sources like Teradata into Hadoop cluster for data processing, transformation, storage and BI reporting.
  • Involved in building and customizing Data pipelines using Azure Data Factory.
  • Worked with ETL team members, Microsoft solution architects, Infrastructure team & networking teams to coordinate & implement migration projects.
  • Generated BI reports in Tableau & published larger datasets using Tableau Server.
  • Worked with Pentaho Data Integrator to integrate multiple data sources like Splunk, for Confidential Global Security team. Supported in R&D activities.
  • Worked on troubleshooting HDFS issues & other tools such as Hive, Spark & Oozie.
  • Presented project technical details to the management & documented in Confluence.
  • Handled and resolved many production related application issues. Worked with Hortonworks architects & analysts for getting bug fixes.

Environment: - Horton Works, Azure, Hive, Spark, Kafka, Sqoop, Flume, Teradata.

Confidential, San Diego CA

Big Data Engineer .


  • Involved in creating Hive tables, loading data and writing hive queries.
  • Written and implemented custom Hive UDF's as per business requirements
  • Worked on Hive performance tuning and Hadoop MapReduce operation optimization.
  • Responsible for Performance tuning, Hadoop cluster management, patching, resolving network issues, cluster monitoring and reviewing log data to fix issues.
  • Managed and reviewed Hadoop log files to optimize MapReduce jobs performance.
  • Importing and exporting of data into HDFS and Hive using Sqoop.
  • Performed analysis using Hive on the partitioned and bucketed data to compute various metrics for reporting using Tableau.
  • Developed shell scripts for processing of hive scripts and according to business need.
  • Worked on and gained good expertise with NoSQL databases like HBase, Cassandra.
  • Experience working with PL/SQL scripting and programming.
  • Trained and coordinated with offshore team members to meet tight deadlines.

Environment: - Hadoop, Hive, Pig, Kafka, Storm, Python, MS SQL Server.

Confidential, San Diego

Sr.System Administrator .


  • Performing security management by maintaining roles, privileges, and user profiles. Add/Remove users from many different Security Groups in our domain (Auto groups)
  • Run, Schedule, Enable SQL Jobs. Troubleshoot if the jobs fail, provide log files of jobs to dev teams Look into the health of the Production Databases.
  • Monitoring Mirroring, Log shipping & Replication.
  • Schedule backup jobs and object level recovery using maxima tool.
  • Monitoring of Online and scheduled jobs.

Environment: - Oracle, MS SQL Server, DB2, Tableau v7(Desktop/Server), MS Visio, Word, Excel, Access, HTML, XML, Agile Methodology, Shell Scripting.


System Administrator


  • Configuring and maintenance various MySQL databases for applications like Drupal.
  • Setting up Single Sign On on Application Servers using SSL keys.
  • Worked on Setting up process maker and getting it work with Shibboleth (SSO).
  • Worked on Creating Cluster of MySQL databases on Mac OS.
  • Experience with installing Drupal Multi Site and Migration of data

Environment: - Oracle, MS SQL Server, DB2.

Hire Now