We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

San, FranciscO


  • Innovative software engineer offering experience in the full software development lifecycle - from concept through delivery of next-generation applications and customizable solutions.
  • Expert in advanced development methodologies, tools and processes contributing to the design and rollout of software applications.
  • Known for excellent troubleshooting skills - able to analyze code and engineer well-researched, cost-effective and responsive solutions.
  • High-performance technologist skills -Excellent at creating horizontally scalable infrastructure and applications to handle large volume of data with low latency and defining, designing and implementing a highly secure, continuous-deployment pipeline to update applications in cloud environments.


Technical Tools: Scala, Python, Amazon Web Services, Kafka, HBase, MongoDB, DynamoDB, Cassandra, RedShift, Map/Reduce, Hive, Yarn, Hadoop, Spark, Oozie, AWS EMR, S3, EC2, Unix, Oracle, SQL, PL/SQL


Sr. Big Data Engineer

Confidential,San Francisco

  • Designed and Developed Confidential customer model system as Cloud solution
  • Designed and Developed DW for KPIs on AWS using Hadoop, Hive, Spark and Scala
  • Loaded unstructured data into Hadoop (HDFS) file system
  • Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data.
  • Designed and developed end to end Spark applications using Scala to perform various data cleansing, validation, transformation and summarization on customer data.

Sr. Big Data Engineer

Confidential,Foster City

  • Designed and Developed Big Data analytics platform for Confidential Risk and Fraud transactions using Python, Hadoop, Hive and SQL.
  • Performed Data ingestion using Sqoop Import loading the credit card transactions, payments data, credit from legacy warehouses into HDFS and Hive tables and performed Sqoop export to get the HDFS data into MySQL for data visualization using tableau.
  • Designed and Developed Streaming analytics using Flume, Kafka with Spark Streaming to process the data and saving raw data and processed data simultaneously into the HDFS using multiplex flume implementation model.
  • Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data.
  • Loaded the aggregate data into a relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs
  • Solved performance issues in Hive scripts by modifying Joins, Group and pre-aggregation.
  • Developed Oozie workflow for scheduling and orchestrating the ETL process.
  • Automated the process for extraction of data from warehouses and weblogs into HIVE tables by developing workflows and coordinator jobs in Oozie.

Sr. Software Engineer

Confidential,San Francisco

  • Designed and Developed WGPR authorization profile repository, containing data related to CEO customers (companies) and users along with the information about their products, accounts and account authorizations.
  • Designed, Developed and Established ETL pipelines for the critical data loads using Oracle, PL/SQL and SQL
  • Performance tuning on ETL and transaction processes (Explain Plan and TKPROF)
  • Re-design of existing CEO Database and re-write of WPR application
  • Worked on merger projects, migrated Wachovia data to Wells Database

Software Engineer

Confidential,San Diego

  • Designed and developed Confidential ’ MPI is a demand planning and forecasting Tool which facilitates the Business planners to do their planning in an efficient manner and submits their demanding to the Manufacturing facility for production.
  • Prepared detailed design document for the new release
  • Developed Database design and data modeling
  • Performance tuning of queries and partitioning database tables
  • Developed UNIX Shell Scripts for nightly data load

Software Engineer


  • Designed and Developed Monsanto Corp Data Warehouse (CDW) system using star schema, which collects sales, marketing, and financial transaction data from Monsanto order management, marketing, and accounting systems for use in decision support and executive information systems
  • Designed and Developed ETL pipelines for both internal and external systems

Hire Now