We provide IT Staff Augmentation Services!

Big Data Engineer Resume

Phoenix, AZ


  • Over 13+ years of experience in the field of IT including 3+ years of experience in Big Data.
  • Hands on experience in Apache Hadoop ecosystem components like HDFS, Map Reduce, Oozie, Hive, Sqoop, HBase, Pig, Spark and Scala.
  • Hands on experience in developing Spark applications using Spark API's like Spark core, Spark Streaming and Spark SQL.
  • Experience in Azure cloud computing and Azure Cosmos DB
  • Exposure in Data modeling and data mining. Experienced in SDLC methodology such as Agile and Spiral model.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie . Good exposure in Elastic Search & Datameer.
  • Deployed data from various sources to HDFS and generated reports using Tableau. Experience in AWS S3 and Amazon Aurora for few months. Exposure in Micro service framework. Hands on experience in UNIX shell scripting.
  • Basic knowledge in Hadoop Administering, Installation, configuration, Performance Monitoring, Capacity planning.
  • Experience in IT Service Delivery areas of Configuration, SLM, Availability, Change, Problem and IT Service Continuity Management
  • Technically strong in IBM Mainframe technologies. Development experience in COBOL, DB2 and JCL and working in CICS environment plus application deployment to production
  • Excellent communication skills and leadership skills.


Big Data Technologies: Hadoop, HDFS, Hive, Pig, Oozie, Sqoop, Map - Reduce, Hbase, Spark, Kafka, Zookeeper & Ambari

Database Technologies: SQL Server & DB2

Programming Languages: JAVA,SCALA,R,COBOL,JCL & C#

Web Technologies: HTML & Scope Script

Operating Systems: Windows, Linux, OS/390 & Z/OS

Tools: Change Man, QMF, SDSF, CONTROL-D, FILE AID, Infoman, GitHub,FastExport, Informatica & Tableau


Confidential, Phoenix, AZ

Big Data Engineer


  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL.
  • Expert in creating and designing data ingest pipelines using technologies such as Kafka
  • Spark RDDs are created for all the data files and then transformed to cash only transaction RDDs.
  • The filtered cash only RDDs are aggregated based on the business rules and CTR requirements and converted into data frames and saved as temporary hive tables for intermediate processing.
  • Worked on in-memory based Apache Spark application for ETL transformations.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs
  • Written Hive queries for data analysis to meet the business requirements Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Developed Map-Reduce programs in Java for data cleaning and pre-processing.
  • Prepared pre-processing script and get various source data into CS(Cornerstone Database)
  • Involved migrating the data using Sqoop from HDFS to Relational Database System and vice-versa according to client's requirement.
  • Responsible for managing and reviewing Hadoop log files. Designed and developed data management using MySQL
  • Involved in generating the Adhoc reports using Hive queries. Provide operational support for Hadoop and/or MySQL databases.
  • Involved in creating calculated fields and dashboards in Tableau for visualization of the analyzed data.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.

Confidential, Seattle, WA

Big Data Consultant


  • Developed Java programs to parse the raw data, populate staging tables and store the refined data in partitioned tables store into Azure BLOB.
  • Imported data from SQL server using sqoop. Used the data in various analysis purpose
  • Performed complex data transformations in Spark using Scala.
  • Converted Hive/SQL queries into Spark transformations using Spark RDDs and Scala. Used Event Hub messaging system.
  • Developed Spark code using Scala and Spark-SQL for faster testing and data processing. Predictive analytic using Apache Spark Scala APIs.
  • Written Hive queries for data analysis and to process the data for visualization. Responsible for developing for Hive script. Extending Hive core functionality by writing custom UDFs .
  • Managed and scheduled batch Jobs on a Cluster using Oozie.
  • Weekly prepared overall data validation reports and prepare POC documentation.
  • An acknowledged expert and leader in the field, early adopter and evangelist of web science. Information Architecture best practice and more recently Linked Data and Semantic architectures for Big Data solutions.
  • Maintained and monitoring clusters. Loaded data from SQL Server to Blob using Azure Data Factory (ADF). Written scope script for various cosmos related services. Created lot of ADF pipeline moving data to different storage account.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Implemented schema extraction for Parquet and Avro file Formats in Hive. Monitored Job logs and other cluster information in Ambari.
  • Used GitHub for store the source code for a project and track the complete history of all changes to that code. Also bug tracking all the task management activities.
  • Weekly meeting with Product and Tech Team for any handover or issue prioritization


Data Engineer


  • Involved in Installing, Configuring Hadoop ecosystem, and Cloudera Manager using CDH4 Distribution.
  • Experienced in managing and reviewing Hadoop log files
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data
  • Load and transform large sets of structured and semi structured.
  • Supported Map Reduce Programs those are running on the cluster
  • Importing and Exporting of data from RDBMS to HDFS using Sqoop.
  • Installed and configured Hive and also written Hive UDFs.
  • Involved in creating Hive tables, loading the data and writing hive queries which will run internally in map reduce.
  • Worked closely with business users to refine requirements;
  • Root cause analysis reports on incidents including recommendations.
  • Written Hive queries for data to meet the business requirements.
  • Analyzed the data using Pig and written Pig scripts by grouping, joining and sorting the data.
  • Worked on Data Serialization formats for converting Complex objects into sequence bits by using Avro, Parquet, JSON, and CSV.
  • Participated in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users. Worked in iText framework for manipulating PDF files.
  • Generate Marketing Reports using Elasticsearch. Actively participated in weekly meetings with the technical teams to review the code.


Software Developer/ ITIL Change Manager


  • Coordinated release content and effort based on the service request backlog, pending service requests, third party applications, or operating system updates
  • Communicated all key project plans, commitments, and changes including requirements, QA plans, schedule, and scope changes
  • Managed relationships and coordinate work between different teams at different locations
  • Conduct Release Readiness reviews, Milestone Reviews, and Business Go/No-Go reviews
  • Produce Deployment, Run Books and Implementation Plans. Weekly Release Reporting
  • Communicated release details and schedules to the Business as required
  • Provided first-line investigation and diagnosis of all Incidents and Service Requests
  • Monitored, maintained, and controlled hardware and software configurations in classified network environment.
  • Sponsored the process by ensuring the Change Manager has adequate resources to design the Change Management process conforming to best practices and meeting the needs of the organization.
  • Developed work flow of Release and Change Management procedures
  • Resolved disputes over the allocation of responsibilities and to sponsor the communication campaign to promote awareness and acceptance of the Change Management process.
  • Provide the description, mission statement, roadmap, strategy, process objectives and metrics to measure success and obtain formal approval for the process and its associated procedures.

IT Service Delivery Manager



  • Worked with project members to develop specifications, diagrams and flowcharts. Measuring-determining progress through formal and informal report
  • Liaison with production services (CMC and Production Change Control) on production implementation activities
  • Drafting, negotiating and refining SLAs with the business units, ensuring business requirements are met and agreement from all parties involved
  • Implementing SLAs. Measuring SLA performance, reporting results and adjusting as necessary
  • ITIL Problem Management - Managed and coordinated all activities necessary to detect, analyze and initiate resolution of problems using Root Cause Analysis (RCA) methodologies
  • Determining current levels of service to use as a starting point in SLA negotiations
  • Worked closely with traders and portfolio counselors utilizing business and technology knowledge to analyze and assess IT operational maturity, identify design improvements and drive efficiency in business operations processes
  • Developed and maintained enterprise-wide integration and dependency schedules identified and documented business and system processes on UAT and PROD in the configuration management system
  • Tailored RFC’s processes to fit the requirements of the business by optimizing risk exposure and minimizing severity of impact as per ITIL standards

Hire Now