We provide IT Staff Augmentation Services!

Big Data Engineer Resume

5.00/5 (Submit Your Rating)

SUMMARY:

  • 6 years of experience in software development which includes 4 years on Big data Hadoop Ecosystem, Hive, Spark, Scala, Map Reduce, YARN and Elastic Map Reduce (AWS EMR).
  • Hands on experience with Apache Hive (HQL), MySQL, HDFS, AWS and Shell scripting. Worked on MR, TEZ and Spark execution engines for Hive.
  • Experience working on AWS EMR on EC2 instances utilizing its elasticity.
  • Experience writing Spark jobs in Scala and MR Jobs in Java.
  • Experience migrating Greenplum, Netezza and Oracle SQL programs to Hadoop and cloud using Apache Hive, AWS EMR and S3.
  • Experience migrating Pig scripts to Spark jobs using Scala.
  • Experience leading a team and being a SCRUM master.
  • Strong experience in optimizing HQL queries and resource utilization on distributed clusters based on the input data(size).
  • Experience working on Mapreduce Java and HBase APIs.
  • Hands on experience on performing jobs on Petabyte scale data.
  • Strong knowledge of Algorithms & Data structures, Object oriented programming and Software development life cycle.
  • Experience in Hortonworks and Cloudera Hadoop distributions.
  • Hands on experience with Big Data technologies like Pig, Sqoop, Oozie.
  • Hands on experience in version controlling systems SVN and GIT.
  • Experience in designing and coding web applications in core JAVA/JAVA EE, XML and AJAX using MVC architecture.
  • Experience in all phases of SDLC using agile software development methodology.
  • Experience in Coding and debugging in C# using .NET framework and visual studio.

TECHNICAL SKILLS:

Languages: Java, Scala, HQL (Hive query language), SQL, SparkSQL, Shell Scripting (Bash)

Big Data: Apache Hadoop, Apache Spark, HDFS, MapReduce, Apache Hive, Amazon S3, Amazon Elastic Map Reduce, EC2, Apache HBase, Apache Sqoop, Apache Pig.

Databases: Oracle, GreenPlum, Netezza, Postgres, MySql, AWS RDS

Tools: Splunk monitoring tool, Jenkins, Ant, JIRA, Crucible Fisheye, AquaDataStudio, PgAdmin for Greenplum and Postgres, Aginity for Netezza.

Version Control: Subversion and Git

Methodologies: Agile, Waterfall

EXPERIENCE:

Confidential

Big Data Engineer

Responsibilities:

  • Analyze and work on development tasks in every sprint.
  • Write Spark jobs and Map reduce jobs in Scala and Java and operationalize the jobs using shell scripting and JAMS jobs.
  • Analyze and work on enhancements.
  • Support productions issues.
  • Participate in spikes and POCs.
  • Write unit tests for both Java and Scala code.
  • Migrate pig scripts to Spark Jobs using Scala.
  • Deploy the code in QA environment and test an end to end run on AWS cluster.
  • Do performance analysis on Spark and Map reduce jobs on different configuration EC2 clusters.
  • Perform parallel run and make sure the build is giving expected results.
  • Use Spark to compare large datasets from both parallel environment and Prod environment.
  • Port legacy C++ code to new Java framework for one of the components of OATS.
  • Fix bugs after the initial Java porting, do the unit testing, perform parallel run and compare both legacy C++ output with New Java output.
  • Work on scripts to automate the data retrieval and comparison.

Confidential

Big Data Engineer

Responsibilities:

  • Analyze pattern requirements, create user stories and sub - tasks and distribute story points and expected hours for the subtasks on JIRA.
  • Lead a team of 4 developers in setting up an offshore team and play the SCRUM Master role.
  • Design the implementation and do design walkthrough to the stakeholders.
  • Create Hive external tables on top of S3 data, adding partitions, coding as per the requirements using Hive (HQL), Java (UDFs) and Shell scripting.
  • Integrate the developed pattern with Pattern Toolkit Component (PTC) framework for running the pattern on AWS.
  • Integrate the patterns to Spark environment and run Hive on top of Spark.
  • Perform performance analysis between Hive jobs on Spark vs TEZ vs MR engines
  • Spin up clusters on AWS and run the pattern in a step wise manner, copy back the resulted output from the hive code on to S3 in the final step.
  • Perform Code walkthrough and also participate in peer review.
  • Deploy the pattern and handover the build label to QA team for further testing.
  • Help QA team in writing ANT targets to simulate the execution process on QA environment.

Confidential, Charlotte

Free Lancer/Volunteer

Responsibilities:

  • Assist in installing and maintaining Hadoop cluster spanning over Multi terabytes (Cloudera Hadoop)
  • Manage data coming from different sources.
  • Support application users in debugging their Map Reduce applications.
  • Involve in Hadoop maintenance and ongoing support.
  • Help PhD students in data analysis tasks using Hive and Pig.

Confidential

Associate Software Engineer

Responsibilities:

  • Analyze the functional requirements documents and prepare user stories as part of requirements phase. Follow Agile/SCRUM development.
  • Use Atlassian tool JIRA for project and issue tracking - Raising user stories for development tasks
  • Develop the assigned user stories in time in C# .NET, .NET Framework 3.5, VS 2010
  • Develop automated test scripts using Nunit.
  • Write unit test cases integration test cases and execute those test cases.

Confidential

Associate Software Engineer

Responsibilities:

  • Prepare the user stories for new requirements and improvements/RFCs.
  • Fix the assigned RFCs (Request for Change) in C# .NET
  • Check in the developed code into VSS and Subversion.
  • Write unit test cases using NUnits.
  • Use NCover for code coverage.
  • Adhere to quality standards conducted and participated in peer reviews.
  • Assist in integration, functional, regression and media check testing.

Confidential

Intern

Responsibilities:

  • Learn the firmware of the company’s products and familiarize with the working conditions.
  • Communicate with the on-site employees and self-learn technologies used for the product development.
  • Create batch files for automation of nightly builds and for executing unit test cases.
  • Write unit test cases for the developed code.
  • Perform code coverage using NCover.

We'd love your feedback!