We provide IT Staff Augmentation Services!

Big Data Engineer Resume

5.00/5 (Submit Your Rating)

Phoenix, AZ

SUMMARY:

  • Over all 8+ years of professional IT experience with over 4 Years of Big Data experience in ingestion, storage, querying, processing and analysis.
  • Excellent understanding and hands on experience in HDFS, HIVE and Hadoop eco system tools including Pig, Sqoop.
  • Used Hive for data analysis, Sqoop for data migration, Flume for data ingestion, Oozie for scheduling and ZooKeeper for coordinating cluster resources.
  • Experience in developing Pig scripting for data processing on HDFS.
  • Experience in writing HiveQL queries to store processed data into Hive tables for analysis.
  • Experience in optimizing the hive queries by modifying the hive configuration files.
  • Experience in scheduling jobs using OOZIE workflow.
  • Experience in deploying and managing the Hadoop cluster using Cloudera Manager.
  • Involved in design and development of various web and enterprise applications using various technologies like XML and Amazon Web Services.
  • Good experience in databases like SQL Server and MySQL and good at database design, creating Tables, views, Stored Procedures, Functions, Triggers and Indexes.
  • Excellent interpersonal skills, good experience in interacting with clients with good team player and problem - solving skills.
  • Experience in using different file formats: Avro, Parquet, RCFile, JSON, SequenceFile .
  • Hands on experience in Shell scripting and Python.
  • Created HBase tables to store various data formats as input coming from different sources.
  • Strong experience in all the phases of SDLC including requirements gathering, analysis, design, implementation, deployment and support.
  • Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions.

TECHNICAL SKILLS:

Big data/Hadoop Ecosystem: HDFS, Map Reduce, Crunch, HIVE, PIG, HBase, Sqoop, Flume, Oozie and Avro

Programming Languages: C, Scala, SQL, PL/SQL, Linux shell scripts.

NoSQL Databases: HBase

Database: SQL Server 2008, MySQL .

Tools: Used: Eclipse, IntelliJ, GIT, Putty, Winscp, Cygwin

Operating System: Ubuntu (Linux), Win 95/98/2000/XP, Mac OS.

ETL Tools: SSIS

Testing: Hadoop Testing, Hive Testing, Quality Center (QC)

Monitoring and Reporting tools: Tableau, Custom Shell scripts.

PROFESSIONAL EXPERIENCE:

Confidential, Phoenix, AZ

Big Data Engineer

Responsibilities:

  • Involved in creating Business case and Functional requirement docs for Hadoop Ecosystem
  • Involved in installing Hadoop Ecosystem components i.e. CDH4,
  • Apache Hadoop 2.0, Python 2.7.5, etc
  • Created Cognos dashboards on top of HDFS for VIP customer lifestyle analysis
  • Used to manage and review the Hadoop log files.
  • Responsible to manage data coming from different sources.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in HDFS maintenance and loading of structured and unstructured data.
  • Installed and configured Pig and also written PigLatin scripts.
  • Wrote MapReduce job using Pig Latin.
  • Involved in managing and reviewing Hadoop log files.
  • Imported data using Sqoop to load data from Teradata to HDFS on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Written Hive queries for data analysis to meet the business requirements.
  • Creating Hive tables and working on them using Hive QL.
  • Used Remedy for bug tracking, issue tracking, and project management
  • Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: Hadoop, Map Reduce, HDFS, Sqoop, FLUME, Oozie, HBASE, Apache Spark, WinScp, UNIX Shell Scripting, HIVE, PIG, Cloudera.

Confidential, Austin, TX

Big Data Developer

Responsibilities:

  • Processed data into HDFS by developing solutions, analyzed the data using MapReduce programs produce summary results from Hadoop to downstream systems
  • Developed Map Reduce jobs using Map Reduce Java API and HIVEQL.
  • Writing Map Reduce program and implementing different design patterns
  • Developed Sqoop scripts to extract the data from MYSQL and load into HDFS
  • Applied Hive quires to perform data analysis on HBase using Storage Handler to meet the business requirements
  • Developed UDF, UDAF, UDTF functions and implemented it in HIVE Queries.
  • Writing Hive Queries to Aggregate Data that needs to be pushed to the Cassandra Tables.
  • Developing Scripts and Batch Job to schedule abundle (group of coordinators) which consists of various Hadoop Programs using Oozie
  • Implemented dynamic partitions, bucketing, sequence files, Multi Insert queries, compression techniques.
  • Experienced in using Avro data serialization system to handle Avro data files in map reduce programs.
  • Implemented optimized joins to gather data from different data sources using Map reduce joins.
  • Experienced in optimizing hive queries, joins to handle different data sets.
  • Configured oozie schedulers to handle different Hadoop actions on timely basis.
  • Involved in ETL, Data Integration and Migration by writing pig scripts.
  • Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts

Environment: HDFS, Hive, SQL Server 2008, SQL, PL/SQL, oozie, MYSql, UNIX Shell Scripting.

Confidential, Rancho Cordova, CA

Data Engineer

Responsibilities:

  • Worked extensively on Data warehousing, extensively used SQL Server an ETL tool to design and packages to move data from Source to Target database-using Stages.
  • Obtained detailed understanding of data sources, Flat files and Complex Data Schema.
  • Designed parallel jobs using various stages like Aggregator, Join, Transformer, Sort, Merge, Filter and Lookup, Sequence, ODBC, Hash file.
  • Worked extensively on Slowly Changing Dimensions using CDC stage.
  • Broadly involved in Data Extraction, Transformation and Loading (ETL process) from Source to target systems using ssis.
  • Generating Surrogate ID’s for the dimensions in the fact table for indexed, faster data access.
  • To reduce the response time, aggregated the data, data conversion and cleansed the large chunks of data in the process of transformation.
  • Involved in creating technical documentation for source to target mapping procedures to facilitate better understanding of the process and in corporate changes as and when necessary.
  • Successfully Integrated data across multiple and high volumes of data sources and target applications.
  • Automation of ETL processes using ssis Job Sequencer and Transform functions.
  • Extensively used ssis Director for Job Scheduling, emailing production support for troubleshooting from LOG files.
  • Optimized job performance by carrying out Performance Tuning Methods.
  • Used Autosys for scheduling the jobs.
  • Strictly followed the change control methodologies while deploying the code from QA to Production

Environment: SQL Server 2008, SQL, PL/SQL, Autosys 4.5, Visio, UNIX Shell Scripting, ssis.

Confidential

Sr. System Analyst

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development.
  • Prepared Use Cases, sequence diagrams, class diagrams and deployment diagrams based on UML to enforce
  • Rational Unified Process using Rational Rose.
  • Developed and implemented the MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
  • Written Junit Test cases for performing unit testing.
  • Used Rational Clear Case as Version control.
  • Developed the war/ear file using Ant script and deployed into Web Logic Application Server.
  • Used JavaScript for client-side validation and Struts Validator Framework for form validations.
  • Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
  • Developed web based presentation-using JSP, AJAX using YUI components and Servlets technologies and implemented using struts framework.
  • Designed and developed backend java Components residing on different machines to exchange information and data using JMS.
  • Worked with QA team for testing and resolving defects.
  • Used Jira for bug tracking and project management.

Environment: J2EE, JSP, JDBC, Spring Core, Struts, Hibernate, Design Patterns, XML, WebLogic, Apache Axis, Clear case, Junit, JavaScript, Web Services, SOAP, XSLT, Jira, Oracle, PL/SQL Developer and Windows.

We'd love your feedback!