We provide IT Staff Augmentation Services!

Data Engineer Resume

0/5 (Submit Your Rating)

San Francisco, CA

SUMMARY

  • Overall close to 5 years of IT industry experience including 3 years of experience in big data and 1 year of experience using Teradata.
  • Experience in providing support to data analyst in running Hive queries.
  • Experience in Importing and Exporting the Data using Sqoop from HDFS to Relational Database systems.
  • Having a good knowledge of SQL databases like Netezza, Teradata.
  • Responsible for generating ongoing monitoring reports using Tableau on a Quarterly basis and share the reports with the manager and onsite team.
  • Hands - on experience in HBase-Hive Integration and Hive-Spark Integration.
  • Expertise in big data architecture with Hadoop File system and its ecosystem tools Flume, HBase, Hive, Pig, Sqoop, Spark.
  • Good knowledge of Spark SQL.
  • Experience in working with different sources like XML files, JSON files, CSV files, text files.
  • Involved actively in creating SQL scripts, testing/Debugging the scripts and fixing the code.
  • Hands on experience on developing models and responsible for monitoring model’s performances every month.
  • Used Spark SQL to access hive tables from spark for faster processing of data.
  • Involved in Data Migration, Data Ingestion, and Data Modeling projects.
  • Gained an extensive experience in Data analysis as we were using three sources at a time (Teradata, Netezza and SAS datasets).
  • Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
  • After loading the data into HDFS used Spark transformations and actions to create the tables and later used that data set for tableau dashboards.
  • Experience in client-side Technologies such as HTML, XML, and JavaScript.
  • Hands on Experience in Proc SQL can easily debug and do an analysis of various SAS data sets.
  • Strong data analysis and SQL skills in Relational/Columnar/Big data environments (Joins, Union, rank, group by, order etc.)
  • Experience in performance optimization for both data loading and data retrieval.
  • Involved in Test case, Test scenario design as per the requirements and involved in Test Execution once the Defect is fixed.
  • Experienced to implement big data solutions using Hive, Map reduce, shell scripting, and Java technologies.
  • Developed core modules in large cross-platform applications using JavaScript, XML, and HTML.
  • Gained an experience in analyzing the Data and created data visualizations (reports and dashboards) as required to support business needs using Tableau.
  • Good Management and Documentation skills.

PROFESSIONAL EXPERIENCE

Data Engineer

Confidential - San Francisco, CA

Responsibilities:

  • Created SQL scripts for production models and used to monitor the model performance every month.
  • Worked on databases including Netezza, Teradata, and HBase.
  • Worked with Data Modelers, ETL staff, Business System Analysts in Functional requirements reviews.
  • Was responsible for migrating data from Netezza to Hadoop using SQOOP.
  • Involved in creation of Internal and External tables in Hive and later accessing this tables using Hive-Spark integration for faster results.
  • Extensively worked with Teradata utilities like Fast Export, Fast Load, to export and load data to/from different source systems including flat files.
  • Involved in Using various software tools like Tableau, Toad, SVN, SAS enterprise guide, Teradata SQL assistant, Teradata studio express, WinSCP.
  • Had Involved in creating tables and debugging SAS datasets through Proc SQL.
  • Involved in creating PD, EAD, LGD models for Credit cards, Auto loans, Mortgage loans.
  • Involved in Migrating code from a legacy system to a new System.
  • For Every Quarter was responsible for generating Ongoing Monitoring Reports through Tableau.
  • Experienced in Spark context(SC), Data frame, datasets and RDD’s
  • Developed Spark applications using Scala to interact with the Teradata database to analyze the credit card accounts.
  • Used SVN to share our Tableau reports and work-related documents.
  • Created work tables, global temporary tables, volatile tables as part of developing the SQL script/code in Netezza and Teradata databases.
  • Involved in giving KT sessions to the newly joined and made improvements/fixes to the existing code.
  • Worked and created with Sqoop jobs with full refresh and incremental load to populate Hive External tables.
  • Developed the Teradata stored procedures and scripts based on the technical design documents.
  • Used Spark SQL to process a large amount of data.
  • After migrating the data, involved in Testing the data between Netezza and HDFS.
  • Experience with analytical manipulation and interpretation of large SAS data sets.
  • Developed Hive scripts to perform aggregation operations on the loaded data.
  • We used spark SQL for better performances and faster results to compare the data between Netezza and Hadoop.
  • Whenever the Credit cards, Mortgage, Auto loans performance is low then we used to go upstream the database finds the root cause of the issue.
  • Developed dashboards in Tableau Desktop and published them on to Tableau Server, which allowed end users to understand the reports.
  • Define, design and develop complex/nested SQL queries for Tableau dashboards.
  • Used calculated fields to show detailed trend analysis in Tableau.
  • Involved in data analysis as part of monitoring model’s performance.
  • Used various flat files/Servers to generate graphs in Tableau and Scheduled the reports and sent email reports using Tableau Server.

Environment: Hadoop, Teradata, Netezza, SAS, Tableau Desktop, Tableau Online, Toad, SQL, Proc SQL, Fast Export, Excel, Fast Load, Impala, Sqoop, Scala, Spark, Hive, Teradata SQL assistant, Cloudera.

Programmer Analyst

Confidential

Responsibilities:

  • Creation of Hive tables, loading the structured data into tables and writing Hive Queries to further analyze the data
  • Hands-on experience with MapRHadoopplatform to implement Bigdata solutions using Hive and Pig.
  • Developed Spark scripts by using Scala Shell commands, as per the requirement.
  • Involved in creating partitions for hive tables which also includes Multilevel Partitioning.
  • Using Flume, we used to get the data from client sources and used PIG scripts to clean the data.
  • Validated final data sets by comparing RDBMS source systems and writing SQL, Hive queries.
  • Involved in creating HBase tables for querying minimum columns for a huge table.
  • Also gained a hands-on experience on using Scala in the backend.
  • Involved in creating Data Frames and Performed several aggregation logics using Spark.
  • Experienced in developing Spark applications using Scala on different data formats like a Text file, CSV file.
  • Involved in writing PIG scripts to transform the data and save them into hive tables.
  • Involved in exploring new components ofHadooplike Spark, Oozie, HBase and Kafka for the development of a project.
  • Learned to schedule the Sqoop jobs using Oozie to get the incremental data from the traditional database.
  • Developed Hive queries to analyze the sales pattern and customer satisfaction index over the data present in various relational database tables.
  • Developed PIG Latin scripts to extract the data from the web server output files and used to store files into HDFS.
  • Created temp tables and used flat files to load the data into Spark RDD’s and used the data for modeling calculations.
  • Worked on tuning the performance of HIVE and PIG queries.
  • Worked on importing and exporting data from Oracle to HDFS using Sqoop.
  • Involved in Data Ingestion techniques to move data from various sources into HDFS.
  • Created hive staging tables Managed Tables and permanent tables External Table .
  • Involved in requirements gathering and Responsible for helping teammates with the Hive andHadooparchitecture.

Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, HBase, Spark, Scala, Cloudera, Oracle, SQL, SQOOP, Flume, Oozie, JSON.

Programmer Analyst

Confidential

Responsibilities:

  • Involvement in all the stages of (SDLC) like requirements specifications, review, test documentation, application testing, and defect reporting.
  • Involved in all the Test cases design and Test execution whenever the new build is deployed.
  • Worked on IEDevelopertools to debug given HTML.
  • Used Travelers Product Designer to fix any bugs or any issues identified during the testing period.
  • Written test cases for Unit testing using JUnit on eclipse.
  • Involved in Configuration Testing, Manual Testing, Unit Testing, integration testing, and Smoke testing to ensure web applications are Defect-free.
  • Actively involved in Configuring new UI drop down box, radio buttons and text boxes as per requirements.
  • Involved in complete development of Agile Development Methodology and tested the application in each iteration.
  • Using Selenium-IDE and Selenium-web driver we used to automate test cases.
  • Created regression test cases on web applications for future references to make there won’t be any code break when a new build is deployed.
  • Whenever new build is integrated to web applications we used to check and verify using Selenium.
  • Perform browser compatibility testing of an application under various cross browsers (Safari, Firefox, IE, Chrome) using HTML IDs and XPATH in SeleniumWeb Driver.
  • Experienced in scripting Tests and automate them with Selenium IDE / Selenium Web Driver.
  • We used to run various XML files in SOAP UI tool, this XML file data would be prefilled the insurance policy automatically.
  • Creation and execution of automated software Test plans, Test cases and Test scripts using Selenium(Java).
  • We used to test Umbrella, Master Pac, Master Pac plus polices accounts using SQL queries to evaluate the MySQL database captures correct values.
  • Getting walkthrough on new requirements from my onsite leads.
  • Preparing weekly and daily status reports for onsite managers.
  • Used to Find the defects in system testing/System Integration Testing and we used HP ALM QC to raise the defects.
  • Prepared high and low-level design documents for the business modules for future references and updates.

ENVIRONMENT: Java, XML, Excel, Test case design, HTML, JavaScript, HP quality center. Selenium IDE/web driver, Firebug, Fire path, Eclipse, SOAP UI, Product Designer, SQL, MySQL, MS access, Account manager.

We'd love your feedback!