Data Engineer Resume
0/5 (Submit Your Rating)
San Francisco, CA
SUMMARY
- Overall close to 5 years of IT industry experience including 3 years of experience in big data and 1 year of experience using Teradata.
- Experience in providing support to data analyst in running Hive queries.
- Experience in Importing and Exporting the Data using Sqoop from HDFS to Relational Database systems.
- Having a good knowledge of SQL databases like Netezza, Teradata.
- Responsible for generating ongoing monitoring reports using Tableau on a Quarterly basis and share the reports with the manager and onsite team.
- Hands - on experience in HBase-Hive Integration and Hive-Spark Integration.
- Expertise in big data architecture with Hadoop File system and its ecosystem tools Flume, HBase, Hive, Pig, Sqoop, Spark.
- Good knowledge of Spark SQL.
- Experience in working with different sources like XML files, JSON files, CSV files, text files.
- Involved actively in creating SQL scripts, testing/Debugging the scripts and fixing the code.
- Hands on experience on developing models and responsible for monitoring model’s performances every month.
- Used Spark SQL to access hive tables from spark for faster processing of data.
- Involved in Data Migration, Data Ingestion, and Data Modeling projects.
- Gained an extensive experience in Data analysis as we were using three sources at a time (Teradata, Netezza and SAS datasets).
- Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
- After loading the data into HDFS used Spark transformations and actions to create the tables and later used that data set for tableau dashboards.
- Experience in client-side Technologies such as HTML, XML, and JavaScript.
- Hands on Experience in Proc SQL can easily debug and do an analysis of various SAS data sets.
- Strong data analysis and SQL skills in Relational/Columnar/Big data environments (Joins, Union, rank, group by, order etc.)
- Experience in performance optimization for both data loading and data retrieval.
- Involved in Test case, Test scenario design as per the requirements and involved in Test Execution once the Defect is fixed.
- Experienced to implement big data solutions using Hive, Map reduce, shell scripting, and Java technologies.
- Developed core modules in large cross-platform applications using JavaScript, XML, and HTML.
- Gained an experience in analyzing the Data and created data visualizations (reports and dashboards) as required to support business needs using Tableau.
- Good Management and Documentation skills.
PROFESSIONAL EXPERIENCE
Data Engineer
Confidential - San Francisco, CA
Responsibilities:
- Created SQL scripts for production models and used to monitor the model performance every month.
- Worked on databases including Netezza, Teradata, and HBase.
- Worked with Data Modelers, ETL staff, Business System Analysts in Functional requirements reviews.
- Was responsible for migrating data from Netezza to Hadoop using SQOOP.
- Involved in creation of Internal and External tables in Hive and later accessing this tables using Hive-Spark integration for faster results.
- Extensively worked with Teradata utilities like Fast Export, Fast Load, to export and load data to/from different source systems including flat files.
- Involved in Using various software tools like Tableau, Toad, SVN, SAS enterprise guide, Teradata SQL assistant, Teradata studio express, WinSCP.
- Had Involved in creating tables and debugging SAS datasets through Proc SQL.
- Involved in creating PD, EAD, LGD models for Credit cards, Auto loans, Mortgage loans.
- Involved in Migrating code from a legacy system to a new System.
- For Every Quarter was responsible for generating Ongoing Monitoring Reports through Tableau.
- Experienced in Spark context(SC), Data frame, datasets and RDD’s
- Developed Spark applications using Scala to interact with the Teradata database to analyze the credit card accounts.
- Used SVN to share our Tableau reports and work-related documents.
- Created work tables, global temporary tables, volatile tables as part of developing the SQL script/code in Netezza and Teradata databases.
- Involved in giving KT sessions to the newly joined and made improvements/fixes to the existing code.
- Worked and created with Sqoop jobs with full refresh and incremental load to populate Hive External tables.
- Developed the Teradata stored procedures and scripts based on the technical design documents.
- Used Spark SQL to process a large amount of data.
- After migrating the data, involved in Testing the data between Netezza and HDFS.
- Experience with analytical manipulation and interpretation of large SAS data sets.
- Developed Hive scripts to perform aggregation operations on the loaded data.
- We used spark SQL for better performances and faster results to compare the data between Netezza and Hadoop.
- Whenever the Credit cards, Mortgage, Auto loans performance is low then we used to go upstream the database finds the root cause of the issue.
- Developed dashboards in Tableau Desktop and published them on to Tableau Server, which allowed end users to understand the reports.
- Define, design and develop complex/nested SQL queries for Tableau dashboards.
- Used calculated fields to show detailed trend analysis in Tableau.
- Involved in data analysis as part of monitoring model’s performance.
- Used various flat files/Servers to generate graphs in Tableau and Scheduled the reports and sent email reports using Tableau Server.
Environment: Hadoop, Teradata, Netezza, SAS, Tableau Desktop, Tableau Online, Toad, SQL, Proc SQL, Fast Export, Excel, Fast Load, Impala, Sqoop, Scala, Spark, Hive, Teradata SQL assistant, Cloudera.
Programmer Analyst
Confidential
Responsibilities:
- Creation of Hive tables, loading the structured data into tables and writing Hive Queries to further analyze the data
- Hands-on experience with MapRHadoopplatform to implement Bigdata solutions using Hive and Pig.
- Developed Spark scripts by using Scala Shell commands, as per the requirement.
- Involved in creating partitions for hive tables which also includes Multilevel Partitioning.
- Using Flume, we used to get the data from client sources and used PIG scripts to clean the data.
- Validated final data sets by comparing RDBMS source systems and writing SQL, Hive queries.
- Involved in creating HBase tables for querying minimum columns for a huge table.
- Also gained a hands-on experience on using Scala in the backend.
- Involved in creating Data Frames and Performed several aggregation logics using Spark.
- Experienced in developing Spark applications using Scala on different data formats like a Text file, CSV file.
- Involved in writing PIG scripts to transform the data and save them into hive tables.
- Involved in exploring new components ofHadooplike Spark, Oozie, HBase and Kafka for the development of a project.
- Learned to schedule the Sqoop jobs using Oozie to get the incremental data from the traditional database.
- Developed Hive queries to analyze the sales pattern and customer satisfaction index over the data present in various relational database tables.
- Developed PIG Latin scripts to extract the data from the web server output files and used to store files into HDFS.
- Created temp tables and used flat files to load the data into Spark RDD’s and used the data for modeling calculations.
- Worked on tuning the performance of HIVE and PIG queries.
- Worked on importing and exporting data from Oracle to HDFS using Sqoop.
- Involved in Data Ingestion techniques to move data from various sources into HDFS.
- Created hive staging tables Managed Tables and permanent tables External Table .
- Involved in requirements gathering and Responsible for helping teammates with the Hive andHadooparchitecture.
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, HBase, Spark, Scala, Cloudera, Oracle, SQL, SQOOP, Flume, Oozie, JSON.
Programmer Analyst
Confidential
Responsibilities:
- Involvement in all the stages of (SDLC) like requirements specifications, review, test documentation, application testing, and defect reporting.
- Involved in all the Test cases design and Test execution whenever the new build is deployed.
- Worked on IEDevelopertools to debug given HTML.
- Used Travelers Product Designer to fix any bugs or any issues identified during the testing period.
- Written test cases for Unit testing using JUnit on eclipse.
- Involved in Configuration Testing, Manual Testing, Unit Testing, integration testing, and Smoke testing to ensure web applications are Defect-free.
- Actively involved in Configuring new UI drop down box, radio buttons and text boxes as per requirements.
- Involved in complete development of Agile Development Methodology and tested the application in each iteration.
- Using Selenium-IDE and Selenium-web driver we used to automate test cases.
- Created regression test cases on web applications for future references to make there won’t be any code break when a new build is deployed.
- Whenever new build is integrated to web applications we used to check and verify using Selenium.
- Perform browser compatibility testing of an application under various cross browsers (Safari, Firefox, IE, Chrome) using HTML IDs and XPATH in SeleniumWeb Driver.
- Experienced in scripting Tests and automate them with Selenium IDE / Selenium Web Driver.
- We used to run various XML files in SOAP UI tool, this XML file data would be prefilled the insurance policy automatically.
- Creation and execution of automated software Test plans, Test cases and Test scripts using Selenium(Java).
- We used to test Umbrella, Master Pac, Master Pac plus polices accounts using SQL queries to evaluate the MySQL database captures correct values.
- Getting walkthrough on new requirements from my onsite leads.
- Preparing weekly and daily status reports for onsite managers.
- Used to Find the defects in system testing/System Integration Testing and we used HP ALM QC to raise the defects.
- Prepared high and low-level design documents for the business modules for future references and updates.
ENVIRONMENT: Java, XML, Excel, Test case design, HTML, JavaScript, HP quality center. Selenium IDE/web driver, Firebug, Fire path, Eclipse, SOAP UI, Product Designer, SQL, MySQL, MS access, Account manager.