We provide IT Staff Augmentation Services!

Big Data /data Engg Resume

Emeryville, CA

SUMMARY:

  • Designer, builder and manager of Big Data infrastructures
  • A collaborative engineering professional with substantial experience designing and executing solutions for complex business problems involving large scale data warehousing, real - time analytics and reporting solutions.
  • Known for using the right tools when and where they make sense and creating an intuitive architecture that helps organizations effectively analyze and process terabytes of structured and unstructured data.

COMPETENCY SYNOPSIS:

Data Warehousing: Proven history of building large-scale data processing systems and serving as an expert in data warehousing solutions while working with a variety of database technologies. Experience architecting highly scalable, distributed systems using different open source tools as well as designing and optimizing large, multi-terabyte data warehouses. Able to integrate state-of-the-art Big Data technologies into the overall architecture and lead a team of developers through the construction, testing and implementation phase.

Databases and Tools: MySQL, MS SQL Server, Oracle, DB2, NoSQL: HBase, SAP HANA, HDFS, MongoDB,, Vertica, Greenplum, Pentaho and Teradata.

Data Analysis: Consulted with business partners and made recommendations to improve the effectiveness of Big Data systems, descriptive analytics systems, and prescriptive analytics systems. Integrated new tools and developed technology frameworks/prototypes to accelerate the data integration process and empower the deployment of predictive analytics. Working knowledge of machine learning and/or predictive modeling.

Tools: Hive, Pig and Hadoop Streaming, MapReduce,Spark, Kafka.

Data Transformation: Experience designing, reviewing, implementing and optimizing data transformation processes in the Hadoop and Informatica ecosystems. Able to consolidate, validate and cleanse data from a vast range of sources - from applications and databases to files and Web services.

PROFESSIONAL EXPERIENCE:

Confidential, Emeryville, CA

Big Data /Data Engg

Responsibilities:

  • Designed a large data warehouse using star schema, flow-flake.
  • Designed and developed Big Data analytics platform for processing customer viewing preferences using Scala, Hadoop, Spark and Hive .
  • Gather the data for analytics for client different sector like on retail, employee reporting
  • Integrated Hadoop into Teradata accelerating the extraction, transformation, and loading of massive structured and unstructured dat a.
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File g and PIG to pre-process the data
  • Loaded the aggregate data into a relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming.
  • Developed Scala scripts, UDFFs using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP
  • Implemented best income logic using Pig scripts.
  • Implemented test scripts to support test driven development and continuous integration.Responsible to manage data coming from different sources.
  • Installed and configured Hive and also written Hive UDF.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning
  • Created reports and dashboards using structured and unstructured data n different Tools like Tableau .
  • Experience in designing and developing in Spark using Scala to compare the performance of Spark with Hive and SQL.
  • Developed Spark scripts by using Scala shell commands as per the requirement
  • Generated Dashboards with Quick filters, Parameters and sets to handle views more efficiently.
  • Generated context filters and data source filters while handling huge volume of data.
  • Built dashboards for measures with forecast, trend line and reference lines
  • Experience in creating different Visualizations using Bars, Lines and Pies, Maps, Scatter plots, Gantts, Bubbles, Histograms, Bullets, Heat maps and Highlight tables .

Confidential, San Francisco

Big Data Engg

Responsibilities:

  • Strong understanding of Data warehouse concepts, ETL, Star Schema, Snowflake, physical and logical data models.
  • Use Informatica IDQ to perform ETL
  • Configured SQL database to store Spark metadata.
  • Design data analysis platform for business using python, Hadoop and spark.
  • Loaded unstructured data into Hadoop File System and Teradata.
  • Wrote MapReduce job using Pig Latin.
  • Written Hive queries for data analysis to meet the business requirements.
  • Creating Hive tables and working on them using Hive QL .
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Python
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
  • Experience with scripting language like LAM P
  • Created reports and dashboards using structured and unstructured data n different Tools like Tableau
  • Generated Dashboards with Quick filters, Parameters and sets to handle views more efficiently.
  • Generated context filters and data source filters while handling huge volume of data.
  • Built dashboards for measures with forecast, trend line and reference lines
  • Experience in creating different Visualizations using Bars, Lines and Pies, Maps, Scatter plots, Gantts, Bubbles, Histograms, Bullets, Heat maps and Highlight tables.

Confidential, Concord, CA

Data Architect / Developer

Responsibilities:

  • Gathered business requirements, definition and design of the data sourcing and data flows, data quality analysis, working in conjunction with the data warehouse architect on the development of logical data models.
  • Used Erwin for data modeling.
  • Created complex Stored Procedures , Triggers, Functions , Indexes, Tables, Views and other T-SQL code and SQL joins for applications.
  • Implemented database standards and naming convention for the database objects. established data granularity standards, designed and built star and snowflake dimensional models
  • Developed Informatica Packages to extract, transform and load (ETL) data into the data warehouse database from heterogeneous databases/data sources
  • Use Java Transformation and written code in Java for xml parasing, writing to the file etc
  • Des igned Star Schema modeling creating Facts, Dimensions, Measures and Cubes and optimized data connections, Data Extracts , Schedules for background tasks and Incremental Refresh for the weekly and monthly dashboard reports on Tableau Server.
  • Used Excel Sheets , Flat files , CSV files to generate Tableau adhoc reports. Involved in creating calculated fields, mapping and hierarchies.
  • Generated Context Filters and used performance actions while handling huge volume of data. Generated tableau dashboards for sales with forecast and Reference lines. Very proficient in working with large Databases in DB2, Oracle, Teradata and SQL Server.
  • Strong understanding of Data warehouse concepts, ETL, Star Schema, Snowflake, physical and logical data models.
  • Depth knowledge of Normalizations, Fact and Dimensional tables.
  • Involved in creating Complex Stored Procedures , Triggers , Cursors , Tables and Views and other SQL joins and statements for reporting application development.
  • Worked on building queries to retrieve data into Tableau from Oracle and developed SQL statements (ETL) for loading data into Target Schema.
  • Excellent analytical and problem solving using Tableau and SQL debugging skills. Drive informed decisions by analyzing business and product performanc e

Confidential, Los Angeles, CA

Data Developer

Responsibilities:

  • Gathered business requirements, definition and design of the data sourcing and data flows, data quality analysis, working in conjunction with the data warehouse architect on the development of logical data models.
  • Used Erwin for data modeling.
  • Created complex Stored Procedures , Triggers, Functions , Indexes, Tables, Views and other T-SQL code and SQL joins for applications.
  • Implemented database standards and naming convention for the database objects. established data granularity standards, designed and built star and snowflake dimensional models
  • Developed S SIS Packages to extract, transform and load (ETL) data into the data warehouse database from heterogeneous databases/data sources
  • Designed Star Schema modeling creating Facts, Dimensions, Measures and Cubes in SSAS.
  • Developed drill down and drill through reports from multi-dimensional objects like star schema and snow flake schema using SSRS and SharePoint server.
  • Designed Aggregations and pre-calculating in SSAS .

Hire Now