Sr. Big Data Engineer Resume Arlington, VA - Hire IT People

SUMMARY:

Overall 7+ years of working experience in IT as a Big Data Engineer, Data Engineer and Programmer Analyst.
Hands on experience in SQL queries, PL/SQL programming and created new packages and procedures and modified and tuned existing procedure and queries using TOAD.
Hands on experience in SQL queries and optimizing the queries in Oracle, SQL Server, DB2, and Netezza & Teradata.
Hands on experience in Normalization (1NF, 2NF, 3NF and BCNF) De - normalization techniques for effective and optimum performance in OLTP and OLAP environments.
Experience in designing, building and implementing complete Hadoop ecosystem comprising of Map Reduce, HDFS, Hive, Impala, Pig, Sqoop, Oozie, HBase, MongoDB, and Spark.
Strong experience with architecting highly per formant databases using PostgreSQL, PostGIS, MySQL and Cassandra.
Extensive experience in using ER modeling tools such as Erwin and ER/Studio, Teradata, BTEQ, MLDM and MDM.
Experienced on R and Python for statistical computing. Also experience with MLlib (Spark), Matlab, Excel, Minitab, SPSS, and SAS
Extensive experience in loading and analyzing large datasets with Hadoop framework (MapReduce, HDFS, Pig, Hive, Flume, Sqoop).
Good experience in using SSRS and Cognos in creating and managing reports for an organization.
Excellent working experience in Scrum / Agile framework and Waterfall project execution methodologies.
Expertise in Data Modeling, Data Migration, Data Profiling, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools such as Informatica Power Centre.
Strong Experience in working with Databases like Teradata and proficiency in writing complex SQL, PL/SQL for creating tables, views, indexes, stored procedures and functions.
Experience in importing and exporting Terabytes of data between HDFS and Relational Database Systems using Sqoop.
Good experience working on analysis tool like Tableau for regression analysis, pie charts, and bar graphs.
Good understanding of Apache Spark High level architecture and performance tuning patterns.
Hands On experience on developing UDF, Data Frames and SQL Queries in Spark SQL.
Worked with NoSQL databases like HBase, Cassandra and MongoDB for information extraction and place huge amount of data.
Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases.
Experienced in writing Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
Developed PL/SQL programs (Functions, Procedures, Packages and Triggers).
Good experience in using Sqoop for traditional RDBMS data pulls.

TECHNICAL SKILLS:

Big Data & Hadoop Ecosystem: Hadoop 3.0, HBase 1.2, Hive 2.3, Pig 0.17, Solr 7.2, Flume 1.8, Sqoop 1.4, Kafka 1.0.1, Oozie 4.3, Hue, Hadoop 3.0, Cassandra 3.11

Data Modeling Tools: Erwin r9.7, ER Studio v16

BI Tools: Tableau 10, SAP Business Objects, Crystal Reports

Methodologies: Agile, SDLC, Ralph Kimball data warehousing methodology, Joint Application Development (JAD)

RDBMS: Microsoft SQL Server 2017, Teradata 15.0, Oracle 12c, and MS Access

Operating Systems: Microsoft Windows 7/8 and 10, UNIX, and Linux.

Packages: Microsoft Office 2019, Microsoft Project, SAP and Microsoft Visio 2019, Share point Portal Server

OLAP Tools: Tableau, SAP BO, SSAS, Business Objects, and Crystal Reports 9

Cloud Platform: AWS, Azure, Google Cloud, Cloud Stack/Open Stack

Programming Languages: SQL, PL/SQL, UNIX shell Scripting, PERL, AWK, SED

Databases: Oracle 12c/11g, Teradata R15/R14, MS SQL Server 2016/2014, DB2.

ETL/Data warehouse Tools: Informatica 9.6/9.1, SAP Business Objects XIR3.1/XIR2, Talend, Tableau, and Pentaho.

PROFESSIONAL EXPERIENCE:

Confidential - Arlington, VA

Sr. Big Data Engineer

Responsibilities:

As a Sr. Big Data Engineer worked on Big Data technologies like Apache Hadoop, MapReduce, Shell Scripting, and Hive.
Involved in all phases of SDLC using Agile and participated in daily scrum meetings with cross teams
Wrote complex Hive queries to extract data from heterogeneous sources (Data Lake) and persist the data into HDFS.
Created data integration and technical solutions for Azure Data Lake Analytics, Azure Data Lake Storage, Azure Data Factory, Azure SQL databases and Azure SQL Data Warehouse for providing analytics.
Involved in all phases of data mining, data collection, data cleaning, developing models, validation and visualization.
Installed and configured Hadoop ecosystem like HBase, Flume, Pig and Sqoop.
Designed and develop Big Data analytic solutions on a Hadoop-based platform and engage clients in technical discussions.
Installed, Configured and Maintained the Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, HBase, Zookeeper and Sqoop.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Worked on Hive queries to categorize data of different wireless applications and security systems.
Responsible in loading and transforming huge sets of structured, semi structured and unstructured data.
Extensively involved in writing PL/SQL, stored procedures, functions and packages.
Involved in Data Architecture, Data profiling, Data analysis, data mapping and Data architecture artifacts design.
Created linked services to connect to Azure Storage, on-premises SQL Server and Azure HDInsight
Responsible for Big data initiatives and engagement including analysis, brainstorming, POC, and architecture.
Implemented logical and physical relational database and maintained Database Objects in the data model using Erwin.
Worked with NoSQL databases like HBase in creating tables to load large sets of semi structured data coming from source systems.
Developed numerous MapReduce jobs in Scala for Data Cleansing and Analyzing Data in Impala.
Created Data Pipeline using Processor Groups and multiple processors using Apache Nifi for Flat File, RDBMS as part of a POC using Amazon EC2.
Managed the Metadata associated with the ETL processes used to populate the Data Warehouse.
Created Hive queries and tables that helped line of business identify trends by applying strategies on historical data before promoting them to production.
Configured Azure SQL database with Azure storage Explorer and with SQL server.
Designed Data Marts by following Star Schema and Snowflake Schema Methodology, using industry leading Data modeling tools like Erwin.
Designed class and activity diagrams using Power Designer and UML tools like Visio.

Environment: Hadoop 3.0, SDLC, Azure, HBase 1.2, Pig 0.17, Sqoop 1.4, Zookeeper, Oozie 4.3, SQL, HDFS, Hive 2.3, PL/SQL, Erwin 9.8, Scala, Apache Nifi, ETL, Excel, Flume 1.8.

Confidential - Lowell, AR

Data Engineer

Responsibilities:

Worked as a Big Data implementation engineer within a team of professionals.
Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
Created HBase tables to store various data formats of PII data coming from different portfolios
Worked on End to End Software Development Life Cycle process in Agile Environment using Scrum methodologies.
Used forward engineering to generate DDL from the Physical Data Model and handed it to the DBA.
Created external tables pointing to HBase to access table with huge number of columns.
Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
Involved in writing complex SQL Queries and provided SQL Scripts for the Configuration Data which is used by the application.
Implemented enterprise grade platform (mark logic) for ETL from mainframe to NoSQL (Cassandra)
Implemented Installed and configured of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.
Developed the code for Importing and exporting data into HDFS and Hive using Sqoop
Developed Pig scripts to parse the raw data, populate staging tables and store the refined data in partitioned DB2 tables for Business analysis.
Developed normalized Logical and Physical database models for designing an OLTP application.
Integrated NoSQL database like HBase with MapReduce to move bulk amount of data into HBase.
Developed the code to perform Data extractions from Oracle Database and load it into AWS platform using AWS Data Pipeline.
Rendered and delivered reports in desired formats by using reporting tools such as Tableau.
Developed Pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
Assisted in designing, development and architecture of Hadoop and HBase systems.
Worked on configuring and managing disaster recovery and backup on Cassandra Data.
Enforced referential integrity in the OLTP data model for consistent relationship between tables and efficient database design.
Developed optimal strategies for distributing the web log data over the cluster, importing and exporting the stored web log data into HDFS and Hive using Scoop.
Developed and maintained data dictionary to create metadata reports for technical and business purpose.
Implemented AWS cloud computing platform using S3, RDS, Dynamo DB, Redshift, and Python.
Translated business requirements into working logical and physical data models for Data warehouse, Data marts and OLAP applications.
Involved with Data Analysis primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats
Implemented Kafka High level consumers to get data from Kafka partitions and move into HDFS.
Developed Spark streaming application to pull data from cloud to hive table.
Wrote SQL Scripts and PL/SQL Scripts to extract data from Database to meet business requirements and for Testing Purposes.
Involved in Manipulating, cleansing & processing data using Excel and SQL and responsible for loading, extracting and validation of client data.
Created sheet selector to accommodate multiple chart types (Pie, Bar, Line etc) in a single dashboard by using parameters.
Developed Python scripts to automate and provide Control flow to Pig scripts.
Designed and Developed PL/SQL procedures, functions and packages to create Summary tables.

Environment: Hadoop 3.0, Agile, HDFS, HBase 1.2, Scala, Cassandra 3.1, SQL, ETL, AWS, Sqoop 1.4, Hive 2.3, MapReduce Pig 0.17, Oracle 12c, Oozie 4.3, Tableau, OLAP, PL/SQL, Kafka 1.0.

Confidential - Houston, TX

Data Modeler

Responsibilities:

Understood and translate business needs into data models supporting underwriting workstation services.
Created DDL scripts using Erwin and source to target mappings to bring the data from source to the warehouse.
Developed dimensional model for Data Warehouse/OLAP applications by identifying required facts and dimensions.
Designed Star schema for the detailed data marts and plan data marts consisting of confirmed dimensions.
Developed logical data models and physical database design and generated database schemas using Erwin.
Reverse Engineered the existing Stored Procedures and wrote Mapping Documents for them.
Developed stored procedures and triggers, packages, functions and exceptions using PL/SQL
Designed both 3NF data models for OLTP systems and dimensional data model
Worked on the reporting requirements and involved in generating the reports for the Data Model using crystal reports
Conducted design walk through sessions with Business Intelligence team to ensure that reporting requirements are met for the business.
Validated existing Data Quality rules to ensure they meet Data Governance requirements.
Involved in writing queries and stored procedures using MySQL and SQL Server.
Created data masking mappings to mask the sensitive data between production and test environment.
Developed solutions for data quality issues and collaborate with the business and IT to implement those solutions.
Created SQL queries using TOAD and SQL Navigator and also created various databases object stored procedure, tables, views.
Created Data stage jobs (ETL Process) for populating the data into the Data warehouse constantly from different source systems.
Worked on Metadata exchange among various proprietary systems using XML.
Extracted data from Oracle and upload to Teradata tables using Teradata utilities FASTLOAD & Multiload.
Designed Data Flow Diagrams, E/R Diagrams and enforced all referential integrity constraints.
Involved in the creation, maintenance of Data Warehouse and repositories containing Metadata.
Used data vault modeling method which was adaptable to the needs of this project.
Created business requirement documents and integrated the requirements and underlying platform functionality.

Environment: Erwin 9.5, Teradata 14.0, Oracle 11g, SQL, PL/SQL, OLAP, OLTP, TOAD, ETL, XML, MySQL, Crystal reports 14.1x.

Confidential

Data Analyst/Data Modeler

Responsibilities:

Worked as a Data Analyst/Modeler to generate Data Models and subsequent deployment to Enterprise Data Warehouse.
Conducted source data analysis of various data sources and develop source-to-target mappings with business rules.
Conducted data modeling JAD sessions and communicated data-related standards.
Generated DDL statements for the creation of new ER/studio objects like table, views, indexes, packages and stored procedures.
Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing
Done Reverse engineering on existing data model to understand the data flow and business flows.
Performed Data Profiling to identify data issues upfront, provided SQL prototypes to confirm the business logic provided prior to the development.
Designed the Data Model/Data exchange Metadata Model for All Interfaces and Data Exchanges
Developed Conceptual, Logical and Physical data models for central model consolidation.
Provided PL/SQL queries to developer as source queries to identify the data provided logic to assign.
Involved in the creation, maintenance of Data Warehouse and repositories containing Metadata.
Designed and developed Use Cases, Activity Diagrams, Sequence Diagrams, OOD (Object oriented Design) using UML and Visio
Developed Data Mapping, Data Governance, Transformation and Cleansing rules for the Master Data Management.
Developed and deployed quality T-SQL codes, stored procedures, views, functions, triggers and jobs.
Effectively used triggers and stored procedures necessary to meet specific application's requirements.
Designed and Maintained Data Model for OLTP systems and OLAP systems, ODS and Data Marts using 3NF and Dimensional Design
Created SQL scripts for database modification and performed multiple data modeling tasks at the same time under tight schedules.
Used the Data Stage Designer to design and develop jobs for extracting, cleansing, transforming, integrating, and loading data into different Data Marts.
Wrote complex SQL queries for validating the data against different kinds of reports generated by Business Objects XIR2.
Performed analysis and presented results using SQL, SSIS, Excel, and Visual Basic scripts.

Environment: ER/Studio, Oracle 11g, SQL, PL/SQL, T-SQL, ODS, OLAP, OLTP, Business Objects, SSIS, MS Excel 2012

We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

Arlington, VA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship