We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

4.00 Rating

Long Beach, CA

SUMMARY

  • Above 8+ years of Experience in Data Analysis, Data Modeling, Big Data/Hadoop professional wif applied information Technology.
  • Proficient in Data Analysis/ DW/ Bigdata/ Hadoop/ Data Integration/ Master Data Management, Data Migration and Operational Data Store, BI Reporting projects wif a deep focus in design, development and deployment of BI and data solutions using custom, open source and off the shelf BI tools
  • Experienced in Technical consulting and end - to-end delivery wif architecture, data modeling, data governance and design - development - implementation of solutions.
  • Have experience in Apache Spark, Spark Streaming, Spark SQL and NoSQL databases like HBase, Cassandra, and MongoDB.
  • Expertise in configuring the monitoring and alerting tools according to the requirement like AWS CloudWatch.
  • Good Experience in database design using PL/SQL, SQL, ETL, T-SQL to write Stored Procedures, Functions, Triggers, Views.
  • Expertise in the Data Analysis, Design, Development, Implementation and Testing using Data Conversions, Extraction, Transformation and Loading (ETL) and SQL Server, ORACLE and other relational and non-relational databases
  • Experience in BI/DW solution (ETL, OLAP, Data mart), Informatica, BI Reporting tool like Tableau and Qlikview and also experienced leading the team of application, ETL, BI developers, Testing team .
  • Experience in developing, support and maintenance for the ETL (Extract, Transform and Load) processes usingTalendIntegration Suite.
  • Worked on Informatica Power Center tools-Designer, Repository Manager, Workflow Manager.
  • Proficiency in multiple databases likeMongoDB, Cassandra, MySQL, ORACLE and MS SQL Server.
  • Extensive use of Talend ELT, database, data set, HBase, Hive, PIG, HDFS and SCOOP components.
  • Experience in Installation, Configuration, and Administration ofInformaticaPower Center 8.x, 9.1 Client/Server.
  • Experienced on Hadoop Ecosystem andBigDatacomponents including Apache Spark, Scala, Python, HDFS, Map Reduce, KAFKA.
  • Expertise in reading and writing data from and to multiple source systems such as oracle, HDFS, XML, delimited files, Excel, Positional and CSV files.
  • Experience in Business Intelligence (BI) project Development and implementation usingMicrostrategy product suits includingMicrostrategyDesktop/Developer, Web, Architect, OLAP Services, Administrator and Intelligence server.
  • Logical and physical database designing like Tables, Constraints, Index, etc. using Erwin, ER Studio, TOAD Modeler and SQL Modeler.
  • Good understanding and hands on experience wif AWS S3 and EC2.
  • Good experience on programming languages Python, Scala.
  • Experience in Performance tuning ofInformatica(sources, mappings, targets and sessions) and tuning the SQL queries.
  • Excellent noledge on creating reports on SAP Business Objects, Webi reports for multipledata providers.
  • Created and maintainedUDBDDL for databases, table spaces, tables, views, triggers, and stored procedures. Resolved lock escalations, lock-waits and deadlocks.
  • Experience in working wif business intelligence anddatawarehouse software, including SSAS, Pentaho, Cognos Database, Amazon Redshift, or AzureData Warehouse.
  • Worked on Informatica Power Center tools-Designer, Repository Manager, Workflow Manager.
  • Logical and physical database designing like Tables, Constraints, Index, etc. using Erwin, ER Studio, TOAD Modeler and SQL Modeler.
  • Extensive ETL testing experience using Informatica 9x/8x, Talend, Pentaho.
  • Experience in Dimensional Data Modeling, Star/Snowflake schema, FACT & Dimension tables.
  • Experience wif relational (3NF) and dimensional data architectures. Experience in leading cross-functional, culturally diverse teams to meet strategic, tactical and operational goals and objectives.
  • Good exposure to BI reportingMicrostrategy8i, 9i & Tableau & SQL programming, RDBMS - Teradata, Oracle, and SQL server.
  • Expertise on Relational Data modeling (3NF) and Dimensional data modeling.
  • Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
  • Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.

TECHNICAL SKILLS

Big Data Hadoop: HDFS, Hive, Pig, HBase, Map Reduce, Zookeeper, Sqoop, Oozie, Flume, Scala, Akka, Kafka, Storm, Mongo DB.

Data Analysis/ Modeling Tools: Erwin R6/R9, Rational System Architect, ER Studio and Oracle Designer.

Database Tools: Teradata 15.0, Oracle 11g/9i/12c and MS Access, Microsoft SQL Server12.0

BI Tools: Crystal Reports Packages Microsoft Office, Microsoft Visio, Tableau, SAP Business Objects, Share point Portal Server

Version Tool: VSS, SVN, CVS.

Project Execution Methodologies: Agile, Ralph Kimball and BillInmon data warehousing methodology, Rational Unified Process (RUP), Rapid Application Development (RAD), Joint Application Development (JAD) .

ETL/Data warehouse Tools: Informatica 9.6/9.1/8.6.1/8.1, Web Intelligence, Talend, Pentaho.

Tools: OBIE 10g/11g/12c, SAP ECC6 EHP5, Go to meeting, Docusign, Insidesales.com, Share point, Mat-lab.

Cloud Platforms: AWS, Azure

Operating System: Windows, Unix, Sun Solaris

RDBMS: MSSQL Server14.0, Teradata 15.0, Oracle 12c/11g and MS Access.

Other Tools: TOAD, SQL PLUS, SQL LOADER, MS Project, MS Visio and MS Office, C++, UNIX, PL/SQL.

PROFESSIONAL EXPERIENCE

Confidential, Long beach, CA

Sr. Big Data Developer

Responsibilities:

  • Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, HBase, Hive .
  • Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Generate metadata, create Talend etl jobs, mappings to load data warehouse, data lake.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
  • Used Talend for Big data Integration using Spark and Hadoop.
  • Used Microsoft Windows server and authenticated client server relationship via Kerbros protocol.
  • Experience on BI reporting wif At Scale OLAP for Big Data.
  • Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Designed and Developed Real time Stream processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning.
  • Identify query duplication, complexity and dependency to minimize migration efforts
  • Technology stack: Oracle, Hortonworks HDP cluster, Attunity Visibility, AWS Cloud and Dynamo DB.
  • Experience in AWS, implementing solutions using services like (EC2 and S3)
  • Worked on Talend Magic Quadrant for performing fast integration tasks.
  • Worked as a Hadoop consultant on (Map Reduce/Pig/HIVE/Sqoop).
  • Worked using Apache Hadoop ecosystem components like HDFS, Hive, Sqoop, Pig, and Map Reduce.
  • Lead architecture and design of data processing, warehousing and analytics initiatives.
  • Worked wif AWS to implement the client-side encryption as Dynamo DB does not support at rest encryption at dis time.
  • Exploring wif the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Used Data Frame API in Scala for converting the distributed collection of data organized into named columns.
  • Performed data profiling and transformation on the raw data using Pig and Python.
  • Experienced wif batch processing of data sources using Apache Spark.
  • Developing predictive analytic using Apache Spark Scala APIs.
  • Involved in working of big data analysis using Pig and User defined functions (UDF).
  • Created Hive External tables and loaded the data into tables and query data using HQL.
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
  • Implement enterprise grade platform(mark logic) for ETL from mainframe to NOSQL(cassandra).
  • Experience on BI reporting wif At Scale OLAP for Big Data.
  • Responsible for importing log files from various sources into HDFS using Flume.
  • Worked on tools Flume, Storm and Spark.
  • Assigned name to each of the columns using case class option in Scala.
  • Enhancements to traditional data warehouse based on STAR schema, update data models, perform Data Analytics and Reporting using Tableau.
  • Expert in performing business analytical scripts using Hive SQL.
  • Implemented continuous integration & deployment (CICD) through Jenkins for Hadoop jobs.
  • Worked in writing Hadoop Jobs for analyzing data using Hive, Pig accessing Text format files, sequence files, Parquet files.
  • Experience in integrating oozie logs to kibana dashboard.
  • Extracted the data from MySQL, AWS RedShift into HDFS using Sqoop.
  • Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
  • Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in CSV format.
  • Developed Spark streaming application to pull data from cloud to Hive table.
  • Used Spark SQL to process the huge amount of structured data.

Environment: Spark, YARN, HIVE, Pig, Scala, Python, Hadoop, AWS, Dynamo DB, Kibana, EMR, JDBC, Redshift, NOSQL, Sqoop, MYSQL, Star Schema, Flume, Scala, oozie.

Confidential, Chicago, IL

Big Data Engineer

Responsibilities:

  • Gathered the business requirements from the Business Partners and Subject Matter Experts.
  • Worked on Big Data Integration & Analytics based on Hadoop, SOLR, Spark, Kafka, Storm and web Methods.
  • Involved in installing Hadoop Ecosystem components.
  • Responsible for Big data initiatives and engagement including analysis, brainstorming, POC, and architecture.
  • Developed Spark-SQL for faster testing and processing of data.
  • Import the data from different sources like HDFS/HBase into Spark RDD developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Writing Hive join query to fetch info from multiple tables, writing multiple Map Reduce jobs to collect output from Hive.
  • Created high level and detaildatamodels forAzureSQL Databases, NonSQL databases, as well as the use of storages for logging anddatamovement between on-premisedatawarehouse and cloud vNets.
  • Designed both 3NF data models for ODS, OLTP, OLAP systems and dimensional data models using star and snow flake Schemas.
  • Developed and designeddataintegration and migration solutions inAzure.
  • Implemented Spark GraphX application to analyze guest behavior for data science segments.
  • Ingest data into Hadoop / Hive/HDFS from different data sources.
  • UsedHiveto analyze data ingested intoHBaseby usingHive-HBaseintegration and compute various metrics for reporting on the dashboard.
  • Involved in developing Map-reduce framework, writing queries scheduling map-reduce
  • Developed the code for Importing and exporting data into HDFS and Hive using Sqoop
  • Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
  • Deployed Hadoop clusters of HDFS, Spark Clusters and Kafka clusters on virtual servers in Azure environment.
  • Wrote ETL jobs to read from web apis using REST and HTTP calls and loaded into HDFS using Talend.
  • Generated comprehensive analytical reports by running SQL queries against current databases to conductdataanalysis.
  • Imported data frequently from MySQL to HDFS using Sqoop.
  • Created Hive tables and working on them using Hive QL.
  • Built Azure Data Warehouse Table Data sets for PowerBI Reports
  • Worked on configuring and managing disaster recovery and backup on Cassandra Data.
  • Utilized Oozie workflow to run Pig and Hive Jobs Extracted files from Mongo DB through Sqoop and placed in HDFS and processed.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
  • Advised the ETL and BI teams wif design and architecture of the overall solution.
  • Work wif Development, Storage and Network teams in installation and administration ofMongoDB in the IT Enterprise Environment.
  • DevelopedTalendESB services and deployed them on ESB servers on different instances.
  • Designed and implemented theMongoDBschema.
  • Effectively usedInformaticaparameter files for defining mapping variables, workflow variables, FTP connections and relational connections.
  • Finalize the naming Standards for Data Elements and ETL Jobs and create a Data Dictionary for Meta Data Management.
  • Produced PL/SQL statement and stored procedures in DB2 for extracting as well as writingdata.
  • Wrote and executed SQL queries to verify dat data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance wif requirements.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs fordatacleaning and preprocessing.

Environment: Pig, Sqoop, Kafka, Apache Cassandra, Azure, Oozie, Impala, Flume, Apache Hadoop, HDFS, Hive, Map Reduce, Cassandra, Zookeeper, MySQL, SQL Server, SQL Server Analysis Services, Oracle12c, Eclipse, Dynamo DB, PL/S

Confidential, Auburn Hills, MI

Sr. Data Analyst / Modeler

Responsibilities:

  • Analyzed the business requirements by dividing them into subject areas and understood the data flow wifin the organization.
  • Involved in reviewing business requirements and analyzing data sources form Excel/Oracle/ SQL Server for design, development, testing, and production rollover of reporting and analysis projects wifin Tableau Desktop.
  • Worked very close wifDataArchitectures and DBAteam to implementdatamodel changes in database in all environments.
  • Developed data Mart for the base data in Star Schema, Snow-Flake Schema involved in developing the data warehouse for the database.
  • Developed enhancements toMongo DBarchitecture to improve performance and scalability.
  • Created DDL scripts for implementing Data Modeling changes. Created ERWIN reports in HTML, RTF format depending upon the requirement, Published Data model in model mart, created naming convention files, co-coordinated wif DBAs' to apply the data model changes.
  • Developing predictive analytic using Apache Spark Scala APIs.
  • Assigned name to each of the columns using case class option in Scala.
  • Experienced wif batch processing of data sources using Apache Spark.
  • Worked on Unit Testing for three reports and created SQL Test Scripts for each report as required.
  • Involved in developing Map-reduce framework, writing queries scheduling map-reduce.
  • Involved in data modelling to define the table structure inMDM system.
  • Implement enterprise grade platform (mark logic) for ETL from mainframe to NoSQL (cassandra)
  • Designed and developed Data Mapping Application for 30+ disparate source systems (COBOL, MS Sql Server, Oracle, and Mainframe DB2), using MS Access, and UNIX Korn script.
  • Extensively used Erwin as the main tool for modeling along wif Visio.
  • Installed, configured and administeredJBOSS4.0 server in various environments.
  • Created DDL scripts for implementing Data Modeling changes. Created ERWIN reports in HTML, RTF format depending upon the requirement, Published Data model in model mart, created naming convention files, co-coordinated wif DBAs' to apply the data model changes.
  • Extensively used Erwin as the main tool for modeling along wif Visio
  • Established and maintained comprehensive data model documentation including detailed descriptions of business entities, attributes, and data relationships.
  • Worked on Metadata Repository (MRM) for maintaining the definitions and mapping rules up to mark.
  • Developed the Conceptual Data Models, Logical Data models and transformed them to creating schema using ERWIN.
  • Performeddatacleaning anddatamanipulation activities using NZSQL utility.
  • Analyzed the physicaldatamodel to understand the relationship between existing tables.
  • Created a list of domains in Erwin and worked on building up the data dictionary for the company.

Environment: Erwin r8.2, Oracle SQL Developer, OracleDataModeler, Teradata 14, SSIS, Business Objects, SQL Server, ER/Studio Windows, MS Excel.

Confidential, Orlando, FL

Analyst/Modeler

Responsibilities:

  • Worked wif SQL, Python, Oracle PL/SQL, Stored Procedures, Triggers, SQL queries and loading data into Data Warehouse/Data Marts.
  • Developed normalized Logical and Physical database models to design OLTP system for Reference and Balance data conformance using ER studio modelling tool.
  • Developed the logical data models and physical data models dat capture current state/future state data elements and data flows using ER Studio.
  • Delivered dimensional data models using ER/Studio to bring in the Employee and Facilities domain data into the oracle data warehouse.
  • Performed analysis of the existing source systems (Transaction database)
  • Involved in maintaining and updating Metadata Repository wif details on the nature and use of applications/datatransformations to facilitate impact analysis.
  • Created DDL scripts using ER Studio and source to target mappings to bring the data from source to the warehouse.
  • Designed the ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and physical database (capacity planning, object creation and aggregation strategies) for Oracle and Teradata .
  • Worked in importing and cleansing ofdatafrom various sources like Teradata, Oracle, flatfiles, MS SQL Server wif high volumedata
  • Reverse Engineered DB2 databases and then forward engineered them to Teradata using ER Studio.
  • Part of team conducting logical data analysis and data modeling JAD sessions, communicated data-related standards .
  • Involved in meetings wif SME (subject matter experts) for analyzing the multiple sources.
  • Involved in SQL queries and optimizing the queries in Teradata.
  • Created DDL scripts using ER Studio and source to target mappings to bring the data from source to the warehouse.
  • Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, SQL Server 2005 wif high volume data.
  • Wrote and executed SQL queries to verify dat data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance wif requirements.
  • Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, SQL Server 2005 wif high volume data
  • Worked extensively on ER Studio for multiple Operations across Atlas Copco in both OLAP and OLTP applications.
  • Used forward engineering to create a physical data model wif DDL dat best suits the requirements from the Logical Data Model.
  • Worked wif the DBA to convert logical Data models to physical Data models for implementation.
  • Involved in preparing the design flow for theDatastageobjects to pull thedatafrom various upstream applications and do the required transformations and load thedatainto various downstream applications.

Environment: Business Objects, ER Studio, Oracle SQL Developer, SQL Server 2008, Teradata, ER/Studio, SSIS, Windows, MS Excel.

Confidential

Data Analyst

Responsibilities:

  • Responsible for the development and maintenance of Logical and Physical data models, along wif corresponding metadata, to support Applications.
  • Involved in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
  • Worked wif Business users during requirements gathering and prepared Conceptual, Logical and PhysicalDataModels.
  • Created conceptual, logical and physical data models using best practices and company standards to ensure high data quality and reduced redundancy.
  • Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy SQL Server database systems
  • Wrote PL/SQL statement, stored procedures and Triggers in DB2 for extracting as well as writing data.
  • Project involves production, test, and development administration and support for client's existing DB2UDBplatform running DB2UDBv9.1 and v8.2 on servers under various operating system.
  • Attended and participated in information and requirements gathering sessions
  • Translated business requirements into working logical and physical data models for Data warehouse, Data marts and OLAP applications.
  • Designed Star and Snowflake Data Models for Enterprise Data Warehouse using ERWIN
  • Created and maintained Logical Data Model (LDM) for the project. Includes documentation of all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, glossary terms, etc.
  • Validated and updated the appropriate LDM's to process mappings, screen designs, use cases, business object model, and system object model as they evolve and change.
  • Responsible for the development and maintenance of Logical and Physical data models, along wif corresponding metadata, to support Applications.
  • Excellent noledge and experience in Technical Design and Documentation.
  • Used forward engineering to create a physical data model wif DDL dat best suits the requirements from the Logical Data Model.
  • Involved in preparing the design flow for theDatastageobjects to pull thedatafrom various upstream applications and do the required transformations and load thedatainto various downstream applications.
  • Involved in preparing the design flow for theDatastageobjects to pull thedatafrom various upstream applications and do the required transformations and load thedatainto various downstream applications.
  • Performed logicaldatamodeling, physicaldatamodeling (including reverse engineering) using the ErwinDataModeling tool.

Environment: Oracle 9i, PL/SQL, Solaris 9/10, Windows Server, NZSQL,Erwin, Toad, Informatica, IBM OS 390(V6.0), DB2 V7.

We'd love your feedback!