Big Data Developer, Resume
Los Angeles, CA
SUMMARY:
- Sr. BI Developer/Architect with ten (10) years of experience in the design and development of data warehouse/big data applications using leading industry tools and working with fortune 500 firms like Apple, Confidential, Confidential and Confidential
- Well rounded experience in ETL, Hadoop, Spark, BI reporting, data visualization, data warehouse and servers and administration
- Strong knowledge in data integration concepts like Dimensional Modeling, Data Quality, Streaming, CDC, Master Data Management (MDM) and the ETL/ELT processes
- Good understanding of big data concepts like Hadoop, Map Reduce, YARN, Spark, RDD, data frames and datasets
- Proficient in Oracle, SQL Server, SQL, PL/SQL, T - SQL, Teradata and in managing very large databases
- Substantial BI experience in ETL, BI Reporting, Data Visualization (Tableau and Power BI)
- Experience writing in house UNIX shell scripts for Hadoop and big data development
- Skilled in performance tuning of ETL mappings, sessions, databases and SQL query performance
- Strong data modeling skills with experience developing complex data using Unified Modeling Language (UML), ER Diagrams and conceptual/physical diagrams etc.
- Recognized for superior performance with awards such as IBM Service Excellence, IBM Managers Choice and the Amex Chairman’s award
TECHNICAL SKILLS:
ETL Tools: Informatica Power Center, Microsoft SSIS, IBM DataStage
Big Data: Hadoop, Sqoop, Flume, Hive, Spark, Pig, Kafka, Talend
Database: Oracle, SQL Server 2016, Teradata, Netezza, MS Access
Business Intelligence: MDM, Change Data Capture (CDC), Metadata, Data Cleansing, OLAP, OLTP, SCD, SOA, Web Services
Tools: Ambari, Dbeaver, SQL Developer, TOAD, Erwin, Visio, Tortoise SVN
Operating Systems: Windows Server, UNIX (Red Hat, Linux, Solaris, AIX)
Languages: UNIX shell scripting, SQL, PL/SQL, T-SQL, Scala
PROFESSIONAL EXPERIENCE:
Big Data Developer
Confidential, Los Angeles, CA
Responsibilities:
- Worked with the project managers, business leaders and technical teams to finalize the requirements and create solution designs and architecture
- Architected the data lake by cataloging the source data, analyzing entity relationships, and aligning the design as per performance, schedules and reporting requirements
- Designed and developed Hadoop ETL solutions to move data from legacy systems to the data lake using big data tools like Sqoop, Hive, Spark, HDFS, Informatica and Talend etc.
- Developed algorithms and scripts in Hadoop to import data from source system and persist in HDFS (Hadoop Distributed File System) for staging purposes
- Developed Informatica and Talend mappings and workflows, to load data in the DW environment, and schedule them for daily and weekly data needs
- Developed Hive logic and Stored Procedures to implement business rules and perform data transformation
- Designed and developed Spark codes using Scala programming language and Spark SQL for high speed data processing to meet the critical business requirements
- Developed Unix shell scripts to migrate the existing Informatica workflows to Hadoop
- Developed scripts in Hive to perform transformations on the data and load to target systems for use by the data analysts for reporting
- Developed PoC implementations for Kafka streaming for web data and migrate sample workflows to the Informatica big data edition
Environment:: Hortonworks, Sqoop, Hive, Informatica, Spark, T-SQL, Talend, UNIX, Ambari, Oozie
BI Architect
Confidential, San Francisco, CA
Responsibilities:
- Performed extensive data analysis and coordinated with the client teams to develop data models
- Developed the ETL/SQL codes to load data from raw stage relational DB’s, and Ingest data using Sqoop to Hadoop environment
- Coordinated with other architects to develop data acquisition, data cleansing and data quality plans
- Developed ETL architecture and data models with data warehouse architect
- Developed scripts in Hive to perform transformation and Sqoop export data back to DB’s
- Wrote scripts to automate data load and performed data transformation operations
- Wrote backend scripts in SQL and modified transformations in Informatica to fine tune the performance
- Coded Informatica mappings using Aggregator, Transformer and Joiners
- Wrote test cases, performed unit and integrity testing and deployed mappings in the production environment
Environment: Informatica, Shell Scripting, Sqoop, Hive, Oracle, PL/SQL, Tableau, UNIX
Sr. BI Developer
Confidential, Washington, DC
Responsibilities:
- Extracted and profiled data from the customer, commercial loans and retail source systems that would provide the data needed for the loan reporting requirements
- Determined criteria and wrote scripts for technical and business data quality checks, error handling and rejected reports during the data quality stage
- Provided inputs on design of physical and logical architecture, Source\Target Mappings of the data warehouse and the ETL process
- Worked with the data administrator to determine the data and sizing requirements
- Performed data transformations like calculations, data splitting and Aggregations using transformations like Joiners, Lookup’s and Aggregators
- Designed and developed the change data capture process based on the business and regulatory requirements
- Loaded the operational data into staging, transforms and loaded data in enterprise data warehouse tables from their legacy systems using the Informatica ETL process
- Developed Informatica objects, Mappings, Mapplets, Complex Transformations, Sessions, scheduled workflows based on the conceptual and physical design documents
- Developed the ETL job workflows with exception handling and rollback framework
- Developed complex SQL and Stored Procedures both at source database and in mappings to implement business logic
- Performance tuned design elements through advanced techniques in Informatica and UNIX to improve performance. Debugged the Informatica mappings and validated the data in the Target tables once it was loaded with mappings
- Wrote UAT scripts, developed test cases and performed stress and data quality testing
- Participated in post implementation activities like handover, knowledge transfer and production support
Environment: Informatica, Cognos, Oracle 11g, TOAD, UNIX, Teradata, Qlikview, Visio, SVN
BI Developer
Confidential, Phoenix, AZ
Responsibilities:
- Extracted data from five operational databases containing almost two terabytes of data, loaded into the data warehouse and subsequently populated seven data marts
- Created complex transformations, mappings, mapplets, reusable items, scheduled workflows based on the business logic and rules
- Developed ETL job workflows with QC reporting and analysis frameworks
- Developed Informatica mappings, Lookups, Reusable Components, Sessions, Work Flows etc. (on ETL side) as per the design documents/communication
- Designed Metadata tables at source staging table to profile data and perform impact analysis
- Performed query tuning and setting optimization on the Oracle database (rule and cost based)
- Created Cardinalities, Contexts, Joins and Aliases for resolving loops and checked the data integrity
- Debugged issues, fixed critical bugs and assisted in code deployments to QA and production
- Coordinated with the external teams to assure the quality of master data and conduct UAT/integration testing
- Implemented Power Exchange CDC for mainframes to load certain large data modules in to the data warehouse and implemented changing data
- Designed and developed exception handling, data standardization procedures and quality assurance controls
- Used Cognos for analysis and presentation layers
Environment: Informatica, Tableau, Oracle 10g, SQL Developer, Cognos, Windows Server and Teradata
ETL Developer
Confidential, Bloomington, IL
Environment: Informatica, Oracle, DB2, SAS, Shell Scripting, TOAD, SQL Plus, Scheduler
BI Developer
Confidential, Richardson, TX
Environment: Microsoft BI Stack (SSIS, SSRS), Informatica, Teradata, Oracle, Linux/UNIX Shell Scripting, TOAD, SQL Plus, Control M scheduler
Business Intelligence Analyst
Confidential, Phoenix, AZ
Environment: IBM Datastage, UNIX, Microsoft Visio, Cognos and Oracle
BI Programmer Analyst
Confidential
Environment: Informatica, MS SQL Server, UNIX, Microsoft, Visio, Lotus Notes
