We provide IT Staff Augmentation Services!

Senior Big Data Engineer Resume

2.00/5 (Submit Your Rating)

OH

SUMMARY

  • 11+ years of experience in BNFS, which includes hands on experience in Big data and Data warehouse Solutions and ETL designing/implementation.
  • Good understand and hands - on in Big Data Solutions which includes working wifHadoop, HDFS, Hive, HBase, Spark and Sqoop.
  • Implemented the reusable Framework in Spark, Python & Sqoop to handle the dynamic metadata changes and load data intothe RDBMSdatabase.
  • Design and implementation of the reusable ETL framework for easy maintenance
  • Designed and developed multiple applications in Spark using Python & Scala.
  • Have good noledge and hands-onJenkins, GitHub.
  • Executed multiple end-to-end Enterprisedata warehousingprojects.
  • Excellent ETL/BISDLC including development and unit testing experience wif extensive wif good understanding of Data Warehousing concepts, Data Analysis, Data Warehouse Architectures and Database concepts.
  • Excellent noledge in Development of Big Data Hadoopecosystem - Map Reduce, Hive, Oozie workflow and Sqoop.
  • Strong noledge in Big data management using Hadoop, HDFS, Pig latin, Hive, pySpark, HBase, Sqoop, Kafka, Linux and Tableau.
  • Practical exposure and good noledge in implementing batch and real time analytics of larger data sets processing using Scala.
  • Experience in designing and developing Hadoop applications for automated Business Analytics in Retail Banking.
  • Experience in loading multiple larger datasets into HDFS and processing datasets by using Hive.
  • Write PySpark, SQLcode to process data from source and load into hive table by inferring the schema according to business requirement.
  • Experience in manipulating/analyzing large datasets and finding patterns and insights wifin structured data
  • Exposure in implementing and deployment of web applications in AWS EC2 instance.
  • Good noledge in python scripting and implementation of data validation rules using Pandas and Numpys libraries wif python.
  • Knowledgeable Oracle developer skilled in data collection, analysis and management.
  • Experience wif Data flow diagrams, Data dictionary database normalization theory techniques. Entity relational model design techniques.
  • Effectively made use of Table functions, Table partitioning, collections, analytical functions, materialized views, indexes (Btree, BITMAP & Function based).
  • Had a very good exposure in finance products implementation (Moody’s, Axiom) in setting for different financial products like swaps, futures, securities finance etc.
  • Strong noledge in OBIEE, Cognos reporting tool including development and testing RPD Changes like creating new subject areas, adding measures, creating aliastables using Administration tool.
  • Excellent communication, interpersonal, intuitive, technical, analysis, problem solving skills.

TECHNICAL SKILLS

Big Data Tools: Hadoop, Spark, Hive, Sqoop, Yarn, Map Reduce

Programming Languages: Scala, Spark SQL, Hive, MapReduce

Databases: Oracle 11g/12c, Teradata, Hive, MySQL Server 2016, DB2, PL/SQL

ETL Tools: Datastage, AWS

Operating Systems: UNIX, LINUX,WINDOWS

Others: CA7, Autosys scheduling, Database architecture, Performance Turning

PROFESSIONAL EXPERIENCE

Confidential, OH

Senior Big Data Engineer

Responsibilities:

  • Responsible for loading the data from Legacy database (Teradata, Oracle, Sqlserver), to HDFS as hive tables using Sqoop jobs.
  • Created transformations using Pyspark on transactional tables like DPI, ATB, Teller into COR posting systems.
  • Extensively worked wif Spark Python libraries to manipulate the data using broadcast joins and Sort merge joins.
  • Implemented SQOOP for large dataset transfer between Hadoop and RDBMS
  • Experienced inoptimizing Hive queriesby tuning configuration parameters.
  • Involved in creation of Python scripts to create a centralized framework for all Big data applications across the bank.
  • Involved in code integration using GitHub and deployment to edge node using Udeploy.
  • Prepared ca7 DOC05 and Agent ca7 job files in Confidential standard format to run the scripts periodically.
  • Experienced inoptimizing Hive queriesby tuning configuration parameters.
  • Prepared the Low-Level Design Document for every transformation that me am responsible to build.
  • Performing data ingestion into Hadoop system from various other legacy source systems like Teradata, oracle, Sqlserver etc.and processing datasets by using Hive and python script using CA7 scheduler.
  • Ensuring scoop job is workflow and configuration of Oozie workflow is working as expected.
  • Used Parquet File format for storing the data to effectively utilize the Cluster space as well to retrieve the Data faster while running the Jobs. As Parquet is Columnar storage its compression ratio is higher and effective for read heavy Operations.
  • Peer reviewing and approving code in GitHub for production deployment of the code defined in Change Request.
  • Creating Data comparison scripts and validation rules using extended python libraries like Pandas, Numpys & NLTK modules to match source and target data.
  • Testing legacy Data (Oracle 12c) thoroughly and perform unit, regression and User acceptance testing.
  • Participating in brain storming sessions wif business users which halps in understanding requirements thoroughly and perform validations.
  • Participating in scrum calls and coordinating wif offshore and ensuring deliverables are not impacted.

Environment: Spark, Scala, Hive, HBase, Airflow, Oracle, MySQL, Unix, & Python, Jenkins, GitHub& Bitbucket.

Confidential, MA

ETL Consultant/ Big data engineer

Responsibilities:

  • Analyzing the requirements for CCAR regulatory reports data issued by Federal government.
  • Designing ETL scripts and performing Legacy data loads from upstream systems to Financial Data warehousing by ensuring ETL flow is maintained across application (Oracle 11g &12c &Exalytics) for Assets and Liabilities and counterparty information using Moody’s financial analytics tool.
  • Developing in Scala and working in a fast-paced agile environment developing features planned as part of a roadmap
  • Understand and create complex algorithms, write beautiful and concise code that can run at scale to build data driven features used wifin the app
  • Participating in the technical design of our solutions and participating to code reviews
  • Spark/Scala Framework dynamically adding on Target database if any Metadata changes at source system wifout manual intervention for different sources.
  • Worked on Sqoop to ingest & retrieve data from various databases like Oracle DB & MySQL
  • Have Experience inUNIXscriptingExperience in Git, Jenkins, CI/CD Pipelines, Data warehouse and Data lake formations.
  • Creating autosys and jil jobs to ensure smooth loading data from source to target.
  • Performing ETL validations (unit testing) to ensure that credit risk mitigants (EAD, RWA) for all the financial products are calculated exactly wifout any deviations.
  • Preparing and executing oracle procedures /functions, SQL queries to fetch the data using Oracle 11G and sometimes by using Oracle stored procedures.
  • Using Explain plan, Oracle hints and creation in of new indexes to improve the performance of SQL
  • Worked on the performance turning.
  • Involved in all the existing releases and halped the team in resolving critical issues.

Environment: Spark, Scala, Hive, Oracle, MySQL, Unix, & Python, Jenkins, GitHub, Moody’s financial analytics.

Confidential, NC

Senior ETL/BI Developer

Responsibilities:

  • Migration of Insight reports (Crystal report)to OBIEE using Oracle Business intelligence tool and Oracle BI Publisher.
  • Understanding the requirements of the client, business scenario(s) and the application using functional specification document
  • Developed various Adhoc reports as per the requirements for user interface design like Table, Chart, Pivot, Narrative and view Selector Reports by using Answers component for testing purpose.
  • Implemented Prompts to facilitate dynamic filter condition to End-users.
  • Responsible for creating and maintaining RPD and creation of Dimensional Hierarchies & Level Based Measures.
  • Responsible fordesigning customized interactive dashboardsin OBIEE using drill down, guided navigation, prompts, filters, and variables.
  • Responsible for creating RTF templates in BI publisher and Integration of reports wif OBIEE Dashboard.
  • Involved in unit testingof presentation layer and organized the information data into related subject areas is easy for end users to use as basis for reporting.
  • Testing Roles by granting users wif appropriate access to the subject areas and reports.
  • Good experience in customization of OBIEE Dashboard user interface using CSS styles and image appearance by configuring skins and styles.
  • Worked on Customization of Dashboard wif radio buttons, hiding prompts, saved analytics request and dashboard objects such as Text, Images, Folders

Environment: OBIEE 11g/12c, OBI Publisher, Toad, Oracle 11g, DMExpress, Bugzilla,ALM

Confidential

ETL developer

Responsibilities:

  • Analyzing the requirements for CCAR regulatory reports data issued by Federal government.
  • Designing ETL scripts and performing Legacy data loads from upstream systems to Financial Data warehousing by ensuring ETL flow is maintained across application (Oracle 11g &12c &Exalytics) for Assets and Liabilities and counterparty information using Moody’s financial analytics tool using DataStage.
  • Participating in the technical design of our solutions and Participating to code reviews
  • Have Experience inUNIXscriptingExperience in Git, Jenkins, CI/CD Pipelines, Data warehouse and Data lake formations.
  • Creating autosys and jil jobs to ensure smooth loading data from source to target.
  • Performing ETL validations (unit testing) to ensure that credit risk mitigants (EAD, RWA) for all the financial products are calculated exactly wifout any deviations.
  • Preparing and executing oracle procedures /functions, SQL queries to fetch the data using Oracle 11G and sometimes by using Oracle stored procedures.
  • Using Explain plan, Oracle hints and creation in of new indexes to improve the performance of SQL
  • Worked on the performance turning.
  • Involved in all the existing releases and halped the team in resolving critical issues.

Environment: DataStage, Oracle, MySQL, Unix, & Python, SVN, Moody’s financial analytics.

We'd love your feedback!