We provide IT Staff Augmentation Services!

Big Data Architect Resume

4.00/5 (Submit Your Rating)

SUMMARY:

  • A creative, hands - on technical leader specializing in Big Data applications, Cloud Computing, and highly scalable systems.
  • A collaborative engineering professional with substantial experience designing and executing solutions for complex business problems involvinglarge scale data warehousing, real-time analytics and reporting solutions.
  • Strong Development background in data base and data warehouse technologies using tolls and programming languages.
  • Hadoop Architecture experience with very good exposure on Hadoop Technologies like HDFS, Map reduce, Hive, Hbase, Sqoop, HCatalog, Pig, Zookeeper, Flume, and Mahout and good experience with AWS services.
  • 10 years of IT experience in Data warehousing including Business Requirements Analysis, Design, Development, Testing and Implementation of Business Intelligence applications with RDBMS, Data Warehouse/Data mart, ETL, OLAP, Client/Server environment applications, including ofProject Management, Technical/Strategic Consulting in Data Analysis and Data management solutions.
  • Strong experience in Data Warehousing and ETL Process using Informatica Power Center(Mapping Designer, Repository manager, Workflow Manager and Workflow Monitor),Data mart, OLAP, OLTP.
  • Able to integrate state-of-the-art Big Data technologies into the overall architecture and lead a team of developers through the construction, testing and implementation phase.
  • Experience architecting highly scalable, distributed systems using different open source tools as well as designing and optimizing large, multi-terabyte data warehouses.
  • Proven history of building large-scale data processing systems and serving as an expert in data warehousing solutions while working with a variety of database technologies.
  • Good working experience with Hadoop Technologies like HDFS, Hive, Pig, Impala, YARN, Spark, NoSQL, Python and Sqoop.
  • Hands on experience with AWS S3, EC2, IAM, VPC, Route53, Glacier, DynamoDB, Cloud Watch and Redshift services.
  • Experience designing, reviewing, implementing and optimizing data transformation processes in the Hadoop and Informatica ecosystems .
  • Strong experience in Informatica ETL, Data warehousing, data cleansing, Transformation and Loading in a complex, high-volume environment. Having extensive programming skills in ETL Informatica, Oracle PL/SQL, ORACLE, and UNIX Shell scripting.Worked on Oracle Database (12c,11g,10g, 9i, 8x), on both OLTP and OLAP environments for high volume database instance and working on NOSQL data bases such as Cassandra and Mongodb.

TECHNICAL SKILLS:

  • Big Data
  • Languages
  • Databases
  • General
  • Hadoop
  • Hive Pig
  • Flume
  • Storm
  • Spark
  • Splunk
  • Perl
  • Java Scripting
  • SQL (T-SQL, PL/SQL, pgSQL)
  • Unix Shell
  • C C++
  • Unix Scripting
  • HBase
  • Cassandra
  • Mongo DB
  • Amazon DynamoDB
  • Oracle
  • Postgres
  • Mysql/MSQL
  • Amazon AWS Stack
  • Database Development
  • Data warehousing
  • Data Modelling
  • ETL Informatica
  • Cloud Computing

PROFESSIONAL EXPERIENCE

Confidential

Big Data Architect

Responsibilities:

  • Created scalable data models for processing customer, product related data.
  • Understood the documentation and business behind the ECRM project.
  • Created scalable data models for processing customer, product related data.
  • Discussed with developers and data modelers to build micro services to cater business needs using amazon ecosystem on EC2.
  • Successfully Architected implementation of Amazon RDS for migrating customer related data services, DynamoDB for product related services and data warehouse services hosted using Amazon Redshift for processing massively scalable web analytics reports.
  • Developed Ingestion methods suitable to the business needs to move the data into S3 storage.
  • Designed Load and transform methodologies into HDFS from large set of structured data from Teradata using Sqoop .
  • Involved in design and code RESTful service to support real time query on data (ElasticSearch). Used elasticsearch-head / marvel, plugin.
  • Analyze and evaluate the existing architecture and design, configure and migrate complex Network architectures toAWSPublic Cloud to EC2.
  • Laid out plan to reverser engineer legacy code of PLSQL to Hadoop Map reduce.
  • Analyzed the OLTP, Application Logs, ClickStream, Operations metrics to extract the necessary informatio from the raw data.
  • Programmed ETL functions between Oracle and Amazon Redshift
  • Designed AWS CloudTrail to log all API calls to Amazon S3 on various instances.
  • Moved reporting to Amazon Redshift and rewrote reports and dashboards in Tableau.
  • Triggering events on S3 handled using lambda function.
  • Designed and involved in developing schemas in Hive.
  • Involved in writing SQL queries, creating views, triggers and audit tables in Oracle for data migration and data retention.
  • Create POC’s to Ingest data into Hadoop cluster and also analyze vast data stores and come up with a feasible solution that meets the business needs.
  • Improved accuracy of the models from 86% to 93% by using an ensemble approach.

Environment: Oracle 12c,Hadoop, Pig, Hive, Spark, Python, HDFS, YARN, Impala, Scala, Python, Sqoop, Tableau, Informatica ETL, AWS Redshift, AWS S3, AWS Glacier, AWS Cloud Watch, WSDL, Restful Web Services, HTML, CSS, JavaScript, JDBC, Agile Methodology, PL/SQL, XML, XSD, UML

Confidential

Data Engineer/Big Data Architect

Responsibilities:

  • Created Oracle packages functions and closely worked with business in creating scalable data models related to CDD.
  • Leveraged used of Informatica transformations such as web service transformations, Java transformations to handle XML and string related parsing complex scenarios.
  • Used various transformations like Lookup, Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, SQL and Union to develop robust mappings in the Informatica Designer to transport the data to reporting database IRIS.
  • Created Java classes to pipe the data from AWS to Hbase, hive and implemented PIG scripts to process unstructured data.
  • Devised and lead the implementation of the next generation architecture for more efficient data ingestion and processing.
  • Mentoring and on-boarding new team members who are not proficient in Hadoop and getting them up to speed quickly.
  • Real time Micro services to evaluate CDD functionality in parallel to legacy design.
  • Extensive Experience with Hadoop and HBase, including multiple public presentations about these technologies.
  • Created ETL architecture to load structured data into Oracle, Semi structured to Mongo DB and Key value data to Dynamo DB store .
  • Data mining thru Redshift services designed .
  • Designed interfaces to flow Cassandra IOT customer data to be integrated to CRM.
  • Experience with hands on data analysis and performing under pressure.
  • Big Data Strategy - performance management, data exploration, data science.
  • Implementing POC for big data tools like Mahout, Impala etc.
  • Hadoop, Mongo DB, MySQL, Hadoop Cluster - Administrator & Architect
  • Implement innovative software systems to solve key business challenges
  • Setup Architecture for big data capture, representation, information extraction and fusion.

Environment: Informatica Power Center 9.1.0, TOAD for Oracle 9.7.2, Oracle12c, Shell Scripting, Mongo DB, Hadoop, Pig, Hive, Spark, Python, HDFS, Impala, Scala, Python, Sqoop

Confidential

Lead Database Engineer

Responsibilities:

  • Responsible for Analysis and development of the new requirements of WAVE IT reporting based on feeds sent by external vendors
  • Prioritized projects and deliverables by balancing criticality and resource requirements.
  • Managed projects schedule, tracked project milestones and ensured timely delivery.
  • Worked on Informatica Power Center tools- Designer, Repository Manager, Workflow Manager, and Workflow Monitor
  • Worked on different tasks in Workflows like sessions, event wait, decision, e-mail, command, and scheduling of the workflow.
  • Used Debugger to test the mappings and fixed the bugs
  • Handled changes effectively by implementing the change management process at UAT stages
  • Responsible for updating status to the client on a weekly basis

Environment: Informatica Power Center 9.1.0, TOAD for Oracle 9.7.2, Oracle11g, ORACLE DESIGNER

Confidential

Senior Database Developer

Responsibilities:

  • Participated in the High level Architectural discussion of DQS project.
  • Discussed with the Infrastructural groups in the Setup of DQS Project data bases.
  • Worked on Informatica Power Center tools- Designer, Repository Manager, Workflow Manager, and Workflow Monitor
  • Used various transformations like Lookup, Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop robust mappings in the Informatica Designer.
  • Developed mappings to load into Sources, ODS, Staging and DM tables
  • Developed mappings to load the static data into Dimensions
  • Used existing ETL standards to develop these mappings.
  • Worked on different tasks in Workflows like sessions, event wait, decision, e-mail, command, and scheduling of the workflow.
  • Used Debugger to test the mappings and fixed the bugs
  • Developed different types of reports, Master/Detail, cross tab, Charts etc.

Environment: Informatica Power Center 8.6.1, TOAD for Oracle, Oracle10g, Erwin 4.0, PL/SQL Developer 9.0.6

Confidential, California, CA

Database Developer

Responsibilities:

  • Developed different types of reports, Master/Detail, cross tab, Charts etc.
  • Participated in Requirements gathering, planning and technical issues discussion phases of the project and frequently interacting with business users for Basket shipment optimization project
  • Created new API’s to implement Business logic and also modified existing API’s to enhance the logic.
  • Created PL SQL packages to implement the SCMT logic which is used for loading various surcharges.
  • Used window functions such as Row Num (), Dense Rank () for deciding the ranking logic while picking up the best distributor.

Environment: Oracle 10g, Soloaris 11x, Toad 9.1, Remedy 7X

Confidential

Database Engineer

Responsibilities:

  • Developed various interfaces to support data feeds to SMS application. Created subroutines using PRO*C for the PASSPORT application to meet the new change requests.
  • Developed PL/SQL Packages that implement the part of mapping of data to the Database tables.
  • Processed files using sed, awk, regular expressions to remove extra feeds and remove special characters from files before loading.
  • Created indexes such as Bitmap and Binary tree, function based indexes to improvise the performance.

Confidential, New Jersey, NJ

Database Developer

Responsibilities:

  • Developed various interfaces to support data feeds to SMS application. Created subroutines using PRO*C for the PASSPORT application to meet the new change requests.
  • Effectively used SQL loader Utility to load various Mapping files of Pregnancy Exposure cases, Product Repository Mappings, Gravidity and Parity, Codelist Mapping into AERS (CARES 1.3).
  • Created packages which have various stored procedures and functions to do the process of transformation of data while migrating from CLINTRACE to AERS(CARES 1.3)

We'd love your feedback!