We provide IT Staff Augmentation Services!

Big Data Architect/hadoop Consultant Resume

3.00/5 (Submit Your Rating)

San Ramon, CA

SUMMARY:

  • Total (15) Years of IT experience in a wide variety of Information Management domains such as Data Warehousing (DW), Business Intelligence (BI), Master Data Management (MDM), Data Quality and Data Governance.
  • 3+Years in Hadoop, Java, Map Reduce/Hive/Pig/HBase development.
  • 5+ years of experience in Analyzing, Designing and Developing Component - based, Distributed Web Applications using Java and Web technologies for enterprise applications.
  • Ability to plan, scope, and provide estimates for work effort, producing quality deliverables on time and on budget
  • Expertise in End to End system design, System Feasibility study, system redesign/engineering and implementation with significant experience in Data Modelling/Data Quality Analysis/ identification data issues and gap assessment.
  • Business requirement gathering, Defining and documenting business, technical & functional specifications, Organising design workshops, Business and technical team coordination
  • Strong experience in implementing Data warehousing solutions involving Dimensional and Relational Modelling using Star flake schema implementation and (3 NF).
  • Proven track record of successfully designing & implementing complex ETL/Data warehousing systems
  • Proven track record of successfully delivering database application, system optimisation and performance tuning of critical financial applications.
  • Accomplished problem solver with wide-ranging experience in systems design, programming and testing.

TECHNICAL SKILLS:

RDBM: Teradata, Aster, HAWQ, MS SQL Server, TSQL, Oracle, PL/SQL, DB2 UDB, Sybase, Informix.

Big Data Ecosystem: Hadoop, MapReduce, HDFS, Hive, Pig, sqoop, storm, scala, spark,kafka, Oozie, Flume.

No SQL: Hbase, Cassendera,MongoDB

ETL/BI Applications: Talend, AbInitio,Informatica,Datastage,SSIS,BOXI/R2,SSRS,Cognos,Microstrategy,Pentaho.

MDM: Allfussion, Abinitio/Informatica Metadata Hub.

IDE: Eclipse, Web sphere Application Developer. Talend Open Studio/Big Data.

Languages: C/C++, Java, J2EE Framework, Servlets, Perl, XML, UML

Case Tools: Erwin, ER/studio, Visio.

Languages /Other: UNIX Shell Scripting, python, C/C++, Java, J2EE Framework, Servlets Perl, XML, UML.

Version Control: MS Visual SourceSafe, IBM Rational ClearCase, CVS, TFS.

Application Servers: WebLogic, Websphere, Tomcat.Web Services SOAP, REST, RMI.

CAREER HISTORY:

Confidential, San Ramon, CA

Big Data Architect/Hadoop Consultant

Responsibilities:

  • Designed and implemented Big Data Platform Architecture and Data Pipeline for different GE business units such as (Aviation, Healthcare, oil & gas, and Transportation)
  • Developed ETL process for several data sources and xml log data. Implemented using Java, Python, spark, Hive, Pig, Hbase, Oozie workflow and visualized results using Tableau.
  • Analysis of the data and business processes and recommending solutions for analytics reporting needs.
  • Design streaming Map/Reduce jobs to process terabytes of xml format data.
  • Developed XML parsers programs for MR and CT Machine logs in healthcare domain from an onwatch mount extracted business KPIs and loaded incrementaly to the Oracle tables and datalake environment.
  • Designed and developed data visualization charts and dashboard for data scientists .
  • Analyzed the data by performing java map reduce, hive queries and running pig scripts and publishing this data in the form of key metrics to business for data visualization.
  • Created unix scripts and python scripts for converting various data formats for data massaging.
  • Provided complete E2E Analytics and Visualization Platform for various GE businesses.

Confidential, San Francisco, CA

Sr. Big Data/Hadoop Consultant

Responsibilities:

  • Banners presented are based on customer profile which is pulled from DECA backend (Hbase/sql server).
  • Developed customer profile data using Hive from IDW customer data and incrementally loaded to Hbase table using storm.
  • Wrote Jax RPC handler/Servlet Filter which integrate with RTD and perform ELI authentication and Authorization.
  • Used RRBus is a messaging infrastructure within Schwab that provides a message bus kind of medium and facilitates dynamic service location by routing it to the appropriate service endpoints.
  • Generate the necessary stubs and skeletons to be used by the web client for the web service method invocation.
  • Use SAML Assertion, which contain the attributes, correlation id, brokerage customer id which is used to help uniquely identify a customer which is used for authorization and used CAM (customer Authentication Module) for Authentication.
  • Wrote JSON format for ELI logging and make them available to splunk server for data analysis.
  • Develop Unit test cases that would facilitate QA testing during integration and system testing.
  • Co-ordinate with SLG (Schwab Local Governance) for the approval of the release item.
  • Co-ordinate with the release management team during production deployment.

Confidential, Foster city CA

Big Data/Hadoop Consultant

Responsibilities:

  • Developed big-data analytic models for customer fraud transaction pattern detection models using Hive from customer transaction data.
  • It also involved transaction sequence analysis with gaps and no gaps, network analysis between common customers for the top fraud patterns, etc
  • Developed customer transaction event path tree extraction model using Hive from customer transaction data.
  • Enhanced and optimized the customer path tree GUI viewer to incrementally load the tree data from HBase NoSQL database. Used prefuse open source java framework for the GUI.
  • Developed several Java utilities/programs in the data flow. Ex: A java utility that finds the clusters of transaction patterns of customer based on their network overlap.
  • Bulid data systems that speak to diff database platforms, enabling product and business teams to make data driven decisions.
  • Involved in Installation & Configuration and Managing Hadoop, Hive, Pig and HBase
  • Wrote Map Reduce program to process data stored in HDFS.
  • Converted output to structured data and imported to Hive tables, loading data and writing hive queries which will run internally in map reduce way.
  • Defined problems to look for right data and analyze results to make room for new project.
  • Worked on Meta Data Hub to import the Metadata definitions from external sources to make the Data Dictionary data visual for non-technical staff and Managing Metadata for Dependency Analysis of complex systems and resolving lineage breaks issue.
  • Worked on Metadata hub customization workbench for web services integration, Involved in Extending Object Model Schema. Writing schema extensions, view customizations and reports. Written several MetaSQL’s for fetching the reports against the Meta model classes.
  • Data Quality Analysis, identification data issues and gap assessment and implementing resolution.

Confidential, Chicago, IL

Sr ETL Architect/Big Data Design Consultant

Responsibilities:

  • Instrumental in redefining the system architecture for functional correctness/performance optimization
  • Redesign process flows and recreating some of the ETL scripts without breaking the original functionality of the code and optimizes the IPDS application.
  • Installed and configured HadoopMapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Experienced in defining job flows
  • Experienced in managing and reviewing Hadoop log files
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data
  • Load and transform large sets of structured, semi structured and unstructured data
  • Responsible to manage data coming from different sources
  • Supported Map Reduce Programs those are running on the cluster
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also written Hive UDFs.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way
  • Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.

Sr Lead Application Developer

Confidential, Memphis, TN

Responsibilities:

  • Developed the Data Integration transformation and change data capture process for Merchant/ transaction and summary subject area which is required to load and update the client line EDW Tables.
  • Responsible for interacting with the business team to gather and formulate requirements for different parts of the application. Translating the requirements in to detailed design specifications for developers on the team.
  • ETL Application design instrumental and involved in 4 major Data warehouse project implementation, Near Real Time and Batch Data warehouse design implementation.
  • Was responsible for building up an global ETL support/off shore development team - 20+ members, over viewing/Implementation of ETL.
  • Ab Initio to new members and Documenting Knowledge share
  • Responsible for system maintenance, generating SLA and KPI for escalation/no of issues and tracking.
  • Redesigned System Architecture of Data stage Application jobs in to Abinitio Graphs, Establishing offshore resources, Designed and developed new applications (GRS and TRAFAC) in Abinito, created incremental update process of claim subsystem in Property/Auto data warehouse, successfully implemented a daily update process.

Senior Data warehouse BI Developer

Confidential, Herndon, VA

Responsibilities:

  • Worked in "FAS91 Amortization" stream of Restatement and Get Current, which is related to amortizing the Guarantee Fee components, securities, loans, and other financial products
  • Also involves in implementing the Guarantee Fee Data mart (GFASDM) and Security Data Mart Team (SCBSLDM) for generating various kinds of monthly, quarterly and yearly reports.
  • Designed and Performed detailed and accurate source systems analysis, developed logical and physical data models, source to target data mappings and data movement task in support of their IDN Data management enterprise data warehouse.
  • Designed and developed REVAMART data warehouse / reporting system for a Broadband ADSL provisioning network.
  • Designed and Developed Data warehouse system to Automate the CART (Contacts Adhoc Retention and Telemarketing) procedure and Developed logical and physical data models in support of data warehousing, decision support, and executive information system in support of their enterprise data warehouse.

Confidential, CA

Senior Software Analyst

Responsibilities:

  • Developed Product and customer information Integration system using event base middle ware (web methods Enterprise Platform) and Supported the Oracle Financials and Purchasing applications, Written PERL Interface for Confidential ’s e*shop to enable communication with Oracle and Vantive Database using VanAPI, oraperl
  • Developing SQL queries, writing stored procs, PL/SQL blocks

We'd love your feedback!