We provide IT Staff Augmentation Services!

Bigdata Enterprise Solution Architect Resume

0/5 (Submit Your Rating)

Irvine, CA

SUMMARY:

  • Over 15 years of experience as accomplished Data Science Engineer and BigData Enterprise Architect. Expertise in implementing Machine Learning Systems, Predictive Analytics, UDW, Data Lake & IoT platforms.
  • Self - driven, has passion for BigData technologies, flexible, has creative approach to problem solving, combines business acumen with technical savvy to implement highly integrated, scalable, real-time solutions using proven and emerging technologies. Demonstrates strong and real time experience with proven track record to implement enterprise level solutions in HealthCare, Semiconductor, Connected Vehicles, IoE, eCommerce, Cloud computing platforms and Telecom, Retail, Bank and financial institutions. Entrepreneurial and enjoys taking on new challenges and quickly delivers business value.

TECHNICAL SKILLS:

Integration Architectures: iPaaS, SaaS, PaaS, Cloud Computing, SOA, BigData Ecosystem, REST/Json, Microservices, MDM and Governance.

BigData ecosystem: Apache Hadoop, Hive, Mahout, NoSQL, Core Java, Apache Spark, Scala, Oozie, Pig, Impala, Shell script,Amazon Web Services EC2, IBM DataStage, Mongo DB, Casandra, Apache Kafka & Storm, Splunk, Solr, ElasticSearch, Amazon RedShift, Amazon EMR, Apache Phoenix, Azure Cloud.

Programing Languages: Core Java, Scala, C++, Python. Enterprise Java (Servlets, JSP, Swing, JDBC, RMI, EJB, JMS), XML. XSLT, HTML, JavaScript, AJAX, JQuery.

Operating Systems: UNIX, Linux, Windows Server, DOS.

Distributed RDBMS: Teradata, Netezza, Greenplum, Aster Data, Vertica, NoSQL, HBase, MongoDB, Casandra, SQL.

Commerce Platforms: Oracle ATG B2B/B2C, SAP Hybris, SFDC CloudCraze, Angular JS, Bootstrap JS, Responsive Design frameworks.

Technology Stack: Azure, IBM stack, Spark, Amazon IaaS, AWS S3, RedShift, PCF Cloud Foundry, Python, Cloudera/Horton/MapR Hadoop, Apache Kafka & Storm, Hive, NoSQL, Scala, UNIX/Linux, Git, HBase, Apache Phoenix, A/B Testing Kameleoon, Python Pybedtools, NumPy, Splunk, Teradata, Netezza, MongoDB, Casandra, REST & Micro services, Solr & Elastic Search, Dockers 0.9, Deeplearning4j and SFDC CloudCraze, Oracle ATG commerce platforms.

PROFESSIONAL EXPERIENCE:

Confidential

BigData Enterprise Solution Architect

Responsibilities:

  • Implementing Enterprise solution for GDPR.
  • Defining Security layer for Health care data
  • Implementation of Multitenant, Queues and capacity segments.
  • Enterprise solutioning for Data Lake, iMDM.
  • Defining Consumption, Data orchestration and Provision layer
  • Developing synergy architecture between Data lake and Cloud EDP

Environment: AWS, IBM DataStage, MapR hadoop, Apache Spark, Hive, Java, UNIX, Git, Apache Phoenix, HBase, MarkLogic, Protegrity.

Confidential

Enterprise Solution Architect

Responsibilities:

  • Designed and developed Spark connectors to Security devices and plugged the live stream of data to virtual layer.
  • Integrated VM platform with QlikView and Qlik Sense for BI and visualization
  • Designed and developed automation of analyzing the building security data and developed alert system using Spark API for communication channel and Python libraries for analytics.

Environment: IBM API Connect, PCF, Apache Kafka and Strom, Apache Spark, Hive, Java, UNIX, Git, R, Apache Phoenix, Thrift REST Webservice, Python, HBase, R Studio, Cloudera Hadoop Cluster, Solr Search, SFDC CloudCraze, GE IoT device.

Confidential - Irvine, CA

Responsibilities:

  • Designed and developed Apache Spark APIs to integrate with AWS DataLake for in memory transactions.
  • Developed migration framework to RedShift DW platform.
  • Developed Regression model in R to provide visual statistics of goods (Thermal) health conditions to live monitor (SFDC) and integrated with Spark layer.
  • Developed Machine Learning system using Deeplearning4j MLP to capture logistics sensor data and automate predicting P&L of revenue with respect to vehicles daily activities.
  • Architected and developed speed layer with Apache Storm and Spark to feed live stream of data to Postgres SQL and Heroku systems (EDI).

Environment: Apache Storm, Apache Spark, Hive, NoSQL, Java, UNIX, Git Bucket, R, Python, HBase, R Studio, RedShift, Amazon EMR Hadoop Cluster, Oozie, Deeplearning4j, ElasticSearch, Dockers 0.9.

Confidential - Peoria, IL

Responsibilities:

  • Designed and developed connection layer between CAT’s vehicles and backend cloud (DataLake) by Apache Kafka, Spark and Tableau and Parts Management Services in real time.
  • Implemented Cloudera Hadoop DataLake and integrated with Cloud, developed Spark connectors to fetch HBase data provide to R Studio for analytics.
  • Developed AB Testing statistical technique in Python and R to determine Part’s life expectancy and predicting failure rate of Vehicle with time dimension data (Kameleoon).
  • Created Hive views and exposed to ETL tool through Oozie engine.

Environment: PCF, Cloudera Hadoop (CDH4.0), Apache Phoenix, Apache Kafka, Apache Spark, Hive, NoSQL, Impala, Java, UNIX, Git, R, Python, HBase, R Studio, Kameleoon A/B Testing tool, Oozie, Tableau, Oracle ATG.

Confidential - LA, CA

Responsibilities:

  • Architected and Implemented Cloudera Hadoop DataLake for Enterprise Sales data.
  • Developed MapReduce layer to source targeted data for visualization.
  • Using Python and R created Sales Data Predictive model and provided channel for business users for decision making through interface.
  • Established data quality program, profiling metrics, trend monitoring and defect tracking/management.
  • Establish sustainable data architecture current state, future state and roadmap to define and prioritize key projects for solution realization and results.
  • Designed, developed and implemented Recommendation Engine(Personalized Relevance) applying Supervised learning and Empirical Bayesian Score model.
  • Integrated product catalog structure with MapReduce framework for Recommendations (ML - Classification and Regression algorithm) and POI strategy.

Environment: APIm, Pivotal Cloud Foundry(PCF), Cloudera Hadoop (CDH4.0), Hive, NoSQL, Hive, Impala, Java, UNIX, Git, R, Python, REST Python Webservices(Flask).

Confidential - San Diego, CA

Responsibilities:

  • Architected and Designed MDM platform for the NP’s enterprise data
  • Implemented Hortonworks Hadoop clusters as part of BigData strategy and implemented single repository for Data Analytics.
  • ImplementedIBM Info Sphere Business Glossaryto create a metadata repository for Promotional products and Data ware house management.
  • Designed data services usingService Oriented Architecture (SOA)replacing individual applications data access and stores with master data management (MDM) principles and data reusability.
  • Performed vendor management and defining product road map.
  • Developed Price Recommendation system (Search Relevance Engine) by applying Matrix Pattern Recognition(Statistical Factor Analysis)

Environment: Hortonworks Hadoop, Hive, NoSQL, IBM Info Sphere, MEGA, Java, UNIX, SVN, Oracle ATG, AWS IaaS.

Confidential - San Diego, CA

Senior Software Engineer - Data Science

Responsibilities:

  • Designed and Implemented Cloudera Hadoop in HDF5 format repository for unstructured dataset of imported Omics data (genomics, transcriptomics, and metabolomics).
  • Developed Mapping and sequencing program using Python Pybedtools and generate R reports to showcase % matched medicines for particular Confidential t.
  • Developed Scala programs to clean the data (ORF) and feedback it to PyTables and integrate to NumPy.
  • Coordinated with technical teams and Business owners to meet strategic business and technology objectives.
  • Developed Scala programs to integrate the end results with Cloud based SFDC platform of Walgreens portal for Business decisions.

Environment: Cloudera Hadoop (CDH 3), Hive, Apache Phoenix, HBase, PyTables, Java, Scala, UNIX, Splunk v1, Python Pybedtools.

Confidential

Senior Consultant - Solution Architect

Responsibilities:

  • Designed and implemented Hortonworks Hadoop platform and DataLake for WB’s wide stream of data and set up Teradata DB schema for unstructured datasets like media images, videos/audio.
  • Developed R & Python programs for Predictions (linear and sampling techniques) by reading input live stream from Social Medias (tweeter, Facebook) which will provide predicted success rate of WB’s movie or TV serials.
  • Developed MapReduce APIs for real time streaming of data from media channels to HDFS DataLake and wrote MapReduce programs to clean the data.
  • Used MATLAB and developed Apache Mahout APIs to analyze the dictionary data to act as Machine Learning model.
  • Integrated Mahout ML system on embedded platform (Android) which is interactive machine in WB studio for people to get the reviews and give predictions of movies they like.

Environment: Java, R, Tera Data, Apache Mahout, Python, MATLAB, Horton Works Hadoop HDFS, MapReduce APIs.

Confidential - Lisle, IL

Responsibilities:

  • Lead and managed a team of developers and worked closely with client for project delivery.
  • Implemented DataLake on HDFS by data injecting from different data sources.
  • Designed the Authentication and Business layer in LDAP and Site minder policy server for SSO and integrating to Hadoop profiler.
  • Designed and Developed MapReduce programs for data cleaning and data conversions.
  • Performed data analysis using NoSQL and HBase.

Environment: Java, Apache Hadoop HDFS, MapReduce, LDAP, SiteMinder, HBase.

Confidential

Responsibilities:

  • Architected Hybris B2B platform and integrated Payment gateways of B2B model to AL
  • Wrote the Batch Jobs and Cron jobs to schedule Catalog import to AL store.
  • Automated the Build process with tools like Ant, Maven and Jenkins.
  • Utilized Agile Software Development Methodology.

Environment: Java, Hybris 4, Eclipse, Vignette (TCL, VAP), Oracle 9i, UNIX, LDAP, Site minder, PVCS.

Confidential

Senior Software Engineer

Responsibilities:

  • Developed Python analytics layer for J&J chemical data sets.
  • Developed R & Python libraries for analytics and visual presentation.
  • Developed J2EE design patterns like Service Locator, Session Façade, and Data Access Object.
  • Involved in project, quality, configuration management planning, design, database design, coding & testing and developing standards, guidelines, test plans.
  • Involved with data modeling, analysis of data and creation of database schema (MySql), tables.

Environment: UNIX, Java/J2EE, Eclipse, Python and R, MySql.

Confidential

Software Engineer

Responsibilities:

  • Developed Apache Lucene search layer for unstructured data set.
  • Developed Python based Machine learning (building behavioral patterns by classification) and predictive analytics scripts in Python to pinpoint fraud red flags and proactively detect suspicious fraud schemes.
  • Developed R graphical interface search-based, to provide investigators for analysis and evidence documentation.

Environment: Java, C++, Python, Apache Lucene search, R.

We'd love your feedback!