Data Engineer/science Resume
West Chester, PA
SUMMARY:
- 4+ years of experience on setting up and supporting large dataset multinode Cloudera Hadoop cluster for DEV, STG & PROD Environment and used tools like Fluentd, Spark streaming, Java APIs for data ingestion from different sources to Hadoop system and Hive/Impala/Sparksql to query and Tableau for reporting. worked with peers on designing complete Bigdata architecture and finalizing/evaluating different products that will be integrated to support Time series model and project needs.
- Experience on setting up AWS EMR Cluster along with Hbase, Spark, Opentsdb(Time Series), EC2,S3(HDFS). Integrating with XSP Billing data service info via kafka platform to AWS EMR cluster.
- 2+ experience on support/implementing complete workflow for multinode Elastic search clusters for Dev, QA & PROD for Application event log analysis using Fluentd as data ingestion tool.
- Design and setup complete Hadoop cluster, integrate ODU event services both (JSON/AVRO) format via Kafka for real time streaming using Spark streaming and Realtime analytics using spark Dataframes/SQL Also we are using Spark ML for learning on customer patterns.
- 1.5 - 2 years of Experience working on implementing End to End infrastructure and workflow design.
- Understanding different customer data patterns and involved on retrieving data in different formats from Billing event systems and loading to Bigdata platform(Cloudera Hadoop and using Spark ML) to support for usecases i.e., Customer pattern detection:
- Predict disconnect/dissatisfied customer across regions.
- Anomaly detection for higher number of disconnections/more change of services,etc
- Providing top consumers better service.
- Working parallelly to convert/implement RNN model for the above usecases as per dataset rapid growth.
- For Enterprise solutions, working on setting up NVIDIA GPU Platform to support Deep learning usecases.
- Hands-on experience working on numpy, pandas, scikit, NLP, Spark ML, Tensorflow and Keras libraries.
- Working with team on POCs for usecases i.e., Log analytics, Sentiment analysis for Slack/chat messages related to Incident tickets using NLP, Scikit & DL libraries.
- 3+ year of experience on Couchbase technology i.e., Multiple Datacenter/cluster configuration, data migration from oracle to couchbase, High availability bidirectional replication for multi Datacenters, Redesign of Couchbase Buckets/views and automation of various tasks.
- 3+ Years on Couchbase services configuration/implementation with
- 4+ years of strong experience on Datastax/Apache Cassandra experience, migration from oracle to Cassandra & from physical servers to cloud, Upgradation/Data modelling/Dataload for new Cassandra clusters, Day to day support of Cassandra/ servers using Opscenter, Ganglia & Splunk for monitoring.
- 3+ years of experience on setting up multiple Kafka clusters datacenters. Worked on writing Producers/Consumers for different services via Kafka brokers. Used Confluent to setup Topic replication across multi Datacenters. Setup Grafana Dashboard/Kafka monitoring Tool on different metrics for Kafka producers/consumers and zookeeper services. Setup Kafka-Spark streaming for loading event data to Analytics platform.
- 9+ years of Experience on managing/working on Oracle database with RAC along with PL/SQL programming, Disaster Recovery and performance tuning, etc.
- 1+ year of experience supporting Warehouse environment of size i.e., 300+ terabytes on Oracle Exadata X2 & X3.
- 5+ years python scripting and programming experience for Cassandra/Couchbase/Hadoop. Scripts includes i.e., data comparison across multiple couchbase datacenters, couchbase/Cassandra data comparison with Oracle, Data loading, Search data as per different criteria's on Cassandra, Integrate with Hadoop system. Map reduce/spark programming for few analytics requirement.
- 2+ Years of experience on Python code on numpy,pandas, scikit, NLP,etc for data filtering,Tensorflow/keras code for RNN model.
- 2+ year experience of core Java programming/code work through /unit testing, implementing Rest APIs, support using spring boot/mvc framework. Support project activities include Log event workflow, DAO implementation for couchbase/Cassandra, implementing utils, etc.
- 8+ year shell scripting experience for automation, monitoring, deployment & alerting for different environment i..e., Oracle/Couchbase/Cassandra/Hadoop,etc.
- 3+ working experience on CI using Git, SVN, Jenkin and Ansible for automation of maintenance work/jobs.
- 3+ years of experience working on Agile/Devops model, attending daily scrum, playing scrum master role and using different tools i.e., Youtrack, Jira, Rally, Service center,4DX,Kanban, etc.
- Enterprise Cloud Framework/Computing:
- 5+ years of experience on Cloud framework i.e., Amazon EC2. Amazon EMR, Cloud watch, S3, SQS/SNS, VPC, etc.
- 3+ years' experience of Openstack/Steel private cloud support for database/server.
- Worked on various assignments and initiatives to maintain data quality, anomaly detection and integration of data across various platform using open source technology via Continuous Integration.
- 3+ years of experience to build and lead team for onsite/offsite model.
- Worked with various team for Enterprise solutions related to Event as a Platform (EaaP). Database as Service (DBaaS).
- Have presented on various topics related to Nosql and BigData Analytics across IT platforms and also provided trainings to juniors/peers on various technology.
- Conducted Workshop related to AI/Deep Learning with Data scientist across Confidential Applications.
WORK EXPERIENCE:
Data Engineer/Science
Confidential, West Chester, PA
Responsibilities:
- Currently working/supporting multiple projects i.e., Enterprise Datagrid & INSPIRE.
- Enterprise/Global Datagrid: Leading In - Memory computing solutions i.e., Reduce direct biller API calls, improve performance and availability even when billing and other backend systems are not available.
Technology used: Couchbase (Bidirectional multi DC), Coherence, Kafka, Cloudera Hadoop, spark streaming, Spark ML, Java, Spring Boot, Maven/Gradle, REST APIs, Elasticsearch-kibana, Grafana, Python.
Solution Architect
Confidential, Mount Laurel, NJ
Responsibilities:
- Publish-Subscribe Platform handle the non-biller, pub/sub, notification parts of messaging/user calls.
- Worked as a Solution Architect to provide DBaaS solutions for Cassandra, couchbase along with automation of messaging platform.
- Worked on Micro service/Rest APIs for multiple consumers.
Technology/Environment: Couchbase, Cassandra, Elasticsearch, AWS SQS/SNS,Kafka, Spark streaming, Spring Boot,, Java, Python,etc.
Solution Architect
Confidential, New York, NY
Responsibilities:
- Worked as an Solution Architect to complete Migration projects from oracle to Cassandra and Hadoop system.
- Along with automation of various jobs related to backup/recovery,implmenration,scaling,etc.
- Wrote multiple Rest APIs as per consumer demand.
Technology/Environment: Oracle 10g/11g R1/R2, Casandra DSE 3.0.8/3.2.7 , Mongodb 2.2.4, Apache Hadoop v2, Spark, CDH4, shell/python, Java, Hive/Impala/Hbase/Flume/Sqoop, Java, APIS, etc.
Senior Technical Specialist
Confidential, San Diego, CA
Responsibilities:
- My role is to provide DB Engineer support of multiple OLTP & Warehouse(Exadata/Hadoop) platform along with ETL operations i.e, Goldengate.
- Worked on writing code/script to integrate various platforms.
- Involved on lots of initiative to automate tasks and also lead a team of 6 members from Offshore team while working as an Onsite coordinator.
Technology/Environment: Hadoop/Hive, MySQL, pl/sql, Exadata X2/X3, python, Oracle RAC 10g/11g, Goldengate Director/Monitor/Veridata, Casandra 2.1.2, etc.
Senior Technical Specialist
Confidential
Responsibilities:
- My role is supporting their Database /Warehouse/ETL platforms (Oracle, RAC), Goldengate, ETL Tools.
- Also taken initiative for automation of various day to day jobs/reports, etc along with various Disaster recovery plan, backup strategy.
Technology/Environment: Oracle, Goldengate,RAC,Shell scripting,Python,Cognos, etc.
Tech Lead DBA
Confidential
Responsibilities:
- Support involves complete management and Day to Day support of Oracle clusters along with some mysql environment.
- Also worked Migration of data from OLTP to Warehouse environment including ETL jobs writing and automation.
- For reporting, worked on performance tuning of Oracle scheduled reports/jobs.
- DB Engineer with plsql programming on below technologies i.e., Oracle 8i/9i/10g/11g Database, RAC 10g/11g, Data guard/standby database, Disaster Recovery, OEM Grid 10g/11g,Sql code and performance tuning, etc.
- Migration and Automation Data loading to various OLTP/Warehouse platform.
- Automation, deployment & monitoring improvement of DBA Tasks using shell scripts.
- Worked on implementing various reports using Cognos, Oracle BI Tools.