Technical Lead Resume
Chatham Nj Technical Lead Chatham, NJ
SUMMARY:
- I do care about software creation with robustness embedded within it.
- Because of that I try to be language/Framework/Cloud/Tool agnostic as much as possible.
- And it is imperative that the DevSecOps (Culture + tools) is implemented in a robust way as possible.
- Hands on documentation comes with it as well.
- I have been consulting Confidential various capacities.
- My experience pans across wide spectrum of technologies and I have worked and collaborated with a lot of international teams. I am capable of creating engineering teams, coaching them..
- My experience with Russian, South American, European and Indian offshore teams give me a good exposure to all kinds of international working environments.
- I have worked with many different tier 1 and 2 firms in the consulting space as a contractor and as an employee.
SKILLS:
HTML/CSS/JavaScript NodeJS
TypeScript MySQL
Linux
EMPLOYMENT HISTORY:
Technical Lead
Confidential, Chatham, NJ
Responsibilities:
- Served as a Tech Lead to create data platforms that helps the organization to create a digital transformation.
- Hands - on code development + guiding young programmers/data engineers with pair programming techniques.
- Bringing in fivetran to replace the ETL architecture with ELT architecture
- Worked with advertisement + Music data + CRM data + LOB data
- Provide technical leadership to a strong team of data scientists, engineers, and analysts.
- AWS argo flows POC was done but later on we decided to let it go.
Confidential
Lead Software/Data EngineerResponsibilities:
- Worked on POC's to replace java with kotlin.
- Applying expertise in quantitative business analysis, data mining, and the presentation of data to see beyond the numbers and drive enterprise level outcomes
- Implemented full DevOps culture of Build, Test automation with continuous integration and deployment.
- DevOps Automation: DevOps - Orchestration/Configuration Management and CI/CD tools (Jenkins, CircleCI, Atlantis, Puppet, Jenkins, Troposphere, Terraform, Serverless, etc.).
- Bringing in Agile/Scrum/Kanban experience.
- Collaborated with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility and fostering data-driven decision making across the organization.
- Collaborated with Project Managers, Product Managers, QA and Engineering teams to deliver results.
Confidential
Lead Software/Data EngineerResponsibilities:
- End to End AWS sage-maker implementation for machine learning(python) model performance metrics(ROC curves plotted) and ability to implement model performance monitoring and data accuracy(Kubeflow + pachyderm) are carried out.
- Flask apps are created utilizing AWS gateway to update model versions to accommodate new changes.
- Data wrangling tools with golang developed as a POC to see whether python to golang migration is a possibility.
- Databricks was heavily used for scrubbing. transforming and loading the data into postgres database.
- We did a lot of structured streaming with databricks(using scala) for advertisement data.
- Created SDK's for database acces with kotlin.
- Used Delta lake to compliment the data lake(created with lambdas and S3 buckets(Single landing Zone)
- Did POC’s with flink and kafka streams(using KSQL) as well.
- AWS kinesis fire horse and data streams are used heavily as well
- Snowflake sharing/snowflake time travel/snowpipe/streams and tasks are used extensively.
- Develops and maintains scalable data pipelines and builds out new API integrations to support continuing increases in data sources, volume and complexity.
- Creating operational support for all the aws step function related architectures. Some architectures utilise AWS step functions, while some use airflow(ETL pipelines) and Matilion.
- To accommodate all the artists from different country regions, Elasticsearch NoSql option was selected. The whole “Artist search” program was done to create more visibility into the publisher/writer world of music. Used Graph databases like Neo4j to deduce relationships among artists and albums.
- DBT + Snowflake pipelines(CI/CD) efforts were carried out for transformations.
- Salesforce + snowflake integration was created from scratch.
Lead Data Engineer
Confidential, New York City
Main skills: Golang, Python, Scala, kotlin, java, Spark(MLib), Data bricks, Python, Java, AWS/Azure/ Confidential, Snowflake cloud data warehouse, redshift, big query, Airflow, Glue, Spark EMR, Flask, informatica, synapse
Responsibilities:
- Redshift ETL(python/scala) implementation
- Airflow(python)/Dagster
- Database access SDK's were created(ORM kind) for proper
- Argo flows
- Step functions with lambda
- Glue for ETL and scheduling (Task and time level)
- Matillion is also used to implement redshift migrations((For bulk loading, we used aws snowball) some of our customers relied on Talend ETL. so the whole pipeline that we created was made into Talend.
- Spark (EMR + databricks)
- Kinessi, SNS are used in the data lake setup.
- Glue crawler is used heavily to update the data catalog.
- Athena and Redshift spectrum used for virtual data warehouses.
- Handling S3 buckets with IAM policies and bucket policies
- Data quality check Confidential the S3 level and then redshift level
- Cloud watch logs and alarms are used for continuous monitoring.
- DBT was introduced for SQL, CI/CD(Especially for adhoc business queries.
- Implemented full DevOps culture of Build, Test automation with continuous integration and deployment.
- DevOps Automation: DevOps - Orchestration/Configuration Management and CI/CD tools (AWS CodePipeline + AWS CodeBuild + AWS CodeDeploy + AWS CodeStar).
- Azure databricks with azure data factory were monumental in implementing the architectures(Azure data lake with the databricks delta lake).
- Used cosmos DB and azure blob storage as well.
- Azure Data Factory(V2) was combined with databricks to create the data flow that was desired.
- Azure functions and logic functions were implemented in Node.js.
- Implemented full DevOps culture of Build, Test automation with continuous integration and deployment.
- DevOps Automation: DevOps - Orchestration/Configuration Management and CI/CD tools(Pipelines + Repos + Test plans
- Used Bigtable as app backends
- Big query as a Redshift replacement.
- Created data pipelines with Google data flow that bridges google buckets/pub-sub/big query
Lead Software/Data Engineer
Confidential
Main skills: Golang, Python, Scala, kotlin, java, Spark(MLib), Data bricks, Python, Java, AWS/Azure/ Confidential, Snowflake cloud data warehouse, redshift, big query, Airflow, Glue, Spark EMR, Flask, informatica, synapse
Responsibilities:
- Advised and Participated in building the Complete Data analytics platform in Golang.
- K8s/dockers are heavily used for the implementation.
- API management.
- Created Database SDks with kotlin
- Argo flows
- Built Data Pipelines for predictive science.
- Implemented full DevOps culture of Build, Test automation with continuous integration and deployment.
- DevOps Automation: DevOps - Orchestration/Configuration Management and CI/CD tools (AWS CodePipeline + AWS CodeBuild + AWS CodeDeploy + AWS CodeStar).
- Creating productionalized systems for Pricing algorithms.
- Retailer recommendation algorithms crafted from dev to prod.
- I was part of the data versioning, data provisioning team as well. Mostly dealing with Line of Business data like POS and ERP data(Nike is SAP shop).
- Structured Forecasting is done on customer supply chain group on several avenues.
- Good familiarity with MDM (Master data management) because I was trying to mingle social media data into it.
- All the algorithms are scaled with spark clusters.
- Spark(Lambda architecture => batch + streaming)
- AWS glue is also used for ETL purposes with pyspark
- Hive/Retail analytics was done Confidential scale which was mentioned earlier.
- I worked in Databricks environment setup and Qubole environment setup.
- Data lake with Snowflake ETL implementation(Using it with data bricks and Tableau)
- AirflowFor real time dashboards, I used Kafka, storm, clojure with python.
- Used mleap/pyspark/pachyderm to demonstrate how release and deployment work for data science teams.
- Sentiment analysis(Amazon,Best buy reviews etc.,) tool written completely on R and Python
- Compared to a lot of algorithms like Stanford CoreNLP with our own algorithm.
- Implemented continuous learning algorithm
- By using gensim kind of packages, I was able to explore topic modeling in a handy manner.
- Collaborative filtering is also used extensively to get cohort groups.
- In-depth knowledge of probability and statistics, including experimental design, predictive modeling & optimization
- Implemented full DevOps culture of Build, Test automation with continuous integration and deployment.
- DevOps Automation: DevOps - Orchestration/Configuration Management and CI/CD tools (AWS CodePipeline + AWS CodeBuild + AWS CodeDeploy + AWS CodeStar).
- As a member of the micro service architecture team created go microservices from groundup(revel, go-kit and fast-http)
- Working on a game theory based key management services
- Brining back smartness to the back end of the key-protect services via automating kubernetes with control theory and queue theory.
- I was part of the data versioning,data provisioning and data governance teams as well.
- Explored pachyderm a lot
- Nifi and Atlas are explored but since this is a golang project, we pretty much went with pachyderm.
- I Was involved in code coverage tools to guide test design statistical analysis tools to eliminate code smells was used extensively
- Architect/Install/Implement IBM BigFix solutions, Qradar solutions in customer(s) environment
- Spark is used for ETL purposes and streaming applications.
- Design/develop and maintain Mongo databases.
- Created functions, extracted data, created Indexes to perform various data extraction and data mining activities
Confidential
Developer
Responsibilities:
- Creating Machine learning scripts and deployment in Python, Golang(data wrangling) and Scala (Both in functional style and object-oriented style).
- Worked with audio files/video files/images a lot.
- Protocol buffer and thrift were used extensively for data serialization. So facilitated the programs to accommodate the Ser/Deser process
- Built a Training client, data broker for data integration.
- Azure HDinsight/Azure event hubs/Azure storage services/azure sql,Azure data warehouses as a part of cortana intelligence suite.
- Information management with azure data factory, spark ETL(Dstreams used heavily) and Data catalog
- Used pachyderm extensively to do data versioning(Explored nifi)
- FRP Microservices were created from scratch for the Validation and Security module for machine learning applications.
- Spring boot(Java) was used extensively with Flask(python)
- Infrastructure, platform maintenance & administration
- Created the Kubernetes deployment(Vanilla & PAAS)
- Network segmentation.
- Golang/Scala SDK is used as well.
- Both gRPC& Rest API’s were created for the whole platform.
- Creating docker containers while managing the release management/deployment management tasks.
- Docker optimization.
- Working with HELM to create galera cluster(MariaDB)
- Created the ELK stack from ground up.
- Chef is used for infrastructure automation.
- Habitat was explored for application automation but we kept ourselves contained with containerization.
Confidential
Lead Software/Data EngineerResponsibilities:
- Reporting directly to the Chief Information Architect on creating Speech recognition and other analytics solutions for Citi.
- Spark/Scala lead for the analytic data infrastructure implementing Machine learning and deep learning algorithms. golang used for data wrangling(using compiled language for this purpose fits well as opposed to python)
- Scala was used for the Data wrangling + Data science
- Data versioning was done with Scala(via pachyderm).
- The microservices are created with play framework(sometimes akk http is used)
- Data migration to RDS and redshift
- APIs were created with Django framework
- We handcrafted these algorithms from first principles using Scala(scalaz, cats are used well to bring back the category theory feel)
- Involved in all kinds of activities associated with the EAP (Enterprise analytics platform).
Confidential
Lead Software/Data EngineerResponsibilities:
- Azure HDinsight/Azure event hubs/Azure storage services/azure sql,Azure data warehouses as a part of the cortana intelligence suite was explored to help out the data science team.
- Architected and created Executive dashboards from scratch.
- Created web applications that were essential for operational services.
- Data pipe lines were created while implementing parsers for various websites.
- Forum, Social feed and Touch commerce feeds are some API feeds that we worked on.
- D3.js, Dc.js/Crossfilter.js and some other JavaScript libraries are being used here.
- Part of an in-house MVC framework that is akin to Play/Spring/Django but completely written as a Confidential script.
- Golang was used for the Data wrangling + Data science
- Data versioning was done with Golang(via pachyderm).
- The microservices with a front end was created with Golang
- We explored gRPC
- We orchestrated the microservices with kubernetes(27 nodes)
Data Engineer
Confidential, Newark, CA
Main skills: Golang, Scala, Spark(Databricks + EMR), Python, Java, aws, redshift, Flask
Responsibilities:
- Web/RESTful applications for External/Internal data for the whole company.
- Spring boot microservices were created from scratch
- Built an anomaly detection engine with Spark, AKKA and Cassandra
- Retail analytics(Data Science Applications were created from scratch)
- Analytics Platform for the IOT product portfolio
- Creating communication channels/organizing jira activity among teams to make the Dev to Prod transition a possibility.
- Implemented full DevOps culture of Build, Test automation with continuous integration and deployment.