Sr Site Reliability/devops Engineer Resume
Waltham, MA
EXPERIENCE SUMMARY:
- 10+ years of IT experience with exposure to programming, system deployments, network administration and operations. I have worked on Python, Jenkins API, Ruby, Perl/Shell, Oracle, MySQL, Core Java, Linux and
- Apache, with a good understanding of Software Development Life Cycle. I am self - starter, hardworking multi-tasking person who can work in a team and help it become highly effective. I have recently worked on technologies like Apache Storm, Docker Image, Perl CGI, Splunk, New Relic, Python, Jenkins API.
- Experience of development in Core Java/Oracle using Eclipse IDE
- Experience of code versioning system (Git, Perforce and SVN)
- Exposure to Continuous build and integration tool.
- Exposure to software build and deployment tools like Jenkins, Opensource based Software Factory tool, Confidential Internal CI/CD tools.
- Exposure to Atlassian DevOps software tools including Jira, BitBucket, Stash, Git repository, Confluence.
- Exposure to SonarCube for code coverage reporting.
- Strong background in Linux environment and managing deployments on Linux.
- Scripting knowledge in Python, Ruby on Rails, Perl/Shell over Linux based systems with Oracle as backend. Significant contributions in form of scripts/automation especially using Python, Perl and Oracle to help reduce operational burden.
- Used Jenkins API, Python and Oracle to make continuous build run in parallel.
- Development experience in LAMP (RedHat Linux, Apache, MySQL, and PHP/Perl) technologies.
- Exposure to handling production systems running on Apache Storm multinode environment, which was co-ordinated and monitored by Zookeeper and supervisord.
- Experience of working on Distributed Systems, I have handled customers issues related to Confidential Services (RDS) on Confidential Cloud Network.
- Have been a single point of contact for tools, ensuring the services are healthy, keeping the knowledge base updated, configuring health monitors for the service.
- Experience of using trouble ticket tracking systems like Remedy, ServiceNow.
- Managed Root Cause Analysis program within team to help drive reduction of re-occurring operational issues.
- Follow Agile Methodology to plan my work. Have exposure of using both bugzilla and Jira.
- System installation and configuration experiences for OS, Database, Web server etc
- Used ETL manager (Data warehousing) software that helps extracting data from source databases to be loaded to team databases.
- Used Confidential internal tool to monitor system health and performance metrics to identify and report bottlenecks. I also participate in analyzing the sudden spikes and report the root cause for the same.
- Exposure to other monitoring tools like Splunk and New Relic.
- Setting up of DNS and configurations like add/remove virtual hosts, LAN Connectivity
- Network Configuration exposure, and working knowledge of configuring iptable and squid
- RHEL5 migration - Applications working on RHEL3 were migrated to RHEL5
- Application migration - For enhanced performance and better application throughput, application migrated to better server.
- Handled Infrastructure and NFS related queries and issues
- Experienced in understanding troubleshooting and providing root cause fix for technical issues experienced by customers.
- Identify bottlenecks using tools and metrics and improvements or work with development teams to fix them.
- Multi-tasking even in a fast-moving environment and strong customer focus.
- Have exposure for and instinct to automate repetitive manual tasks to help reduce Operational burden.
- Develop useful dashboards to collect statistical data helpful for the team
- Actively create/update team SOP
- co-workers and end-users and also involved in recruitment.
- Perform effectively under high-pressure situations.
TECHNICAL SKILLS:
Scripting: Python, Ruby on Rails, Perl, PHP, Shell, JavaScript and Greasemonkey
Languages & Tools: Java, C
Database: Oracle, MySQL, Postgres, AWS Redshift and SQLServer
Web/Application Server: Apache Tomcat, Apache Webservers, IHS WAS, Apache Storm
Operating System: RedHat Linux, Windows
Linux Administration: Squid, IP tables, LAN configuration, DNS and Samba Server
Build Tools: Jenkins, RunDeck, Puppet, Chef, Nexus, Software Factory, Atlassian Software Tools
AWS: EC2, RDS, S3, Redshift, VPC and Route 53
Cloud Computing: Openstack, Ceph, Docker Image
Other: Jakarta Ant, XML, Eclipse
Code Versioning: Git, Stash, SVS, CVS, Perforce
PROFESSIONAL EXPERIENCE:
Sr Site Reliability/DevOps Engineer
Confidential, Waltham MA
Responsibilities:
- Use Atlassian software tools for functioning in Agile environment. Git repositories, Stash and Software Factory/Workbench are integrated together for CI/CD.
- Integrated SonarCube code coverage reporting.
- Used Jenkins API, Python and Oracle to make the continuous integration builds run in parallel.
- In Jenkins UI, I modified/created new Jenkins job to complete the automation
- Added new nodes to the environment.
- Used the console reporting for troubleshooting build related issues.
- Do general Linux system administration: configuration, installs, automation, monitoring, etc. This is the easy part.
- Get involved in every part of Amadeus site—from the earliest stage of product design and development to deployment, troubleshooting, and performance analysis
- Design and build tools to manage a rapidly growing number of servers and services
- Be a release engineer, and manage the development workflow from the desktop to production
- Participate in a periodic on-call rotation
- Research open source based continuous build and integration tool to integrate existing Amadeus application to this new tool.
Sr Site Reliability Engineer
Confidential, Woodland Hills CA
Responsibilities:
- Create and/or Update standard operating procedures used for troubleshooting production issues by the team.
- I am responsible for monitoring health of payments related platform services, we rely on monitoring tools like Splunk to alerts us automatically about an issue. The platform services are based on Apache Storm, Apache Tomcat and Apache Webservers, REST APIs and Oracle Databases.
- Monitor the Apache Storm multinode environment health which was co-ordinated and monitored by Zookeeper and supervisord.
- Identify gaps in monitoring of our services health and create and/or update our monitors.
- Created and shared with team a splunk dashboard to monitor critical services that the team supports.
- Mentor new members joining the team to make them familiar with our system.
- Team is responsible for troubleshooting issues by root-cause that involve deep dive in logs and relevant data in the database(s).
- Create automations for re-occurring manual tasks and reduce operational pain.
- Built a dashboard using Perl CGI, JSP and Oracle on Apache web-server to monitor the end to end data flow for daily funding.
- Team used to monitor and dig data manually to monitor ban control related data, I built a scripts to monitor ban control data automatically and notify the team.
- Built a script to send the daily Risk Case processing statistics.
- Built a script to monitor daily internal funding stats.
- Report bug to development teams for recurring issues using Jira.
- Manage the development workflow from the QA to production.
- Used Puppet to monitor application specific services and deploy configuration files.
- Used Jenkins for weekly deployment of services.
- Work with Developers in design and review for new features and products.
Sr. Support Engineer
Confidential
Responsibilities:
- Handle production customer impacting issues that involve extensive troubleshooting, debugging, fix and identifying of root cause.
- Create/Update standard operating procedures used for troubleshooting within team.
- Involved in interviewing prospective candidates and mentor new members in the team.
- Analyze and monitor performance metrics and trends to help identify bottlenecks and service health using tools like Nagios.
- Troubleshoot Infrastructure related issues
- Identify areas of improvement for services, tools, or procedures to reduce contacts and drive down manual tasks.
- Report bug for recurring issues using Jira similar to bugzilla.
- Work with Developers in design and review for new features and products.
- Deployment of Production ready code and perform validations once the code gets deployed
- Perform database level troubleshooting by executing SQL statements.
- Build automated tools to reduce operational pain.
Sr. Support Engineer
Confidential
Responsibilities:
- Automated and enabled code versioning for the entire set of service health monitor creation, replication and audit using the Internal Monitoring API’s, Ruby, Perl, JSON and Oracle.
- Built dashboard to gather useful statistics that helps the entire team determine the major pain points and focus areas for improvements. The dashboard also tries to give a broad overview of efforts being put in by Operations team vs Development team. Technologies used were Ruby on Rails, AWS Redshift, XML and Oracle.
Support Engineer
Confidential
Responsibilities:
- Out of several improvements, two of the major contributions for the team were:
- Improvement of throughput of report processing time for each report from 3 minutes to 5 seconds. The change saved several hours of Accounting Team during month close process. The improvement was highly appreciated not only by my Management but by our customers too.
- Developed a Queue Monitoring tool and Daemon Monitoring Tool using Perl and Oracle for scheduling software used by team to manage the execution of daily/weekly/monthly reports. Scheduling software used queues to manage resource contention issues. We managed queues manually until the tool was developed. A big win overall for our team, a major chunk of manual monitoring was eliminated as a result.
Software Engineer
Confidential
Responsibilities:
- Actively involved in development of projects for Number Translation System (NTS) and SIP-IX. The technologies used were Core Java, JUnit, Oracle, MySQL, Shell Scripts and Linux.
- Interacted with onsite team with regards to bug fixing, development of newer modules and understanding the requirements.
- Code and maintain the versions, test the application at functional and integration level, and then deploy the modules on production server in co-ordination with the onsite team.
- Built an onsite test environment that involved in setting up MySQL server as testing from remote server caused extensive lag and slowed down the overall development process.
- Technical Support for Scalix (Linux based mailing server)
- Provide technical support for customer impacting production issues related to mail server.
- Identify, report, verify, and follow-up on bugs for the software bugzilla.
- Received for the product from onsite group in San Jose CA
- Mentor new members in the team and update team standard operating procedures.
- Technologies used were sendmail, shell scripting, Linux, DNS, Active Sync
Software Engineer
Confidential
Responsibilities:
- Bharti Multi-Lingual Message Server
- Worked on development of Multi-lingual message server based on IMP and Horde Framework, worked on the applet designed for multi-lingual message support. Technologies used in the project like Java, IMP based on Horde Framework
- Deployed and actively involved in customization and setup of Multi-Lingual Message Server at multiple locations including Department of IT Sikkim and BITS Mesra.
- Imparted to the users and administrators of the software.
- Network Setup for Indian National Science Academy (INSA)
- Worked actively in setting up Squid, IP tables, LAN configuration, DNS server setup and Installed and configured Samba Server.
- SMS Notifier, a module for Netram Motion (A product of ESCL)
- The module was responsible to send instant alerts of motion detected to security personnel to their mobile in the form of an SMS via mobile connected to a computer through a serial cable. The project used javax communication APIs to send SMS using the AT (Attention) Commands.
- Message monitoring for eXpert Notification Server (XNS)
- Developed message monitoring system for one of the critical products eXpert Notification Server (XNS). The project was developed to track the path of message delivery from end to end from XNS server to its customer. The software was also designed to report lost or undelivered messages. The project was developed using C language and the interface was built using Tcl/Tk scripting language in Linux environment.
