Devops Site Reliability Engineer Resume
4.00/5 (Submit Your Rating)
SUMMARY
- Architect Professional with 15+ years of experience. Background skills include: AWS, Big Data, DevOps, Virtualizations, Storage, Cloud Technology and Networking
TECHNICAL SKILLS
- Oracle 10/11g/12c, MS - SQL Server 2005/2008/2012 , 2016 DB2 9.7,10.1
- ESX 3.5, Vsphere 4.0, Vcenter 5.0, 6.0 Workstation 6.0/6.5/7.0
- AWS, AZURE
- UNIX (Solaris, SGI, HP AUX, AIX Centos, Ubuntu,), Linux Red Hat, AWS, Cloudera, Windows Server 8R2, 2012, 2016, OpenStack, Puppet, Hadoop, EMC Isilon, VNX , NetApp, Html, Internet Explorer, Netscape Navigator, McAfee, Heat Management Tools, Lotus Notes, Pine, Hummingbird exceed (NETBackup, VMWare Citrix XenApp / XenDesktop) EMC, Isilon, VNX, Cisco Nexus, SQL Server 2012,, Active Directory, Security Scanning Tools, Varonis, LINUX Server and workstation Administration, HPC Administration, NAS and SAN Administration and Engineering, System Security Administration and Engineering, System integration, Data Migration, Data Center Management, Data Center Migration, Big Data Management, Tanium Foundation, Operations, & IR Deep Dive
PROFESSIONAL EXPERIENCE
Confidential
DevOps Site Reliability Engineer
Responsibilities:
- Deploy cloud infrastructure (Security Groups and load balancers needed to support EBS environment)
- Create and manage Continuous integration builds on VSTS
- Responsible for maintaining AWS instances as part of EBS deployment
- Developed business logic using Python
- Provided support on AWS services and DevOps deploying applications in AWS to help take full advantage of the AWS platform
- Develop serverless applications on AWS instances (Lambda, ECS, SNS/SQS/Kinesis, RDS, DynamoDB)
- Develop microservice applications using Java
- Developed business logic using Python
- Supported and Work with both relational and NoSQL databases
- Configured, test, deploy, and upgrade software for production EC2 servers in AWS
- Lead initiatives for automating and scaling our systems
- Participated in technical architecture design
- Improve the security, reliability, and performance
- Administer, monitor, and deploy cloud-based systems
- Collaborate with application engineers to design robust systems
- Take ownership of infrastructure projects and internal tools
- Exert automated test approaches through CICD
- Communicated and collaborate with Product Manager, Engineers, Stakeholders, et
- Deploy, automate, maintain and manage AWS cloud-based production system, to ensure the availability, performance, scalability and security of productions systems.
- Establish, maintain and evolve concepts in continuous integration and deployment (CI/CD) pipelines for existing and new services.
- Ensured security compliance with appropriate NIST and ICD requirements
- Assisted in the architecture, design, implementation, and lead AWS public cloud build (connectivity, network, security, containerization, monitoring)
- Provided guidance on security configurations and risk and compliance procedures (Identity Management, Network Configuration, Data Protection, Segregation of Duties)
- Work with in-house cloud security experts to implement a security framework that satisfies ISO standards for implementing cloud solutions in public clouds
- Work closely with product and platform teams to engineer and implement cloud security controls
- Design and implement Azure/cloud-based DevSecOps processes and tools
- Manage patch automation and security hardening for Azure infrastructure
- Deploy security automation services such as Puppet, Chef, and/or Terraform
- Secure microservices and hardening containers
- Build automation/infrastructure as code to enforce cloud infrastructure security
- Work with Operationalize tools to strengthen cloud security posture - e.g. Cloud Infrastructure scan tools, Firewall scan, network scan, host scan tools, vulnerability management tools etc.
- Roll out security infrastructure such as central logging, IAM Roles, SIEM tools etc.
- Manage/create Cloud accounts for both AWS commercial and .Gov cloud as defined by the Government customer and keep in compliance.
- Manage day-to-day security operational tasks such as security event monitoring, log monitoring and security incident management, compliance monitoring, data loss prevention, and monitoring and responding to emerging threats varying from endpoint to server to public cloud system.
- Perform ongoing vulnerability assessments including vulnerability scanning and vulnerability exploit testing (penetration testing) with clear reporting, threat identification and action plans for remediation with prioritization. This will also include any assessments for changes that the security team has identified as requiring a vulnerability assessment prior to release
- Drive shared responsibility model to roll out security compliance infrastructure working with various sub-group.
- Assist with the development, implementation, and administration of Cloud security awareness training for the enterprise.
- Design, configured and deployed Microsoft Azure for a multitude of applications utilizing the Azure stack (Including Compute, Web & Mobile, Blobs, Resource Groups, Azure SQL, Cloud Services, and ARM), focusing on high - availability, fault tolerance, and auto-scaling
- Configured SQL Server Master Data Services (MDS) in Windows Azure IaaS
- Manage different AZURE environment for provisioning of Linux servers and services executed by the providers.
- Design and configured Azure Virtual Networks (VNets), subnets, Azure network settings, DHCP address blocks, DNS settings, security policies and routing.
- Deployed Azure IaaS virtual machines (VMs) and Cloud services (PaaS role instances) into secure VNets and subnets.
- Design VNets and subscriptions to confirm to Azure Network Limits.
Confidential, Hanover, MD
Sr Systems Engineer
Responsibilities:
- Deploy, monitor and maintain Amazon AWS GOV cloud infrastructure consisting of multiple EC2 nodes in rapidly changing R&D environment.
- Created Windows and Linux desktop using AWS Workspaces
- Setup Amazon Work Spaces that’s available different Regions P
- Provided access to high performance cloud desktops wherever the teams needed work done
- Manage global deployments of customers Workspaces from the AWS console.
- Worked with Jenkins to Automated the Orchestration and Incident Response
- Provision and de-provision desktops as needed at current customers workforce change.
- Launch AWSEC2 Cloud Instances using Amazon Images (Linux/ Ubuntu) and configure launched instances with respect to specific custom applications.
- Designed Splunk Enterprise 6.5 infrastructure to provide high availability by configuring clusters across two different data centers.
- Worked on Microsoft Azure (Public) Cloud to provide IaaS support to client. Create Virtual Machines through Power Shell Script and Azure Portal.
- Manage and Create Storage Account and Affinity Group in Azure Portal.
- Captured various images of a Virtual Machines. Attach Disk host of Virtual Machines. Manage and create Virtual Network and End Points in Azure Portal.
- Deployed VM's, Storage, Network and Affinity Group through PowerShell Script.
- Created Storage Pool and Stripping of Disk for Azure Virtual Machines. Backup, Configure and Restore Azure Virtual Machine using Azure Backup.
- Configured Window Failover Cluster by creating Quorum for File sharing in Azure Cloud.
- Validate and stress-test multiple servers hosting custom software applications.
- Created proper documentation for new server setups and existing servers.
- Automate build and release management process, monitor all changes between releases.
- Maintained GIT, Bitbucket repository, handling branching, merging, tagging and release activities.
- Manage multiple AWS instances, security groups, Elastic Load Balancer's and AMI's.
- Provided authenticated access to AWS resources using Multi-Factor Authentication).
- Created and manage users, accounts, roles, groups and policies using Identity Access Management (IAM).
- Design and development of Continuous Integration Process and deployment of Internet, Intranet and Client/Server business applications.
- Installed, Configured, Maintained, Tuned and Supported Splunk Enterprise server 6.x/5.x.
- Architected and Implemented Splunk arrangements in exceptionally accessible, repetitive, conveyed figuring situations.
- Performed Field Extractions and Transformations using the RegEx in Splunk.
- Responsible for Installing, configured and administered Splunk Enterprise on Linux and Windows servers.
- Supported the upgradation of Splunk Enterprise server and Splunk Universal Forwarder from 6.5 to 6.6.
- Install and implement Splunk App for Enterprise Security and documented best practices for the installation and performed knowledge transfer on the process.
- Worked on installing Universal Forwarders and Heavy Forwarders to bring any kind of data fields into Splunk.
- Write Splunk Queries, Expertise in searching, monitoring, analyzing and visualizing Splunk logs.
- Design, optimize and executing Splunk-based enterprise solutions.
- Installed and configured Splunk Universal Forwarders on both UNIX (Linux) and Windows Servers.
- Worked on customizing Splunk dashboards, visualizations, configurations using customized Splunk queries.
- Monitored the Splunk infrastructure for capacity planning, scalability, and optimization.
- Supported configured work on Splunk- DB connect for real-time data integration between Splunk Enterprise and rest all other databases.
- Responsible with Splunk Searching and Reporting modules, Knowledge Objects, Administration, Add-On's, Dashboards, Clustering and Forwarder Management.
- Monitored license usage, indexing metrics, Index Performance, Forwarder performance, death testing.
- Splunk Architecture/Engineering and Administration for SOX monitoring and control compliance.
- Design and implement Splunk Architecture (Indexer, Deployment server, Search heads, and Forwarder management), create/migrate existing Dashboards, Reports, Alerts, on daily/weekly schedule to provide the best productivity and service to the business units and other stakeholders.
- Involved in standardizing Splunk forwarder deployment, configuration and maintenance across UNIX and Windows platforms.
- Worked with and provided needed information to the Security Operations Center, Global Security Operations Manager, Global Security Operations Specialists and the Global Security Investigations and Intelligence Team to anticipate, identify and evaluate global risks that carry a significant risk to the enterprise
- Work with various version control systems like Subversion, and GIT and used Source code management client tools like Stash, SourceTree, Git Bash, GitHub, Git GUI and other command line applications.
- Work on Cloud automation using AWS Cloud Formation templates.
- Build & Release automation framework designing, Continuous Integration and Continuous Delivery, Build & release planning, procedures, scripting & automation. Good at documenting and implementing procedures related to build, deployment and release.
- Monitor track Security Information and Event Management within customer datacenter with various software tools and applications
- Work with Jenkins for Automation, Orchestration, and Incident Response with the Security operation centers cloud monitoring team
- Stand up and administer Kubernetes cluster on on-perm and Amazon Cloud.
- Ensure optimum performance, high availability and stability of solutions and Ensure the container orchestration platform (Docker/Kubernetes) is regularly maintained and released to production without any downtime
- Increase the effectiveness, reliability and performance of container orchestration platform (Docker/Kubernetes) by identifying and measuring key indicators, making changes to the production systems in an automated way and evaluating the results
- Ensure that the container orchestration platform (Docker/Kubernetes) is maintained properly by measuring and monitoring availability, latency, performance and system health.
- Assist development teams to migrate applications to Docker based PaaS platform
- Build Chef Server (set up, run, and maintain), Cookbook creation, Chef Environment Maintenance, & Version pinning.
- Utilize Jenkins for release management and assistance with CI/CD processes.
- Responsible for Automation, Virtual networking/security and access in AWS Cloud Services. Provide DevOps and Systems engineering work with all AWS Services (EC2, RDS, Redshift etc..) and frameworks such as Chef
Confidential, Woodlawn, MD
Sr Systems Engineer
Responsibilities:
- Automate and manage our AWS infrastructure and deployment processes, including production, test and development environments.
- Installed deployed Windows and Linux desktop using AWS Workspaces
- Setup various Amazon desktop OS instance with AWS Work Spaces that’s available in different Regions
- Monitor Azure Infrastructure through System Center Operation Manager (SCOM).
- Moderated and contribute to the support forums (specific to Azure Networking, Azure Virtual Machines, Azure Active Directory, Azure Storage ) for Microsoft Developers Network including Partners and MVPs.
- Provided consulting and cloud architecture for premier customers and internal projects running on Microsoft Azure platform for high-availability of services, low operational costs.
- Handle escalated Support tickets till closure for MS Azure IaaS platform .
- Design VNets and subscriptions to conform to Azure Network Limits.
- Exposed Virtual machines and cloud services in the VNets to the Internet using Azure External Load Balancer.
- Provided high availability for IaaS VMs and PaaS role instances for access from other services in the VNet with Azure Internal Load Balancer.
- Implemented high availability with Azure Classic and Azure Resource Manager deployment models.
- Provided access permission to high performance cloud desktops wherever the teams needed work done
- Manage large global enterprise deployments of customers Workspaces from the AWS console.
- Provision and de-provision desktops as needed at current customers workforce change.
- Launch AWSEC2 Cloud Instances using Amazon Images (Linux/ Ubuntu) and configure launched instances with respect to specific custom applications.
- Provided needed support to the Ability to perform and speak regarding log analysis, use of IDS, IPS, and/or other signature technology. Lead teams that Manage and maintain the log management and threat analysis solution
- Automate of infrastructure using Terraform and Ansible
- Work with Jenkins for Automation, Orchestration, and Incident Response with the Security operation centers cloud monitoring team
- Develop, Maintain and support Continuous Integration framework based on Jenkins
- Work with Jenkins Pipeline develop Pipeline Development, build configure with suite of Jenkins features, which is installing plugins, then enable implementation of continuous delivery pipelines, which is to automated the customer processes for getting software from source control through deployment to end users.
- Lead the development of innovative service solutions for Azure cloud service offerings
- Used Ansible and Ansible Tower as Configuration management tool, to automate repetitive tasks, quickly deploys critical applications, and proactively manages change.
- Wrote Python Code using Ansible Python API to Automate Cloud Deployment Process.
- Setup complete CI/CD Pipelines
- Automate instance schedule using Lambda Cloud Watch S3 and RDS services in AWS
- Edit and repurpose WordPress plugins under customers’ needs in AWS
- Write and extend WordPress plugins in AWS
- Developed procedures to unify streamline and automate applications development and deployment procedures with Linux container technology using Docker swarm.
- Worked in all areas of Jenkins setting up CI for new branches, build automation, plugin management and securing Jenkins and setting up master/slave configurations.
- Involved in deploying systems on Amazon Web Services Infrastructure services EC2, S3, RDS, SQS, Cloud Formation.
- Manage the Azure environments Network Design and Infrastructure Setup using Azure Services for both Development and Production systems.
- Build AWS-based services supporting production SaaS platform including web applications and data analytic services
- Provided leadership in developing innovative service capabilities for Azure Cloud and in managing Azure capability development project. plan, configure, optimization and deploy Microsoft Azure solutions (IaaS, PaaS, VMs, AD, Automation, Monitor, etc
- Migrate existing on-premises services to an AWS cloud infrastructure.
- Build/Maintain Docker container clusters managed by Kubernetes , Linux, Bash, GIT, Docker , on GCP. Utilized Kubernetes and Docker for the runtime environment of the CI/CD system to build, test deploy.
- Responsible for design and implementation of the Codex is Network and server infrastructure.
- Provide following duties as Sr Engineer include Firewall, Switch and Router configuration and maintenance
- Secured configured locked down Hadoop multi-tenant data sets to users and grant access to resources based on each user’s unique needs.
- Work with OS and application teams to ensure client service success.
- Performed Vulnerability Assessment & Penetration Testing on the infrastructure on AWS for security.
- Installed configured maintained Key Trustee Server with Apache Sentry on the current AWS cloud.
- Responsible for auditing and tracking usage across multiple tenants and multiple clusters.
- Build a technical and security architecture in Azure for the selected apps/workloads
- Lead compliance assessments and application portfolio assessment with the customer on designed Azure architecture
- Select a migration approach to lift and shift the workloads to Azure or architecting a greenfield development and/or production platform for new applications
- Configured supported monitored Key Trustee Server with Apache Sentry within customers datacenter environments located offsite.
- Configured, data read from and written to HDFS directories while its transparently encrypted and decrypted without requiring any changes to user application code.
- Configured encryption layers in traditional data management software/hardware stack.
- Supported and deployed encryption at a given layers in a traditional data management software/hardware stack with different advantages and disadvantages. Application-level encryption, Database-level encryption, Filesystem-level encryption, and Disk-level encryption
- Integrated various Version control tools, build tools, nexus and deployment methodologies (scripting) into Jenkins to create an end to end orchestration build cycles.
- Troubleshoot build issues in Jenkins, performance and generating metrics on master's performance along with jobs usage.
- Implemented enterprise-grade authorization mechanisms based on user directories and authentication technologies such as Kerberos.
- Installed configured Kerberos to allow Master/Slave replication cluster with consist of any number of hosts which stores all information, both account and policy data, in application databases.
- Ensure plan execution and Azure consumption targets are met
- Implemented Kerberos software distribution which includes software replication, such as copying data to other servers.
- Installed configured design Kerberos which gives client applications ability to attempt authentication against secondary servers if the primary master is down.
- Create data level security rules for IDH Hive users leveraging Apache Sentry
- Create new infrastructure Load Balancing, Packet Routing and SSH protocol designs to Maximize Network routing efficiency. Daily network monitoring and troubleshooting of network operation deficiencies
- Administering & designing LANs, WANs internet/intranet, and voice networks.
- Work with Tanium Foundation, Operations, & IR Deep Dive tools in customer enterprise AWS space
- Standardize Splunk forwarder deployment, configuration and maintenance across a variety of platforms
- Deploying and using enterprise EDR products such as Tanium
- Define, manage, and promote various development activities for DevOps practices, including continuous integration, continuous delivery, continuous testing, and continuous monitoring
- Support AWS Cloud infrastructure automation with multiple tools including Gradle, Chef, Nexus, Knife, Docker and monitoring tools such as Splunk, New Relic and Cloudwatch
- Responsible for designing, scaling and deploying various cloud services, modernizing processes and workflows along with building a consolidated and collaborative integration of IaaS, SaaS, and PaaS cloud services
- Manage all components of the DevOps Configuration Management platform (Jenkins, Nexus, GitLab, Sonar, etc.)
- Perform security log analysis during Information Security related events, identifying and reporting possible security breaches, incidents, and violations of security policies.
- Responsible for designing, developing, testing, troubleshooting, deploying and maintaining Splunk solutions, reporting, alerting and dashboards
- Implemented and supported Cloud Networks. Collaborate with security and network team to ensure all cloud platforms adhere to security models and compliance requirements for the cloud infrastructure for either on-premises or Cloud network. Assist in the support and troubleshooting of cloud network infrastructure along with the network support team to resolve complex operational issues
- Manage, configure and install VMware vSphere environment: vCenter, hypervisor on new hosts, virtual machines, datastore creation and maintenance
- Perform daily system monitoring of Virtual Infrastructure which includes VMware and Amazon Cloud Service
- Work with various teams to design, implement, integrate and operate AWS cloud solutions for high availability and scalable service delivery.
- Conduct and remediate Windows Security Content Automation Protocol (SCAP) and NESSUS system scans
- Implemented distributed data storage system using Accumulo and Hadoop Distributed File System (HDFS) for storing and running analytics on large volumes of data.
- Install, configure, and manage VMware vSphere environment: vCenter, hypervisor on new hosts, virtual machines, datastore creation and maintenance.
- Responsible for system administration, engineering, provisioning, operation, maintenance of vCenter, vRealize Operations, VMware Configuration Manager and support..
- Assist in the proper operation and performance of Splunk, loggers and connectors
- Worked configured responsible for Installation and configuration of Hadoop, YARN, Cloudera manager, Cloudera BDR, Hive, HUE and MySQL applications
- Reviewed performance stats and query execution/explain plans, and recommends changes for tuning Hive/Impala queries
- Enforce best practices in while maintaining customers environment as well as Service request management, Change request management and Incident management by using the standard tools of preference
- Review security management best practices which includes ongoing promotion of awareness on current threats, auditing of server logs and other security management processes, as well as following established security standards.
- Work with Cloudera maintenance, monitoring, and configuration tools to accomplish task goals and build reports for the management review.
- Responsible to build and maintain the Cloudera distribution of Hadoop.
- Perform cluster maintenance as well as creation and removal of nodes using tools like Ganglia, Nagios, Cloudera Manager Enterprise, Dell Open Manage and other tools
- Integrate data feeds (logs) into Splunk administering Splunk and Splunk App for Enterprise Security (ES) log management
- Standardize Splunk agent deployment, configuration and maintenance across a variety Of UNIX and Windows platform
- Work on System Center and Tanium design and deployment initiatives
Confidential, Orangeburg, NY
Systems Engineer
Responsibilities:
- Maintain network operations on a day-to-day basis for a complex network of over 400 workstations and 150 servers as part of the Infrastructure Services team.
- Troubleshooted and proactively monitored networks, systems and applications to identify and correct malfunctions and other operational difficulties.
- Provided documentation of network operating systems and network topology.
- Implement authorized network enhancements and special projects as assigned.
- Supported and maintain recovery systems and operations, including but not limited to: maintaining backup system configuration, performing backup system upgrades, monitoring backups, and performing restores as needed.
- Supported organization’s in the design and implementation of network components, including support of development systems and assistance with testing.
- Recommended new and emerging technologies to solve business problems.
- Maintain Windows and Linux Operating Systems, Servers, Storage, Data Backups and other related equipment or software.
- Monitor and support for datacenter systems and applications.
- Work with the Help Desk for support requests as needed, including change management and other key support measures and metrics.
- Handle remote and onsite installs, upgrades and configurations of Omnicell systems for customers.
- Resolve technical issues escalated by customers or internal departments as part of product installations and upgrades.
- Provided technical support for each client’s server-based systems and environment as a whole
- Provided technical support for each client’s collaborative technology, including email and unified communication systems