Big Data Architect/lead Engineer Resume
3.00/5 (Submit Your Rating)
SUMMARY
- 11+ years of experience as Big data Architect/Sr. Big Data Lead Engineer and Cloud Development Engineer involving extensive work towards code compilation, automation, packaging, building, debugging, managing, tuning and deploying code across multiple Cloud environments.
- Used Ansible and Ansible Tower as Configuration management tool, to automate repetitive tasks, quickly deploys critical applications, and proactively manages change.
- Managing the configurations of multiple servers using Ansible & Chef.
- Managed environments DEV, QA, UAT and PROD for various releases and designed instance strategies.
- Good knowledge about CI/CD.
- Managed Multi Flavour of Linux and Windows virtual servers with Ansible & Chef using Git.
- Extensively worked on Cloud Bees, Jenkins and TeamCity for continuous integration (CI) and for End - to-End automation for all build and deployments.
- Performed Automation and Scaling of applications using Kubernetes.
- Experience with Ansible Tower to manage Multiple Nodes and Manage Inventory for different Environments.
- Used Ansible to orchestrate software updates and verify functionality.
- Terraforming and CloudFormation to code all infrastructures into Azure, AWS and GCP.
- Organized different infrastructure resources like physical machines, VMs and even containers using Terraform.
- Worked on Terra form to set up the AWS infrastructures such as launching the EC2 instances, S3buckets, VPC, Subnets, and created the module driven AWS Infrastructure with Terraform
- Used Ticketing & Project Management tools like Jira, Team Foundation Server in DevOps, Service Now, and HPQC.
- Knowledgeable on scripting languages as Python hands on experienced.
- Worked as an independent contributor, Self-motivated and energetic professional, Strong Organizational skills and ability to do multitask, ability to quickly acquire an in-depth knowledge of the company's product and systems
- Regular maintenance of cluster with Hadoop-balancer and monitoring of clusters with Sensu and Python Flask Monitoring Tool.
PROFESSIONAL EXPERIENCE
Confidential
Bigdata Architect/Lead Engineer
Responsibilities:
- Data pipelines, Terraform, Lambda functions, Aws AppSync GraphQL
- Involved in designing and deploying multi-tier applications using all the AWS services like (EC2, Route53, S3, RDS, Dynamo DB, SNS, SQS, IAM) focusing on high-availability, fault tolerance, and auto-scaling in AWS Cloud Formation
- Extensive expertise using the core Spark APIs and processing data on a EMR cluster
- Worked on ETL pipeline to source these tables and to deliver this calculated ratio data from AWS to Datamart (SQL Server) & Credit Edge server
- Experience in using and tuning relational databases (Oracle, MySQL) and columnar databases (SQL Data Warehouse)
- Worked with Business to define, identify & implement quickwins for Get2Zero GTZ program which will deliver incremental value to the business, in collaboration with FedEx.
- Gathered and documented detailed business requirements to identify, and prioritize quick wins.
- Engaged with the PSE team to determine the exact scope of quick wins to be delivered.
- Assessed and requested for any infrastructure and environments required to implement the prioritized quick wins.
- Working with Business in requirements gathering and prepare the functional requirements document. Analyzing the requirements and providing the estimation for the project based on the business request.
- Design Cloud Architecture on AWS, Spin up cluster for developers during data processing, cleaning and analysis.
- Working on the Data Model & technical design & implementation for Hive ETL & Big Data Hadoop projects.
- Being part of critical model, design, software development and code reviews for decision making and maintain quality standards.
- Worked with Infrastructure teams DBA, SAP BW, Middleware, and UNIX in setting up environments during different levels of software lifecycle.
- Working on performance tuning of the Big Data components to meet the SLA which is critical for the customer.
- Install and configure different tools like Jupyter notebook, Redshift, python libraries, Spark etc. Prepare data for consumption into tableau visualization layer.
- Developed AWS data pipeline, SNS for automating the dunning process on cloud.
- Project Management for Onsite, Offshore team for assigning tasks and report development work.
Confidential
Big data Lead
Responsibilities:
- Designing, implement/deployment of highly available, cost-efficient, fault-tolerant, scalable systems (Linux and Windows) on AWS platform.
- Identifying performance bottlenecks and implementing remedies
- Designing and implement automation tools for system administration achieving AWS best practice.
- Researching and implement for known/new IT products.
- Executed an automated backup management system for disaster recovery based on Python Boto3 using AWS + Lambda +SNS + EC2 + S3 services.
- Taking AMIs recursively.
- Purging AMIs and corresponding snapshots based on customized retention policy.
- Sending email/SMS notification on success/failure
- Tagging chargeable resources for precise billing purpose.
- Escalation points for Level 1 and 2 engineers
- Network setup and deployment (Network and Telecommunication Engineer)
- Setup of Access Points, Routers, Switches, and servers
- Experience with Cisco switches and routers, Sophos UTM.
- Managing WAN/outage alerts from SolarWinds.
- Provisioning Wireless access points
- Setup and troubleshooting DHCP and DNS. VPN configuration for remote access.
Confidential
Bigdata Lead Engineer
Responsibilities:
- Developed Spark applications using Scala utilizing Data frames and Spark SQL API for faster processing of data.
- Developed highly optimized Spark applications to perform various data cleansing, validation, transformation, and summarization activities according to the requirement
- Data pipeline consists of Spark, Hive and Sqoop and custom build Input Adapters to ingest, transform and analyse operational data.
- Developed Spark jobs and Hive Jobs to summarize and transform data.
- Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
- Involved in converting Hive/SQL queries into Spark transformations using Spark Data Frames and Scala.
- Implemented system wide monitoring and alerts.
- Installed & configured Hive, Impala, Oracle BigData Discovery, Hue, Apache Spark, Tika, Tika Tesseract, Sqoop, Spark sql etc.
- Importing and exporting data into MapRFS and Hive using Sqoop.
- Used Bash Shell Scripting, Sqoop, AVRO, Hive, Impala, HDP, Pig, Java, Map/Reduce daily to develop ETL, batch processing, and data storage functionality.
- Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
- Worked on loading all tables from the reference source database schema through Sqoop. Worked on designed, coded and configured server side J2EE components like JSP, AWS and JAVA.
- Collected data from different databases (i.e. Oracle, My Sql) to Hadoop. Used CA Workload Automation for workflow scheduling and monitoring.
- Used Hive to analyse the partitioned and bucketed data and compute various metrics for reporting.
- Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
- Scheduled and executed workflows in Oozie to run various jobs.
Confidential
System Analyst
Responsibilities:
- Participated in a project with a small team to install Latest Windows OS on company computers and provided necessary Support
- Supported numerous company activities by setting up and configuring laptops, projectors, video teleconference, intranet and internet connections
- Tasked with monitoring, organizing and editing help desk tickets in trouble ticket system
- Monitor the performance of the computer systems and address issues as they arise
- Provide technical support for software reconfigurations to aid in function customization
- Test software performance throughout the desktop network to ensure peak performance
- Install computer hardware and software on desktops to keep versions current