We provide IT Staff Augmentation Services!

Cloud Engineer Resume

Atlanta, GA

SUMMARY:

  • Around 6+ years of experience in Big data and Cloud technologies as a Data Engineer professional and using industry - accepted tools, methodologies, and procedures along with driving business solutions for the Client. Proficient knowledge in statistics, mathematics, and analytics.
  • Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
  • Create clusters in Google Cloud and manage the clusters using Kubernetes(k8s). Using Jenkins to deploy code to Google Cloud, create new namespaces, creating Docker images, and pushing them to the container registry of Google Cloud.
  • Hands of experience in GCP, Big Query, GCS bucket, G - cloud function, cloud dataflow, Pub/suB cloud shell, GSUTIL, BQ command-line utilities, Data Proc, Stack driver
  • Exposure to Data Lake Implementation using Apache Spark and developed Data pipelines and applied business logic using Spark.
  • Hands-on ETL experience from Heterogenous sources using Informatica and SISS
  • Experience working with Big Data and Cloud Technologies and like Hadoop, Hive, Teradata, Informix, and Google Cloud Platform using Big Query.
  • In-depth experience in using various Hadoop Ecosystem tools like HDFS, MapReduce, Yarn, Pig, Hive, Sqoop, Spark, Storm, Kafka, Oozie, Elastic search, HBase, and Zookeeper.
  • Experience in coding SQL/PL SQL using Procedures, Triggers, and Packages.
  • Experience in the practical implementation of cloud-specific AWS technologies including IAM, Amazon Cloud Services like Elastic Compute Cloud (EC2), ElastiCache, Simple Storage Services (S3), Cloud FormationVirtual Private Cloud (VPC), Lambda, EBS.
  • Created Hive tables, Hive joins & HQL for querying the databases eventually leading to complex Hive UDFs.
  • Imported data from RDBMS into HDFS using Sqoop and vice-versa.
  • Created User Defined Functions (UDFs), User Defined Aggregated Functions (UDAFs) in PIG and Hive.
  • Automated different workflows, which are initiated manually with Python scripts and Linux bash scripting.
  • Hands-on expertise in working and designing of Row keys and Schema Design with NoSQL databases like Mongo DB 3.0.1, HBase, Cassandra, and DynamoDB.
  • Professional working experience in Machine Learning algorithms such as Linear Regression, LogisticRegression, Naive Bayes, Decision Trees, K-MeansClustering, and Association Rules.
  • Well experienced in Normalization, De-Normalization, and Standardization techniques for optimal performance in relational and dimensional database environments.
  • Well versed in Atlassian tools like Github, Pivotal Tracker, Confluence and JIRA.
  • Experience with best practices of Web services development and Integration (both REST and SOAP).
  • Experienced in using build tools like Ant, Gradle, SBT, Maven to build and deploy applications into the server.
  • Experience in complete Software Development Life Cycle (SDLC) in both Waterfall and Agile methodologies.
  • Knowledge in Creating dashboards and data visualizations using Tableau, Power BI, and Looker to provide business insights.
  • Excellent communication, interpersonal, problem-solving skills, and very good team player along with a will-do attitude and ability to effectively communicate with all levels of the organization such as technical, management, and customers.

TECHNICAL SKILLS:

Programming languages & tools: Hadoop, Python, R, SQL, SAS, Teradata, MS Excel, Informix, Linux/Unix.

Visualization tools: Looker, Tableau, Power BI, MS PowerPoint, MS Excel Data Analysis, and PivotTables.

Analytical skills: Business Intelligence, Agile, Scrum, Regression, Decision Tree, Random Forest Forecasting, Testing, Data modeling, Data Validation, Data Warehousing, Data Marts, Data Manipulation, Scikit-learn, Pandas, NumPy, Probabilistic model, Metadata organization, Data cleansing, SQL Windows function, Joins, Case Statements, Slicing/Dicing.

Certification: Google Analytics, Business Analyst, and Agile Requirements foundation.

Interpersonal skills: Leadership, Teamwork, Relationship Building, Decision making, Problem-solving

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Cloud Engineer

Responsibilities:

  • Using g-cloud function with Python to load Data into Bigquery for on arrival CSV files in GCS bucket.
  • Write a program to download a SQL Dump from there equipment maintenance site and then load it in the GCS bucket. On the other sideload this SQL dump from GCS bucket to MYSQL (hosted in Google Cloud SQL) and load the Data from MYSQL to Bigquery using Python, Scala, spark, and Dataproc.
  • Process and load bound and unbound Data from Google pub/subtopic to Bigquery using cloud Dataflow with Python.
  • Create firewall rules to access Google Data proc from other machines.
  • Write a Scala program for spark transformation in Dataproc.
  • Experience in building and architecting multiple Data pipelines, end to end ETL and ELT process for Data ingestion and transformation in GCP and coordinate task among the team.
  • Design and architect various layers of Data lake and Design star schema in Big Query
  • Loading salesforce Data every 15 min on an incremental basis to BIGQUERY raw and UDM layer using SOQL, Google DataProc, GCS bucket, HIVE, Spark, Scala, Python, Gsutil And Shell Script.
  • Using rest API with Python to ingest Data from and some other site to big query.
  • Build a program with Python and apache beam and execute it in cloud Dataflow to run Data validation between a raw source file and Bigquery tables.
  • Building a Scala and spark based configurable framework to connect common Data sources like MYSQL, Oracle, Postgres, SQL Server, Salesforce, Bigquery, and load it in Bigquery.
  • Monitoring Bigquery, Dataproc, and cloud Data flow jobs via Stackdriver for all the environment.
  • Open SSH tunnel to Google DataProc to access to yarn manager to monitor spark jobs.
  • Managed regular alignment meetings with clients and 7 members offshore Data Scientist team to keep pace on agile development using pivotal trackers to track the development and deliver the projects on time.

Confidential, Bentonville, AR

Senior Data Engineer

Responsibilities:

  • Develop business architecture using requirements such as scope, processes, alternatives, and risks.
  • Analyze the client’s business requirements and processes through document analysis, interviews, workshops, and workflow analysis.
  • Conduct 5+ levels of testing including functional, regression, user acceptance, integration, and performance to verify the client’s needs are met
  • Communicate client’s business requirements by constructing easy-to-understand data and process models
  • Engage client to gather software requirements/business rules, and ensure alignment with development teams
  • Translate stakeholder requirements into over 10 different tangible deliverables such as functional specifications, user cases, user stories, workflow/process diagrams, data flow/data model diagrams.
  • Identify and reconcile errors in client data to ensure accurate business requirements.
  • Draft and maintain business requirements and align them with functional and technical requirements.
  • Created, Maintained and Supervised a feature development framework for cross-functional teams
  • Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Experience using Spark improving the performance and optimization of the existing algorithms in Hadoop.
  • Managed and reviewed Hadoop log files to identify issues when a job fails and used HUE for UI based Hive script execution, Automic scheduling.
  • Involved in creating data-lake by extracting customer's data from various data sources to HDFS which include data from Excel, databases, and log data from servers.
  • Automated workflows using shell scripts pull data from various databases into Hadoop and developed scripts to automate the process and generate reports.
  • Developed SparkSQL automation components and responsible for modifying java components to directly connect to the server.
  • Used various Spark Transformations and actions for cleansing the input data and involved in using the Spark application master to monitor the Spark jobs and capture the logs for the spark jobs.
  • Refactored the existing spark batch process for different logs written in Scala.
  • Ensured that target was achieved through scrum meetings and tracked development status using JIRA and Confluence
  • Worked extensively on data migration from one Hadoop cluster to another and delivered end to end solution
  • Performed Data validation between Hadoop and Teradata to get the right source.
  • Converted a tool written in R to pyspark in order to run in one of the Hadoop clusters.
  • Setup data ETL using HIVE and data pipelines
  • Utilize GCP client libraries with Node app, load bigdata in Google BigTable, and BigQuery.
  • Migrate the unstructured data to Google cloud storage using GoogleSDK.
  • Experience in Microsoft Azure Cloud and handled Data Lake AZURE, Blob and used different tools like Data Factory, Azcopy.
  • Use logic apps to ingest data (excel sheets) from share point and put it blob and to get attachments from emails and copy it to Blob.
  • Experience in Azure Blob and ADLS services for Storage purposes.
  • Experience using Spark improving the performance and optimization of the existing algorithms in Hadoop.
  • Create data-lakes by extracting customer’s data from various data sources which include data from Excel, databases, and log data from servers.
  • Build new data pipelines, identify existing data gaps, and provide automated solutions to deliver analytical capabilities and enriched data to applications.

Confidential, Alpharetta, GA

Data Engineer\Data Science

Responsibilities:

  • Used data to analyze in-depth paid marketing channel performance using SAS, R, SQL, and Excel.
  • Advised the team on patterns and relationships in the data to recommend campaign strategies.
  • Created customer segments using RFM methodology as a base to create CLTV.
  • Developed hypotheses and built predictive models for acquisition, repeat and referral rates, & customer lifetime value using probabilistic models.
  • Created, maintained, and delivered tailored dashboards and reports for key stakeholders and worked with the tech teams to build data and reporting systems using data marts and Power BI.
  • Improved data mining processes, resulting in a 20% decrease in time needed to infer insights from customer data used to develop marketing strategies
  • Used predictive analytics such as machine learning and data mining techniques to forecast company sales of new products with a 95% accuracy rate
  • Developed ETS for data sources used for reporting by sales, inventory, and marketing departments
  • Designed a long term obfuscation solution using AWS Glue for data transformation in python.
  • Wrote Lambda functions in python to trigger where file lands in the raw zone S3 bucket and then to zone S3 bucket.
  • Designed and developed ETL Processes in AWS Glue to migrate Campaign data from external sources like S3,
  • ORC/Parquet/Text Files into AWS Redshift.
  • Created external tables with partitions using Hive, AWS Athena, and Redshift.
  • Managed security groups on AWS, focusing on high-availability, fault-tolerance, and auto-scaling using
  • Terraform templates. Along with Continuous integration and continuous deployment with AWS Lambda and
  • AWS code pipeline.
  • Utilized AWS Cloud watches to monitor the environment for operational & performance metrics during load testing.
  • Performed integration of various data sources like RDBMS, Spreadsheets, Text files, JSON, and XML files.
  • Published Power BI reports on dashboards in the Power BI server.
  • Designed and developed Power BI graphical and visualization solutions based on business requirements.
  • Guided proper structuring of data for Power BI reporting.
  • Migrated databases from Oracle/SQL server to AWS RDS PostgreSQL using AWS Data Migration Service (DMS) tool.
  • Responsible for writing triggers in PostgresSQL using DBvizualizer.
  • Restored an RDS Instance to a point in time using snapshots.
  • Actively participated in daily stand up and sprint meetings and reviews using Agile methodology.

Confidential

Business Data Engineer

Responsibilities:

  • Design and deploy dashboards (Power BI and Tableau) that helps business leads manage growing marketing activities.
  • Responsible for the health of our B2B and B2C marketing database. Ensure data integrity and accuracy. Execute the marketing clean-up project internally and serve as the liaison with external data vendors
  • Setup data ETL using Microsoft SQL Server Integrated Services(SSIS) and data pipelines that update autonomously to be consumed by analysts across the global business marketing organizations.
  • Design and implement reporting dashboards that track key metrics and performance trends and provide actionable insights to marketing leadership.
  • Used Excel to create and gather reports with heavy data to present to higher management (Pivot tables, V-lookups, create reports using Access, and formulas to manipulate data).
  • Work with the marketing manager to develop advanced analytics techniques using different reporting tools and testing processes (A/B and multi-variant testing)
  • Coordinating with the marketing and sales team in recommending the new products launched. Sales support through literature, presentations and launch documents
  • Create, manage and distribute data and insights via regular and ad-hoc reporting utilizing analytics tools (in house or proprietary) and reporting/analysis platforms (MicroStrategy, SQL, Excel)
  • Conducted PPM - Pre-production and progress review meetings (Weekly) with the Production and Q.A. department.
  • Compiled data and generated reports needed to conduct product planning during plant closure, ensuring accuracy, and appropriate inventory management.
  • Participated in the process improvement team to audit AS400 transactions, providing recommendations that expedited issue resolution and reduced cost.
  • Taking part in the development and improvement of IT systems as it relates to Costing data and tools (with support of IT team) also involved in Providing clients technical expertise and support on sales and energy efficiency program development.
  • Generating business intelligence reports using pivot tables and pivot charts from cleaned data.
  • Utilized data analysis and visualization tools Power BI to deliver actionable insights for business development, created customizable programs for 5 major clients to increase retention
  • Provide clients technical expertise and support on sales and energy efficiency program development
  • Using SQL for organizing and abstracting data from MS access databases and created reports.
  • Generating business intelligence reports using pivot tables and pivot charts from cleaned data.
  • Utilized data analysis and visualization tools to deliver actionable insights for business development, created customizable programs for 5 major clients to increase retention.
  • Worked closely with IT, Sales, and Marketing teams to automate and optimize key workflows to identify costs, results, and opportunities; streamlined business resources by standardizing business processes and improving internal system integration.

Hire Now