AWS Bigdata Engineer Resume

PROFESSIONAL SUMMARY:

Professional Bigdata Engineer with 2.5 years of industry experience in Bigdata technologies. Expertise in Hadoop/Spark and AWS Bigdata environment with Outstanding communication skills and proven to work in a fast - paced environment. Enthusiastically seeking a Data Engineer role where I can leverage existing skills and learn new ones.

TECHNICAL SKILLS:

Programming languages: Python, Scala, C, C++, Java

RDBMS/NoSQL: MySQL, PostgreSQL, HBase, DynamoDB

Hadoop Ecosystem: Hive, Pig, Sqoop, oozie, Impala, MapReduceFlume

Apache Spark: Spark Core, Data frames, SparkSQL, Spark Streaming, Kafka

Operating systems: Linux, Windows.

Machine Learning: Scikit learn, Numpy, Pandas, linear regression, polynomial regression, KNN, Logistic regression, KNN, Naive Bayes, LDA, PCA, Xgboost, Adaboost.

Visualization tools: Tableau, Quick sight

ETL tools: Talend

Cloud Technologies: AWS, Google Cloud Platform (GCP)

WORK HISTORY:

Confidential

AWS Bigdata Engineer

Responsibilities:

In-depth Understanding/knowledge of Hadoop Architecture and various components such as HDFS, Application master, Name Node, Data Node, and MapReduce concepts
Migrated the existing data from mainframes/SQL server to AWS Hadoop/EMR cluster and perform ETL jobs on it
Implemented scalable data pipelines in spark (Scala) to perform data transfer, aggregation, transformation mining and passed the data to S3 and Redshift
Developed prototype to offer the order history function on user mobile app using kinesis data streams and dumped the data into dynamo DB
Implemented the transaction alarm if any unexpected orders has placed and eventually alarm the user by using use Kinesis data streams and Kinesis data analytics to monitor our incoming orders and use a lambda to function to fire off alarms using Amazon SNS to user cell phone when something unusual happens
Store the log servers and order history in Redshift and used Quick sight for analysis and created different visualizations
Developed the prototype of analysis of server log data using kinesis firehose to pump the data into the Amazon Elastic Search and perform analysis using Kibana
Implemented the Amazon machine learning algorithm to predict the quantity a user might want to order for a specific item so we can automatically suggest using kinesis firehose and S3 data lake

Confidential

Hadoop/Spark Developer

Responsibilities:

Developing spark programs using Scala API for faster testing and processing of Data
Transforming and retrieving the data by using Spark, Impala, Pig, Hive
Imported/Exported Data from AWS S3 into spark RDD and perform transformations and actions on RDD's
Experienced in performance tuning of Spark Applications for setting right Batch Interval time, the correct level of Parallelism and memory tuning
Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and others during the ingestion process itself
Responsible for developing a data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS
Used SQOOP to import/export data from various RDBMS to HDFS cluster and Hive tables and designed daily SQOOP incremental jobs into tables
Involved in Creating hive tables loading with data and perform different HQL queries
Worked with various file formats like Avro, Parquet and ORC, sequence file and various compression formats like snappy
Developed data pipeline using Flume, Sqoop, pig to ingest customer behavioral data and purchase histories into HDFS for analysis

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship