AWS BIG DATA ENGINEER Resume Souderton, PA - Hire IT People

SUMMARY

Over 10 years of IT experience as a Developer, Designer & QA Engineer with cross - platform integration experience using Hadoop Ecosystem, Java and functional automation
Hands on experience in installing, configuring and architecting Hadoop and Hortonworks clusters and services - HDFS, MapReduce, Yarn, Pig, Hive, Hbase, Spark, Sqoop, Flume and Oozie
2+ years of experience in Cloud platform (AWS).
2+ Years of Experience on working using Spark Technology.
Expertise on Spark streaming (Lambda Architecture), Spark SQL, Tuning and Debugging the Spark Cluster (MESOS).
Expertise on working with Machine Learning with MLlib using Python.
Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
Hands-on experience with Amazon EC2, Amazon S3, Amazon RDS, Redshift, VPC, IAM, Amazon Elastic Load Balancing, Auto Scaling, Cloud Front, CloudWatch, SNS, SES, SQS and other services of the AWS family.
Selecting appropriate AWS services to design and deploy an application based on given requirements.
Setup/Managing CDN on Amazon Cloud Front to improve site performance.
Expertise on working with MongoDB, Apache Cassandra.
Expertise on Java, J2EE, Java Scripting, HTML, JSP.
Solid programming knowledge on Scala, Python, C#, Ruby
Hands on experience with integrating Rest API's to cloud environment to access resources.
Experience in working with Teradata. And making the data to be batch processing using distributed computing.
Good working experience on Hadoop tools related to Data warehousing like Hive, Pig and also involved in extracting the data from these tools on to the cluster using Sqoop.
Developed Oozie workflow schedulers to run multiple Hive and Pig jobs that run independently with time and data availability.
Good knowledge of High-Availability, Fault Tolerance, Scalability, Database Concepts, System and Software Architecture, Security and IT Infrastructure.
Lead onshore & offshore service delivery functions to ensure end-to-end ownership of incidents and service requests.
Getting in touch with the Junior developers and keeping them updated with the present cutting Edge technologies like Hadoop, Spark, SparkSQL, Presto
All the projects which I have worked for are Open Source Projects and has been tracked using JIRA.
Experience on agile methodologies Scrum.

TECHNICAL SKILLS

Hadoop/Big Data: Hadoop, Map Reduce, HDFS, Zookeeper, Kafka, Hive, Pig, Sqoop, OozieFlume, Yarn, HBase, Spark with Scala
No SQL Databases: HBase, Cassandra, Mongo DB
Scripting Languages: Cassandra, Python, Scala, Ruby, Bash, UNIX shell and JavaScript
Programming Languages: Java, SQL, C#, HTML5, CSS3
Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL
Frameworks: MVC, Struts, Spring, Hibernate
Operating Systems: Linux, Unix, Mac, Windows
Web Technologies: HTML, DHTML, XML
Web/Application servers: Apache Tomcat, WebLogic, JBoss
Databases: Data Warehouse, SQL Server, MySQL, MongoDB, Oracle
IDE: Eclipse, IntelliJ IDEA

PROFESSIONAL EXPERIENCE

AWS BIG DATA ENGINEER

Confidential, Souderton, PA

Responsibilities:

Created S3 buckets, and enforced policies on the IAM roles and customized the JSON template
Launched Amazon EC2 Cloud Instances using AWS (Linux/Ubuntu) and configured launched instances
Managed Amazon Redshift clusters such as launching the cluster and specifying the node types
Used AWS BeanStalk for deploying and scaling web applications and services developed with Java
End to end deployment ownership for projects on AWS. including Python scripting for automation, scalability, builds promotions for staging to production etc.
Hands on with Git / GitHub for code check-ins/checkouts and branching etc.
Implemented and maintained the monitoring and alerting of production and enterprise servers / storage using AWS Cloud Watch
Built Continuous Integration environments using Jenkins and puppet.
Experienced in various AWS services including VPC, EC2, S3, RDS, Redshift, Dynamo DB, Lambda, SNS and SQS.
Designed and deployed multiple applications utilizing almost all the AWS stack including EC2, Route53, S3, RDS, DynamoDB, SNS, SQS, IAM focusing on high-availability, fault tolerance, and auto- scaling in AWS Cloud Formation.
Experienced in installation, configuration and troubleshooting the issues and performance tuning of WebLogic/Apache/IIS and Tomcat.
Written shell scripts for end to end build and deployment automation. Run Ansible Scripts to provision Dev servers.
Created Docker container using Docker images to test the application even ship and run applications.
Leveraged AWS cloud services such as EC2, auto-scaling and VPC to build secure, highly scalable and flexible systems that handled expected and unexpected load bursts.

Environment: Python, UNIX, VMware, Shell, Perl, IAM, S3, EBS, EC2, Cloud Watch, Cloud Formation, Puppet, Docker, Jenkins, Spark, Kafka.

CLOUD BIG DATA ENGINEER

Confidential, New York, NY

Responsibilities:

Worked on installing Kafka on Virtual Machine and created topics for different users
Installed Zookeepers, brokers, schema registry, control Center on multiple machine.
Develop SSL security layers and setup ACL/SSL security for users and assigned multiple topics
Worked on Hadoop cluster and data querying tools Hive to store and retrieve data.
Worked on migrating multiple applications and automating the infrastructure creation using CloudFormation for the new applications.
Used AWS Application Discovery tools to do the Analysis of the existing infrastructure.
Designing and implementation of Public and private facing websites on Amazon web services.
Migrated the Application on to AWS Cloud .
Created and configured Redshift clusters.
Configured EMR Cluster, used Hive script to process the data stored in S3
Created Data-pipelines and configured EMR Cluster to offload the data to Redshift.
Configured Data-pipelines (with EMR Cluster) to offload the data to Redshift.
Infrastructure as Code: Automated the infrastructure creation using AWS Cloud Formation.
Responsible for security including opening different ports on security groups, Network ACL, building Peering connections, NAT instances & VPN connection.
Written various Lambda services using Python and Java for automating some of the tasks.
Used SSM Commands to run the shell script on Linux instances for server startup etc. and invoked the run command using Lambda.

Environment: EC2, Load balancing, Auto Scaling, Route53, VPC, IAM, RDS, Cloud Formation, puppet, Spark, Hive, Kafka.

HADOOP DEVELOPER

Confidential, New York, NY

Responsibilities:

Developed NiFi workflows to automate the data movement between different Hadoop systems.
Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
Developed Spark scripts by using Scala shell commands as per the requirement.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Developed Scala scripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
Imported large datasets from DB2 to Hive Table using Sqoop
Implemented Apache PIG scripts to load data from and to store data into Hive.
Partitioned and bucketed Hive tables and compressed data with Snappy to load data into Parquet hive tables from Avro hive tables
Involved in running all the hive scripts through hive, Impala, Hive on Spark and some through Spark SQL
Developed Spark scripts by using Scala Shell commands as per the requirement.
Worked and learned a great deal from AWS Cloud services like EC2, S3, EBS, RDS and VPC.
Responsible for implementing ETL process through Kafka-Spark-HBase Integration as per the requirements of customer facing API
Worked on Batch processing and real-time data processing on Spark Streaming using Lambda architecture
Developing and maintaining Workflow Scheduling Jobs in Oozie for importing data from RDBMS to Hive
Utilized Spark Core, Spark Streaming and Spark SQL API for faster processing of data instead of using MapReduce in Java
Responsible for data extraction and data integration from different data sources into Hadoop Data Lake by creating ETL pipelines Using Spark, MapReduce, Pig, and Hive.

Environment: Hadoop, HDFS, Pig, Sqoop, Shell Scripting, Ubuntu, Linux Red Hat, Spark, Scala, Hortonworks, Cloudera Manager, Apache Yarn.

We provide IT Staff Augmentation Services!

Aws Big Data Engineer Resume

Souderton, PA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship