Aws Redshift / Hadoop Developer Resume
North Brunswick, NJ
PROFESSIONAL SUMMARY:
- 7+ year experience in Big Data Hadoop ecosystem and Business Intelligence, Data Visualization, ETL, Data warehousing, Data mining and Data Modeling.
- Hands on experience in AWS cloud services like S3, EMR, EC2, RDS and Redshift.
- Good understanding/knowledge of Hadoop Architecture.
- Used various spark Transformations and Actions for cleansing the input data.
- Hands - on experience on major components in Hadoop ecosystem including HDFS, MapReduce, Hive, Pig, Spark, HBase, Sqoop, Flume, Kafka, NiFi and Talend.
- Experience in Python programming for ETL, data analyzing and visualization.
- Set up standards and processes for Hadoop based application design and implementation.
- Experienced in developing MapReduce programs in JAVA for working with Big Data.
- Good experience in optimizing Map Reduce algorithms using Mappers, Reducers, combiners and partitioners to deliver the best results for the large datasets.
- Worked on Spark SQL and Dataframes for faster execution of Hive queries using Spark Sql context.
- Experience ingesting both structured and unstructured Data using Flume into HDFS from Legacy systems including Streaming Data.
- Good expertise working with different types of data including semi/un-structured data.
- Knowledge of job workflow scheduling and monitoring using Oozie.
- Worked on NoSQL databases including HBase, MongoDB.
- Experience in processing different file formats like XML, JSON and sequence file formats.
- Good Experience in creating Business Intelligence solutions using Tableau.
- Worked on large datasets by using Pyspark.
- Good Experience in Agile/Scrum methodologies, Test-Driven Development and Waterfall methodologies.
- Exposure to Java development projects.
TECHNICAL SKILLS:
Bigdata Ecosystem: Hadoop, Map Reduce, HDFS, Hive, Pig, Tez, Zookeeper, Sqoop, Flume, Kafka, Spark, NiFi, and Oozie.
Programming Languages: Java, Scala, Python, C# .NET.
No SQL Databases: HBase, Cassandra, MongoDB
Methodologies: Agile/Scum, Waterfall
Platforms: Windows, RedHat Linux, Ubuntu Linux
Relational Databases: SQL Server, Oracle and Teradata.
Reporting Tools: Power BI, Tableau.
SQL Server Tools: SQL Server 2016/2014/2012/2008R2, SSIS/SSRS/SSASCloud: Amazon AWS (S3, Redshift, EMR, DynamoDB, Lambda), Azure (Blob, HDInsight, Data Factory)
PROFESSIONAL EXPERIENCE
AWS Redshift / Hadoop Developer
Confidential, North Brunswick, NJ
Responsibilities:
- Database Migration from SQL Server to Amazon Redshift using AWS DMS and SCT. Worked on AWS Data Pipeline to configure data loads from S3 into Redshift.
- AWS Redshift performance tuning and optimization by using correct distkey (even, key or all) and sortkey (single sortkey, compound sortkey, interleaved sortkey)
- Developed spark ETL pipelines for data extraction, cleansing, transformations and custom aggregations from various sources (testing data, weblog and Twitter) with different formats (JSON, CSV, and XML) on AWS EMR.
- Hands on Experience in working with Hadoop ecosystems like HDFS, Hive, Pig, Spark, Sqoop, Flume, Oozie.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Written Hive and Pig queries for data analysis to meet the business requirements.
- Involved in creating tables, partitioning, bucketing of table and creating UDF’s in Hive.
- Experience with Hive Queries Performance Tuning.
- Well experienced with implementing Join operations using Pig Latin.
- Involved in writing data transformations, data cleansing using Pig operations.
- Knowledge in NoSQL databases like HBase, MongoDB, and DynamoDB.
- Knowledge on latest tools like Kafka, Spark, and Scala.
Environment: MapReduce, HDFS, Pig, Hive, Spark, HBase, Flume, Scala, Python, Java, Sqoop, Oozie, Zookeeper, Talend, Spark, Spark Streaming, Azure (Blob, HDInsight), Amazon AWS (S3, Redshift, EMR, DynamoDB, Lambda), Agile/Scrum environment.
SQL Server Developer
Confidential, Morgantown, WV
Responsibilities:
- Fine-tuned Stored Procedures using Execution Plan in T-SQL for better performance.
- Created, maintained databases, tables to support the application with good normalization, indexes to have better performance.
- Developed Transact-SQL (T-SQL) Queries, User Defined Functions, Views and Stored Procedures.
- Created User Defined Functions to encapsulate frequently and commonly used business logic.
- Used Joins and sub-queries to simplify complex queries involving multiple tables and also optimized the procedures and triggers to be used in production.
- Performed software installations and upgrades, monitored database performance, performed capacity planning and SQL Server clustering, managed database quality assurance including database consistency check.
- Created logins, users in MS SQL Server and assigned proper roles to maintain the security.
- Generated Reports using Global Variables, Expressions and Functions along with concatenating strings based on the requirements by using SSRS.
- Optimized code and conducted Performance Tuning in databases by using SQL Profiler, Database Tuning Advisor based on the requirement created new indexes and updating Statistics.
Environment: MS SQL Server 2008R2/2012, SSIS, SSRS, C#.NET, Microsoft Visual Studio 2008, Windows Server 2008, SQL Profiler, Database Tuning Advisor, SharePoint.