Big Data Developer Resume

SUMMARY

Extensive IT experience in Big Data technologies, Data Management/Analytics, Data visualization and java - based enterprise application using JAVA/J2EE.
Worked in domains like finance, e- commerce, healthcare, automotive industries etc.
Extensive experience of Big Data Ecosystem including Hadoop2.X, HDFS, YARN, MapReduce, Spark1.4+, HIVE 2.1, Impala 1.2, Hbase 1.0+, SQOOP 1.4, Flume 1.7, Kafka 1.2+, Oozie 3.0+ and Zookeeper 3.4+
Experience in programming languages, namely: Java 8+, Scala 2.1+, SQL, Python
Experienced with real-time data processing mechanism in Big Data Ecosystem such as Apache Kafka and Spark Streaming
Experienced in Spark Scala API, Spark Python API to transfer, process and analyse data in different formats and structures
Experienced in writing HiveQL, and developing Hive UDFs in Java to process and analyse data
Implemented Sqoop and Flume jobs for large sets of structured and semi-structured data migration between HDFS and/or other data storage like Hive or RDBMS
Conducted transformation of data in formats like Avro and Parquet
Adept at using Sqoop to migrate data between RDBMS, NoSQL and HDFS
Knowledge of Linux/Unix Shell Commands
Good knowledge of scheduling batch job workflow using Oozie
Worked with RDBMS including MySQL, MsSql, Oracle and knowledge of SQL Server 2008 R2 and over and NoSQL databases including HBase, Cassandra and MongoDB
Very Good understanding and Working Knowledge of Object-Oriented Programming (OOPS), Core Java concepts, J2EE 8, JDBC, Javascript and jQuery.
Working knowledge of workflows and ETL batch jobs using SSIS, TSQL, Informatica and Talend
Experience in database design using Stored Procedure, Functions, Triggers and strong experience in writing complex queries for DB2, SQL Server.
Experience with Microsoft Business Intelligence Stack (SSIS,SSRS,SSIS) and BI tools like Power BI and Tableau.
Knowledge of Software Development Life Cycle (SDLC) methodology like Agile, Waterfall
Familiarity with project management tools like GIT, Microsoft Team foundation server 2015+
Knowledge of Unit Testing with ScalaCheck, ScalaTest, JUnit and MRUnit, also used JIRA for basic issue tracking, Jenkins for continuous integration and A/B testing for certain projects.
Excellent interpersonal and communication skills. Creative, data-oriented, problem- shooting, enthusiastic learner

TECHNICAL SKILLS

Hadoop2.X, Spark1.4+, MapReduce, Hive2.1, \ Google Cloud Platform (Dataproc, Compute \
Impala1.2+ Sqoop1.4, Flume1.7, Kafka1.2+, \ Engine, Bucket, SQL), Amazon Web Service \
Hbase1.0+, Oozie3.0+, Zookeeper3.4+\ (EC2, S3, EMR), Databricks Cloud Community\
Java 8+, C++, Scala2.1+\ Linux, Ubuntu, Mac OS, CentOS, Windows\
JavaScript, jQuery, AngularJS, HTML, CSS, \ MySQL5.X, Oracle11g, PostgreSQL9.X, \
Node.js\ Netezza7.X, MongoDB3.2, HBase0.98\
NetBeans, Eclipse, Visual Studio Code IntelliJ \ Python, R, Tableau, Matplotlib, D3.js\
Idea, SQL Server 2008 R2+, Aqua Data Studio\
UNIX Shell, HTML, XML, CSS, JSP, SQL, \ Regression, Decision Tree, Random Forest, \
Markdown\ K-Means, Neural Networks, SVM, NLP\
Agile, Scrum, waterfall\ Git, Microsoft TFS, JIRA, Jenkins\

PROFESSIONAL EXPERIENCE

Confidential

Big Data Developer

Responsibilities:

Helped in pattern maintaince, made changes to the patterns according to changing business requirements like adding a column or editing certian columns
Migrating a set of patterns that were in hive to spark with updated cluster types
AMI upgrades of a number of patterns to the latest cluster type.
Used an inhouse tool in databricks P2T2 to compare the production data to rerun environement(Dev or QC).
Also, used JAMs client to run the patterns in QAPR environment.
Created pre-UAT packages for the AMI upgrades for BA's to check if there are any changes with the production data before sending the pattern to production.
Used Git for version control, JIRA for issue tracking and Jenkins for continuous integration

Environment: Spark SQL, Hive, AWS (S3, EMR), Databricks, Presto, Jams Client, XML, Hadoop, Aqua Data studio

Confidential, Piscataway, NJ

Big Data Engineer

Responsibilities:

Importing and exporting large amount of data using Sqoop and real time data using Flume and Kafka.
Uploaded data to Hadoop HIVE and combined new tables with existing databases
Created various hive external tables, staging tables and joined the tables as per the requirement. Implemented static Partitioning, Dynamic partitioning and Bucketing in Hive using internal and external table.
Written transformations and actions on data frames, used Spark SQL on data frames to access hive tables into spark for faster processing of data.
Developed Spark applications using Scala utilizing Data frames and Spark SQL API for faster processing of data.
Used Talend to migrate historical data from Oracle SQL and SQL Server to HDFS and HIVE
Extracted data from oracle SQL server and MYSQL databases to HDFS using SQOOP
Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
Used Scala to convert Hive/SQL queries into RDD transformations in Apache Spark.
Designed and built a custom and generic ETL framework - Spark application using Scala; for data loading and transformations.
Used Git for version control, JIRA for issue tracking and Jenkins for continuous integration

Environment: Hadoop2.X, Cloudera CDH, HDFS, Java 8+, Python, Scala 2.1+, Spark1.4+, HIVE 2.1, Kafka 1.2+, SQOOP 1.4, Flume 1.7, Talend, Zookeeper 3.4+, Oozie 3.0+, Git, JIRA, Tableau

Confidential

Business Intelligence Developer

Responsibilities:

Experience in T-SQL programming ( DDL, DML ) skills like creating Stored Procedures, User Defined Functions, indexes, Views, Tables
Created aggregate, Merge Join, Sort, Execute SQL Task, Data Flow Task, and Execute Package Task etc to generate underlying data for reports and to export cleaned data from Excel Spreadsheets, Text file, MS Access and CSV files to data warehouse.
Configure and maintain Report Manager and Report Server for SSRS .
Created visual reports like pie charts, bar graphs of company’s product sales using Power BI.

Environment: SQL Server 2017(SSDT) (SSRS)(SSIS), FTP, JIRA, WINDOWS 10, VISUAL STUDIO 2017, MS SQL Server 2016, MS Access, Microsoft Team Foundation Server (TFS)

Confidential

Spark/ Hadoop Developer

Responsibilities:

Installed and configured Apache Hadoop, Hive environment on the prototype server.
Configured MySQL Database to store Hive metadata.
Responsible for loading unstructured data into Hadoop File System (HDFS).
Importing and exporting data into HDFS and Hive using Sqoop.
Supported Map Reduce Programs those are running on the cluster.
Wrote Hive queries for data analysis to meet the business requirements.
Extensively worked with SQL scripts to validate the pre and post data load
Developed Scripts and Batch Job to schedule various Hadoop Program.
Created jobs to load data from MongoDB into Data warehouse.
Wrote Java MapReduce jobs to process the tagging functionality for each chapter, sections and subsections on the data stored in HDFS.

Environment: Hadoop2.X, HDFS, HIVE 2.1, Map Reduce, MySQL, Spark1.4+, Scala 2.1+, SQOOP 1.4

Confidential

Data Analyst

Responsibilities:

Designed and coded certain application modules and components
Designed the logical and physical data model, generated DDL, DML scripts
Designed user-interface and used JavaScript to check validations.
Wrote SQL queries, stored procedures and database triggers on the database objects
Wrote SQL queries to extract data from archives using complex joins
Developed various Java classes, SQL queries and procedures to retrieve and manipulate the data from backend database using JDBC
Analysis and Reporting of data using SSRS
Enabled reporting access to the archives for reporting tools and created documentation

Environment: .Net, C#.net, IIS, ASP, JavaScript, SQL Server 2008R2, SQL-Server SSRS

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship