Big Data Developer Resume Livonia, MI - Hire IT People

PROFESSIONAL SUMMARY:

Certified Hadoop and Spark Developer from Cloudera and Horton works with 8+ years of experience in developing applications using SQL, Java, Spark, AWS, Big Data
4 years of experience in Big Data tools.
In depth Knowledge and experience in using Hadoop ecosystem tools likeHDFS, MapReduce, YARN, Pig, Hive, Sqoop, Kafka, Flume, Oozie and Zookeeper.
Excellent understanding and extensive knowledge of Hadoop architecture and various ecosystem components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, MapReduce programming paradigm and strong knowledge of Rack awareness topology.
Good usage of Apache Hadoop along enterprise version of Cloudera, Hortonworks and MapR distribution.
Expert in importing and exporting data from different Relational Database Systems like MySQL and Oracle into HDFS and vice - versa.
Hands on experience with data ingestion tools Kafka, Flume and workflow management tools Oozie
Experience on analyzing data in NoSQL databases like Hbase, Cassandra, DynamoDB.
Capable of processing large sets of structured, semi- structed and unstructured data and supporting systems application architecture.
Experience in using data visualization libraries such as Tableau.
Expertise in Extending Hive and Pig core functionality by writing custom UDF’s Spark Core, SQL and Streaming:
Developed Spark applications for data transformations and loading into HDFS using RDD, DataFrames.
Extensive Knowledge on performance tuning of Spark applications and converting Hive/SQL queries into Spark transformations.
Hands on experience handling different file formats like JSON, AVRO, ORC, Parquet and Compression Snappy, zlib, ls4, etc.
Experience in execution of Batch jobs through the data streams to SPARK Streaming.
Hands-on experience with AWS (Amazon Web Services), using Elastic MapReduce (EMR), creating and storing data in S3buckets and creating Elastic Load Balancers(ELB)
Extensive knowledge on creating Hadoop cluster on multiple EC2 instances in AWS and configuring them and using IAM (Identity and Access Management) for creating groups, users and assigning permissions.
Hands-on experience with Data-pipeline, moving data between S3 and DynamoDB.
Extensive programming experience in Java Core concepst like OOPS, Multithreading, Collections and IO.
Extensive experience with UNIX commands, shell scripting and setting up CRON jobs.
Good experience in using Relational databases like Oracle, MySQL etc.
Experience in working with build tools like Maven, SBT, Gradle to build and deploy applications into server.
Expertise in Object Oriented Analysis and Design(OOAD) and knowledge in Unified Modeling Language(UML)
Expertise in complete Software Development Life Cycle (SDLC) in Waterfall and Agile.
Experience in software configuration Management using Git.
Experience in using IDEs like Eclipse, NetBeans and Intellij.
Experience in developing web pages interfaces using JSP, Java Swings and HTML.
Comprehensive Knowledge on software development using Shell scripting, Core Java and Web technologies.
Experience in working with JAVA,JDBC,ODBC,JSP, Servlets, Java Beans,
Developed stored procedures and queries using PL/SQL.
Successfully working in face-paced environment, both independently and in collaborative team environments.
Strong background in mathematics and have very good analytical and problem-solving skills.

TECHNICAL SKILLS:

Bigdata Technologies: HDFS, Map Reduce, Pig, Hive, Sqoop, Oozie, Scala, Spark, Kafka, Flume, Ambari.

Hadoop Frameworks: Cloudera CDHs, Hortonworks HDPs, MAPR.

Database: Oracle 10g/11g, PL/SQL, MySQL, MS SQL Server 2012,DB2

Language: C, C++, Java, Scala, Python

AWS Components: IAH, S3, EMR, EC2,Lambda, Route 53, Cloud Watch, SNS

Development Methodologies: Agile, Waterfall

Build Tools: Maven, Gradle, Jenkins.

NOSQL Databases: HBase, Cassandra, MongoDB, DynamoDB

IDE Tools: Eclipse, NetBeans, Intellij

Modelling Tools: Rational Rose, StarUML, Visual paradigm for UML

Relational DBMS, Client: Server Architecture

Cloud Platforms: AWS Cloud

BI Tools: Tableau

Operating System: Windows 7/8/10, Vista, UNIX, Linux, Ubuntu, Mac OS X

PROFESSIONAL EXPERIENCE:

Confidential, Livonia, MI

Big Data Developer

Responsibilities:

Worked on Horton works-HDP 2.5 distribution.
Involved in review of functional and non-functional requirements.
Involved in importing data from IBM DB2into HDFS using Sqoop and created Hive external tables.
Developed Spark scripts by using Scala as per the requirement.
Load the data into SparkRDD and performed in-memory data computation to generate the output response.
Performed different types of transformations and actions on the RDD to meet the businessrequirements.
Involved in loading data from UNIX file system to HDFS.
Developed multiple Map Reduce jobs inScala for data cleaning and pre-processing.
Involved in managing and reviewing Hadoop log files.
Exported the analyzed data to the relational databases using Sqoop to generatereports for the BI team.
Involved in writing shell scripts for execution of HiveQL.
Analyzed NDW data-based Business logic and visualized data in Tableau.
Deployed Tableau Dashboard in Tableau Server and Scheduled data loading.
Migrated Existing SQL queries into SparkSQL and executed Spark job in cluster mode.
Implemented best offer logic in Sparkby writingSpark UDFsin Scala.
Created Calculated fields in Tableau for aggregation, transformation of data in Tableau.
Involved in scheduling Oozie workflow engine to run Hive queries.

Environment: Horton works, Hadoop, HDFS, Sqoop, Hive, Oozie, Zookeper, NoSQL, Shell Scripting, Scala, Spark, SparkSQL, Tableau, Tableau Server.

Confidential, Middletown, NJ

Hadoop Developer

Responsibilities:

Worked on Hortonworks-HDP 2.5distribution
Responsible for building-scalable distribution data solution using Hadoop
Involved in importing data from MicrosoftSQLServer, MySQL, Teradata. into HDFS using Sqoop.
Played a key role in dynamic partitioning and Bucketing of the data stored in Hive Metadata.
Writing HiveQL queries for integrating different tables for create views to produce result set.
Collected the log data from Web Servers and integrated into HDFS using Flume.
Experienced on loading and transforming of large sets of structed and unstructured data.
Developed MapReduce programs for data cleaning and transformations and load the output into the Hive tables in different file formats.
Written Several MapReduce Jobs in Spark using Scala programming language.
Written MapReduce programs to handle semi structed and un structed data like JSON, Avro data files and sequence files for log files.
Involved in loading data into HBaseNoSQL database.
Building, Managing and scheduling Oozie workflows for end to end job processing
Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
Analyzing of Large volumes of structured data using SparkSQL.
Written shell script to execute HiveQL.
Written Automated shell scripts in Linux/Unix environment using bash.
Analyze Hive log files and fix the issues making sure all the jobs run fine.
Migrated HiveQL queries into SparkSQLto improve performance.
Extracted Real time feed using Spark streaming and convert to RDD and process data into Data Frame and load the data into HBase.
Experienced in using Data Stax Spark connector which is used to store the data into Cassandra databaseor get the data from Cassandra database.
Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.
Imported real time weblogs using Fulme and ingested the data to Spark Streaming.
Used Flume to collect, aggregate and push log data from different log servers.
Extensive experience tuning Hive queries using memory joins for faster execution and appropriating resources.
Implemented Flume, Spark, and Spark Streaming framework for real time data processing.
Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
Generate the reports using Tableau.
Migrated Map Reduce jobs to Spark jobs to achieve better performance.
Used GitHub as repository for committing code and retrieving it and Jenkins for continuous integration.

Environment: Hortonworks, Hadoop, HDFS, Pig, Sqoop, Hive, Oozie, Zookeper, NoSQL, HBase, Shell Scripting, Scala, Spark, SparkSQL, Git, GitHub.

Confidential, Raleigh, NC

Hadoop Developer

Responsibilities:

Worked on Cloudera CDH distribution.
Hand on experience on cloud services like Amazon Web Services (AWS)
Created data pipelines for different events to load the data from DynamoDB to AWS S3 bucket and then into HDFS location.
Developed Sqoop scripts to import data from relational sources and handled incremental loading.
Create Hive external tables for data in HDFS locations.
Written Hive queries for data analysis to meet the business requirements.
Used various Hive optimization techniques likes partitioning, bucketing, Mapjoins and merge small files and vectorization.
Process the complex/nested Json and CSV data using Spark Data Frames.
Used Spark as ETL tool
Developed various Spark jobs with Scala as programming for Data Analysis on Different data formats.
Automatically scale-up the EMR Instances based on the data.
Apply Transformation rules on the top of Data Frames.
Imported real time weblogs using Kafka as a messaging system and ingested the data to Spark Streaming.
Used Kafka to load data into HDFS and move data into NoSQL databases.
Deployed the project on Amazon EMR with S3 Connectivity.
Implemented usage of Amazon EMR for processing Big Data across a Hadoop Cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service(S3).
Loaded the data into Simple Storage Service (S3) in the AWS Cloud.
Good Knowledge in using of Amazon Load Balancer for Auto scaling in EC2 servers.
Executed the Spark jobs in Amazon EMR.
Generated the reports using Tableau.

Environment: Data Pipeline, Hive, Impala, Amazon Elastic Cloud, Amazon Load Balancer, Amazon Simple Storage Service, Amazon EMR. Spark, Spark SQL, Cloudera, Intellij IDE.

Confidential, Portland, Oregon

Hadoop Developer

Responsibilities:

Worked on Cloudera CDH distribution.
Worked on Cluster size of 50-100 nodes.
Loading data from different relational databases to HDFS using Sqoop.
Implemented Daily jobs that automate parallel tasks of loading the data into HDFS using Oozie coordinator jobs.
Involved in review of functional and non-functional requirements.
Created External Hive tables and executed complex Hive queries on them using Hive QL.
Solved performance issues in Hive and Pig Scripts with understanding of Joins, Group and Aggregation and perform the MapReduce jobs.
UsedSparkfor transformations, event joins and some aggregations before storing the data into HDFS.
Troubleshoot and resolve data quality issues and maintain elevated level of data accuracy in the data being reported.
Analyze the large amount of data sets to determine optimal way to aggregate.
Used Oozie to automate/schedule business workflows which invoke sqoop, MapReduce and pig jobs as per the requirements.
Loaded data from UNIX file system to HDFS and written Hive User Defined Functions.
Involved in processing ingested raw data using Apache Pig.
Involved in migrating HiveQL into Impala to minimize query response time.
Used Pig, Spark as ETL tool.
Involved in creating UDF’s in Spark using Scala programming Language.
Monitored continuously and managed the Hadoop cluster using cloudera manager.
Created Hive-Hbase tables for data storage Hive for Meta-Store and Hbase for data storage in Row Key Format.
Implemented Impala to have a better database management system for data stored in computer clusters which are running on Apache Hadoop and also provides less stress on CPU than HIVE.
Worked on different file formats like JSON, AVRO, ORC, Parquet and Compression like Snappy, zlib, ls4 etc.
Invoved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
Gained Knowledge in creating Tableau dashboard for reporting analyzed data.
Expertise with NoSQL databases like HBaseand loaded the data into HBase.
Experienced in managing and reviewing the Hadoop log files.
Used GitHub as repository for committing code and retrieving it and Jenkins for continuous integration.
Involved in creating and maintaining of the technical documentation for MapReduce, Hive, Sqoop, Spark jobs along with Hadoop Clusters and also reviewing them to fix the post production issues.

Environment: HDFS, MapReduce, Sqoop, Hive, Pig, Oozie, Cloudera, MySQL, Eclipse, Spark, Git, GitHub, Jenkins.

Confidential

Java Developer

Responsibilities:

Involved in all the phases of the life cycle of the project from requirement gathering to quality assurance testing.
Developed UML diagrams using Rational Rose.
Involved in developing applications using Java, JSP, Servlets, Swings.
Developed UI using HTML, CSS, Ajax, JQuery and developed Business Logic and Interfacing Components using Business Objects, XML and JDBC.
Created applications, connection pools, deployment of JSP & Servlets.
Used Oracle, MySQL database for storing user information.
Developed backed for application using PHP for web applications.
Used the Eclipse as IDE, configured and deployed the application onto WebLogic application server using Maven build scripts to automate the build and deployment process.

Environment: Java, JSP, Swings, Oracle, HTML, CSS, PHP, Servlets, Eclipse, JSP, Servlets, MySQL.

Confidential

Java Developer

Responsibilities:

Hands on experience in all phases of SDLC (software development life cycle) involving.
Developed UML diagrams using Rational Rose
Created UI for web applications using HTML, CSS.
Created Desktop applications using J2EE, Swings.
Developed the process using Waterfall model.
Created SQL scripts for Oracle database.
Executed test cases manually to verify expected results.
Used JDBC to establish connection between the database and the application.
Involved in designing, coding, debugging, documenting and maintaining the applications.

Environment: Rational Rose, HTML, CSS, J2EE, Swings, SQL, Oracle 9i Java, Servlets.

We provide IT Staff Augmentation Services!

Big Data Developer Resume

Livonia, MI

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship