Hadoop developer Resume

SUMMARY

A Data Engineer with newly acquired skills, an insatiable intellectual curiosity, and the ability to mine hidden gems located within large sets of structured, semi - structured and unstructured data. Able to leverage a heavy dose of mathematics and applied statistics with visualization and a healthy sense of exploration. The skills which I got might be helpful for you and please look at my profile for at least 10 seconds.
TECHNICAL SKILLSHadoop/Big Data Technologies: Spark-Scala, Kafka, Spark Streaming, Mlib, Sqoop, Hbase, HDFS, Map Reduce, Pig, Hive, Zeppelin

(Distributions: Data Bricks, Horton works and Cloudera)

Programming Languages and Scripting: Java (JDK 5/JDK 6), C/C++, Python, Scala, HTML, SQL

Operating Systems : UNIX, Windows, LINUX, Mac OS X

Application Servers : IBM Web sphere, Tomcat

Web technologies : JSP, Servlets, JDBC, Java Script, CSS,

Databases : Oracle9g/10g & MySQL 4.x/5.x, Hbaseon AWS-s3.

Development and BI Tools : TOAD Visio, Rational Rose, Endur, Informatica 9.1.

Data Modelling : Erwin, Visual Studio

Development Methodologies : Agile Methodology -SCRUM, Hybrid.

PROFESSIONAL SUMMARY

Around 5+ years of experience in Analysis, Architecture, Design, Development, Testing, Maintenance and User training of software application which includes over 4+ Years in Big Data, Spark, Hadoop and HDFS environment and around 1 Year of experience in JAVA/J2EE.
3 years hands on experience on Hadoop (HDFS, Map Reduce, PIG, HIVE, and SQOOP etc.).
1+ year’s hands on experience on Spark (1.5, 1.6) & Scala Full stack developer.
HDPCD Certified Spark Developer
Experience in designing and Testing solutions with the Microsoft Azure including HDInsight.
Experience developing Machine Learning Algorithms using Azure ML Studio.
Migration from different databases (i.e. Oracle, DB2 and MYSQL) to Hadoop and Spark with NoSQL databases.
NoSQL database experience with HBase, Cassandra.
Experience using Sqoop to import data into HDFS from RDBMS and vice-versa.
Expertise in Unix-based operating systems
Expertise in developing Spark programs that includes data processing, data management etc.
Experience creating real-time data streaming solutions using Apache Spark core, Spark SQL, Kafka and spark streaming.
Hands on Experience in Data Modelling (Erwin, Visual studio), Data Analysis, Data cleansing and Entity Relationship diagrams (ERD).
Experience in metadata maintenance and enhancing existing Logical and Physical data models.
Extensive experience with ETL and Query with big data tool like Hive QL.
New data science and Machine learning skills to derive actionable insights in industry and beyond.

EXPERIENCE

Data Engineer

Confidential

Responsibilities:

Designed and deployed a Spark cluster and different Big Data analytic tools including Spark, Kafka-ETL streaming, HBase, zeppelin and SQOOP with Cloudera Distribution.
Configured deployed and maintained multi-node Dev and Test Kafka
Integrated kafka with Streaming ETL and done some required ETL on it to extract the meaningful insights.
Developed application components interacting with Hbase.
Performed optimizations on Spark/Scala.
Used the Kafka producer app to publish clickstream events into the Kafka topic and later explored the data with sparkSQL
Processed raw data at scale including writing scripts, web scraping, calling APIs, write SQL queries, etc
Importing streaming logs and aggregating the data to HDFS and MYSQL through Kafka.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Pyspark, Spark-SQL, Data Frame, Pair RDD's and Spark YARN.
Implemented Machine learning algorithms to optimize electrode targeting and parameter settings for deep brain stimulation.
Developed custom Machine Learning (ML) algorithms in Scala and then made available for MLIB in Python via wrappers
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Imported data from different sources like HDFS, MYSQL and other sources through Sqoop and kafka to import streaming logs into Spark RDD
Performed visualization using SQL integrated with Zeppelin on different input data and created rich dashboards
Performed transformations, cleaning and filtering on imported data using Spark-SQL and loaded final data into HDFS and MYSQL database.

Environment: Hadoop, Spark, Pyspark, Spark-SQL, HDFS, MapReduce, Hive, Sqoop, Kafka, HBase, Oozie, Spark - Streaming/SQL, java, SQL Scripting, Linux Shell Scripting, Zeppelin.

Hadoop developer

Confidential

Responsibilities:

Requirement discussions, design the solution.
Estimated the Hadoop cluster requirements
Responsible for choosing the Hadoop components (hive, Impala, map-reduce, Sqoop, flume etc)
Responsible for building scalable distributed data solutions using Hadoop.
Hadoop cluster building and ingestion of data using Sqoop
Imported streaming logs to HDFS through Flume
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
Developed Use cases and Technical prototyping for implementing Hive,and Pig.
Worked in analyzing data using Hive, Pig and custom MapReduce programs in Java.
Implemented partitioning, dynamic partitions and buckets in HIVE
Installed and configured Hive, Sqoop, Flume, Oozie on the Hadoop cluster.
Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
Tuned the Hadoop Clusters and Monitored for the memory management and for the Map Reduce jobs.
Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting.
Developed a custom Framework capable of solving small files problem in Hadoop.
Deployed and administered 70 node Hadoop clusters. Administered two smaller clusters.

Environment: Map Reduce, HBase, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, Java (JDK 1.6), Eclipse

java developer

Confidential

Responsibilities:

Involved in various stages of Enhancements in the Application by doing the required analysis, development, and testing.
Prepared the High and Low-level design document and Generating Digital Signature
For the registration and validation of the enrolling customer developed logic and code.
Developed web-based user interfaces using J2EE Technologies.
Handled Client-Side Validations used JavaScript and
Used Validation Framework for Server-side Validations
Created test cases for the Unit and Integration testing.
Front-end was integrated with Oracle database using JDBC API through JDBC-ODBC Bridge driver at server side.

Environment: Java Servlets, JSP, JavaScript, XML, HTML, UML, Apache Tomcat, Eclipse, JDBC, Oracle 10g.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship