We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

SUMMARY

A Data Engineer with newly acquired skills, an insatiable intellectual curiosity, and the ability to mine hidden gems located within large sets of structured, semi - structured and unstructured data. Able to leverage a heavy dose of mathematics and applied statistics with visualization and a healthy sense of exploration. The skills which I got might be helpful for you and please look at my profile for at least 10 seconds.
TECHNICAL SKILLSHadoop/Big Data Technologies: Spark-Scala, Kafka, Spark Streaming, Mlib, Sqoop, Hbase, HDFS, Map Reduce, Pig, Hive, Zeppelin

(Distributions:  Data BricksHorton works and Cloudera)

Programming Languages and Scripting: Java (JDK 5/JDK 6), C/C++, PythonScala, HTML, SQL

Operating Systems : UNIX, Windows, LINUX, Mac OS X

Application Servers : IBM Web sphere, Tomcat

Web technologies : JSP, Servlets, JDBC, Java Script, CSS,

Databases : Oracle9g/10g & MySQL 4.x/5.x, Hbaseon AWS-s3.

Development and BI Tools : TOAD Visio, Rational Rose, Endur, Informatica 9.1.

Data Modelling : Erwin, Visual Studio

Development Methodologies : Agile Methodology -SCRUM, Hybrid.

PROFESSIONAL SUMMARY

  • Around 5+ years of experience in Analysis, Architecture, Design, Development, Testing, Maintenance and User training of software application which includes over 4+ Years in Big Data, Spark, Hadoop and HDFS environment and around 1 Year of experience in JAVA/J2EE.
  • 3 years hands on experience on Hadoop (HDFS, Map Reduce, PIG, HIVE, and SQOOP etc.).
  • 1+ year’s hands on experience on Spark (1.5, 1.6) & Scala Full stack developer.
  • HDPCD Certified Spark Developer
  • Experience in designing and Testing solutions with the Microsoft Azure including HDInsight.
  • Experience developing Machine Learning Algorithms using Azure ML Studio.
  • Migration from different databases (i.e. Oracle, DB2 and MYSQL) to Hadoop and Spark with NoSQL databases.
  • NoSQL database experience with HBase, Cassandra.
  • Experience using Sqoop to import data into HDFS from RDBMS and vice-versa.
  • Expertise in Unix-based operating systems
  • Expertise in developing Spark programs that includes data processing, data management etc.
  • Experience creating real-time data streaming solutions using Apache Spark core, Spark SQL, Kafka and spark streaming.
  • Hands on Experience in Data Modelling (Erwin, Visual studio), Data Analysis, Data cleansing and Entity Relationship diagrams (ERD).
  • Experience in metadata maintenance and enhancing existing Logical and Physical data models.
  • Extensive experience with ETL and Query with big data tool like Hive QL.
  • New data science and Machine learning skills to derive actionable insights in industry and beyond.

EXPERIENCE

Data Engineer

Confidential

Responsibilities:

  • Designed and deployed a Spark cluster and different Big Data analytic tools including Spark, Kafka-ETL streaming, HBase, zeppelin and SQOOP with Cloudera Distribution.
  • Configured deployed and maintained multi-node Dev and Test Kafka
  • Integrated kafka with Streaming ETL and done some required ETL on it to extract the meaningful insights.
  • Developed application components interacting with Hbase.
  • Performed optimizations on Spark/Scala.
  • Used the Kafka producer app to publish clickstream events into the Kafka topic and later explored the data with sparkSQL
  • Processed raw data at scale including writing scripts, web scraping, calling APIs, write SQL queries, etc
  • Importing streaming logs and aggregating the data to HDFS and MYSQL through Kafka.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Pyspark, Spark-SQL, Data Frame, Pair RDD's and Spark YARN.
  • Implemented Machine learning algorithms to optimize electrode targeting and parameter settings for deep brain stimulation.
  • Developed custom Machine Learning (ML) algorithms in Scala and then made available for MLIB in Python via wrappers
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Imported data from different sources like HDFS, MYSQL and other sources through Sqoop and kafka to import streaming logs into Spark RDD
  • Performed visualization using SQL integrated with Zeppelin on different input data and created rich dashboards
  • Performed transformations, cleaning and filtering on imported data using Spark-SQL and loaded final data into HDFS and MYSQL database.

Environment: Hadoop, Spark, Pyspark, Spark-SQL, HDFS, MapReduce, Hive, Sqoop, Kafka, HBase, Oozie, Spark - Streaming/SQL, java, SQL Scripting, Linux Shell Scripting, Zeppelin.

Hadoop developer

Confidential

Responsibilities:

  • Requirement discussions, design the solution.
  • Estimated the Hadoop cluster requirements
  • Responsible for choosing the Hadoop components (hive, Impala, map-reduce, Sqoop, flume etc)
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Hadoop cluster building and ingestion of data using Sqoop
  • Imported streaming logs to HDFS through Flume
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
  • Developed Use cases and Technical prototyping for implementing Hive,and Pig.
  • Worked in analyzing data using Hive, Pig and custom MapReduce programs in Java.
  • Implemented partitioning, dynamic partitions and buckets in HIVE
  • Installed and configured Hive, Sqoop, Flume, Oozie on the Hadoop cluster.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
  • Tuned the Hadoop Clusters and Monitored for the memory management and for the Map Reduce jobs.
  • Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting.
  • Developed a custom Framework capable of solving small files problem in Hadoop.
  • Deployed and administered 70 node Hadoop clusters. Administered two smaller clusters.

Environment: Map Reduce, HBase, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, Java (JDK 1.6), Eclipse

java developer

  Confidential

Responsibilities:

  • Involved in various stages of Enhancements in the Application by doing the required analysis, development, and testing.
  • Prepared the High and Low-level design document and Generating Digital Signature
  • For the registration and validation of the enrolling customer developed logic and code.
  • Developed web-based user interfaces using J2EE Technologies.
  • Handled Client-Side Validations used JavaScript and
  • Used Validation Framework for Server-side Validations
  • Created test cases for the Unit and Integration testing.
  • Front-end was integrated with Oracle database using JDBC API through JDBC-ODBC Bridge driver at server side.

Environment: Java Servlets, JSP, JavaScript, XML, HTML, UML, Apache Tomcat, Eclipse, JDBC, Oracle 10g.

We'd love your feedback!