We provide IT Staff Augmentation Services!

Senior Big Data Developer Resume

3.00/5 (Submit Your Rating)

New York City Ny New York City, NY

SUMMARY

  • Over 6 years of professional IT experience including 3 years’ on Big Data ecosystems and 3 years on software development with continuous work experience within Java.
  • Industry expertise in financial services, banking, and technologies.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, MapReduce, High Availability and YARN architecture, as well as good understanding of workload management, schedulers, scalability and distributed platform architectures.
  • Proficient in Python and Scala within Apache Spark
  • Technical expertise in Big data/Hadoop HDFS, Map Reduce Spark, Hive, Pig, Sqoop, Flume, Oozie, Kafka, NoSQL.
  • Experience developing MapReduce jobs with Java API in Hadoop.
  • Experience developing Spark applications using Scala and Python.
  • Extensive experience importing and exporting data using Sqoop from HDFS/Hive/HBase to Relational Database Systems (RDBMS) and vice versa.
  • Experience in collecting, aggregating and moving large amounts of streaming data using Flume, Kafka, RabbitMQ, Spark Streaming.
  • Extensive experience writing Pig scripts and Hive Queries for processing and analyzing large volumes of data structured in different level.
  • Strong experience writing custom UDFs in Java for HIVE and Pig to extend the functionality.
  • Exposure on the HBase distributed database and the ZooKeeper distributed configuration service.
  • Strong in core java, data structure, algorithms design, Object - Oriented Design(OOD) and Java components like Collections Framework, Exception handling, I/O system.
  • Strong Database Experience on RDBMS (SQL Server, MySQL) with PL/SQL programming skills in creating Packages, Stored Procedures, Functions, Triggers & Cursors.
  • Strong experience in Machine Learning and Data Mining by using R and Python.
  • Experience in data visualization using Tableau, Talend, SSIS, Qlik Sense, and Microstrategy.
  • Experience in Agile, Waterfall, and Scrum Development environments by using JIRA and Jenkins.

TECHNICAL SKILLS:

Big Data Eco-system: \Languages Hadoop 2.6.3, Spark 1.6.20, MapReduce 1.0\Java 15/1.6/1.7/1.8, Python 2.7.12/3.5.2\ YARN, Hive 2.0.0, Pig 0.15.0, Hbase 1.1.4, Scala 2.11.5, R, SQL, HiveQL, Pig-Latin Flume 1.5.0, Sqoop 1.4.6, ZooKeeper 3.5.2\

Kafka: 0.10.1.1, RabbitMQ 3.6.6, Oozie 4.2\NoSQL

Databases: MongoDB Cassandra, HBase\

Relational Databases: MySQL 5.6.x, SQL Server 2005/2008/2012\

Business Intelligence: Oracle 11g/10g/9i/, PostgreSQL 8.0\Tableau, Talend, SSIS, Qlik Sense\Microstrategy\

Web Technologies: J2EE (Servlets, JSP, Struts 2, Spring 4.0-\

Others: JDBC, ODBC, Hibernate 4)\Cloudera, Distributed Hadoop\ (Django), HTML/CSS, XML, Javascript\Zookeeper 3.4.5, Eclipse IDE, NetBeans 8.1

PROFESSIONAL EXPERIENCE:

Confidential, New York City, NY

Senior Big Data Developer

Responsibilities:

  • Extensively involved in installation and configuration of Cloudera Distribution Hadoop platform.
  • Extract, transform, and load (ETL) data from multiple federated data sources (JSON, relational database, etc.) with DataFrames in Spark.
  • Utilized SparkSQL to extract and process data by parsing using Datasets or RDDs in
  • HiveContext, with transformations and actions (map, flatMap, filter, reduce, reduceByKey).
  • Extend the capabilities of DataFrames using User Defined Functions in Python and Scala.
  • Resolve missing fields in DataFrame rows using filtering and imputation.
  • Integrate visualizations into a Spark application using Databricks and popular visualization libraries (ggplot, matplotlib).
  • Train analytical models with Spark ML estimators including: linear regression, decision trees, logistic regression, and k-means.
  • Perform pre-processing on a dataset prior to training, including: standardization, normalization.
  • Create pipelines to create a processing pipeline including transformations, estimations, evaluation of analytical models.
  • Evaluate model accuracy by dividing data into training and test datasets and computing metrics using evaluators.
  • Tune training hyper-parameters by integrating cross-validation into pipelines.
  • Compute using Spark MLlib functionality not present in SparkML by converting DataFrames to RDDs and applying RDD transformations and actions.
  • Troubleshoot and tune machine learning algorithms in Spark.

Environment: Spark 1.6.2, Spark Mllib, Spark ML, Hive 1.2.1, Sqoop 1.4.6, Flume 1.5.0, HBase 1.1.4, MySQL 5.6, Scala 2.11.x, Pyspark 1.4.0, Shell Scripting, Tableau 9.2, Agile

Confidential, Boston, MA

Big Data Engineer

Responsibilities:

  • Responsible for building scalable distributed data solutions usingHadoop.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on theHadoopcluster.
  • Developed simple to complex Map/Reduce jobs using Java, and scripts using Hive and Pig.
  • Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) for data ingestion and egress.
  • Implemented business logic by writing UDFs in Java and used various UDFs from other sources.
  • Experienced on loading and transforming of large sets of structured and semi structured data.
  • Managing and ReviewingHadoopLog Files, deploy and MaintainingHadoopCluster.
  • Export filtered data into HBase for fast query.

Environment: Hadoop, HBase, Hive, Pig, Map Reduce, Sqoop, Oozie, Eclipse, Java

Confidential, Boston, MA

Big Data Developer

Responsibilities:

  • Involve in meeting and release, working closely with my teammates and managers.
  • Developed on Hadoop technologies including HDFS, MapReduce2, YARN, Hive, HBase, Sqoop, Spark Streaming and RabbitMQ.
  • Translated, loaded and streamed disparate data sets in multiple formats/sources including Arvo, JSON delivered by Kafka queue, RabbitMQ, Flume etc.
  • Translated functional and technical requirements into detail programs running on Hadoop MapReduce and Spark.
  • Migrated traditional database code to distributed system code (mainly HiveQL).
  • Migrated data between RDBMS and HDFS/Hive with Sqoop.
  • Used HBase for scalable storage and fast query.
  • Involved in application performance tuning and troubleshooting.

Environment: Hadoop, HBase, MapReduce, Spark, Flume, Sqoop, Kafka, RabbitMQ, Hive

Confidential, NY

Database Developer

Responsibilities:

  • Involved in system design, which is based on Spring Struts Hibernate framework.
  • Implemented the business logic in standalone Java classes using core Java.
  • Developed database (SQL Server) applications.
  • Worked in Spring Hibernate Template to access the SQL Server database.
  • Design, implementing, and test new features by using T-SQL programming.
  • Optimize existing data aggregation and reporting for better performance.
  • Perform varied analyses to support organization and client improvement.

Environment: Eclipse, SQL Server 2012, spring, HTML, JavaScript, Hibernate, JSF, Junit, SDLC: Agile/Scrum

Confidential, NJ

Software Engineer

Responsibilities:

  • Designed and coded application components with JSP, Servlet and AJAX.
  • Implemented data persistency using JDBC for database connectivity and Hibernate for database/java object mapping.
  • Designed the logical and physical data model, generated DDL, DML scripts.
  • Designed user-interface and used JavaScript to check validations.
  • Wrote SQL queries, stored procedures and database triggers as required on the database objects.

Environment: Java, XML, Hibernate, SQL Server, Maven2, JUnit, J2EE (JSP, Java beans, DAO), Eclipse, Apache Tomcat Server, Spring MVC, Spiral Methodology

Confidential, NJ

Software Engineer

Responsibilities:

  • Developed the data parsing system on XML.
  • Developed the system UI using Java Swing.
  • Developed with Struts/Hibernate frameworks as MVC layer.
  • Developed front end application using HTML/CSS, JavaScript, JSP.
  • Developed SQL queries using Oracle database.

Environment: Eclipse, MySQL Client 4.1, JSP, HTML, JavaScript, spring, Hibernate, SDLC: Agile/Scrum

We'd love your feedback!