We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

5.00/5 (Submit Your Rating)

Boston, MA

SUMMARY

  • Over 10 years of IT experience as Big Data/Hadoop Developer in all phases of Software Development Life Cycle including Java/J2EE technologies.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume.
  • Well versed experience in Amazon Web Services (AWS) Cloud services like EC2, S3, EBS, RDS and VPC.
  • Improved the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark - SQL, Data Frame, Pair RDD's, Spark YARN.
  • Proficient in Core Java, Enterprise technologies such as EJB, Hibernate, Java Web Service, SOAP, REST Services, Java Thread, Java Socket, Java Servlet, JSP, JDBC etc.
  • Good exposure to Service Oriented Architectures (SOA) built on Web services (WSDL) using SOAP protocol.
  • Written multiple MapReduce programs in Python for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
  • Experience in working on the Hadoop Eco system, also have little experience on installing and configuring of the Hortonworks distribution and Cloudera distribution (CDH3 and CDH4).
  • Experience in NoSQL database HBase, MongoDB and Cassandra.
  • Good understanding of Hadoop architecture and hands on experience with Hadoop components such as Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming.
  • Expertise in Data Migration, Data Profiling, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools such as Informatica Power Centre.
  • Experience in designing, building and implementing complete Hadoop ecosystem comprising of Map Reduce, HDFS, Hive, Impala, Pig, Sqoop, Oozie, HBase, MongoDB, and Spark.
  • Experience with Client-Server application development using Oracle PL/SQL, SQL PLUS, SQL Developer, TOAD, and SQL LOADER.
  • Strong experience with architecting highly per formant databases using PostgreSQL, PostGIS, MySQL and Cassandra.
  • Extensive experience in using ER modeling tools such as Erwin and ER/Studio, Teradata, BTEQ, MLDM and MDM.
  • Experienced on R and Python for statistical computing. Also experience with MLlib (Spark), Matlab, Excel, Minitab, SPSS, and SAS
  • Experience in importing and exporting data between HDFS and RDBMS using Sqoop.
  • Extracted & processed streaming log data from various sources and integrated in to HDFS using Flume.

TECHNICAL SKILLS

  • Big data/Hadoop: Hadoop 2.7/2.5, HDFS1.2.4, Map Reduce, Hive, Pig, Sqoop, Oozie, Hue, Flume, Kafka and Spark 2.0/2.0.2
  • NoSQL Databases: HBase, MongoDB3.2 & Cassandra
  • Java/J2EE Technologies: Servlets, JSP, JDBC, JSTL, EJB, JAXB, JAXP, JMS, JAX - RPC, JAX- WS
  • Programming Languages: Java, Python, SQL, PL/SQL, AWS, HiveQL, Unix Shell Scripting, Scala
  • IDE and Tools: Eclipse 4.6, Netbeans 8.2, BlueJ
  • Database: Oracle 12c/11g, MYSQL, SQL Server 2016/2014
  • Web Technologies: HTML5/4, DHTML, AJAX, JavaScript, jQuery and CSS3/2, JSP, Bootstrap 3/3.5
  • Application Server: Apache Tomcat, Jboss, IBM Web sphere, Web Logic
  • Operating Systems: Windows8/7, UNIX/Linux and Mac OS.
  • Other Tools: Maven, ANT, WSDL, SOAP, REST .
  • Methodologies: Software Development Lifecycle (SDLC), Waterfall, Agile, STLC (Software Testing Life cycle), UML, Design Patterns (Core Java and J2EE)

PROFESSIONAL EXPERIENCE

Confidential - Boston, MA

Sr. Big Data Developer / Datawarehouse Developer

Responsibilities:

  • Worked as a Sr. Big Data Developer with Hadoop Ecosystems components.
  • Developed Big Data solutions focused on pattern matching and predictive modeling.
  • Involved in Agile methodologies, daily scrum meetings, spring planning.
  • Primarily involved in Data Migration process using Azure by integrating with GitHub repository and Jenkins.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
  • Involved in designing the row key in HBase to store Text and JSON as key values in HBase table and designed row key in such a way to get/scan it in a sorted order.
  • Strong working experience in the data analysis, SQL design and development, implementation and testing of data warehousing using extraction, transformation and loading (ETL) Tools and Teradata.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Java Persistence API (JPA) framework for object relational mapping which is based on POJO Classes.
  • Responsible for fetching real time data using Kafka and processing using Spark and Scala.
  • Worked on Kafka to import real time weblogs and ingested the data to Spark Streaming.
  • Developed business logic using Kafka Direct Stream in Spark Streaming and implemented business transformations.

Environment: Agile, Hadoop 3.0, MS Azure, MapReduce, Java, MongoDB 4.0.2, HBase 1.2, JSON, Scala 2.12, Oozie 4.3, Zookeeper 3.4, J2EE, Python 3.7, JQuery, NoSQL, MVC, Struts 2.5.17, Hive 2.3

Confidential - Dallas, TX

Sr. Hadoop Developer

Responsibilities:

  • Worked on Spark SQL to handle structured data in Hive.
  • Involved in making Hive tables, stacking information, composing hive inquiries, producing segments and basins for enhancement.
  • Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.
  • Worked on complex MapReduce program to analyses data that exists on the cluster.
  • Analyzed substantial data sets by running Hive queries and Pig scripts.
  • Written Hive UDFs to sort Structure fields and return complex data type.
  • Worked in AWS environment for development and deployment of custom Hadoop applications.
  • Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.

Environment: HDFS, MapReduce, Storm, Hive, Pig, Sqoop, MongoDB, Apache Spark, Python, Accumulo, Oozie Scheduler, Kerberos, AWS, Tableau, Java, UNIX Shell scripts, HUE, SOLR, GIT, Maven.

Confidential - Franklin Lakes, NJ

Sr. Big Data Engineer/Datawarehouse Developer

Responsibilities:

  • As a Sr. Big Data Engineer worked on Big Data technologies like Apache Hadoop, MapReduce, Shell Scripting, Hive.
  • Involved in all phases of SDLC using Agile and participated in daily scrum meetings with cross teams
  • Wrote complex Hive queries to extract data from heterogeneous sources (Data Lake) and persist the data into HDFS.
  • Involved in all phases of data mining, data collection, data cleaning, developing models, validation and visualization.
  • Developed the code to perform Data extractions from Oracle Database and load it into AWS platform using AWS Data Pipeline.
  • Installed and configured Hadoop ecosystem like HBase, Flume, Pig and Sqoop.
  • Designed and develop Big Data analytic solutions on a Hadoop-based platform and engage clients in technical discussions.
  • Created SSIS package to extract, validate and load data into Data warehouse.
  • Worked on Hive Table creation and Partitioning
  • Installed, Configured and Maintained the Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, HBase, Zookeeper and Sqoop.

Environment: Hive 2.3, MapReduce, Hadoop 3.0, HDFS, Oracle, Spark 2.3, HBase 1.2, Flume 1.8, Pig 0.17, Sqoop 1.4, Oozie 4.3, Python, PL/SQL, NoSQL, SSIS, SSRS, Visio, AWS Redshift, Teradata, Python, SQL, PostgreSQL, EC2, S3, Windows, Pl/Sql

Confidential - Florham Park, NJ

Jr. Data Analyst/Data Modeler

Responsibilities:

  • Worked with Business Analysts team in requirements gathering and in preparing functional specifications and translating them to technical specifications.
  • Worked with Business users during requirements gathering and prepared Conceptual, Logical and Physical Data Models.
  • Worked with supporting business analysis and marketing campaign analytics with data mining, data processing, and investigation to answer complex business questions.
  • Developed scripts that automated DDL and DML statements used in creations of databases, tables, constraints, and updates.
  • Planned and defined system requirements to Use Case, Use Case Scenario and Use Case Narrative using the UML (Unified Modeling Language) methodologies.
  • Gather all the analysis reports prototypes from the business analysts belonging to different Business units; Participated in JAD sessions involving the discussion of various reporting needs.

Environment: PL/SQL, Erwin 8.5, MS SQL 2012, OLTP, ODS, OLAP, SSIS, Transact-SQL, Teradata SQL Assistant

We'd love your feedback!