Sr. Big Data Developer Resume Boston, MA - Hire IT People

SUMMARY

Over 10 years of IT experience as Big Data/Hadoop Developer in all phases of Software Development Life Cycle including Java/J2EE technologies.
Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume.
Well versed experience in Amazon Web Services (AWS) Cloud services like EC2, S3, EBS, RDS and VPC.
Improved the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark - SQL, Data Frame, Pair RDD's, Spark YARN.
Proficient in Core Java, Enterprise technologies such as EJB, Hibernate, Java Web Service, SOAP, REST Services, Java Thread, Java Socket, Java Servlet, JSP, JDBC etc.
Good exposure to Service Oriented Architectures (SOA) built on Web services (WSDL) using SOAP protocol.
Written multiple MapReduce programs in Python for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
Experience in working on the Hadoop Eco system, also have little experience on installing and configuring of the Hortonworks distribution and Cloudera distribution (CDH3 and CDH4).
Experience in NoSQL database HBase, MongoDB and Cassandra.
Good understanding of Hadoop architecture and hands on experience with Hadoop components such as Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming.
Expertise in Data Migration, Data Profiling, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools such as Informatica Power Centre.
Experience in designing, building and implementing complete Hadoop ecosystem comprising of Map Reduce, HDFS, Hive, Impala, Pig, Sqoop, Oozie, HBase, MongoDB, and Spark.
Experience with Client-Server application development using Oracle PL/SQL, SQL PLUS, SQL Developer, TOAD, and SQL LOADER.
Strong experience with architecting highly per formant databases using PostgreSQL, PostGIS, MySQL and Cassandra.
Extensive experience in using ER modeling tools such as Erwin and ER/Studio, Teradata, BTEQ, MLDM and MDM.
Experienced on R and Python for statistical computing. Also experience with MLlib (Spark), Matlab, Excel, Minitab, SPSS, and SAS
Experience in importing and exporting data between HDFS and RDBMS using Sqoop.
Extracted & processed streaming log data from various sources and integrated in to HDFS using Flume.

TECHNICAL SKILLS

Big data/Hadoop: Hadoop 2.7/2.5, HDFS1.2.4, Map Reduce, Hive, Pig, Sqoop, Oozie, Hue, Flume, Kafka and Spark 2.0/2.0.2
NoSQL Databases: HBase, MongoDB3.2 & Cassandra
Java/J2EE Technologies: Servlets, JSP, JDBC, JSTL, EJB, JAXB, JAXP, JMS, JAX - RPC, JAX- WS
Programming Languages: Java, Python, SQL, PL/SQL, AWS, HiveQL, Unix Shell Scripting, Scala
IDE and Tools: Eclipse 4.6, Netbeans 8.2, BlueJ
Database: Oracle 12c/11g, MYSQL, SQL Server 2016/2014
Web Technologies: HTML5/4, DHTML, AJAX, JavaScript, jQuery and CSS3/2, JSP, Bootstrap 3/3.5
Application Server: Apache Tomcat, Jboss, IBM Web sphere, Web Logic
Operating Systems: Windows8/7, UNIX/Linux and Mac OS.
Other Tools: Maven, ANT, WSDL, SOAP, REST .
Methodologies: Software Development Lifecycle (SDLC), Waterfall, Agile, STLC (Software Testing Life cycle), UML, Design Patterns (Core Java and J2EE)

PROFESSIONAL EXPERIENCE

Confidential - Boston, MA

Sr. Big Data Developer / Datawarehouse Developer

Responsibilities:

Worked as a Sr. Big Data Developer with Hadoop Ecosystems components.
Developed Big Data solutions focused on pattern matching and predictive modeling.
Involved in Agile methodologies, daily scrum meetings, spring planning.
Primarily involved in Data Migration process using Azure by integrating with GitHub repository and Jenkins.
Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
Involved in designing the row key in HBase to store Text and JSON as key values in HBase table and designed row key in such a way to get/scan it in a sorted order.
Strong working experience in the data analysis, SQL design and development, implementation and testing of data warehousing using extraction, transformation and loading (ETL) Tools and Teradata.
Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
Used Java Persistence API (JPA) framework for object relational mapping which is based on POJO Classes.
Responsible for fetching real time data using Kafka and processing using Spark and Scala.
Worked on Kafka to import real time weblogs and ingested the data to Spark Streaming.
Developed business logic using Kafka Direct Stream in Spark Streaming and implemented business transformations.

Environment: Agile, Hadoop 3.0, MS Azure, MapReduce, Java, MongoDB 4.0.2, HBase 1.2, JSON, Scala 2.12, Oozie 4.3, Zookeeper 3.4, J2EE, Python 3.7, JQuery, NoSQL, MVC, Struts 2.5.17, Hive 2.3

Confidential - Dallas, TX

Sr. Hadoop Developer

Responsibilities:

Worked on Spark SQL to handle structured data in Hive.
Involved in making Hive tables, stacking information, composing hive inquiries, producing segments and basins for enhancement.
Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.
Worked on complex MapReduce program to analyses data that exists on the cluster.
Analyzed substantial data sets by running Hive queries and Pig scripts.
Written Hive UDFs to sort Structure fields and return complex data type.
Worked in AWS environment for development and deployment of custom Hadoop applications.
Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.

Environment: HDFS, MapReduce, Storm, Hive, Pig, Sqoop, MongoDB, Apache Spark, Python, Accumulo, Oozie Scheduler, Kerberos, AWS, Tableau, Java, UNIX Shell scripts, HUE, SOLR, GIT, Maven.

Confidential - Franklin Lakes, NJ

Sr. Big Data Engineer/Datawarehouse Developer

Responsibilities:

As a Sr. Big Data Engineer worked on Big Data technologies like Apache Hadoop, MapReduce, Shell Scripting, Hive.
Involved in all phases of SDLC using Agile and participated in daily scrum meetings with cross teams
Wrote complex Hive queries to extract data from heterogeneous sources (Data Lake) and persist the data into HDFS.
Involved in all phases of data mining, data collection, data cleaning, developing models, validation and visualization.
Developed the code to perform Data extractions from Oracle Database and load it into AWS platform using AWS Data Pipeline.
Installed and configured Hadoop ecosystem like HBase, Flume, Pig and Sqoop.
Designed and develop Big Data analytic solutions on a Hadoop-based platform and engage clients in technical discussions.
Created SSIS package to extract, validate and load data into Data warehouse.
Worked on Hive Table creation and Partitioning
Installed, Configured and Maintained the Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, HBase, Zookeeper and Sqoop.

Environment: Hive 2.3, MapReduce, Hadoop 3.0, HDFS, Oracle, Spark 2.3, HBase 1.2, Flume 1.8, Pig 0.17, Sqoop 1.4, Oozie 4.3, Python, PL/SQL, NoSQL, SSIS, SSRS, Visio, AWS Redshift, Teradata, Python, SQL, PostgreSQL, EC2, S3, Windows, Pl/Sql

Confidential - Florham Park, NJ

Jr. Data Analyst/Data Modeler

Responsibilities:

Worked with Business Analysts team in requirements gathering and in preparing functional specifications and translating them to technical specifications.
Worked with Business users during requirements gathering and prepared Conceptual, Logical and Physical Data Models.
Worked with supporting business analysis and marketing campaign analytics with data mining, data processing, and investigation to answer complex business questions.
Developed scripts that automated DDL and DML statements used in creations of databases, tables, constraints, and updates.
Planned and defined system requirements to Use Case, Use Case Scenario and Use Case Narrative using the UML (Unified Modeling Language) methodologies.
Gather all the analysis reports prototypes from the business analysts belonging to different Business units; Participated in JAD sessions involving the discussion of various reporting needs.

Environment: PL/SQL, Erwin 8.5, MS SQL 2012, OLTP, ODS, OLAP, SSIS, Transact-SQL, Teradata SQL Assistant

We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

Boston, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship