We provide IT Staff Augmentation Services!

Big Data Developer Resume

2.00/5 (Submit Your Rating)

Ei Paso, TexaS

SUMMARY

  • 6+ Years of experience with emphasis on Big Data Technologies, Development, and Design of Java based enterprise applications
  • 4 years of experience in Hadoop Developer in Big Data/Hadoop technology development and 3 years of Java Application Development
  • Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Expertise in the Hadoop ecosystem components like Map Reduce, HDFS, MapR-FS, Hive, Spark Streaming, Oozie, Pig, Kafka, Flume, NiFi, Impala, Storm, Zookeeper.
  • Hands on databases like Sybase, Oracle, MS SQL, DB2 and developing in RDBMS that includes SQL queries, Stored procedures and triggers.
  • Expertise also in NoSQL databases like MongoDB, HBase, MapR-DB and Cassandra and its integration with Hadoop cluster.
  • Exposure towards simplifying and automating big data integration with graphical tools and wizards that generate native code using Talend.
  • Worked as Java developer with great experience on Java Libraries, API’s and frameworks.
  • Expertise in Java/J2EE technologies such as Core Java, Struts, Restful, Soap Web services JDBC, JSP, JSTL, HTML, JavaScript, JSON.
  • Worked with the famous frameworks like hibernate and MVC. Expertise with MVC, Spring Core, IOC, Spring-MVC, JDBC, Web modules.
  • Extensive experience in importing and exporting data using stream processing platforms like Flume, Storm and Kafka.
  • Worked in the different environments like Hortotnworks, Cloudera Hadoop and MapR Distributions.
  • Hands on programming in MapReduce by suing Java and Scala data cleaning and preprocessing.
  • Implemented discretization and binning, data wrangling: cleaning, transforming, merging and reshaping data frames.
  • Good knowledge on Amazon Web Services(AWS), Amazon Cloud Services like Elastic Compute Cloud(EC2).
  • Experienced in Amazon EC2 setting up instances, virtual private cloud (VPCs), and security groups.
  • AWS EC2 and Cloud watch services. CI/CD pipeline management through Jenkins. Automation of manual tasks using Shell scripting.
  • Experience with Application Servers and Web Servers such as Web Logic, J boss, Web Sphere and Apache.
  • Good Knowledge in writingSpark ApplicationsinScala Python(Pyspark)
  • Experienced in migratingETLtransformations using Pig Latin Scripts, transformations, join operations.
  • Expert in scheduling spark jobs with Airflow.
  • Hands on experience in ETL Services like AWS GLUE.
  • Flexible with Unix/Linux and Windows Environments working with Operating Systems like Centos 5/6, Ubuntu 13/14, Cosmos.
  • Strong in working in both Agile Scrum and Waterfall SDLC methodologies.
  • Adept in Agile/Scrum methodology and familiar with SDLC life cycle from requirement analysis to system study, designing, testing, debugging, documentation, and implementation.
  • Capable to finish the tasks before the time with the proven abilities along with good communication skills.
  • Collaboration in team work with strong analytical and problem-solving skills.
  • Included leadership skills, technically competent, research-minded.

TECHNICAL SKILLS

Big Data / Hadoop: HDFS, MapR-FS, Map Reduce, HBase, Kafka, Pig, Hive, Sqoop, Impala, Flume, Talend, Cloudera, Hortonworks, MapR, MongoDB, NiFi, Scala, Cassandra, MapR-DB Oozie and Zookeeper

Real-time/Stream Processing: Apache Storm, Apache Spark

Operating Systems: Windows, Unix and Linux

Programming Language: C, Java/J2EE, JDBC, JSP and JavaScript, JQuery, SQL, Shell Script

Data Base: Oracle 9i/10g, SQL Server, MS Access

IDE Development Tools: Eclipse, NetBeans

Java Technologies: Servlets, JSP, Struts, Spring, Web Services, Hibernate, JDBC

Methodologies: Agile, Scrum and Waterfall

PROFESSIONAL EXPERIENCE

Confidential, EI Paso, Texas

Big Data Developer

Responsibilities:

  • Analyzing the business requirements thoroughly, from the Business Partners
  • Part of the team installed and configured Hadoop Map Reduce and HDFS
  • Implemented NiFi flow topologies to perform cleansing operations before moving data into HDFS
  • Worked with the Apache NiFi flow to perform the conversion of Raw XML data into ORC files, Parquet files
  • Developed custom code to read the messages of the IBM MQ and to dump them onto the NiFi Queues
  • Used Flume to create FANIN and FANOUT multiplexing flows and custom interceptors for data conversion along with Flume
  • Installed Oozie workflow engine to run multiple Hive, Shell Script, Sqoop, pig and Java jobs
  • Setting up and managing Kafka for stream processing and Broker and topic configuration and creation
  • Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, HBase and Hive by integrating with Storm
  • Created, altered and deleted topics using Kafka Queues when required with varying
  • Using HIVEmapside/skew join queries to join multiple tables of a source system and load them into Elastic Search Tables
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to Elastic MapReduce jobs
  • Creating HiveUDFs in java, compiling them into jars and adding them to the HDFS and executing them with Hive Queries
  • Coded custom processors in NiFi and implemented consumers and producers for Kafka topics
  • Developing a Data ingestion workflow using tools like NiFi for HBase Ingestions
  • Worked on performance tuning of Apache NiFi workflow to optimize the data ingestion speeds
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS
  • AWS EC2 and Cloud watch services. CI/CD pipeline management through Jenkins. Automation of manual tasks using Shell scripting.
  • Worked with AWS data pipeline using DynamoDB.
  • Proficient in data modelling with Hive partitioning, Indexing, bucketing and other optimization techniques in Hive
  • Involved in Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Java
  • Involved in importing metadata into Hive using Java and migrated existing tables and applications to work on Hive
  • creating entities in Scala and Java along with named queries to interact with database
  • Implemented Spark using Scala and SparkSQL for faster testing and processing of data
  • Developed expandable rest client for the restful web services using Spring MVC, Java
  • Developed a framework of RESTful web services using Spring MVC, JPA and APIs to help Hadoop developers to automate data quality checks
  • Designed HBase schema to avoid Hot spotting and exposed the data from HBase tables to REST API on UI
  • Data storage in HBase using Pig and Involved in Parsing of data using Pig
  • Involved in implementation of script to transform information from Oracle to HBase using Sqoop
  • Development activities have been carried out by using Agile Methodologies

Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, J2EE, Eclipse, ORC, Parquet, NiFi, HBase, Kafka, Oozie, HBase, Zookeeper, Spring boot, Spring core, Spark RDD, Spark SQL and Spark streaming.

Confidential

Big Data Developer

Responsibilities:

  • Involved in Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups
  • Worked with the environment of MapR Distribution with MapR-File System.
  • Used MapR Streams with the Apache Kafka API.
  • Also involved with HDFS, Apache Hadoop and MapReduce APIs.
  • Used MapR-DB for analytical applications, operational operations and real-time sharing.
  • Implemented Spark Streaming to read real-time data from Kafka in parallel and processed in parallel and save the result as parquet format in Hive
  • Involved in complete analysis between Avro, Parquet file, ORC file and decided to go with parquet format
  • Worked on the module of the project related to on-boarding of data using Apache NiFi
  • Participated in Map Reduce Programs those are running on the cluster and Log files are managed and reviewed
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
  • Done Proof of Concept in Apache Nifi workflow in place of Oozie to automate the tasks of loading the data into MapR-FS and pre-processing with Pig
  • Used Nifi to copy the data from local file system to HDFS
  • Used NiFi to automate the data movement between different Hadoop systems
  • Part of team in plug-in for Hadoop that provides the ability to use MongoDB as an input source and an output destination for MapReduce, Hive and Pig
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed
  • Involved in data modeling and sharing and replication strategies in MongoDB
  • Worked in integration part of storing data from Rest to MongoDB
  • Involved migrations process from Hadoop java map-reduce program to Spark-Scala APIs
  • Writing Scala programs to create Spark RDDs & SQL data frame to load processed data into RDBMS for mortgage analysis dashboard
  • Composing the application classes as Spring Beans using Spring IOC/Dependency Injection.
  • Designed and Developed server side components using Java, REST
  • Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement the former in project
  • Optimize Hive/Impala Queries for fast results and pushing data from impala to micro strategy
  • Used Impala as the primary analytical tool for allowing visualization servers to connect and perform reporting on top Hadoop directly
  • Used Impala and Tableau to create various reporting dashboards
  • Implemented Spark RDD transformations to Map business analysis and apply actions on top of transformations
  • Created Spark based Talend Bigdata Integration jobs to do lighting speed analytics over the spark cluster
  • Involved in migrating MapReduce jobs into Spark jobs and used Spark SQL and Data frames API to load structured data into Spark clusters
  • Participated on establishing connectivity between Tableau and Spotfire
  • Responsible for design development of Spark, SQL Scripts bases on Functional Specifications
  • Developed Oozie workflow for scheduling and orchestrating the ETL process

Environment: MapR Distribution, MapR-File System, MapReduce APIs Flume, NiFi, Pig, Tableau, Scala, Spark RDD, Spark SQL, Hive, Impala, Oozie, Kafka, MongoDB, MapR-DB, SQL, Spring MVC, Spring IOC.

Confidential

Hadoop/Java Developer

Responsibilities:

  • Developing requirements Proto according to the Business requirements, definition and business process flows
  • Involved in requirements gathering and documenting the functional specifications
  • Worked with installation, configuration and testing on several Hadoop ecosystem components like Hive, Pig, HBase, and Sqoop.
  • Worked on MapReduce jobs in java for the development in order to make data preprocessing and cleaning.
  • Developed the Oozie workflow to automate the tasks loading data into HDFS with the help of Pig by preprocessing.
  • Monitored Hadoop cluster using tools like Cloudera manager, managing and scheduling the jobs on Hadoop cluster.
  • Developed Kafka producer, consumer, HBase clients, spark jobs along with the components Hive, HDFS.
  • Worked and had knowledge on co-ordination services through Zookeeper.
  • Analysis and Design of the Object models using JAVA/J2EE Design Patterns in various tiers of the application
  • Developers using the framework builds the graphical components and define actions, popup menus in XML
  • Developed server side code in Servlet and JSP and Designed and implemented a Struts framework for Swing
  • Designed the use cases, sequence diagrams, class diagrams and Activity diagrams
  • Developed Struts framework that are written using JSP and the presentation layer and use JavaScript for client validations
  • Creation Test plan and Development and coding of Test classes and Test Cases
  • Execution of Test cases in Jbuilder. Defect fixing and client communication & Query resolution
  • Testing of the product using Regression Testing, Unit Testing and Integration Testing
  • Created struts-config file and resource bundles for Distribution module using Struts Framework
  • Worked on core java for multithreading, arrays and Developing the JSPs for the application
  • Designed and developed code for MVC architecture using Struts framework using Servlets, JSPs
  • Designed the application by implementing JSF Framework based on MVC Architecture with EJB, simple Java Beans as a Model, JSP and JSF UI Components as View and Faces Servlet as a Controller
  • Involved in deployment of local initialization of application in Oracle WebLogic server

Environment: Java, J2SE5.0, Struts, Servlets, JSP, Eclipse, Oracle 8i, CouchDb, Oracle, XML, HTML/DHTML, Jbuilder.

Confidential

Junior Java Developer

Responsibilities:

  • Implemented Action classes, ActionFrom classes for the entire Reports module using Struts framework
  • Created tile definitions, struts-config files and resource bundles using Struts framework
  • Working with Core java while implementing multithreading and executing in struts framework
  • Used to work with OOPS concepts and memory concepts like string pools
  • Used Eclipse for writing code for HTML java, Servlets, JSP and JavaScript
  • Implemented various design patterns like, MVC, Factory, Singleton
  • Deployed and tested the JSP pages in Tomcat server
  • Developed Struts Framework Action Servlets classes for Controller and developed Form Beans for transferring data between Action class and the View Layer
  • Developed the front-end UI using JSP, HTML, JavaScript, CSS
  • Implemented Struts validators framework to validate the data
  • Used Java Message Service (JMS) for reliable and asynchronous exchange of important information, such as loan status report, between the clients and the bank
  • Developed interfaces using HTML, JSP pages and Struts -Presentation View
  • Used Hibernate for object-relational mapping and for database operations in Oracle

Environment: Java, Servlets, Core java, Multi-Threading, Struts, Hibernate, UML, Oracle, Tomcat, Eclipse, Windows XP.HTML, CSS, JSP.

We'd love your feedback!