We provide IT Staff Augmentation Services!

Big Data Consultant Resume

4.00/5 (Submit Your Rating)

Eden Prairie, MN

SUMMARY:

  • Around 9+ years of professional IT experience which includes experience in Big data ecosystem and Java/J2EE related technologies.
  • Good exposure in following all the process in a production environment like change management, incident management and managing escalations
  • Experience with AWS components like Amazon Ec2 instances, S3 buckets and EBS Volumes.
  • Experience in Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node and MapReduce programming paradigm.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Flume and kafka.
  • In depth and extensive knowledge of analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java..
  • Good Knowledge on Hadoop Cluster architecture and working with Hadoop clusters using Cloudera (CDH5) and HortonWorks Distributions.
  • Excellent understanding and knowledge of NOSQL databases like MongoDB and HBase.
  • Extensive knowledge on file formats like AVRO, sequence files, Parquet, ORC and RC .
  • Experience in importing and exporting data using Sqoop to Relational Database Systems and vice - versa.
  • Good knowledge in using job scheduling and workflow designing tools like Oozie.
  • Have good experience creating real time data streaming solutions using Apache Spark, Kafka and Flume.
  • Experienced in Developing Spark application using Spark Core, Spark SQL and Spark Streaming API's.
  • Extending Hive and Pig core functionality by writing custom UDFs .
  • Experience in managing Hadoop clusters using Cloudera Manager Tool and Ambari.
  • Experience in Installing and monitoring standalone multi-node Clusters of Kafka and Storm
  • Very good experience in complete project life cycle (design, development, testing and implementation) Rapid Application Development (RAD), Agile Methodology and Scrum software development processes
  • Highly skilled with object oriented architectures and patterns, systems analysis, software design, effective coding practices, databases, and servers
  • Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster
  • Developer in Big Data team, worked with Hadoop AWS cloud , and its ecosystem.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Experience in Java, JSP, Servlets, WebLogic, WebSphere, Hibernate, Spring, JBoss, JDBC, Java Script, Ajax, Jquery, XML, and HTML.
  • Experience in different compression techniques like Gzip, LZO, Snappy and Bzip2
  • Experience in working with multi/ multiple Operating Systems like Windows, Linux and strong knowledge with troubleshooting, finding and fixing critical problems.
  • Functional knowledge of Banking and Health Insurance domain.
  • Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
  • Proficient with Core JAVA, AWT and also with the markup languages like HTML 5.0,XHTML, DHTML, CSS, XML 1.1, XSL, XSLT, XPath, XQuery, Angular.js, Node.js
  • Worked with version control systems like Subversion, Perforce, and GIT for providing common platform for all the developers.
  • Articulate in written and verbal communication along with strong interpersonal, analytical, and organizational skills.
  • Experience in preparing deployment packages and deploying to Dev and QA environments and prepare deployment instructions to Production Deployment Team.
  • Highly motivated team player with the ability to work independently and adapt quickly to new and emerging technologies.
  • Creatively communicate and present models to business customers and executives, utilizing a variety of formats and visualization methodologies.

TECHNICAL SKILLS:

Big Data Technologies: Apache Hadoop (MRv1, MRv2), Hive, Pig, Sqoop, HBase, MongoDB, Flume, Spark, Zookeeper, Oozie.

Languages: C, Java, SQL/PLSQL.

Methodologies: Agile, Rad, V-model.

Databases: Oracle, MySQL, MongoDB, Hbase, MS SQL server.

Web Technologies: HTML, JSP, JSF, CSS, JavaScript, JSON & AJAX

IDE’s: Eclipse, Netbeans

Build tools: Maven, Ant.

Web services: SOAP & RESTful Web Services

Cloud solutions: Amazon Web Services (AWS)

Monitoring Tools: Wire shark, Nagios, Ganglia

Operating System: Windows, Ubuntu, Red Hat Linux, Cent OS.

Scripting languages: JavaScript, Shell Scripting.

PROFESSIONAL EXPERIENCE:

Confidential, Eden prairie, MN

Big Data Consultant

Responsibilities:

  • Involved in installation and configuration of parcels of various Hadoop eco system components including HDFS, Pig, Hive, Sqoop, Hbase.
  • Evaluate, refine, and continuously improve the efficiency and accuracy of existing Predictive Models.
  • Developed data pipeline using Sqoop to ingest customer behavioural data into HDFS for analysis.
  • Worked on installing cluster, Commissioning & Decommissioning of Datanode, Namenode recovery and capacity planning .
  • Import data into Hive using Sqoop from RDBS systems
  • Written Hive queries to parse the logs and structure them in tabular format to facilitate effective querying on the log data to perform business analytics .
  • Good experience in Hive partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive SerDe.
  • Used Hbase to perform fast, random reads and writes to all data stored and integrate with other components like Hive.
  • Provided production support for cluster maintenance .
  • Triggered workflows based on time or availability of data using Oozie.
  • Monitoring and Debugging Hadoop jobs/Applications running in production.
  • Worked on Providing User support and application support on Hadoop Infrastructure.
  • Install Kafka on Hadoop cluster and configure producer and consumer using JAVA.
  • All the Mterics data is directly published in kafka, where it is consumed by a consumers group called Spark Streaming API .
  •  Load data from various data sources into HBase using Kafka.
  • Performance tuning of Kafka, Storm Clusters. Benchmarking Real time streams.
  • Added authorization to the server using the user’s Kerberos identity to determine which role each was and which operations they could perform
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Implemented Spark using java,scala and utilizing Data frames, Spark SQL API for faster processing of data
  • Used Spark SQL to analyze web logs and use Spark tranformations and actions to compute some statistics for web server monitoring using Scala.

Environment: AWS  , EMR, EC2, S3 , Hortonworks Distribution of Hadoop, HDFS, Oozie, Java (JDK 1.6), Eclipse, MySQL, Kafka, Impala, Spark SQL, Spark Streaming, UNIX Shell Scripting

Confidential, Omaha NE

Hadoop Developer

Responsibilities:

  • Handle the installation and configuration of a Hadoop cluster.
  • Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
  • Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster. 
  • Commission and decommission the Data nodes from cluster in case of problems.
  • Worked on No SQL databases like MongoDB and ingest the data into HDFS
  • Worked on No SQL databases like MongoDB.
  • Understanding concepts like replication, sharding and implementing them using Mongo DB.
  • Worked on NoSQL (HBase) for support enterprise production.
  • Loading data into HBASE using HIVE and SQOOP.
  • Involved in upgradation process of the Hadoop cluster from CDH3 to CDH4
  • Creating Hive tables and working on them using Hive QL.
  • Performed cluster co-ordination and assisted with data capacity planning and node forecasting using Zookeeper.
  • Installed Hadoop, Map Reduce, HDFS,developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing
  • Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs.
  • Pig UDFs for custom data processing (clean, edit and format unstructured data) .

Environment: Hadoop, MapReduce, HDFS, Hive, Java, Hadoop distribution of Horton Works, Pig, HBase, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, Oracle 10g, PL/SQL, SQL*PLUS.

Confidential, San Francisco, CA

Sr. Hadoop Developer/ Admin

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce  jobs in java for data cleaning and preprocessing.
  • Experience in installing, configuring and using Hadoop ecosystem components. 
  • Experience in administration, installing, upgrading and managing CDH3, Pig, Hive & Hbase
  • Importing and exporting data into HDFS and Hive using Sqoop.  
  • Experienced in defining job flows.
  • Knowledge in performance troubleshooting and tuning Hadoop clusters.
  • Experienced in managing and reviewing Hadoop  log files.  
  • Participated in development/implementation of Cloudera Hadoop environment.
  • Load and transform large sets of structured, semi structured and unstructured data.  
  • Responsible to manage data coming from different sources.  
  • Good understanding of NoSQL databases such as HBase, Cassandra and MongoDB. .  
  • Supported Map Reduce Programs those are running on the cluster.  
  • Involved in loading data from UNIX file system to HDFS .  
  • Installed and configured Hive and also written Hive UDFs .  
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.  
  • Implemented CDH3 Hadoop cluster on CentOS .
  • Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented best income logic using Pig scripts .
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Cluster coordination services through Zookeeper.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Datameter, PIG, Zookeeper, Sqoop, CentOS, SOLR.

Confidential, Johnston, IA

Java/ J2EE Developer

Responsibilities:

  • Involved in Analysis, Design, Development and Testing of application modules.
  • Analyzed the complex relationship of system and improve performances of various screens.
  • Developed various user interface screens using struts framework.
  • Worked with spring framework for dependency injection.
  • Developed JSP pages, using Java Script, Jquery, AJAX for client side validation and CSS for data formatting.
  • Written domain, mapper and DTO classes and hbm.xml files to access data from DB2 tables.
  • Developed various reports using Adobe APIs and Web services.
  • Wrote test cases using Junit and coordinated with testing team for integration tests
  • Fixed bugs, improved performance using root cause analysis in production support
  • Analyzed the data by performing Hive queries (1HiveQL) and running Pig scripts (Pig Latin) to study customer behavior.
  • Stored the data in an Apache Cassandra Cluster
  • Used Impala to query the Hadoop data stored in HDFS.
  • Manage and review Hadoop log files.
  • Support/Troubleshoot Map/Reduce programs running on the cluster
  • Load data from Linux file system into HDFS.
  • Install and configure Hive and write Hive UDFs.
  • Create tables, load data, and write queries in Hive.
  • Develop scripts to automate routine DBA tasks using Linux Shell Scripts, Python

Environment: JDK 1.4.2, Swings, XML,SQL,Windows XP/7, Web services, Hyperion 8/9.3, Citrix, Mainframes, CVS.

Confidential, Lansing, MI

Java/ J2EE Developer

Responsibilities:

  • Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML.
  • Involved in complete requirement analysis, design, coding and testing phases of the project.
  • Participated in JAD meetings to gather the requirements and understand the End Users System.
  • Developed user interfaces using JSP, HTML, XML and JavaScript.
  • Generated XML Schemas and used XML Beans to parse XML files.
  • Created Stored Procedures & Functions. Used JDBC to process database calls for DB2/AS400 and SQL Server databases.
  • Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
  • Created Data sources and Helper classes which will be utilized by all the interfaces to access the data and manipulate the data.
  • Developed web application called iHUB (integration hub) to initiate all the interface processes using Struts Framework, JSP and HTML.
  • Developed the interfaces using Eclipse 3.1.1 and JBoss 4.1 Involved in integrated testing, Bug fixing and in Production Support

Environment: Java 1.3, Servlets, JSPs, Java Mail API, Java Script, HTML, Spring Batch XML Processing, MySQL 2.1, Swing, Java Web Server 2.0, JBoss 2.0, Red Hat Linux 7.1.

Confidential

Java/ J2EE developer 

Responsibilities:

  • Designed and developed Struts like MVC 2 Web framework using the front-controller design pattern, which is used successfully in a number of production systems.
  • Normalized Oracle database, conforming to design concepts and best practices.
  • Designed the high level and detailed design for Quiet Time alerts by utilizing the existing Message broker infrastructure. The architecture reduced 400 man hours of initial estimation.
  • Invoked the web service from Message flows through XML gateway in a secured manner.
  • Developed User Interface to allow the members to setup the preference using JSP based on Struts and Custom tag library facility using form beans and action classes.
  • Developed ANT scripts to build the different modules for the Project, such as building the binary files and scripts for deploying to the server.
  • Provided technical assistance to the Infrastructure team in successfully maneuvering the challenges faced during installation and execution of the project.
  • Worked closely with the Infrastructure team during the build process in deployment and configuration of the Message broker. Resolved the environmental issues in a timely manner and performed IT checkout to ensure successful deployment on all the Message broker servers.

Environment: MVC 2,Message Broker, ESQL, Java, JSP, Struts, Hibernate, Apache Ant, Log4j, XML gateway.

We'd love your feedback!