Hadoop/Spark Developer Resume Jacksonville, FL - Hire IT People

SUMMARY

8+ years strong skillset in building and developing Hadoop Map reduce solutions and experience in using Hive, Pig, Spark, Storm, Kafka.
Strong Proficiency in R on concepts such as data transformation, filter and analytics
Having experience onRDDarchitectureandimplementingspark operations on RDD and also optimizing transformations and actionsin spark.
Good knowledge on spark components likeSpark Sql, MLib, Spark Streaming and GraphX
Configured deployed and maintained multi - node Dev and TestKafkaClusters.
Developed code to write canonical modelJSONrecords from various input sources toKafka Queues
Well knowledgeable in building, configuring, monitoring, supporting Hadoop environment using Cloudera manager, Hortonworks, AWS, and apache Ambari.
Involved and worked in managing, configuring, installing, supporting Cloudera Hadoop CHD 5 and IBM InfoSphere BigInsights.
Experience in importing and exporting data using Flume, sqoop from RDBMS to HDFS and vice-versa.
Developed various Pig and Hive UDFs(User Defined Functions) to extend functionality to solve multiple bigdata filtering problems.
Implementing Hbase Row Key design and integration of Hive and Hbase
Good knowledge in using AWS tools EC2, VPC, Route 53, Cloud Trail, Cloud Watch,IAM,S3
Strong experience in developing Map Reduce solutions on Hadoop ecosystem that solved various bigdata problems.
Strong experience in MapReduce programming and customizing framework at various levels and worked on various Input formats like SequenceFileInputFormat, KeyValue Pair Input Format etc.
Strong experience in tweaking RecordReaders and developing custom Input Formats to perform MapReduce on highly unstructured data.
Strong skillset in optimizing MapReduce code to improve performance by imposing compression techniques on intermediate data crated between map and reduce task.
Worked on following NoSQL databases MongoDB, HBase and Cassandra and ensured faster access to data on HDFS.
Strong knowledge in working with MongoDB database concepts locking, transactions, indexes and good in managing MongoDB environment for scalablity.
Experience in web pages development using HTML,CSS,JAVA Script, D3 java Script, Knockout JS, AJAX, Underscrore JS.
Proficiency in understanding various data formats that include XML, JSON and also comfortable with development of web services like REST.
Strong background as a Java developer using Java, J2EE,JSP,JDBC, SQL, Hibernate.
Good in using version control tools GITHUB and SVN.
Strong Experience in using Unix commands and shell scripting.
Experience in working with development tools ECLIPSE, NET BEANS, PyCharm, INTELLIJ, ANDROID STUDIO.
Strong work experience in using MS Office, Excel tools for documentation and reporting.

TECHNICAL SKILLS

Big Data: Hadoop HDFS, Map Reduce V2, PIG, HIVE, HBase, Oozie, Spark, Kafka, Storm, ZooKeeper, Flume

Java Confidential: Core Java, J2EE Servlets, JSP, JDBS, JNDI, JAVA Beans, Hibernate, Java Script, JQuery, JDBC, Applets, Swings, Struts

AWS tools: EC2, VPC, Route 53, S3, IAM, Cloud Watch, Cloud Trail, Glacier, Elastic Search

Programming Languages: C, C++, R, Java, Python, Scala, UNIX Shell Scripting

Databases: Oracle 11g, MYSQL, DB 2, MY-SQL Server, Microsoft SQL server, MS Access

NoSQL Databases: MongoDB, HBase, Apache Cassandra, Amazon DynamoDB, Neo4j

IDE tools: Eclipse, NetBeans, PyCharm, IntelliJ, Android Studio

Virtualization Confidential: VMware ESX server, Microsoft Hyper-V Server, Citrix: Xen Server

Operating Systems: Linux, Unix, Windows XP/7/8/10, Windows Server 2003,2008 Mac, AMI

Web Confidential: HTML, CSS, Java Script

Data Visualization: D3, Tableau, R

Networking Protocols: TCP/IP, UDP,HTTP,HTTPS,FTP,SMTP,SNMP,POP3

Unit Testing Tools: Junit, TestNG

Version Control: GitHub, CVS, SVN

ETL Tools: Informatica, Pentaho

PROFESSIONAL EXPERIENCE

Confidential, Jacksonville, FL

Hadoop/Spark Developer

Responsibilities:

Worked on Hadoop cluster and data querying tools Hive to store and retrieve data.
While developing applications involved in complete Software Development Life Cycle(SDLC).
Reviewing and managing Hadoop log files by consolidating logs from multiple machines using flume.
Implemented custom input format and record reader to read XML input efficiently using SAX parser.
Involved in setting up storm and kafka cluster in AWS environment, monitoring and troubleshooting cluster
Documented the data flow formApplication > Kafka > Storm > HDFS > Hive tables.
Involved in converting HiveQL to Sparkql, connecting JDBC drivers between spark and editing configuration parameters.
Involved in writing Queries in sparksql using scala.
Loading data from Linux file system to HDFS and vice-versa.
Using CSVExcelStrorage to parse with different delimiters in PIG.
Installed and monitored Hadoop and eco system tools on multiple operating systems Ubuntu, CentOS, Suse 11.
Using sqoop exported analyzed and output data to relational databases using sqoop and visualized it by using Tableau and forwarding it to BI team for report generation.
Developed Pig Latin scripts to do operations of sorting, joining and filtering enterprise data.
Implemented test scripts to support test driven development and continues integration.
Developed multiple MapRedcue jobs in java to clean datasets.
Worked on Oozie workflow engine to run multiple Map Reduce jobs.
Filtered datasets by developing custom user defined functions in hive that will be running with the support of MapReduce.
Supported in setting up QA environment.
Experienced in working with applications team in installing Hadoop updates, upgrades based on requirement.

Environment: Hadoop Map Reduce 2, HDFS, PIG, Hive, Flume, Eclipse, Java, Sqoop, Kafka, Storm

Confidential, Hagerstown, MD

Hadoop / Spark Developer

Responsibilities:

having experience onRDDarchitectureandimplementing spark operations on RDD
Implemented various machine learning models like Random Forest, SVM, K-NN etc using spark MLib component.
Implementation of de-duplication process to avoid duplicates in daily load.
Developed MapReduce programs to extract and transform the data sets and results were exported back to RDBMS using Sqoop with HortonWorks.
Involved in data modeling and sharding and replication strategies in MongoDB.
Having experience onspark performance tuning Options
Developed code to write canonical modelJSONrecords from various input sources toKafka Queues.
Implemented Oozie workflows for Map Reduce, Hive and sqoop actions.
Analyzed the data by performingHive querieson existing databases and analyze system performance using Hortonworks HDP .
Worked in integration part of storing data from Rest to MongoDB.
Experience working with NoSQL database including MongoDB and Hbase.
Developedstorm boltsandtopologiesinvolvingKafka spoutsto stream data fromKafka.
Managed small cluster for testing environment with Hortonworks using ambari.
Transferring queried data to Tableau using JDBC connector to visualize and also using same data to visualize using D3.
Design and implementation of delta data load systems in Hive, which increased efficiency by more than 60%.
Coordinating cluster services using zookeeper.
Exported analyzed data to the relational databases using Sqoop and visualized using Tableau, R
Involved in development of an application to migrate files from S3 to HDFS.

Environment: Hadoop MR2,HDFS,Spark 0.8.0,Kafka 0.8.1.1, Hive, Zookeeper, Oozie

Confidential, Burlington,MA

Hadoop Developer

Responsibilities:

Used to perform sentiment analysis of consumers towards product and company.
To perform web crawl and storing data in HDFS our team designed and implemented Hadoop architecture.
Developed code base to stream data from sampledata files > Kafka > Kafka Spout >Storm Bolt > HDFS Bolt.
Setup name node and formatted child nodes to HDFS format using name node.
Started deploying IBM Inforsphere BigInsights V 3.0 for using with Hadoop ecosystem.
Scheduled jobs in BigInsights for testing environment using solr in SUSE 11
Trained interns to use Hadoop and modify cluster according to requirements.
Configured Hadoop configuration files for master and all machines conf/masters, conf./slaves conf/*-site.xml.
Scheduling the jobs with workflow engine Oozie, managed actions in both sequentially and parallel using Oozie.
Optimized code base to run independent tasks in a distributed manner to improve performance.
Writing map reduce program to process crawled data store in HDFS storage.
Run different map reduce jobs to analyze data with the help of data science team.
To perform sentiment analysis, we used storm to get real time data from twitter stream API.
Manipulate, serialize, model data in multiple forms(JSON,xml).
Configured, deployed and maintained a single nodeZookeepercluster in DEV environment.
Configured deployed and maintained multi-nodeDev and TestKafkaClusters.
Transferring data from MySql to HBase environment using Sqoop.
Environment Hadoop 0.20, Hbase, MapReduce, Storm, Java, Amazon, EC2,Kafka,Scala, Sqoop

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

Designed docs and specs for the near real time data analytics using Hadoop and HBase.
Installed Cloudera Manager 3.7 on the clusters.
Used a 60 node cluster with Cloudera Hadoop distribution on Amazon EC2.
Developed ad-clicks based data analytics, for keyword analysis and insights.
Crawled public posts from Facebook and tweets.
Wrote MapReduce jobs with the Data Science team to analyze this data.
Converted output to structured data and imported to Spot fire with analytics team.
Defined problems to look for right data and analyze results to make room for new project.
TIBCO Spot fire with in-house custom application was used to perform and generate analytics.

Environment: Hadoop 0.20, HBase, HDFS, MapReduce, Java, Spot fire, Cloudera Manager 2, Amazon EC2 classic

Confidential, Raleigh, NC

Application developer J2EE

Responsibilities:

For user interaction(UI) developer JavaScript behavior code.
Created database program in SQL server to manipulate data accumulated by internet transactions.
Developed servlet class to generate dynamic HTML pages.
Developed back-end classes and servlets using Web Sphere application server.
Developed an API to write XML documents from a database server.
Using Junit test tested usability performance for application.
Maintenance of a Java GUI application using JFC/Swing.
Created complex SQL and accessed to database using JDBC connectivity.
Involved in the design and coding of the data capture templates, presentation and component templates.
As a part of team designed, customized & implemented metadata search and database synchronization.
Used toad for queries execution and also involved in writing SQL scripts, PL SQL code for procedures ad functions and used Oracle as database.

Environment: Java, JavaScript, Servlets, Web Sphere 3.5,EJB, JDBC, SQL, JUNIT, Eclipse IDE, Apache Tomcat 6

Confidential

Java Application Developer

Responsibilities:

Creating class and sequence diagram with data flow diagrams and UML.
Developed use case, sequence, business modeling and data modeling using IBM Rational Rose.
Developed JSP's with STRUTS custom tags and implemented JavaScript validation of data.
Using Struts frameworks developed web applications.
Developed UI using JSP, HTML, CSS, JavaScript.
According MVC pattern implemented Struts framework.
Eclipse IDE was used to build application.
Created validation.xml,Struts-config.xml,web.xml to integrate all components in the struts framework.
For logging framework used log4j.
Worked with strut tags and used strut tags as the front end controller to the web application.
SVN was used to manage application versions.
Helped in developing user manuals and product documentation.

Environment: Java/J2EE, Oracle 10g, Struts1.2,Hibernate 3, Web Logic 10.0,HTML,AJAX,Java Script,XML,UML,JMS,JDBC,log4j,Web Sphere, IBM Rational Rose, Eclipse 3.4.2 & 3.5

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

Jacksonville, FL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship