Sr Hadoop Developer Resume
San Antonio, TX
SUMMARY
- Over 8+ years of IT experience which includes 3+ years of work experience in Big Data, Hadoop ecosystem related technologies.
- Good knowledge of hadoop architecture and various components such as HDFS, job tracker, task tracker, name node and data node.
- Experience in data analysis using HIVE, Pig Latin, HBase and custom Map Reduce programs in Java.
- Hands on experience on technologies like Spark, Kafka, Apache Drill, Scoop and Flume.
- Expertise in composing Map Reduce Pipelines with many user - defined functions using Apache crunch.
- Experience working with Apache SOLR for indexing and querying.
- Experience in using Oozie for managing Hadoop jobs.
- Experience in cluster coordination using Zookeeper.
- Experience in managing and reviewing Hadoop log files.
- In-depth knowledge of Statistics, Machine Learning, Data mining.
- Experience in supervised & unsupervised learning techniques like Multi-Linear Regression, Nonlinear Regression, Logistic Regression, Support Vector Machine, Decision tree, Random Forest.
- 4+ years of extensive experience in JAVA/J2EE Technologies, Database development, ETL Tools, Data Analytics.
- Expertise in developing web based GUIs using Applets, Swing, Servlets, JSP, HTML, XHTML JavaScript and CSS.
- Worked on Tableau data visualization tools.
- Expertise in relational databases like Oracle, My SQL.
TECHNICAL SKILLS
Languages: C, C++, Python, R Programming, Java, SQL,UML,XML
Web Services: SOAP, REST
Databases: Oracle 10g/11g, SQL Server, SQLite, MYSQL, PostgreSQL
No SQL: HBase, Cassandra, MongoDB
Application / Web Servers: Apache Tomcat, JBoss, Mongrel, Web Logic, Web Sphere.
System Design Tools: - Rational rose, Enterprise Architect
Operating systems: Windows XP/2000/2003/NT/98/95, Unix/Linux.
Microsoft Products: MS office, MS Visio, MS Project
Deployment tools: Heroku, Passenger
Frameworks: Spring, Hibernate, Struts
PROFESSIONAL EXPERIENCE
Confidential, San Antonio TX
Sr Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop
- Clustering/Classification of delivery documents in Mahout using KNN algorithm
- Experience on developing code using MongoDB
- Responsible for managing and scheduling Jobs on a Hadoop cluster
- Resource management of Hadoop Cluster including adding/removing cluster nodes
- Loading data from UNIX file system to HDFS and vice versa
- Created HBase tables to store variable data formats
- Implemented best offer logic using Pig scripts and Pig UDFs.
- Installed and configured Hive and also written Hive UDFs.
- Experience on loading and transforming of large sets of structured, semi structured & unstructured data
- Cluster coordination services through Zookeeper
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team
- Data Visualization using Tableau for Reporting from Hive Tables
- Developed multiple Map Reduce jobs in java for data cleaning and preprocessing
- Responsible for writing Hive queries for data analysis to meet the business requirements
- Responsible for creating Hive tables and working on them using Hive QL
- Responsible for importing and exporting data into HDFS and Hive using Sqoop loading with data and writing hive queries which will run internally in map reduce way
- Designed and implemented Map reduce based large-scale parallel relation-learning system
- Responsible for setup and benchmarking of Hadoop/HBase clusters
Environment: /Tools: Hadoop, HDFS, Pig, Sqoop, Hive, Map Reduce, Zookeeper, Java, Ubuntu/Cent OS, Tableau
Confidential San Francisco, CA
Hadoop Developer
Responsibilities:
- Transforming the data to R programming supported flat and basket file format from SQL
- Performed data transformation functions on data to improve the algorithm efficiency
- Generated Item frequency plots to check the frequency of the items the customers are purchasing from store
- Split the original data to training and testing
- The training data model is used to train the model using Apriori algorithm
- Different iterations of support and model parameters are configured to tune the model
- Once the model is tuned the accuracy of the algorithm is tested by using test dataset
- Different iterations are performed on test dataset for multiple support and confidence
- Managed the model and identified the troubleshooting issues
- Different statistical parameters are recorded from the model for further tuning
- To further modified the model collaborative filtering algorithm is implemented
- The accuracy and recommendation of collaborative filtering is better compared to Apriori generated model
- The model is attached to the live data by using pipelining operations
- Further requirement collections and change requests are handled to support the business scenario
Environment: /Tools: R Programming, HDFS, Map Reduce, HBase, Hive, Hadoop distribution of Horton works, Eclipse JMS, MR Unit, Java Batch, SQL
Confidential, OH
Hadoop Administrator
Responsibilities:
- Worked in the design and development phases of the application
- HadoopCore Components & Environment Administration Hive, Job Tracker, Task tracker, Pig, Scoop
- All Functions like Monitoring Cluster & Jobs, troubleshooting, user on boarding, reporting application status, Incident & Outage Management
- Manage the backup and disaster recovery forHadoopdata
- Experience inHadoopcluster management and capacity planning
- Optimize and tune theHadoopenvironments to meet performance requirements
- Install and configure monitoring tools
- Experience working on LDAP user accounts and configuring LDAP on client machines
- Work with big data developers and developers designing scalable supportable infrastructure
- Work with Linux server admin team in administering the server hardware and operating system
- Assist with develop and maintain the system run books
- Create and publish various production metrics including system performance and reliability information to systems owners and management
- Supported technical team members in management and review ofHadooplog files and data backups
- Participated in development and execution of system and disaster recovery processes
- Formulated procedures for installation ofHadooppatches, updates and version upgrades
- Automated processes for troubleshooting, resolution and tuning ofHadoopclusters
- Set up automated processes to send alerts in case of predefined system and application level issues
- Set up automated processes to send notifications in case of any deviations from the predefined resource utilization
- Perform ongoing capacity management forecasts including timing and budget considerations
- Coordinate root cause analysis (RCA) efforts to minimize future system issues
Environment: Ruby, Rails, Ajax, MySQL, GIT, HTML, CSS, jQuery, AJAX, Linux, RVMJIRA, JavaScript, XML, RSpec, PostgreSQL
Confidential, MN
Java Developer
Responsibilities:
- Worked with sprint planning, sprint demo, status and daily standup meeting
- Hands on experience on using Spring Web MVC framework
- Worked with Spring Configuration files to add new content to the website
- Used Hibernate framework to retrieve and update information and dependency Injection is achieved by Spring MVC Framework
- Worked on the Spring DAO module and ORM using Hibernate.
- Extensively used Spring's features such as Dependency Injection/Inversion of Control to allow loose coupling between business classes
- Configured Association Mappings such as one-one and one-many in Hibernate
- Worked with JavaScript calls as the Search is triggered through JS calls when a Search key is entered in the Search window
- Worked on analyzing other Search engines to make use of best practices
- Collaborated with the Business team to fix defects
- Interacted with project management to understand, learn and to perform analysis of the Search Techniques
Environment: Java 1.6, J2EE, Eclipse SDK 3.3.2, Java Spring 3.x, JQuery, Oracle 10i, Hibernate, JPA, JSON, SQL, stored procedures, JQuery, XML, HTML and JUnit, Ant, HTML, MySQL
Confidential
Java Developer
Responsibilities:
- Played an active role in the team by interacting with business and program specialists and converted business requirements into system requirements
- Conducted Design reviews and Technical reviews with other project stakeholders
- Implemented Services using Core Java
- Involved in development of classes using java
- Good proficiency in developing algorithms for serial interfaces
- Involved in testing of CAN protocols
- Developed the flow of algorithm in UML
- Developed verification and validation scripts in java
- Followed verification and validation cycle for development of algorithms
- Developed Test cases for Unit Test cases and as well as System and User test scenarios.
- Involved in Unit Testing, User Acceptance Testing and Bug Fixing
Environment: Java, UML, Linux, CAN, C, Doors
