Hadoop Developer Resume San Ramon CA - Hire IT People

SUMMARY

Around 5+ years of experience in software development using BigData, Hadoop, Apache Spark Java/J2EE, Scala, Python technologies.
Solid Mathematics, Probability and Statistics foundation and broad practical statistical and data mining techniques cultivated through various industry work and academic programs.
Strong technical, administration, &mentoring knowledge in Linux and BigData/Hadoop technologies.
Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Pentaho, Hbase, Zookeeper, Sqoop, Oozie, Cassandra, Flume and Avro
Experienced the deployment of Hadoop Cluster using Puppet tool
Work experience with cloud infrastructure like Amazon Web Services (AWS).
Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa
Installing, configuring and managing of Hadoop Clusters and Data Science tools
Managing the Hadoop distribution with Cloudera Manager, Cloudera Navigator, Hue
Setting up the High-Availability for Hadoop Clusters components and Edge nodes.
Experience in developing Shell scripts and Python Scripts for system management
Hands on experience in application development using Java, RDBMS and Linux Shell Scripting.
Hands on experience working on Talend Integration Suite and Talend Open Studio. Experience in designing Talend jobs using various Talend components.
Extensive experience with SQL, PL/SQL and database concepts.
Experience working with JAVA, J2EE, JDBC, ODBC, JSP, Java Eclipse, Java Beans, EJB, Servlets, MS SQL Server
Detailed understanding of Software Development Life cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile Delivery Assurance - Quality Focused & Process Oriented:
Ability to work in high-pressure environments delivering to and managing stakeholder expectations.
Application of structured methods to: Project Scoping and Planning, risks, issues, schedules and deliverables.
Strong analytical and Problem solving skills.
Good Inter personnel skills and ability to work as part of a team.
Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines

TECHNICAL SKILLS

Technology: Hadoop Ecosystem/J2SE/J2EE/Oracle

Operating Systems: Windows Vista/XP/NT/2000 Series, UNIX/LINUX (Ubuntu, CentOS, Redhat)/AIX/Solaris.

DBMS/Databases: DB2, My SQL, PL/SQL

Programming Languages: C, C++, JSE, XML, JSP/Servlets, Struts, Spring, HTML, JavaScript, jQuery, Web services, Scala

Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive, Pig, Spark, kafka, Sqoop, Flume, Zookeeper and Hbase.

Methodologies: Agile, Water Fall

NOSQL Databases: Cassandra, MongoDb, Hbase

Version Control Tools: SVN, CVS, VSS, PVCS, GIT

ETL Tools: Talend, Informatica

PROFESSIONAL EXPERIENCE

Confidential I San Ramon CA

Hadoop Developer

Responsibilities:

Collaborated on insights with other Data Scientists, Business Analysts, and Partners.
Evaluate, refine, and continuously improve the efficiency and accuracy of existing Predictive Models.
Utilized various data analysis and data visualization tools to accomplish data analysis, report design and report delivery.
Developed Scala and SQL code to extract data from various databases
Champion new innovative ideas around the Data Science and Advanced Analytics Practices Creatively communicated and presented models to business customers and executives, utilizing a variety of formats and visualization methodologies.
Uploaded data to Hadoop Hive and combined new tables with existing databases.
Developed statistical models to forecast inventory and procurement cycles.
Developed Python code to provide data analysis and generate complex data report.
Deployed the Cassandra cluster in cloud (Amazon AWS) environment with scalable nodes as per the business requirement.
Implemented the data backup strategies for the data in the Cassandra cluster.
Generated the data cubes using hive, Pig, JAVA Map-Reducing on provisioning Hadoop cluster in AWS.
Implemented the ETL design to dump the Map-Reduce data cubes to Cassandra cluster.
Imported the data from relational databases into HDFS using Sqoop.
Implemented POC for using APACHE IMPALA for data processing on top of HIVE.
Utilized Python Panda Frame to provide data analysis.
Utilized Python regular expressions operation (NLP) to analysis customer review.
Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores, NOSQL, Hadoop, PIG, MySQL and Oracle databases.
Used Spark MLLIB Libraries for designing recommendation Engines Analysis predicted by using Statistical analysis using R.

Environment: Apache Spark, Apache Kafka, MLIB Libraries, Scala, AKKA, Python, Cassandra, Hive, Storm, PIG, Hive, Big Data, R programming .

Confidential, Sanjose, CA

Hadoop Developer

Responsibilities:

Involved in architecture design, development and implementation of Hadoop deployment, backup and recovery systems.
Experience in working on multi-Petabyte clusters both administration and development.
Developed Chef modules to automate the installation, configuration and deployment of ecosystem tools, OS's and network infrastructure at a cluster level.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
Performed cluster co-ordination and assisted with data capacity planning and node forecasting using ZooKeeper.
Implemented Hadoop framework to capture user navigation across the application to validate the user interface and provide analytic feedback/result to the UI team.
Executed custom interceptors for Flume to filter data and defined channel selectors to multiplex the data into different sinks.
Extracted data from Oracle SQL server and MySQL databases to HDFS using Sqoop.
Optimized MapReduce jobs to use HDFS efficiently by using Gzip, LZO, Snappy compression techniques.
Experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
Created Hive tables to store the processed results in a tabular format and written Hive scripts to transform and aggregate the disparate data.
Experience in using Avro, Parquet, RCFile and JSON file formats and developed UDFs using Hive and Pig.
Responsible for cluster maintenance, rebalancing blocks, commissioning and decommissioning of nodes, monitoring and troubleshooting, manage and review data backups and log files.
Driving the application from development phase to production phase using Continuous Integration and Continuous Deployment (CICD) model using Chef, Maven and Jenkins.
Develop Pentaho Kettle Graphs to cleanse and transform the raw data into useful information and load it to a Kafka Queue (further loaded to HDFS) and Neo4j database for UI team to display it using the Web application.
Automated the process for extraction of data from warehouses and weblogs into HIVE tables by developing workflows and coordinator jobs in Oozie.
Scheduled snapshots of volumes for backup to find root cause analysis of failures and document bugs and fixes for downtimes and maintenance of cluster.
Tune/Modify SQL for batch and online processes.
Commissioning and decommissioning the nodes.
Manage cluster through performance tuning and enhancement.

Environment: Hortonworks (HDP 2.2), HDFS, MapReduce, Apache Cassandra, Apache KAFKA, YARN, Spark, Hive, Pig, Flume, Sqoop, Puppet, Oozie, ZooKeeper, Ambari, Oracle Database, MySQL, HBase, SparkSQL, Avro, Parquet, RCFile, JSON, UDF, Java (jdk1.7), CentOS.

Confidential, Dallas, TX

Java / J2EE Developer

Responsibilities:

Analyzed and reviewed client requirements and design.
Followed agile methodology for development process.
Developed presentation layer using HTML5 and CSS3, Ajax.
Developed the application using Struts Framework that uses Model View Controller(MVC) architecture with JSP as the view.
Extensively used SpringIOC for Dependency Injection and worked on Custom MVC Frameworks loosely based on Struts.
Used RESTful web services for transferring data between applications.
Configured spring with ORM framework Hibernate for handling DAO classes and to bind objects to the relational model.
Adopted J2EE design patterns like Singleton, Service Locator and Business Facade.
Developed POJO classes and used annotations to map with database tables.
Used Java Message Service (JMS) for reliable and asynchronous exchange of important information such as credit card transactions report.
Used Multi-Threading to handle more users.
Developed HibernateJDBC code for establishing communication with database.
Worked with DB2 database for persistence with the help of PL/SQL querying.
Used SQL queries to retrieve information from database.
Developed various triggers, functions, procedures, views for payments.
XSL/XSLT is used for transforming and displaying reports.
Used GIT to keep track of all work and all changes in source code.
Used JProfiler for performance tuning.
Used JUnit, a test framework which uses annotations to identify methods that specify a test.
Used Log4J to log messages depending on the messages type and level.
Built the application using MAVEN and deployed using Jenkins tool.

Environment: Java 8, Spring framework, Spring Model View Controller (MVC), Struts 2.0, XML, Hibernate 3.0, UML, Java Server Pages(JSP)2.0, Servlets 3.0, JDBC 4.0, JUnit, Log4j, MAVEN, HTML, RESTClient, Eclipse, Agile Methodology, Design Patterns, Jenkins.

Confidential

Java Developer

Responsibilities:

Worked on Requirement analysis, gathered all possible business requirements from end users and business Analysts
Involved in the analysis, design, implementation, and testing of the project.
Involved in understanding the functional specifications of the project.
Implemented the presentation layer with HTML, XHTML and JavaScript.
Involving in creation object model to relational using Hibernate
Used JDBC to handle large result sets.
Designed class and sequence diagrams to modify and add modules.
Developed web components using JSP, Servlets and JDBC.
Done JavaScript validations on JSP pages as per requirement
Involved in creating various reusable Helper and Utility classes which are used across all the modules of application
Implemented database using SQLServer.
Designed tables and indexes
Wrote complex SQL and stored procedures.
Involved in fixing bugs and unittesting with test cases using JUnit.
Developed user and technical documentation.

Environment: Java, SQL, Servlets, HTML, XML, Hibernate, JavaScript, Spring

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

San Ramon, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship