Big Data/hadoop Developer Resume
Pittsburgh, PA
SUMMARY
- Around 8 years of overall IT experience wif hands - on experience in Big Data processing using Hadoop, Hadoop Ecosystem implementation, maintenance and Big Data analysis operations.
- 3 years of experience in application development and design using Object Oriented Programming, Java /J2EE technologies.
- Good noledge of big data ecosystems and various components such as HDFS, Job Tracker, Task Tracker, Namenode and Datanode.
- Technical expertise in Big data/Hadoop HDFS, Map Reduce, Apache Hive, Apache Pig, Sqoop, HBase, Flume, Storm, Kafka, Spark, Oozie, Zookeeper, NoSQL Data bases HBase, Cassandra, MongoDB.
- Experience in teh developing NoSQL database by using CRUD, Sharding, Indexing and Replication.
- Experience in good understanding of Apache Storm-Kafka pipelines.
- Extensive experience working in Teradata, Oracle, Netezza, Informatica, SQL Server and MySQL database.
- Experience in usage of Hadoop distribution like Cloudera and Hortonworks.
- Good Experience in data loading from Oracle and MYSQL databases to HDFS system using Sqoop (Structure Data) and Flume (Log Files & XML).
- Knowledge on analyzing data interactively using Apache Spark and Apache Zeppelin.
- Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
- Experienced in writing custom Hive UDF's to in corporate business logic wif Hive queries.
- Good experience in optimizing Map Reduce algorithms using Mappers, Reducers, combiners and partitioners to deliver teh best results for teh large datasets.
- Experience in understanding teh security requirements for Hadoop and integrate wif Key Distribution Centre.
- Hands-on experience wif Amazon EC2, Amazon S3, Amazon RDS, VPC, IAM, Amazon Elastic Load Balancing, Auto Scaling, Cloud Front, CloudWatch, SNS, SES, SQS and other services of teh AWS family.
- Proficient in Java, Scala and Python.
- Expertise in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.
- Hands on experience in using BI tools like Tableau/Pentaho.
- Detailed understanding of Software Development Life Cycle (SDLC) and sound noledge of project implementation methodologies including Waterfall and Agile.
- Involved in design and development of various web and enterprise applications using various technologies like JSP, Servlets, Struts, Hibernate, and spring, JDBC, JSF, XML, Java Script, HTML, AJAX, SOAP and Amazon Web Services.
- Experience in constructing pipelines using workflow tools like Oozie.
- Experienced in providing real time analytics on big data platforms using HBase, Cassandra and Mongo DB.
- Hands on experience in application development using core JAVA, RDBMS, Linux shell scripting and also developed UNIX shell scripts to automate various processes.
- Ability to work independently as well as in a team and able to TEMPeffectively communicate wif customers, peers and management at all levels in and outside teh organization.
TECHNICAL SKILLS
Bigdata Ecosystem: Hadoop, MapReduce, YARN, Pig, Hive, HBase, Flume, Pentaho, Sqoop, Impala, Oozie, Zookeeper, Spark, MongoDB, Cassandra, Snappy, Kafka, AWS.
Databases: Oracle, NoSQL,MySQL,PL/SQL,PostgreSQL.
No SQL Databases: Cassandra,MongoDB,Hbase,DynamoDB.
Operating Systems: UNIX,Linux,MAC OS, Windows XP, Server 2008.
Development Tools: Eclipse 3.3,Ant,Maven,JUNIT.log$J,ETL.
Programming Languages: C,C++, Python, HTML5, CSS 3, JAVA, JAVASCRIPT, JQUERY.
Development Methodologies: Agile/ Scrum, Waterfall.
Frameworks: Struts, Spring, Hibernate, MVC.
PROFESSIONAL EXPERIENCE
Confidential, PITTSBURGH, PA
BIG DATA/HADOOP DEVELOPER
Responsibilities:
- Installed, configured, monitored and maintainedHadoopcluster on Big Data platform.
- Worked wif highly unstructured and semi structured data.
- Developing scripts and Batch Job to schedule various Hadoop programs.
- ConfiguredZookeeper, worked on Hadoop High Availability wif Zookeeper failover controller, add support for scalable, fault-tolerant data solution.
- Wrote multipleMapReduceprograms in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
- Used Pig as ETL tool to do transformations, event joins, filter & some pre-aggregations.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Used Spark stream processing to get data into in-memory, implemented RDD transformations, actions to process as units
- ConfiguredSparkto optimize data process.
- Created/modified UDF and UDAFs for Hive.
- Populated HDFS and Cassandra wif huge amounts of data using Apache Kafka.
- Used DML statements to perform different operations on Hive Tables.
- Developed Hive queries for creating foundation tables from stage data.
- Adjusting teh minimum share of maps and reducers for all teh queues.
- Analyzed teh data by performing Hive queries and running Pig scripts to study customer behavior.
- Managed Amazon Web Services (AWS) EC2 wif Puppet.
- Working wif Apache Crunch library to write, test and run HADOOP Map Reduce pipeline jobs.
- Efficiently put and fetched data to/from HBase by writing Map/Reduce job in Java/Python.
- Cluster coordination services through Zookeeper.
- Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using Hive QL.
- Experienced on loading and transforming of large sets of semi structured data using Pig Latin operations.
- Worked onOozieworkflow engine for job scheduling.
- Extracted teh data from Teradata into HDFS using Sqoop.
- Data Visualization using Tableau for reporting from Hive Tables.
- Worked in using Sequence files, RC File, AVRO and HAR file formats.
Environment: Hadoop, HDFS, Apache Crunch, Map Reduce, Hive, Flume, Sqoop, Oozie, Zookeeper, Kafka, Storm, Cassandra, Spark, Puppet, Storm, Linux.
Confidential, NEW YORK, NY
BIG DATA/HADOOP DEVELOPER
Responsibilities:
- Has real-time experience of Kafka-Storm on HDP 2.2 platform for real time analysis.
- Enhanced and optimized teh customer path tree GUI viewer to incrementally load teh tree data from HBase, NoSQL database.
- CreatedHBasetables to store variable data formats of input data coming from different portfolios.
- Created PoC to store Server Log data in MongoDB to identify System Alert Metrics.
- Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- End-to-end performance tuning of Hadoop clusters andHadoop MapReduceroutines against very large data sets.
- Extracted teh data from Teradata into HDFS using Sqoop.
- Designing NoSQL schemas in HBase.
- Performed analysis on teh unused user navigation data by loading into HDFS and writing MapReduce jobs. Teh analysis provided inputs to teh new APM front end developers and lucent team.
- Responsible for writing Hive queries for data analysis to meet teh business requirements.
- Wrote MapReduce jobs using Java API and Pig Latin.
- Wrote Pig scripts to run ETL jobs on teh data in HDFS and further do testing.
- Used Hive to do analysis on teh data and identify different correlations.
- Importing and exporting teh data using Sqoop from HDFS to Relational Database systems and vice-versa.
- Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
- Created and maintained Technical documentation for launching Hadoop Clusters and for executingHive queriesandPig Scripts.
- Involved in creating Hive tables and working on them using HiveQL and perform data analysis using Hive and Pig.
- Automatically Importing data regular basis using Sqoop to into teh Hive partition by using apache Oozie.
- Continuous monitoring and managing teh Hadoop cluster through Cloudera Manager
Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Flume, Zookeeper, Cloudera Manager, Oozie, Java (jdk1.6), MySQL, SQL, Windows NT, Linux
Confidential
JAVA J2EE Dev
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Actively participated in teh implementation, maintenance and testing phases plan generation using Struts framework.
- Designed and Developed Web Services using technologies UDDI, WSDL, SOAP to communicate to teh other modules.
- Generated Client classes using WSDL2 Java and used teh generated Java API.
- Experienced in designing, developing and implementing J2EE applications.
- Writing shell scripts to monitor teh health check of Hadoop daemon services and responding accordingly to any warning or failure conditions.
- Supported Map Reduce Programs those are running on teh cluster.
- Managing and scheduling Jobs on a Hadoop cluster.
- Developed Pig Latin scripts to extract teh data from teh web server output files and loading into HDFS.
- Writing Map Reduce jobs using Java API.
- Responsible for Importing and exporting data into HDFS and Hive using Sqoop.
- Clustering customers category based on that providing offers using Apache Hive.
- Developed Scripts and Batch Job to schedule various Hadoop Programs.
- Grouping, Aggregation and Sorting are done by using Pig and Hive which are higher level abstractions of MapReduce.
- Experience in importing data from various sources to teh Cassandra cluster using Java API’s
- Developed teh Pig UDF'S to pre-process teh data for analysis.
- Writing Hive queries for data analysis to meet teh business requirements.
- Developed workflow in Oozie to automate teh tasks of loading teh data into HDFS and pre-processing wif Pig.
- Created data-models for customer data using teh Cassandra Query Language.
- Took part in monitoring, troubleshooting and managing Hadoop log files.
- Mentored few people in team and has reviewed design, code and test cases written by them.
Environment: ApaceHadoop, Hive, Java (jdk 1.6), Cassandra, DataStax, Oracle 11g/10g, MySQL, Toad 9.6, UNIX, Hive, Oozie, SOAP, JSON.
Confidential
JAVA/HADOOP DEVELOPER
Responsibilities:
- Used Spring MVC framework for displaying teh report.
- Involved indeveloping web pages using JSP, Servlet, CSS, and JavaScript/jQuery.
- Involved in various POC activity using technology like Map reduce, Hive, Pig, and Oozie.
- Developed teh Pig UDF's to pre-process teh data for analysis.
- Involved in designing and implementation of service layer over HBase database.
- Importing of data from various data sources such as Oracle and Comptel server into HDFS using transformations such as Sqoop, MapReduce.
- Involved in loading data from LINUX and UNIX file system to HDFS.
- Developed Hive queries for teh analysts.
- Environment: Struts, Spring, Angular JS,Hadoop, MapReduce, HDFS, Hive, Oozie, Java (jdk1.6), Cloudera, NoSQL, Oracle 11g, Toad 9.6, Windows NT, UNIX
Environment: Structs MVC, JSP, CSS, Hibernate,JavaScript, jQuery, HTML, CSS, Structs Tiles, AJAX, SOAP, NodeJS, Angular JS, XML, JSP 2.3, HTML5, CSS, PL/SQL, Oracle, DB2, Java 8, JSP, Junit, Mockito, SVN, ANT.
Confidential
JAVA DEVELOPER
Responsibilities:
- Worked in various phases of Software Development Life Cycle (SDLC) such as requirements gathering, analysis and development.
- Designed and developed teh front-end using HTML, CSS, and JavaScript wif Ajax.
- Designed use case diagrams, class diagrams, and sequence diagrams as a part of Design Phase using Rational Rose.
- Developed teh application implementing Spring MVC Architecture wif Hibernate as ORM framework.
- Developed a JavaScript performance testing toolkit for web and Node.js applications.
- Used JNDI to perform lookup services for teh various components of teh system.
- Developed teh Enterprise Java Beans (Stateless Session beans) to handle different transactions such as online funds transfer, bill payments to teh service providers.
- Developed deployment descriptors for teh EJB has to deploy on Web Sphere Application Server.
- Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
- Developed Web Services for data transfer from client to server and vice versa using Apache Axis, SOAP, WSDL, and UDDI.
- Extensively worked on MQ Series using point-point, publisher/subscriber messaging Domains for implementing Exchange of information through Messages.
- Developed XML documents and generated XSL files for Payment Transaction and Reserve Transaction systems.
- Implemented various J2EE Design patterns like Singleton, Service Locator, Business Delegate, DAO, Transfer Object, and SOA.
- Worked on AJAX to develop an interactive Web Application and JavaScript for Data Validations.
- Used Subversion to implement version control System.
- Build ANT Script for teh application and used Log4J for debugging.
- Used JUnit Framework for teh unit testing of all teh java classes.
Environment: Jdk 1.5, J2EE, EJB 2.0, JNDI 1.2, Hibernate 2.1, Spring 2.0, HTML, JavaScript, XML, CSS, Node.js JUnit, UML, MQ Series, Multithreading, Web Services, SOAP, WSDL, UDDI, AXIS 2, Ajax, Git, JUnit, Ant, Eclipse 3.3, IBM Web Sphere 6.1, DB2, subversion, Linux.
