Sr Hadoop Developer Resume
San Antonio, TX
SUMMARY
- Hadoop/Java/PythonDeveloper with over7 years of overall experience as software developer in design, development, deploying and supporting large scale distributed systems.
- 3years of extensive experience as BigData Developer.
- Excellent understanding ofHadoop architectureand underlying framework includingstorage management.
- Expertise in using various Hadoop infrastructures such asMapReduce, Pig, Hive, HBase, Sqoop, kafka, Spark, pyspark and Spark Streaming for data storage and analysis.
- Experienced in troubleshooting errors in HBase Shell/API, Pig, Hive and MapReduce.
- Highly experienced in importing and exporting data betweenHDFSandRelational Database Management systems usingSqoop.
- Collected logs data from various sources and integrated in to HDFS usingFlume.
- Good experience in Generating Statistics/extracts/reports from the Hadoop.
- Experience in writing Spark Applications using spark - shell, pyspark, spark-submit.
- Developed prototype Spark applications using Spark-Core, Spark SQL, DataFrame API
- Good knowledge in querying data fromCassandrafor searching, grouping and sorting.
- Strong experience in core Java,J2EE,SQL,PL/SQL and Restful web services.
- Good experience in generatingStatistics and reportsfrom the Hadoop.
- Hands on experience with Essbase tool.
- Extensive experience in developing applications using Core Java and multi-threading.
- Determined, committed and hardworking individual with strong communication, interpersonal and organizational skills.
- Hands on experience with visualization like Tableau.
- Expertise inJ2EEarchitecture, 3-tier Client/Server development architecture and Distributed Computing Architecture.
- Experience in design and development of multi-tiered web based applications usingJ2EE technologies like JSP, Servlets, EJB, JDBC, JMS, XML (SAX, DOM), XSL and Custom Tags.
- Experience in various open source frameworks and tools such as Spring and Hibernate.
- Experience in Web Service development (JAX-RPC, JAX-WS, SOAP, and WSDL).
- Involved in complete SDLC implementations including, preparing Requirement Specification documents, design documents, Test cases and Analysis, User Training documents and Technical Help documents.
- Experience in Python, Java Design Patterns, J2EE Design Patterns, MVC, multi-tier architecture.
- Good working knowledge of Log4J, XML, XSLT (DOM, SAX), Multithreading, Collections, Exceptions.
- Experience with ETL tool like Datastage and scheduling tools like CTRL-M
- Knowledge in Machine Learning and related technologies.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, MapReduce,MRUnit, YARN, Hive, Pig, Spark, HBase, Kafka Sqoop, DataStax Apache Cassandra, Flume
Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX
RDBMS: Oracle 10g/11g, MySQL, SQL server
No SQL: HBase, Cassandra
Web/Application servers: Tomcat, WebSphere
Java frameworks: Struts, Spring, Hibernate
Methodologies: Agile, UML, Design Patterns Core Java and J2EE
Databases: Oracle 11g/10g,Teradata, MS-SQL Server, MySQL
Programming Languages: C, Java, Python, PHP, SQL, Linux shell scripts.
Tools: Eclipse, Putty, Control - M, Infosphere-DataStage
Analytics, Monitoring and Performance Management Tools: Spark-R,SPLUNK, AppViewX
PROFESSIONAL EXPERIENCE
Confidential, San Antonio, TX
Sr Hadoop Developer
Responsibilities:
- Designed & developed custom Map Reduce Job to ingest Click-stream Data received from bank into Hadoop.
- Developed Sqoop Scripts to extract data from DB2 EDW source databases onto HDFS.
- Developed custom Map Reduce Job to perform data cleanup, transform Data from Text to Avro & write output directly into hive tables by generating dynamic partitions
- Developed Custom FTP & SFTP drivers to pull flat files from UNIX, Windows into Hadoop & tokenize an identified sensitive data from input records on the fly parallelly
- Developed Custom Input Format, Record Reader, Mapper, Reducer, Partitioner as part of developing end to end hadoop applications.
- Developed Custom Sqoop tool to import data residing in any relational databases, tokenize an identified sensitive column on the fly and store it into Hadoop.
- Worked on Hbase Java API to populate operational Hbase table with Key value.
- Experience in writing Spark Applications using spark-shell, pyspark, spark-submit
- Developed prototype Spark applications using Spark-Core, spark-streaming, Spark SQL, DataFrame API
- Developed several custom User defined functions in Hive & Pig using Java & python
- Developed Sqoop Job to perform import/ Incremental Import of data from any relational tables into hadoop in different formats such as text,avro, sequence, etc & into hive tables.
- Developed Sqoop Job to export the data from hadoop to relational tables for visualization and to generate reports for the BI team.
- Experience in using HCatalog to load Text Format Hive Table and transform data into Avro within Pig script
- Scheduling jobs using CTRL-M and edit ETL Datastage jobs
- Developed java code to generate, compare & merge AVRO schema files
- Designed & developed External & Managed Hive tables with data formats such as Text, AVRO, Sequence File, RC, ORC, parquet
Environment: Control-M, Hadoop, MapReduce, HDFS, Hive, pig, spark,Java, SQL, IG, Sqoop, Eclipse, Python, Unix Shell Scripting.
Confidential, Charlotte, NC
Hadoop Developer
Responsibilities:
- Involved in review of functional and non-functional requirements.
- Installed and configuredHadoop Environment.
- Developed multipleMapReducejobs in java for data cleaning and preprocessing.
- Installed and configuredPigand also writtenPigLatinscripts.
- WroteMapReducejob usingPig Latin.
- Involved in managing and reviewing Hadoop log files.
- Imported data usingSqoopto load data fromDB2to HDFS on regular basis.
- Developing Scripts and Batch Job to schedule various Hadoop Program.
- WrittenHive queriesfor data analysis to meet the business requirements.
- CreatingHive tablesand working on them usingHive QL.
- Importing and exporting data intoHDFSandHiveusingSqoop.
- Experienced in defining job flows.
- Involved in creatingHivetables, loading data and writing hive queries.
- Developed a custom File system plug in for Hadoop so that it can access files on Data Platform.
- Designed and implementedMapreduce-based large-scale parallel relation-learning system
- Extracted table feeds from EDW database using Python scripts.
Environment: Hadoop, MapReduce, HDFS, Sqoop, Hive, Java, Pig, MySQL, Eclipse, Oracle 10g, SQL.
Confidential, Charlotte, NC
Java Developer
Responsibilities:
- Development, enhancement and testing of the Web Methods flow services and Java services.
- Used web services for interaction between various components and created SOAP envelopes.Involved in Unit Testing of Web Services using SOAP UI.
- Used Muti-Threading concept while creating the application.
- Created web service Connectors for use within a flow service, while making the URLs abstract so as to change them at runtime without redeployment.
- Used JAX- RS API to create RESTfull web Service to interact with the Server.
- Created a Front-end application using JSPs, JSF, GWT and Spring MVC.
- Created standalone Java programs to read data from several XLS files and insert data into the Database as needed by the Testing team. Used JUnit for testing.
- Involved in Ant Build tool configuration for automation of building processes for all types of environments - Test, QA, and Production.
- Developed and provided support to many components of this application from end-to-end, that is, Front-end (View) to Web Methods and Database.
- Provided solutions for bug fixes in this application.
- Analyzed requirements and prepared the high level and low level design documents.
- Post implementation certification/verification.
- Coordinating between the Onsite and Offshore teams.
- Created DataPower Services like MPGW, XML Firewall, and all the associated Objects.
- Custom XSLT coding was done to achieve requirements.
- Handled message format like SOAP/XML, OAG, FP, Delimited, etc.
- Created Schedulers as necessary.
- Created Log targets and log categories to offload Datapower logs to a remote SAN using syslog protocol.
Environment: IBM DataPower Xi50, Xi52, Java, J2EE, WebSphere MQ, webMethods 8.1, SOAP, XML, XSLT, Contivo, Compuware, Gomez, SPLUNK, DynaTrace, Sitescope, Foglight, Introscope APM for Java
Confidential
Java/J2EE Developer
Responsibilities:
- Assisted in designing and programming for the system, which includes development of Process Flow Diagram, Entity Relationship Diagram, Data Flow Diagram and Database Design.
- Involved in Transactions, login and Reporting modules, and customized report generation using Controllers, Testing and debugging the whole project for proper functionality and documenting modules developed.
- Designed front end components using HTML, CSS, JavaScript.
- Involved in developing Java APIs, which communicates with the Java Beans.
- Implemented MVC architecture using Struts, Custom and JSTL tag libraries.
- Involved in development of POJO classes and writing Hibernate query language (HQL) queries.
- Used Java/J2EE Design patterns like MVC, Factory Method, Singleton, Data Transfer Object (DTO) Service Locator.
- Created Stored Procedures using SQL/PL-SQL for data modification.
- Developed JUnit test cases for regression testing and integrated with ANT build.
- Implemented Logging framework using Log4J.
- Involved in code review and documentation review of technical artifacts.
Environment: Java/J2EE, Spring Web Services, WebLogic Portal, websphere Application Server 7.0, Websphere Process Server, Linux, Struts, UNIX Shell, java batch client, Oracle 10g, Toad, Wily Introscope, Blade Logic, SSH Techtia, MS-Excel 2003/2007