Hadoop Developer Resume
Boston, MA
SUMMARY:
- Have 7+years of IT experience which includes 3 years of work experience in Big Data, Hadoop ecosystem related technologies.
- Good Domain knowledge onHealthcare, Insurance and E - commerce.
- Working experience inApache Hadoop ecosystem components like HDFS, Map Reduce, Pig, Hive, Impala, HBase, SQOOP, Flume, Oozie and Spark.
- Experience in working with major Hadoop distributions likeCloudera 5.x and Hortonworks HDP 2.xand above
- Worked on Apache Nififor real-time analytical processing
- Experience in optimizingMap ReducePrograms usingcombiners, partitioners and custom counters for delivering the best results
- Experience in writingPig and Hive scriptsand extending the core functionality bywriting custom UDF’s
- Good knowledge on File formats like sequence File, RC, ORC, Parquet and compression techniques like, gzip, snappy and LZO
- Extensively worked onHive and Impala
- INTEGRATION with various Hadoop Eco-System Tools:
- IntegratedHive and HBasefor better performance
- IntegratedImpala and HBasefor real-time analytics
- IntegratedHive and Spark SQLfor high performance
- Didspark and HBaseIntegration for OLTP
- IntegratedKafka-spark streamingfor high efficiency throughput and reliability
- Worked onApache Flumefor collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis
- Experience in Importing Traditional RDBMS data to HDFS UsingSqoopand Exporting data from HDFS to RDBMS to generate reports
- Experience in writing both time and data driven workflows usingOozie
- Solid understanding of algorithms, data structures and object-oriented programming
- Knowledge on NoSQL columnar databases like HBase and Cassandra
- Experience in managing and troubleshooting Hadoop related issues
- Good knowledge and understanding ofJava and Scalaprogramming languages
- Knowledge on Linux and shell scripting
- Diverse experience in utilizing Java tools in business, Web, client-server platforms using core java, JSP, Servlets, Swings, Java Database Connectivity (JDBC) and application servers like Apache Tomcat.
- Improved the performance and optimization of the existing algorithms in Hadoopusing SparkContext, Spark-SQL, Data Frame, Pair RDD's, Spark YARN
- Hands on experience in working on Spark SQL queries, Data frames, import data from Data sources, perform transformations, perform read/write operations, save the results to output directory into HDFS.
- Implemented POC’s using Kafka, spark Streaming and Spark SQL
- Knowledge in using SQL Queries for backend database analysis
TECHNICAL SKILLS:
Operating Systems: Windows XP,LINUX
Languages/Technology: Hadoop: MapReduce, Pig, Hive, Sqoop, Hbase,Oozie, Spark, Scala,NifiJava: J2EE, JSP, Servlets,HTML,Java Script, XML,CSS
Hadoop Distribution: Cloudera, Hortornworks
Databases: MYSQL,Oracle
Tools: Tableau
PROFESSIONAL EXPERIENCE:
Confidential, Boston, MA
Hadoop Developer
Responsibilities:
- Played a senior Hadoop developer role and involved in all the phases of the project, starting from POC’s till implementation
- Involved in data migration activity using Sqoop JDBC drivers for MySql
- Worked on full and incremental imports and created Sqoop jobs
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports For the BI team.
- Involved in loading data from local file system (Linux) to HDFS.
- Created data model for structuring and storing the data efficiently. Implemented partitioning and of tables in HBase.
- Involved in creating Hive tables, loading the data and writing Hive queries which will run internally in map reduce way.
- Worked with various Hadoop file formats, including ORC and parquet.
- Involved in integration of Hive and HBase.
- Implemented bucketing, partitioning and other query performance tuning techniques.
- Tested Apache(TM) Tez, an extensible framework for building high performance batch and interactive data processing applications, onHive jobs.
- Involved in writingOozie work flows.
- Designed, documentedstandardoperational procedures using confluence.
- Creating Nifi custom processor and building the flow accordingly.
- Created NIFI flow to ingest the data realtime from MySql to SalesForce using RestAPIs.
Environment: Hortonworks, Hadoop, Hive, Sqoop, HBase, HDFS, Tez, Java,MySql, FileZilla, Unix Shell Scripting.
Confidential, Boston, MA
Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive,HBase database and SQOOP.
- Installed Hadoop, Map Reduce, HDFS, and Developed multiple map reduce jobs in PIG and Hive for data cleaning and pre-processing.
- Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements and delivered the BRD and TDD documents.
- Extensively involved in Design phase and delivered Design documents.
- Involved in Testing and coordination with business in User testing.
- Importing and exporting data into HDFS and Hive using SQOOP.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
- Experienced in defining job flows.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Experienced in managing and reviewing the Hadoop log files.
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
- Load and Transform large sets of structured data.
- Responsible to manage data coming from different sources.
- Involved in creating Hive Tables, loading data and writing Hive queries.
- Utilized Apache Hadoop environment by Cloudera.
- Created Data model for Hive tables.
- Involved in Unit testing and delivered Unit test plans and results documents.
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Worked on Oozie workflow engine for job scheduling.
Environment: Cloudera, Hadoop, Hive, Map Reduce, Hbase, Pig, SQOOP, Mqsql, Tableau
Confidential
Software Developer
Responsibilities:
- Performance analysis of team members is done weekly and assigned tasks which resulted in improvement of way of approach in critical phases of the project.
- Developed client server website using JAVA as programming language for clients to access required information.
- Extensively worked with the retrieval and manipulation of data from the Oracle database by writing queries using SQL and PL/SQL.
- Designed and developed web pages using HTML and JSP.
- UsedEclipse3.2 for writing code forJSP,Servlets.
- UsedSpring’s JdbcTemplate for the database through the stored procedures.
- UsedXMLfor mapping and configuring the project.
- Designed and developed Servlets to communicate between the presentation and business layer.
- Developed JSP’s, Java Beans and Servlets to interact with Data Base.
- Used CSS andJavaScriptfor validations and integrating business server side components on the client side with in the browser.
- Developed the shell script for the routine maintenance of the database.
- Involved in Integration testing and defect fixes.
- Conducted client and team meetings regularly and produced results beyond expectations.
Environment: Java 1.6, JSP 2.0, MYSQL, UNIX, Oracle 11g, XML, FTP, JSP, Servlets, Tomcat, HTML 5, CSS, JavaScript, XML, Eclipse IDE.
Confidential
Software Developer
Responsibilities:
- Involved in Design, Development, Testing and Integration of the application
- Implementing business logic and data base connectivity.
- Client side installation and configuration of project.
- Developed using simple Struts Validation for validation of user input as per the business logic and initial data loading.
- Co-coordinated Application testing with the help of testing team.
- Writing database queries on Oracle
- Writing stored procedures, Packages, Views, Cursors, Functions and triggers using SQL in the back end.
- Worked with business teams using agile methodology to integrate business line of apps with SOA in a seamless fashion.
- Used Hibernate for Object Relational Mapping (ORM) and data persistence.
- Wrote SQL commands and Stored Procedures to retrieve data from Oracle database.
- Developed REST APIs using Web API (REST API).
- Developed web services using Restful web services, WSDL, and XML.
- Developed the application using Singleton, Business Delegate and Data Transfer Object design patterns in the project.
- Created and implemented Oracle Stored Procedures, Functions, Triggers and complex queries using SQL.
- Worked with Java Messaging Services (JMS) for reliable and Asynchronous Communication.
Environment: Java 1.6, JSP 2.0, MYSQL, UNIX, Shell Scripting, Oracle 11g, FTP, JSP, HTML 5, XML, CSS, JavaScript, XML, Eclipse IDE.