Java / Hadoop Engineer Resume
Sunnyvale, CA
SUMMARY:
- 8 years of experience in Analysis, Design, Coding, Testing & Production Support of application software .
- 2 years of hands on experience in hadoop ecosystems such as HDFS - Map reduce,Pig, Hive, Hbase,sqoop, Flume, Apache Spark .
- Proficiency in DB2, SQL Server,My SQL,JCL,COBOL,VSAM,Web harvest tool,PL/SQL
- Experience in analyzing data using HiveQL, Pig Latin, Hbase.
- Experience in Importing and exporting data from different databases like MySQL, Oracle,DB2 into HDFS and viseversa using Sqoop
- Excellent understanding of Hadoop architecture and different components of Hadoop clusters
- Analyzed large data sets by running Hive queries and Pig scripts
- Extending Hive and Pig core functionality by writing custom UDFs.
- Experience in working with Hadoop clusters using Cloudera (CDH3) distributions.
- Extensively worked on Web harvest tool and reporting tool Tableau.
- Strong analytical, problem solving, multitasking and strategic planning skills
- Experience in design and development of interactive data, data visualizations, graphical dashboards and reports using any tools like Tableau.
- Experience in working with Healthcare Domain.
- Expertise in generating reports using SQL Server Reporting Services, Crystal Reports, and MS Excel spreadsheets, Power BI and Power View reports for ad-hoc business requests.
- Involved in Huge data migrations, transfers using SSIS
- Expert in calculating measures and dimension members using Multi-dimensional expression (MDX), mathematical formulas, and user-defined functions.
- Enthusiastic to learn new technologies and to work in challenging domains.
- Experienced in Batch and online programs debugged using Xpeditor and File-aid.
- Ability to work independently.
- Proficiency at all the levels of SDLC.
- Works well with the team and also self-motivated.
TECHNICAL SKILLS:
Operating Systems: Windows XP/98/95,LINUX,UNIX, IBM Z/390
Languages: C,C++,Web harvest tools,Xquery,PL/SQL,HTML,XML,Java
Databases: SQL Server 2005/2008,Oracle 11/10g/9i,DB2,MS Access,HBase
Hadoop Ecosystems: Hadoop 2.x(Yarn),Pig,Hive,HBase,Sqoop,Map Reduce,Cloudera
Business Intelligence tools: SQL Server Reporting Services (SSRS), SQL Server Analysis Services (SSAS), SQL Server Integration Services (SSIS),Tableau 8.3,MDX
Business Process Tools: MS Office
PROFESSIONAL EXPERIENCE:
Confidential - Sunnyvale, CA
Java / Hadoop Engineer
Responsibilities:
- Installed and configured Hadoop clusters for application development and Hadoop tools like Hive, Pig, Sqoop, HBase, Flume and Zookeeper.
- Worked on developing ETL processes to load data into HDFS using Sqoop and export the results back to RDBMS.
- Used Flume to collect, aggregate and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
- Involved in creating hive tables, loading the data and write hive queries that will run internally in a map reduce way.
- Analyzed business requirements and cross-verified with functionality and features of NOSQL databases like HBase, Cassandra to determine the optimal DB.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades of Cloudera Hadoop distribution(CDH3 and CDH4) as required
- Moved data from Third party system to Hadoop File System (HDFS) vice versa using shell commands hosted on an AWS cluster.
- Configured Sqoop and developed scripts to extract data from MySQL into HDFS.
- Involved in POC to import TIFF, Text and JSON files from Rabbit MQ server to HBase using Spark Streaming.
- Hands-on experience with productionalizing Hadoop applications viz. administration, configuration management, monitoring, debugging and performance tuning.
- Involved in implementing High Availability & automatic failover infrastructure to overcome single point of failure for Name node.
- Supports and assist QA Engineers in understanding, testing and troubleshooting.
- Written build scripts using ant and participated in the deployment of one or more production systems
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Environment: Java 7, Eclipse, Hadoop, Pig, Hive, MapReduce, HDFS, Sqoop, Flume, Unix Shell Scripting, Oozie, AWS, Rabbit MQ, CDH, Linux.
Confidential, Hartford,CT
Programmer Analyst / Hadoop Developer
Responsibilities:
- Analyse the requirement and lay out the plan to execute the task.
- Involved in start to end process of Hadoop cluster installation, configuration and monitoring.
- With Pig Latin & Hive, analyze the files obtained from claims submissions and third party vendor data dumps.
- Modify database through query as per client request using SQL queries and DB2
- Writing optimized SQL queries and extract data from data warehouse as per business user requirement.
- Import data using Sqoop to load data from HDFS to mysql and viceversa on regular basis.
- Developed Pig scripts to implement ETL transformations.
- Developed join data set scripts using Pig Latin join operations.
- Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
- Write Pig User Defined functions whenever necessary for carrying out ETL tasks.
- Extensive Working knowledge of partitioned table, UDFs,Thrift server in Hive
- Developed HIVE UDFs to in corporate external business logic.
- Imported Bulk Data into HBase Using Map Reduce programs.
- Managed and reviewed Hadoop log files to identify issues when job fails.
- Preparing dashboards using Tableau for analysis of data
- Recommending Dashboards per Tableau visualization features.
- Implemented advanced geographic mapping techniques and use custom images and geo coding to build spatial visualizations of non-geographic data
- Purchasing, Setting up, and configuring a Tableau Server for Data warehouse purpose.
- Making sure that the claims are processed on time as per the SLA.
- When there is any issue in the batch jobs due to the wrong format of the claims, file should be analysed and then work with the EDI team to make sure that the files are sent with the correct format.
Environment: Hadoop - Pig latin Script,Hive,Hbase, Apache Spark, MapReduce, Cloudera Hadoop Distribution, SQL Server 2008, SQL Server Management Studio,Tableau 7,8.1,Tableau server
Confidential Woonsocket, RI
Systems Engineer
Responsibilities:
- Plan, design, and implement application database code objects.
- Fix abends in minimum time limit with the help of abend aid, Xpeditor etc.
- Analysing the flow of JCL and COBOL jobs and making sure that no contention happens
- Develop SQL scripts, indexes, and complex queries.
- Perform quality assurance and testing of SQL server environment.
- Perform normalization of database.
- Identify business requirements and provide solution options.
- Creating Database Objects - Tables, Indexes, Views, User defined functions, Cursors, Triggers, Stored Procedure, Constraints and Roles.
- Ensure data integrity is maintained through security and change management.
- Involved in Analyzing, designing, building and testing of OLAP cubes with SSAS 2008/2012 and in adding calculations using MDX.
- Worked on SSRS reports and fine-tuned SSRS MDX queries.
- Deploy SSIS Package into Production and used Package configuration to export various package properties to make package environment independent.
- Identify the data issues and provided solution options for resolution.
- Monitor entire application development, execution, and maintenance.
Environment : SQL Server 2008, Business Intelligent technologies like SSIS, SSRS, SSAS,Visual Studio 2010, COBOL,VSAM,JCL,DB2
Confidential
Systems Engineer
Responsibilities:
- As a team member make sure that web crawling is done without any issues and data is fetched correctly.
- Ensuring that adequate data is fetched from the web crawl using Web Harvest tool
- Writing new programs to crawl latest ecommerce websites using Xquery and Xpath.
- Design and monitoring of Unix jobs to upload the crawled data to Oracle
- Generate reports using SQL to identify the lag in client product pricing compared with the competitor pricing
- Wrote efficient SQL queries in order to do proper data analysis.
- Good understanding of SSRS with Report Authoring, created Daily, Weekly and Monthly reports as per requirement using SSRS.
Environment: SQL server 2008/2012, Oracle 11g, TOAD 8.6.1.0,XML,HTML,Web harvest tool,BI tool - SSRS