Hadoop Developer Resume
Pleasanton, CA
SUMMARY:
- Business driven and result oriented Technologist, with 10+ years of IT experience including 3+ years of BigData experience comprising in depth business knowledge in Healthcare domain.
- Passionate and having ability to capture complex processing needs of big data and has experience developing codes and modules to address those needs.
TECHNICAL SKILLS:
Hadoop Ecosystem: MapReduce, HDFS, Pig, Hive, Impala, Sqoop, Oozie, Spark
Hadoop Distributions: Cloudera, Hartonworks, Apache open source, Amazon EMR
Languages: SQL, HiveQL, Pig Latin, JAVA, COBOL,SCALA, Python
Scripting Languages: Shell Script, Java Script, HTML and
Operating Systems: Redhat Linux, Sco UNIX, z O/S/MVS, Guardian 90, Windows
Tools: Informatica power center 9.5.1, SQL developer, Win SCP, Rational Clear case, Eclipse IDE, Apache ANT
Databases: Oracle 10g/9i/8i, SQL Server 2005/2008,DB2
Reporting Tools: Pentaho 5.2, Tableau 9.0
Version Control Tools: Stash, Git, Tortoise SVN
PROFESSIONAL EXPERIENCE:
Confidential, Pleasanton CA
Hadoop Developer
Responsibilities:
- Coordinates with business users and BI team to discuss and understand the requirement and to build the database based on their business needs on various reporting.
- Ingested data into HDFS from Oracle using Sqoop and log files of HL7 and NCPDP messages using FLUME.
- Involved in loading data from edge node to HDFS using shell scripting.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables
- Developed PIG scripts to transform the raw data into intelligent data as per business requirement.
- Created several External and Managed in Hive to handle the initial and incremental loads and Partition tables to store the data per reporting needs.
- Generated several Impala scripts to validate the data stored in Hive and to evaluate the performance of the database design and report generation to end users.
- Used Avro and Parquet as the file storage format to save disk storage space
- Wrote E2E Oozie workflows to perform various actions like Sqoop and hive and scheduled the jobs to run independently with time and up on data availability.
- Worked in tuning Hive and Pig scripts to improve performance
- Extensively used Cloudera HUE on multiple environments such as Dev, QA and Prod for validation, browsing job logs, files system transfer, etc.
- Used Pentaho 5.2 and Tableau 9.0 for building reports for various business users.
- Attended daily standup meetings. Provided the status of the project on weekly meeting.
Major technologies: HDFS, MapReduce, Sqoop, FLUME, Hive, Impala, Oozie, Cloudera Hue, Linux, Java, Tableau and Pentaho
Confidential, Pleasanton CAHadoop/ETL Developer
Responsibilities:
- Designed and developed the repository for the Pharmacy Enterprise Reporting Consolidation project using Hadoop stack that provides Business Users for reporting and statistical analysis of the live data from the Legacy systems.
- Developed various UNIX shell scripts to SFTP data from different environments upon file arrival, schedule and run Extraction and Load process.
- Assisted in fixing the Major data issues that evolved during production and achieved the cutover of all the stores across enterprise as per the schedule
Major technologies: HDFS, MapReduce, Sqoop, Hive, Cloudera Hue, Linux, Informatica Power Center 9.5.1, Repository Manager, Mapping Designer, Work flow Manager, Workflow monitor, Oracle, UNIX.
Confidential, Pleasanton CADeveloper/Consultant
Responsibilities:
- Incorporated pre - processing and parallel processing to update year end benefit of over 10million members to avoid contention/system handing issues while pulling member details at front-end.
- Designed and developed programs for claim adjudication with different PBM 's based on the type of member insurance using TCP/IP connectivity to comply the Affordable Care Act (ACA).
- Analysis and Bug fix of several critical issues impacting the Patient Care/functionality and developed many scripts to alert for any issues to reduce the number of critical incidents.
Major technologies: Core Java, UNIX, Oracle, Eclipse, Tortoise SVN, Apache ANT, XML, Weblogic, Remedy.
ConfidentialJava Developer
Responsibilities:
- Involved in various stages of the project life cycle primarily design, implementation, testing, deployment and enhancement of the application.
- Accomplished major data migration.
Major technologies: HTML, CSS, JavaScript, JSP, IBM WAS 7.5, Oracle 10g.
ConfidentialMainframe Developer
Responsibilities:
- Developed several new processes to reflect various new categories of alternations made in database to auto-print reports for day to day operations to avoid manual work to the End Users.
- Merged 100+ functions from MCOS table to MRMS table that reduces the separate maintenance cost of MCOS application.
Major technologies: Java, IBM Mainframe, Tandem Guardian 90, JCL, COBOL, DB2, VSAM, CICS, Expeditor, Change man, TACL etc