Hadoop Architect Resume
PROFESSIONAL SUMMARY:
- Over 11 plus years of progressive and diversified experience in all phases of software development including design and architecture of distributed, client/server, network intensive, and multi - tier systems.
- Managed Distributed clusters of HDP-2.6.0, AWS and Cloudera Platform
- Good Understanding in processing large sets of structured, semi-structured and unstructured data using Hadoop Ecosystem.
- Very good exposure on AGILE methodology of Software Development.
- Good knowledge of Hadoop ecosystem, HDFS, Big Data, RDBMS.
- Experienced on working with Big Data and Hadoop File System (HDFS).
- Hands on Experience in working with ecosystems like Hive, Pig, Sqoop, Map Reduce, Flume, Oozie.
- Strong Knowledge of Hadoop and Hive and Hive's analytical functions.
- Capturing data from existing databases that provide SQL interfaces using Sqoop.
- Efficient in building hive, pig and map Reduce scripts.
- Implemented Proofs of Concept on Hadoop stack and different big data analytic tools, migration from different
- (i.e. Teradata, Oracle) to Hadoop.
- Successfully loaded files to Hive and HDFS from Base.
- Loaded the dataset into Hive for ETL Operation.
- Good knowledge on Hadoop Cluster architecture and monitoring the cluster.
- Hands on experience in IDE tools like Eclipse.
- Excellent problem solving skills, high analytical skills, good communication and interpersonal skills.
TECHNICAL SKILLS:
Big Data Technologies: Hadoop 2.1.0.2.0, HDFS 2.1.0.2.0, MapReduce 2, Hive 0.12.0.2.0, Oozie 4.0.0.2.0, Flume 1.5.2, Phoenix, Spark 1.6.3, HBase 0.96.0.2.0, YARN, Sqoop 1.4.4.2.0
Programming Skills: Core Java, Scala 2.10.6, C++, C, Python, SQL, PL/SQL, Gupta SQL, Oracle Forms 10g
Databases: Mysql, Hbase, Hive, Oracle 10g
Tools and Utilities: Centura 2.1, Team Developer 5.1,Cruise control, Clarify Tool, Toad, Putty, Star Team, Subversion, SQL Developer, Edit Plus, Sccs, Pvcs, Cvs, Continuous, gdb, PL/SQL Developer, Eclipse, Netbean, Sbt, Maven
Operating Systems: Win 9x, UNIX (HP-UX 11.11, Solaris 9, AIX), Linux (Red hat 6, CentOS 6.6) MSDOS
PROFESSIONAL EXPERIENCE:
Hadoop Architect
Confidential
Responsibilities:
- Designed and developed ingestion service to load data from Rdbms/Files to Hadoop Data Lake.
- Prepared the solution design document and Get it reviewed.
- Prepared the detailed design document and Get it approved.
- Surrogate key Generations.
- Hash key generation.
- De-Duplicate the records.
- End date the final records.
- Involved in coding, reviewing the codes, unit testing, functional testing etc.
- Scheduled the jobs Autosys.
Environment: Hadoop, HDFS, MapReduce 2, Hive 0.12.0.2.0, Scala, Phoenix, Spark 1.6.3, HBase 0.96.0.2.0, YARN, Sqoop 1.4.4.2.0, Shell Scripting, Oracle.
Hadoop Architect
Confidential, Windsor, CT
Responsibilities:
- Involved in Loading Data from Oracle to Hadoop Data Lake.
- Created an ingestion frame work to ingest the tables for required for Specialty BI.
- Involved in coding, reviewing the codes, unit testing, functional testing etc.
- Scheduled the jobs using Autosys.
Environment: Hadoop, HDFS, MapReduce 2, Hive 0.12.0.2.0, Scala, Phoenix, Spark 1.6.3, HBase 0.96.0.2.0, YARN, Sqoop 1.4.4.2.0, Shell Scripting, Oracle.
Hadoop Architect
Confidential, Windsor, CT
Responsibilities:
- Involved in Loading Data from Oracle Exadata to Hadoop Data Lake.
- Prepared the solution design document and Get it reviewed.
- Prepared the detailed design document and Get it approved.
- Ingested data from Exadata to Hadoop using Sqoop.
- Exported data from Hadoop to Exadata using Sqoop.
- Involved in coding, reviewing the codes, unit testing, functional testing etc.
- Scheduled the jobs using Crontab.
Environment: Hadoop, HDFS, MapReduce 2, Hive 0.12.0.2.0, Scala, Phoenix, Spark 1.6.3, HBase 0.96.0.2.0, YARN, Sqoop 1.4.4.2.0, Shell Scripting, Oracle.
Hadoop Architect
Confidential
Responsibilities:
- Involved in Loading Data from SAP CRM/ISU to Hadoop Data Lake.
- Prepared the solution design document and Get it reviewed.
- Prepared the detailed design document and Get it approved.
- Calculated the deltas from the full load files.
- Captured the history from the delta files.
- Ingested data from Teradata to Hadoop using Sqoop.
- Exported data from Hadoop to Teradata using Teradata connector.
- Involved in coding, reviewing the codes, unit testing, functional testing etc.
- Scheduled the jobs using Crontab.
Environment: Hadoop, HDFS, MapReduce 2, Hive 0.12.0.2.0, Scala, Phoenix, Spark 1.6.3, HBase 0.96.0.2.0, YARN, Sqoop 1.4.4.2.0, Shell Scripting, Oracle.
Hadoop Architect
Confidential
Responsibilities:
- The ASV Data Fix is a BAU process run by Data Management on a weekly basis to fix discrepancies in Annual Site Visit (ASV) between SAP CRM and WMIS.
- This is one of the major problem areas in BGR business that are causing several types of failures in the job booking process in WMIS.
- Various reasons could be attributed to causing ASV misalignment e.g. inherited data from legacy, Banner Replacement, Catalyst migration, bad design and code errors etc; some of these root causes being investigated to be fixed at source.
- A significant amount of time is lost in running the ASV data fix which usually requires AMS (MI), WMIS (Apps-Hestia) and Data team to coordinate in producing the ASV extract split into several known pots, and fixing those in WMIS and SAP CRM separately.
- A requirement has therefore risen to automate this data fix to reduce the manual intervention required. This document attempts to outline the steps that are being carried out manually and an estimate/effort to automate these steps along with the assumptions that have been considered.
Environment: Hadoop, HDFS, MapReduce 2, Hive 0.12.0.2.0, Scala, Phoenix, Spark 1.6.3, HBase 0.96.0.2.0, YARN, Sqoop 1.4.4.2.0, Shell Scripting, Oracle.
Team Leader
Confidential
Responsibilities:
- Both instrument reference data feed file and transaction feed file should be subjected to a series of validation checks.
- Transaction/Instrument reference data is classified as “validated”, only if all field level validations are successful.
- Transaction/Instrument reference data is classified as “rejected”, only if one or more field level validations fails. If a duplicate submission has been received, then the submission should be rejected.
- Ingestion Through put (Graph of total transaction count/ total incoming feed file count vis-à-vis the days of the month). Outbound Throughput (Graph of total number of outbound reports vis-à-vis the days of the month) Ingestion Breakup (Displays % of validated/ rejected)
Environment: Hadoop, HDFS, MapReduce 2, Hive 0.12.0.2.0, Scala, Phoenix, Spark 1.6.3, HBase 0.96.0.2.0, YARN, Sqoop 1.4.4.2.0
Team Leader
Confidential
Responsibilities:
- Build an end-to-end turnkey solution that captures, extracts, transforms, loads, manages, and analyzes device data.
- Use specialized machine learning algorithms for predictive analytics.
- Capitalize on out-of-the-box BI Dashboards and Reports to analyze and identify trends and patterns leading to device failures.
- Predict device failures in near real-time.
Team Leader
Confidential
Responsibilities:
- Recommend a free item/s to a customer on birthday/anniversary from an approved list.
- Recommend a free item based on customers purchase history and loyalty score
- Identify customer visit frequency patterns and gaps in customer visits
- Generate offer based on customers previous purchase and similarity to active customers
Team Leader
Confidential
Responsibilities:
- To enhance the functionality of the existing Confidential web portal. Confidential is the cargo division of Confidential .
- I am involved in the easy-booking project of this Confidential system. This booking can be done through internet, intranet and message queues.
- I was working on the enhancement part of this project which is the Lab on Hiring basis from Confidential to Confidential .