Hadoop Engineer Resume
2.00/5 (Submit Your Rating)
CA
SUMMARY
- 26 years of experience in software engineering, architecting and delivering diverse applications/frameworks including client/server, web, distributed, service - oriented, real time and native iPhone.
- Administered a dozen Hadoop clusters averaging about 500 nodes per cluster.
- Analyzed telephony Call Detail Record logs generated by IVR system using MapReduce obtaining metrics not possible before.
- Architected an ETL solution dat enabled genomic data to be bulk loaded onto HBase, thus facilitating protein comparison to arrive at a score indicative of teh degree of match.
- Pioneered a machine learning algorithm to build Prediction model dat predicts teh state of a semiconductor chip in teh manufacturing process.
- Worked extensively on both Cloudera and Intel Distribution of Hadoop. Familiar with MapR distribution and trained on Hortonwork s Data Platform.
- Enabled Intel Distribution of Hadoop on Amazon EC2 .
- Extensive hands-on experience on heterogeneous environment; C++/C 12 years, Java 4 years, Objective-C 3 years.
- As an MES Architect in teh R&D department me analyzed manufacturing data to improve solar panel efficiency to 18%.
- As UNIX administrator, me was instrumental in automating access to applications on Mainframe.
- Provided leadership and technical direction in 2 out of 4 iOS projects.
- Provided Software consultancy services across multiple industry domains such as Manufacturing, Energy, Healthcare, Security and Social Network, clientele being Yahoo, Apple, Cisco, Microsoft, Intel, Unisys, Philips and First Solar.
- Worked in companies of all sizes; Startup to Fortune 500.
- References / recommendations publicly available on LinkedIn.
- Contributed code to github.
TECHNICAL SKILLS
OS Linux: Mac OS, Windows, iOS
BIG DATA: Hadoop, MapReduce, HBase, Hive, Pig, Cassandra, Spark, Kafka
SCRIPT Bash: awk, sed, Expect
STAT TOOLS: Octave, Python, MLLib, Excel
LANGUAGE: C, C++, Objective-C, Java
PROFESSIONAL EXPERIENCE
Hadoop Engineer
Confidential, CA
Technology:CDH, bash, awk, sed, Expect, Python, Salt, Spark, Kafka, Oracle, Fabric, Java
Responsibilities:
- My responsibility includes stabilizing (and growing) Hadoop Ecosystem, researching, evaluating and introducing new technology, on boarding and educating customers, tuning cluster for performance improvement, supporting infrastructure and automating processes by writing scripts.
- Managed 4 Production clusters averaging about 500 nodes per cluster. Performed cluster creation, authentication enablement, troubleshooting, cluster balancing, file system check, monitoring disk/cpu/memory/network utilization.
- Upgraded and configured migration to Cloudera Manager. Performed capacity planning, backup, data compression, node commissioning, benchmarking, scheduling for resource utilization. Analyzed job failure and tuned hive performance.
- Analyzed millions of documents using MapReduce algorithms unearthing hidden patterns like frequently found hidden pairs to enhance metadata for easier classification and search. Used MarkLogic Connector for Hadoop for XML integration.
- Designed an ETL solution dat enabled genome sequencing data to be bulk loaded onto HBase exceeding teh required SLA. dis was accomplished by bypassing teh typical write path to get teh data online into a running cluster. Data was prepared into HBase internal file format such dat each file fitted within a single region. Partitioning into disjoint ranges of teh key space was accomplished by TotalOrderPartitioner.
- Pioneered a system for performing experiments with machine learning algorithms to build Prediction models. Teh system allows for flexible test data retrieval, performing pattern extractions (based on feature defined on wafer geometry) and applying teh resulting training data to create prediction models. Predicted test results with greater accuracy reducing testers’/equipments’ time.
- Interacted with various clients to gather requirements, perform feasibility studies and checked data sources for Call Detail Record from mobile operators and SMS Aggregators. Accomplished task by writing MapReduce program and publishing teh reduced data.
