We provide IT Staff Augmentation Services!

Lead Hadoop Engineer/techlead Resume

5.00/5 (Submit Your Rating)

Beaverton, OR

SUMMARY

  • Over 15 years of multi - disciplined technology expertise in developing and delivering high performance, scalable, reliable and available systems dat work on large amount of data.
  • Experienced in architecting and implementing BIG Data analytics solutions using distributed technologies such as Spark, Hadoop HDFS, Map Reduce, Hive, Pig, and oozie.
  • Hands on Java,Scala,Python,C++ programming languges.
  • Experienced in leading full life cycle planning, delivery of projects and provide hands-on technical leadership to engineering teams
  • Experienced in building, managing scrum/agile, globally distributed and functionally interdependent teams
  • Experienced in online & email data driven campaigns wif highly personalized/dynamic data and targeted messages
  • Sun Certified Java Programmer & Enterprise Architect and Cloudera Certified Hadoop Developer
  • Architected, designed and developed core Java based applications for financial trading industry and also built JEE based applications
  • Built high volume, high performance, low latency, scalable, multi threaded, fault tolerant systems
  • Excellent Object Oriented design, programming & problem solving skills; employed these skills to build highly extensible and maintainable systems
  • Won excellent performance award from Confidential Group,GE Capital
  • JVM fine-tuning.
  • Experienced in Test Driven Environment, Continuous Integration.
  • A self-motivated professional and natural communicator possessing good technical, initiating, leadership and problem-solving skills and TEMPhas proven to be a good team player.
  • High personal and professional ethics and an effective team player wif an aptitude to learn.
  • Migration on-perm environment to cloud based environment
  • Experienced in Test Automation, Build Automation and Continuous Integration.
  • Ability to quickly ramp up and start producing results on any given tool or technology.
  • Excellent communication skills and understanding of business processes.

TECHNICAL SKILLS

Big Data Technologies: Hadoop,Hive,Pig, Java MapReduce, Apache Spark, Machine Learning,, Crunch, Cascading,Impala, Sqoop, Python streaming, EMR.Hive,Pig, Sqoop. Impala, Spark SQL, Sqark Streaming, Kafka Broker, Spark ML, Twitter elephant bird, DataFuLib.

Hadoop Distributions: Cloudera CDH5.X 4,X, Horton Works, Amazon EC2.

Languages: Java, Python, Scala, C, C++, Ruby, Javascript, UML

AWS: EMR,EC2,SQS,SNS,Lamda,Machine Learning.

NoSQL: Hbase, Cassandra, Dynamo DB. Simple DB.

Languages: Java, Python, Scala, C, C++

Methodology: WaterFall, Scrum, Agile.

ORM technology: Hibernate.JPA

App/Web servers: Weblogic, Tomcat, Websphere. Apache

Databases: Oracle,MySql

Operating Systems: Linux, Ubantu, Mac OSx, Windows.

Tools: Maven, ANT, JUNIT, log4J.

IDEs: Eclipse, Intellj.Toad

Scripting Languages: HTML, DHTML, Java Script.

Web services: REST, SOAP

Learning: Advance Python,Scala programming

Functional Programming: Akka concurrent, distributed, resilient, message-driven applications Building scalable system using AWS.

PROFESSIONAL EXPERIENCE

Lead Hadoop Engineer/TechLead

Confidential, Beaverton, OR

Responsibilities:

  • Provide technical leadership to teh Big Data development team.
  • Design and develop data ingestion, aggregation, integration and advanced analytics in Hadoop, spark using Java, Scala and Python
  • Built continuous integration and test driven development environment
  • Researched and deployed new tools, frameworks and patterns to build a sustainable big data platform
  • Built business aggregation reports by joining large datasets (26 TB clickstream wif other dataset and other product, customer data) using scala spark 1.3.0 in EMR cluster
  • Involved Crunch, Cascading data pipeline and convert them into Hive, Pig data pipe lines.
  • In-depth experience wif Avro, HiveRC,Parquet,Sequence,Google ProtoBuf for binary data storage.
  • Provided technical architecture and development leadership in implementing Hadoop and Spark in AWS on EC2 nodes.
  • Experience wif Hadoop tools including Hive, Sqoop, Pig, Cascading, Crunch, and Impala.
  • Experience wif both Job tracker (MRv1), Yarn and job tuning and optimizing job processing.
  • Developed java udf,udaf,udtf for hive process.
  • Migrated on-perm hadoop job to AWS.
  • Migrated Long running hadoop jobs to EMR
  • Lead most of development activities.
  • Developed java, python user defined functions for Pig processes.
  • Developed Spark streaming job, which subscribe from Kafka, message broker and ingested teh data in hadoop cluster.
  • Developed Pig processes for ETL data pipeline.
  • Created multiple Spark prototypes and proof of concepts.
  • Involved Cluster migration.
  • Designed and Lead development for Job auditing process.
  • Worked wif Data scientists to create lot of POCs
  • Implemented Test Automation, Build Automation
  • Fine Tuned Hive Query, Debug Production issues for MR jobs.
  • Design, configuration, and troubleshooting of teh Big data Hadoop solution in customer environment
  • Productionalizing Hadoop applications (administration, configuration management monitoring, debugging, and performance tuning)

Confidential, Chicago

Responsibilities:

  • Architected, Developed Big data solutions for building various credit reports.
  • Developed Hive, Pig for building data pipelines.
  • Created Hive UDF, Hive UDAF .
  • Used Python UDF in pig .
  • Created Custom java MR jobs which uses custom libs and parsing JSON, XML payloads.
  • Used Avro, Google ProtoBuf, Sequence File, RC files
  • Designed HBase schema for storing market data in time series format.
  • Fine-tuned customer hive queries.
  • Debugged production jobs.
  • Build Custom Ingestor, which subscribes various messages from brokers and feed teh data into hadoop cluster.
  • Fine-tuned JVM for reducer memory requirement.
  • Development Audit and Logging process using java, python.
  • Sqoop export/import various tables from customer, credit databases.
  • Involved Development in Bond tranding application.
  • Used Spring Integration module for integration messaging systems.
  • Implemented Continuous integration using GIT,Hudson.Maven
  • Automated various test cases.

Environment: Java 1.6, Hadoop 0.20.2,HBase, Hive, HDFS, Sqoop, Map Reduce Programming, Pig, CDH 3.X, Tibco RV, JMS, Spring, Hudson, Maven, Oracle, SQLServer,DB2.

Senior Consultant

Confidential, Chicago

Responsibilities:

  • Architected and Designed teh systems.
  • Lead 4 Engineers to develop teh system.
  • Used Javaascript, AJAX, JQuery for web page development.
  • Designed Communication layer framework using java,Tibco,python.
  • Implemented J2EE pluggable autantication
  • Used Hibernate for ORM Object to Database mapping framework.

Environment: java, Spring Framework, Javascript, AJAX, Oracle, Python, Rule Engine.

Confidential

Responsibilities:

  • Implemented Big data for analysis certification logs and find business insights.
  • Developed Java web application for certifying trading applications.
  • Developed Ajax (used web 2.0) pages to refresh automated status about order and market data messages
  • Developed scalable, high tolerance custom rule engine.
  • Used Hive, java map reduce for analysis certification logs.
  • Sqoop reports to oracle database,

Confidential

Responsibilities:

  • Developed Web application for trading admistrators.
  • Used Apache Struts for MVC Architecture.
  • Used EJB as session façade layer.
  • Used Tibco as communication layer to talk to various components.
  • Build Router to route message to multiple engines.

Environment: Weblogic Application Server 9.0, J2EE (EJB, JMS, JNDI, JSP, Servlet), TIBCO RV, Sonic MQ, log4j, Sun OS, Swing, AJAX, JavaScript, CSS, HTML, DHTML, Eclipse, Oracle, TogetherSoft, ANT, CVS

We'd love your feedback!