We provide IT Staff Augmentation Services!

Lead Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Libertyville, IL

SUMMARY

  • 9+ years of extensive Professional IT experience, including 5+ years of Hadoop experience, capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Well experienced in the Hadoop ecosystem components like Hadoop, MapReduce, Cloudera,Horton works, Mahout,HBase, Oozie, Hive, Sqoop, Pig, andFlume.
  • Experience in using Automation tools like Chef for installing, configuring and maintaining Hadoop clusters.
  • Lead innovation by exploring, investigating, recommending, benchmarking and implementing data centric technologies for the platform.
  • Technical leadership role responsible for developing and maintaining data warehouse and Big Data roadmap ensuring Data Architecture aligns to business centric roadmap and analytics capabilities.
  • Experienced inHadoopArchitectand Technical Lead role, provide design solutions andHadooparchitectural direction.
  • Strong knowledge with Hadoop cluster connectivity and security.
  • Demonstrates an ability to clearly understand user’s data needs and identifies how to best meet those needs with the application, database, and reporting resources available.
  • Strong understanding of Data Modeling in data warehouse environment such as star schema and snow flake schema.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Great understanding of structure of relational database, which give me advance to write complex SQL statement combine multiple joins and inline-view.
  • Proficient in writing Structured Query Language under Microsoft SQL Server and Oracle environment.
  • Create, manipulate and interpret reports with specific program data.
  • Determine client and practice area needs and customize reporting systems to meet those needs.
  • Hands on experience with HDFS, Map-reduce, PIG, Hive, AWS, Zookeeper, Oozie, HUE, Sqoop,Spark, Impala, Accumulo.
  • Good experience on general data analytics on distributed computing cluster likeHadoopusing Apache Spark,Impala and Scala.
  • Worked on different RDBMS databases like Oracle, SQL Server, and MySQL.
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS and transferred large datasets betweenHadoopandRDBMS by implementing SQOOP.
  • Good experience of NoSQL databases like MongoDB, Cassandra, and HBase.
  • Hands on experience developing applications on HBase and expertise with SQL, PL/SQL database concepts.
  • Excellent understanding and knowledge of ETL tools like Informatica.
  • Having experience in using Apache Avro to provide both a serialization format for persistent data, and a wire format for communication betweenHadoopnodes.
  • Extensive experience in Unix Shell Scripting.
  • Expertise inHadoopworkflows scheduling and monitoring using Oozie, Zookeeper.
  • Good Knowledge in developing MapReduce programs using Apache Crunch.
  • Strong experience as a JavaDeveloperin Web/intranet, Client/Server technologies using Java, J2EE technologies which includes Struts framework, MVC design Patterns, JSP, Servlets, EJB, JDBC,JSLT,spark XML/XLST, JavaScript, AJAX, JMS, JNDI, RDMS, SOAP, Hibernate and custom tag Libraries.
  • Supported technical team members for automation, installation and configuration tasks.
  • An excellent team player and self-starter with good communication and interpersonal skills and proven abilities to finish tasks before target deadlines.

TECHNICAL SKILLS

Big Data: ApacheHadoop, Cloudera, Hive, Hbase, Sqoop, Flume, Spark,Pig, HDFS and MapReduce, Oozie, Scala, Impala, Cassandra, Zookeeper, Apache Spark, Apache Kafka, Accumulo and Apache STORM.

Databases: Oracle, MySQL, MS SQL Server and MS Access T-SQL, PL/SQL, SSIS, SSRS

Programming Languages: C/C++, Java,python.

Java Technologies: Java, J2EE, JDBC, JSP, Java Servlets, JMS, Junit, Log4j.

IDE Development Tools: Eclipse, Net Beans, My Eclipse, SOAP UI, Ant.

Operating Systems: Windows, Mac, Unix, Linux.

Frameworks: Struts, Hibernate, Spring.

PROFESSIONAL EXPERIENCE

Confidential, Libertyville, IL

Lead Hadoop Developer

Responsibilities:

  • Worked onHadoopcluster which ranged from 4-8 nodes during pre-production stage and it was sometimes extended up to 24 nodes during production.
  • Built APIs that will allow customer service representatives to access the data and answer queries.
  • Designed changes to transform currentHadoopjobs to HBase.
  • Handled fixing of defects efficiently and worked with the QA and BA team for clarifications.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Extending the functionality of Hive and Pig with custom UDF s and UDAF's.
  • Developed Spark Application by using Scala.
  • The new Business Data Warehouse (BDW) improved query/report performance, reduced the time needed to develop reports and established self-service reporting model in Cognos for business users.
  • Implemented Bucketing and Partitioning using Hive to assist the users with data analysis.
  • Used ooziescripts for deployment of the application and perforce as the secure versioning software.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Extracted large volumes of data feed from different data sources, performed transformations and loaded the data into various Targets.
  • Develop database management systems for easy access, storage and retrieval of data.
  • Perform DB activities such as indexing, performance tuning and backup and restore.
  • Used Sqoop to import the data from RDBMS toHadoopDistributed File System (HDFS) and later analyzed the imported data usingHadoopComponents
  • Expertise in writingHadoopJobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java.
  • Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
  • Expert in creating PIG and Hive UDFs using Java in order to analyze the data efficiently.
  • Responsible for loading the data from BDW Oracle database, Teradata into HDFS using Sqoop.
  • Implemented AJAX, JSON, and Java script to create interactive web screens.
  • Wrote data ingestion systems to pull data from traditional RDBMS platforms such as Oracle and Teradata and store it inNoSQL databases such as MongoDB.
  • Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically.
  • Support of applications running on Linux machines
  • Developed data formatted web applications and deploy the script using HTML5, XHTML, CSS and Client side scripting using JavaScript.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
  • Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications.
  • Used Zookeeper to manage coordination among the clusters.
  • Experienced in analyzing Cassandra database and compare it with other open-sourceNoSQL databases to find which one of them better suites the current requirements.
  • Created and maintained Technical documentation for launchingHADOOPClusters and for executing Hive queries and Pig Scripts
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability
  • Assisted application teams in installingHadoopupdates, operating system, patches and version upgrades when required
  • Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.

Environment: ApacheHadoop2.0.0, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, MapReduce, JSP, Structs2.0, NoSQL, HDFS, Teradata, Sqoop, LINUX, Oozie, Cassandra, Hue, HCatalog, Java.IBM Cognos, Oracle 11g/10g, Microsoft SQL Server, Microsoft SSIS, DB2 LUW, TOAD for DB2, IBM Data Studio, AIX 6.1, UNIX Scripting

Confidential, Seattle, WA

Hadoop Developer

Responsibilities:

  • Experienced on adding/installation of new components and removal of them through Ambari.
  • Architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Responsible for importing log files from various sources into HDFS using Flume.
  • Handled Big Data utilizing aHadoopgroup comprising of 40 hubs.
  • Performed complex HiveQL queries on Hive tables.
  • Actualized Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Create technical designs, data models and data migration strategies,create dimensional data models, data marts.
  • Design and build maintain logical and physical databases, dimensional data models, ETL layer design and data integration strategies.
  • Created final tables in Parquet format.
  • Developed PIG scripts for source data validation and transformation.
  • Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
  • Developed NoSQL database by using CRUD, Indexing, Replication and Sharing in MongoDB.
  • Experience using Talend administration console to promote and schedule jobs.
  • Extracted and updated the data into MongoDB using Mongo import and export command line utility interface.
  • Involved in unit testing using MR unit for MapReduce jobs.
  • Utilized Hive and Pig to create BI reports.
  • Used Oozie workflow engine to manage interdependentHadoopjobs and to automate several types ofHadoopjobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Worked with Informatica MDM in creating single view of the data.

Environment: Cloudera,Hadoop, HDFS, Pig, Hive, MapReduce, Java, Flume, Informatica, Oozie, Linux/Unix Shell scripting, Avro, MongoDB, Python, Perl, Java (jdk1.7), Git, Maven,SAPBW,COGNOS, Jenkins.

Confidential

Java Developer

Responsibilities:

  • Involved in analysis and gathering requirements and user specifications from business analyst.
  • Involved in creating use case, class, sequence, package dependency diagrams using UML.
  • Involved in Database Design by creating Data Flow Diagram (Process Model) and ER Diagram (Data Model).
  • Used JavaScript for certain form validations, submissions and other client side operations.
  • Created Stateless Session Beans to communicate with the client. Created Connection Pools and Data Sources.
  • Implemented and supported the project through development, Unit testing phase into production environment.
  • Designing the database and coding of SQL, PL/SQL, Triggers and Views using IBM DB2.
  • Deployed Server-side common utilities for the application and the front-end dynamic web pages using Servlets, JSP and custom tag libraries, JavaScript, HTML/DHTML and CSS.

Environment: Java 5.0, J2EE, JSP, HTML/DHTML, CSS, JavaScript DB2, Windows XP, Struts Framework, Eclipse IDE, Web Logic Server, SQL, PL/SQL.

We'd love your feedback!