We provide IT Staff Augmentation Services!

Senior Hadoop Developer/ Admin Resume

4.00/5 (Submit Your Rating)

SUMMARY

  • 6 years of experience in Analysis, design, development, deployment in IT industry.
  • 3 years of experience installing, configuring, and testing Hadoop ecosystem components.
  • Over 2+ years of experience inMongoDB(versions 2.x, 3.0.0 and 3.0.5) installation, configuration and administration.
  • 1 year of experience working on Spark SQL and Scala
  • 2 years of experience in MS SQL server development and Administration
  • 2 years of experience in client side web development using technologies such as JQuery, JavaScript, JSP, JSTL, XML, HTML, DHTML, CSS and AJAX
  • Over 2 years of hands - on experience in cluster design, sizing, installation and configuration of HortonworksData Platform 2.1 - 2.3.
  • Experience in end to end design, development, Maintainance and Analysis of various types of applications using efficient Data Science Methologies and Hadoop ecosystem tools
  • Experience in providing Solution Architecture for Big Data projects usingHadoopEco System.
  • Experienced in setting up ofHadoopcluster, Performance Tuning, Developing Logical & Physical Data Models using HIVE for Analytics, Data lake creation using Hive and Data load management using SQOOP.
  • Experience in Linux shell Scripting
  • Experience working in NoSQL(MongoDB) and Python
  • Support the development of Kafka pipeline
  • Experience in developing application, automated scripts leveragingMongoDB
  • Experience in the analysis of the data, data modelling and data structures.
  • Extensive experience in designingMongoDBmulti-sharded cluster and monitoring with MMS.
  • Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
  • Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review.
  • Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing.
  • Coding of Database objects like Triggers, Procedures, Functions and Views.
  • Exposure to T-SQLprogramming and Architecture, and translated complex legacy process with T-SQLprocedure, functions and package.
  • Experience in designing logical and physical data models using MS Visio
  • Knowledge of working with star schema & Snow-Flake Schema.
  • Knowledge with all phases of software development life cycle (SDLC).
  • Excellent interpersonal and strong analytical, problem-solving skills with customer service oriented attitude.
  • A very good team player, self-motivated, dedicated in any work environment

TECHNICAL SKILLS

Big Data Ecosystem: Hadoop, MapReduce, HDFS, HBase, Spark, Scala, Kafka, Zookeeper, Hive, Pig, Sqoop, Flume and Pivotal HD

Programming Languages: Java J2EE, Python, Scala, C, R

Scripting Languages: JSP & Servlets, JavaScript, XML, HTML and Bash

Databases: MS SQL, Oracle, NoSQL, MongoDB

IDEs & Tools: Rational Rose, Rational Team Convert, Eclipse, NetBeans, Eclipse, JUnit, jQuery, MQ, TOAD, SQL Developer, Microsoft Visual studio 2008/10, Yum, RPM

Versioning Tools: SVN, CVS, Dimensions, and MS Team Foundation Server

Scripts & Libraries: Java Script, AngularJs, Node.js, Freemarker, Groovy, Maven, Ant Scripts, XML DTDs, Xquery, XPath, XSLT, XSDs, JAXP, SAXandJDOM

Markup Languages: XSLT, XML, XSL, HTML5, DHTML, CSS, OO CSS, jQuery, AJAX

Operating Systems: Red Hat Linux 6.2/6.3, Unix, Solaris, Windows 7/8, Linux

PROFESSIONAL EXPERIENCE

Confidential

Senior Hadoop Developer/ Admin

Responsibilities:

  • Performed root cause analysis in the client data warehouse.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Documented the systems processes and procedures for future references.
  • Coded in Python with Numpy, Scipy, and Pandas modules. Performed statistical analysis on different data patterns to recommend top valued data patterns with the highest accuracy in predicting the next data point. Used Python Matplotlib and Excel to generate charts, and report to customer using PowerPoint in weekly meetings
  • Maintained client relationship by communicating the daily status and weekly status of the project.
  • Developed enhancements toMongoDBarchitecture to improve performance and scalability. Collaborated with development teams to define and apply best practices for usingMongoDB.
  • Consulted with the operations team on deploying, migrating data, monitoring, analyzing, and tuning MongoDBapplications. Rolled out and administered sharded and non-sharded clusters totaling 10 TB.
  • Designed and implemented GridFS. Created Python scripts to conduct routine maintenance and deliver ad hoc reports. Monitored and tuned user-developed JavaScript
  • Facilitated storage by identifying the need and subsequently developing JavaScript to archive GridFS collections.
  • Proactively developed and implemented a Python script to report the health and metadata of a shard cluster.
  • Installation of MongoDB in different Operating System (Linux, Windows)
  • Manage and maintain MongoDB servers across multiple environments
  • Design & implement sharding and indexing strategies
  • Monitor deployments for capacity and performance
  • Implement and maintain MMS (MongoDB Management Services).
  • Define and implement backup strategies per data retention requirements
  • Develop and document best practices for data migration
  • Incident and problem management
  • Developed a data pipeline using Kafka to store data into HDFS
  • Streaming data to Hadoop using Kafka
  • End to End integration between Kafka, NoSQL, Flume, Hadoop, Hive
  • IntegratedKafkawith Flume in sand box Environment usingKafka source andkafkasink.
  • Auto Populate Hbase tables with data coming fromkafkasink.
  • Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing
  • Worked on the core and Spark SQL modules of Spark extensively.
  • Handling structured and unstructured data and applying ETL processes.
  • Managing and reviewingHadooplog files.
  • RunningHadoopstreaming jobs to process terabytes data.
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures
  • Performed visualization using sql intergrated with Tableau on different input data
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
  • Load and transform large sets of structured, semi structured and unstructured data
  • Supported Map Reduce Programs those are running on the cluster
  • Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure condition
  • Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs
  • Utilized Java and MS SQL from day to day to debug and fix issues with client processes
  • Managed and reviewed log files
  • Implemented partitioning, dynamic partitions and buckets in HIVE

Environment: Hadoop, Pivtoal HD 2.1, Eclipse, R programming, Kafka, Python, Map Reduce, Hive, Pig, Hbase, Putty, Sqoop, Flume, Scala,Spark, Linux, Java, Tableau, Eclipse, HDFS, PIG, Java (JDK), MSSQL and Ubuntu.

Confidential

Hadoop Developer/Administrator

Responsibilities:

  • Experience in providing Solution Architecture for Big Data projects usingHadoopEco System.
  • Experienced in setting up ofHadoopcluster, Performance Tuning, Developing Logical & Physical Data Models using HIVE for Analytics, File processing using PIG, and Data load management using SQOOP.
  • Installing and configuringHortonworksData Platform 2.1 - 2.3
  • Analyzed the Functional Specifications.
  • Installed and configured HDFS, PIG, HIVE,HadoopMapReduce
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Writing Pig scripts to process the data.
  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala/Python
  • Developed and executed shell scripts to automate the jobs
  • Worked on reading multiple data formats on HDFS using PySpark
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala
  • Creating Spark SQL queries for faster requests
  • Developed multiple POCs using PySpark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Analyzed the SQL scripts and designed the solution to implement using PySpark
  • Written Hive queries for data analysis to meet the Business requirements.
  • Responsible to manage data coming from different sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Supported MapReduce Programs those are running on the cluster.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.
  • Build/Tune/Maintain Hive QL and Pig Scripts for user reporting
  • Experienced in defining job flows
  • Experienced in managing and reviewingHadooplog files
  • Supported MapReduce Programs running on the cluster
  • Installed and configured Hive.
  • Develop Shell scripts to automate routine DBA tasks (i.e. database refresh, backups, monitoring)
  • Importing & exporting data from RDBMS to HDFS, Hive and HBase using sqoop.
  • Moved various formats like TSV, CSV etc. files from RDBMS to HDFS for furtherProcessing.
  • Installed and configured Hive/Pig Ecosystems.
  • Gathered the business requirements by coordinating and communicating with business team.
  • Prepared the documents for the mapping design and production support.
  • Written the Apache PIG scripts to process the HDFS data and send the data to HBase.
  • Involved in developing the Hive Reports, Partitions of Hive tables.
  • Moved user information from MS SQL Server to HBase using Sqoop.
  • Involved in integration of Hive and HBase.
  • Created MapReduce jobs using Hive/Pig Queries.

Environment: Windows, Linux,Hortonworks 2.3, Eclipse, Kerberos, Ranger, HDFS, Pig, Hive, Sqoop, Flume, Java, JEE,Python,Spark, Kafka,HBase and SQL Server

Confidential

MS SQL Developer/Administrator

Responsibilities:

  • Worked as developer and administratoronMSSQLServer2014.
  • Maintained client relationship by communicating the daily status and weekly status of the project.
  • Developed complex T-SQLcode.
  • Created Database Objects - Tables, Indexes, Views, User defined functions, Cursors, Triggers, Stored Procedure, Constraints and Roles.
  • UsedSQLprofiler to view indexes performance to mostly eliminate table scan.
  • Maintained the table performance by following the tuning tips like normalization, creating indexes and collect statistics.
  • Managed and monitored the use of disk space.
  • Maintained the consistency of the client's Database using DBCC.
  • Create indexes on selective columns to speed up queries and analysis inSQLServerManagement Studio
  • Implemented triggers and stored procedures and enforced business rules via checks and constraints.
  • Performed data transfers using BCP and BULK INSERT utilities.
  • Implemented Mirroring and Log Shipping for Disaster recovery
  • Executed transactional and snapshot replication
  • Performed all aspects of database administration, including data modeling, backups and recovery.
  • Experience in troubleshooting replication problems.
  • Checking Database Health by using DBCC Commands and DMVS
  • Generatedserverside T-SQLscripts for data manipulation and validation and created various Snapshots and materialized views for remote instances.
  • Tuned stored procedures by adding the try catch blocks for error handling.
  • Tested to optimize the Stored Procedures and Triggers to be used in production.

Environment: MSSQLServer 2014, Business Intelligence Development Studio (BIDS),SQLserver integration services (SSIS),SQLserver reporting services (SSRS),SQLScripts, Linux Script, Unix Operating System, Windows Operating System, T-SQL, Stored Procedures, MS Access, MS visio and MS Excel

Confidential

Web Developer

Responsibilities:

  • Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user requirements
  • Created front-end interfaces and Interactive user experience using HTML, CSS, and JavaScript
  • Responsible for validation of Client interface JSP pages using Struts form validations
  • Ajax for better and more options in graphical and shaping page with JavaScript
  • API and SOAP for transferring data and information between other websites
  • Worked on Struts framework to create theWebapplication
  • Developed Servlets, JSP and Java Beans using Eclipse
  • Designed and developed struts action classes for the controller responsibility
  • Involved in the integration of Spring for implementing Dependency Injection (DI/IOC)
  • Responsible for Writing POJO, Hibernate-mapping XML Files, HQL
  • Involved with the database design and creating relational tables
  • Utilized Agile Scrum to manage full life-cycle development of the project
  • Building and Deployment of EAR, WAR, JAR files on test, stage and production servers
  • Involved with the version control and configuration management using SVN.

Environment: HTML, CSS, XML, DHTML, XHTML, DOM, POJO, HQL, SOAP, JSP, JavaScript, JQuery, AJAX, JSON, Eclipse

We'd love your feedback!