We provide IT Staff Augmentation Services!

Big Data / Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Warren, NJ

SUMMARY

  • Around 8 years of overall IT experience in a variety of industries, this includes hands on experience of 3 years in Big Data technologies and designing and implementing Map Reduce.
  • Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
  • In - depth knowledge ofHadooparchitecture and its components like YARN, HDFS, Name Node, Data Node, Job Tracker, Application Master, Resource Manager, Task Tracker and Map Reduce programming paradigm.
  • Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Pentaho, HBase, Zookeeper, Sqoop, Oozie, Cassandra, Flume and Avro.
  • Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture
  • Building massively scalable multi-threaded applications for large data processing primarily with Apache Spark / Pig on Hadoop.
  • Knowledge and experience in job work-flow scheduling and monitoring tools like Oozie and Zookeeper.
  • Commendable knowledge / experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
  • Experience in Object Oriented Analysis and Design (OOAD) and development of software using Java, Scala, Python in UNIX platform...
  • Extensive use of core Java Collections, Generics, Exception Handling, and Design Patterns for functionality,
  • Experience in database design. Used Oracle PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle …
  • Experience in installation, configuration, Management, supporting and monitoring Hadoop cluster using various distributions such as Apache and Cloudera.
  • Experience writing MapReduce programs with custom logics based on the requirement.
  • Experience writing custom UDFs in pig and hive based on the user requirement.
  • Experience in storing, processing unstructured data using NOSQL databases like HBase, Cassandra and MongoDB.
  • Good knowledge in the Analysis, Design, development and administering web applications using Documentum.
  • Reviewed business requirement, developed dashboards, made visualizations and updated development in Qlikview.
  • Experience in writing Python scripts to extract data from HTML files.
  • Written Hive queries for data analysis and to process the data for visualization.
  • Experience in managing and reviewing Hadoop Log files.
  • Good knowledge on Scala functional programming concepts.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Possess excellent Analytical abilities and technical skills and ability to learn new technologies.
  • Energetic, enthusiastic and hardworking, result oriented team player inclined towards achieving career goals and enterprise objectives.

TECHNICAL SKILLS

Big Data: HDFS, MapReduce, Pig, Hive, Impala, HBase, Cassandra, MongoDB, Mahout, Sqoop, Oozie, Zookeeper, Flume, Spark, YARN, Falcon, Scala.

Programming languages: Java & J2EE Technologies Core Java, Python, Groovy, Swift, PHP, jQuery, HTML5, CSS, AJAX, JavaScript, AngularJS, HTML, XML, XSLT, XPATH, CSS, VB Script. Hibernate, Spring, JSP, Servlets, Java Beans, JDBC, EJB 3.0, IDE Tools Eclipse, Net Beans

Tools: Photoshop, Tableau, QlikView, Documentum, Enterprise Content Management, Multisim, Matlab, Bizagi Modeler, Simul8, UML

Databases: Oracle, MySQL, SQL Server, DB2, Sybase, Cassandra, HBase and MongoDB

Operating Systems: Windows, UNIX, Linux and Mac OS

Other Tools: Putty, WinScp, Stream weaver.

PROFESSIONAL EXPERIENCE

Confidential, Warren, NJ

Big Data / Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Developed job processing scripts using Oozie workflow.
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Developed Simple to complex Map/reduce Jobs using Hive and Pig.
  • Used Spark SQL from extracting data from NoSQL(HBASE) and placing data into NoSQL(HBASE).
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Involved in Hadoop cluster task like commissioning & decommissioning Nodes without any effect to running jobs and data.
  • Wrote Map Reduce jobs in Python to discover trends in data usage by users.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Analyzed large data sets by running Hive queries and Pig scripts.
  • Helped the team to increase the Cluster size from 22 to 30 Nodes.
  • Job management using Fair scheduler.
  • Worked extensively with Sqoop for importing metadata from Oracle.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries.
  • Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as we as RDBMS and NoSQL data stores for data access and analysis. Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Obtained good knowledge on Scala.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Used Spark in three distinct workloads like pipelines, iterative processing and research.
  • Responsible to manage data coming from different sources.
  • Assisted in exporting analyzed data to relational databases using Sqoop.
  • Wrote Hive Queries and UDF's.
  • Developed Hive queries to process the data and generate the data cubes for visualizing.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Gained experience in managing and reviewing Hadoop log files.

Environment: Hadoop Ecosystem, HortonWorks, Mongo DB, Zookeeper, Spark, Scala, Python MapReduce, Sqoop, HDFS, Hive, Pig, Oozie, Spark, Kafka, Cassandra, ElasticSearch, Python, Oracle 10g, MySQL, QlikView.

Confidential - Jersey City, NJ

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
  • Loaded the customer profiles data, customer spending data, credit from legacy warehouses onto HDFS using Sqoop.
  • Built data pipeline using Pig and Python Map Reduce to store onto HDFS.
  • Used Oozie to orchestrate the map reduce jobs that extract the data on a timely manner.
  • Applied transformations and filtered both traffic using Pig.
  • Performed unit testing using MRUnit.
  • Responsible for building scalable distributed data solutions using Hadoop
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Setup and benchmarked Hadoop/HBase clusters for internal use
  • Developed Simple to complex Map/reduce Jobs using Hive and Pig.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior
  • Installed Oozie workflow engine to run multiple Hive and Pig job.
  • Evaluated ETL applications to support overall performance and improvement opportunities.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team
  • Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
  • Obtained good knowledge on Scala.
  • Streamed Python scripts for MapReduce jobs.
  • Provide support data analysts in running Pig and Hive queries.
  • Importing and exporting Data from Mysql/Oracle to HiveQL using Sqoop.
  • Importing and exporting Data from Mysql/Oracle to HDFS using Sqoop.
  • Responsible for defining the data flow within Hadoop eco system and direct the team in implement them.
  • Exported the result set from Hive to MySQL using Shell scripts.
  • Developed HIVE queries for the analysts.

Environment: Core Java, Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig 0.10 and 0.11, JDK1.6,HDFS, Flume, Oozie, DB2, HBase, Mahout, Python, PL/SQL and SQL.

Confidential, NY

Software Engineer

Responsibilities:

  • Responsible for design, architecture, constructing code, Unit testing for the Tableau dashboard to effectively provides the investment patterns in pictorial view.
  • Design and Development of the Application User Interface as part of the presentation layer.
  • The components include the User interface for the first-level authentication and service authorization based on the line of business.
  • Working on application design, database design and analysis.
  • Constructing the Hadoop interface to get an inter connectivity between Tableau and Hive database.
  • Responsible for designing the application level as well as the table schemas for the predictive analysis.
  • Designed wrapper classes for the processing components which facilitate big data loading and data testing to Hive database.
  • Development of separate modules to create, modify and delete data feeds for automatic investment services.
  • Documenting the end to end process development for the institutional clients to perform the predictive analysis on the developed user interface for low risk business decisions.
  • Provided facility for Admin users to perform check on data correctness. These workflows were automated using the cron jobs to load the data for proper data and time management.
  • Admin users were given the provision to view entire agreement in the form of Excel sheet or PDF.
  • Building the component for logging the application events with respect to AIS to analyze the performance.This component has to be tuned properly since this would be used by system administrators directly to manage the resources, roles, role mapping commonly known as entitlements. This performance tuning has to be done with the help of Access control facilities and Site Minder Systems.
  • Setting up the Infrastructure in different environments like Development and User Acceptance Testing (UAT), Data Center Acceptance Testing (DCAT), Production (PROD) for AIS.

Environment: Java/J2EE, Python, Spring, ExtJs, JaxB, Tomcat 7.0, iBatis, Webservice, Sybase, Documentum, SVN, Maven.

Confidential

Python Developer

Responsibilities:

  • Wrote Python routines to log into the website and fetch data for selected options.
  • Used Python modules such as requests, urllib, urllib2 for web crawling.
  • Used other packages such as Beautifulsoup for data parsing
  • Worked on writing as well as reading data from csv and excel file formats.
  • De eloped a MATLAB algorithm which determines an object’s dimensions from digital images.
  • Web-services backend development usingPython(CherryPy, Django, SQLAlchemy).
  • Worked on resulting reports of the application and Tableau reports.
  • Worked onHTML5, CSS3, JavaScript, AngularJS, Node.JS, Git, Mongo DB, intelliJ IDEA.
  • Extracting data from the database usingSAS/Access, SAS SQLprocedures and createSAS data sets.
  • Performed QA testing on the application.
  • Create custom VB scripts in repackaging applications as needed.
  • Held meetings with client and worked all alone for the entire project with limited help from the client. And participated in the completeSDLCprocess.
  • Developed rich user interface usingCSS, HTML, JavaScript and JQuery.
  • Developed and maintained various automated web tools for reducing manual effort and increasing efficiency of the Choice based Credit System.
  • Created database using MySQL, wrote several queries to extract data from database.
  • Wrote scripts in Python for extracting data from HTML file.
  • Effectively communicated with the external vendors to resolve queries.

Environment: Python, Django 1.4, MySQL, Windows, Linux, HTML, CSS, JQuery, JavaScript, Apache, Linux.

We'd love your feedback!