We provide IT Staff Augmentation Services!

Big Data/ Talend Developer Resume

3.00/5 (Submit Your Rating)

Houston, TX

SUMMARY:

  • Over 7+ years of IT experience in Analysis, Design, Development and in Scala, Spark,Hadoop and HDFS environment and experience in JAVA, J2EE.
  • Experienced in developing and Implementing MapReduce programs using Hadoop to work as per the requirement.
  • Excellent experience on Scala, ApacheSpark, SparkStreaming, Pattern Matching and Map - Reducing.
  • Developed ETL test scripts based on technical specifications/Data design documents and source to target mappings.
  • Experienced in installing, configuring, and administrating Hadoop cluster of major Hadoop distributions Hortonworks, Cloudera.
  • Experienced in working with different data sources like Flat files, Spreadsheet files, log files and Databases.
  • Experienced in working with flumeto load the log data from multiple sources directly into hdfs.
  • Excellent experience in Apache Hadoop ecosystem components like Hadoop Distributing File System (HDFS), Map Reduce, Sqoop, Apache Spark and Scala.
  • Extensive experience working inOracle, DB2, SQL Server and MySQL database and Java Core concepts like OOPS, Multithreading, Collections and IO.
  • Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map-Reduce and Pig jobs.
  • Experience with MapReduce, Pig, Programming Model, Installation and Configuration of Hadoop, HBase, Hive, Pig, Sqoop and Flume using Linux commands.
  • Experience in managing and reviewing Hadoop Log files using FLUME and Kafka and also developed the Pig UDF's and Hive UDF's to pre-process the data for analysis.
  • Experience with NOSQL databases like HBASE and Cassandra
  • Experience in scripting using UNIX Shell script.Proficiency in Linux (UNIX) and Windows OS.
  • Experienced in setting up data gathering tools such as Flume and Sqoop
  • Extensive knowledge about Zookeeper process for various types of centralized configurations.
  • Knowledge of monitoring and managing Hadoop cluster using Hortonworks.
  • Experienced in working with Flume to load the log data from multiple sources directly into HDFS.
  • Worked with application teams to install operating system, Hadoop updates, patches and version upgrades as required.
  • Experienced in analyzing, designing and developing ETL strategies and processes, Writing ETL specifications.
  • Experiences on applications using Java, python and UNIXshell scripting.
  • Have good interpersonal skills, good communication, problem solving skills and a motivated team player. 
  • Have the ability to be a value contribution to the company.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Flume, MongoDB, HBase, Oozie, Zookeeper, spark, storm& Kafka

Java & J2EE Technologies: Core Java

IDE’s: Eclipse, Net beans

Big data Analytics: Datameer 2.0.5

Frameworks: MVC, Struts, Hibernate, Spring

Programming languages: C, C++, Java, Python, Ant scripts, Linux shell scripts

Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP, FTP

ETL Tools: Informatica, Pentaho, SSRS, SSIS, BO, Crystal reports, Cognos.

Testing: Win Runner, Load Runner, QTP

WORK EXPERIENCE:

Confidential, Houston, TX

Big Data/ Talend Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop
  • Worked extensively with Flume for importing social media data
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager
  • Upgraded the Hadoop Cluster from CDH 3 to CDH 4, setting up High Availability Cluster and integrating HIVE with existing applications
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Developed Pigscripts in the areas where extensive coding needs to be reduced.
  • Extensively used for all and bulk collect to fetch large volumes of data from table.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs
  • Handled importing of data from various data sources using Sqoop, performed transformations using Hive, MapReduce, loaded data into HDFS
  • Configured Sqoop and developed scripts to extract data from MySQL into HDFS
  • Hands-on experience with productionalizing Hadoop applications viz. administration, configuration management, monitoring, debugging and performance tuning
  • Created HBase tables to store various data formats of PII data coming from different portfolios.Data processing using SPARK.
  • Parsed high-level design specification to simple ETL coding and mapping standards. 
  • Cluster co-ordination services through Zookeeper .
  • Developed complex Talend jobs mappings to load the data from various sources using different components.
  • Design, develop and implement solutions using Talend Integration Suite.
  • Partitioning data streams using KAFKA . Designed and configured Kafka cluster to accommodate heavy throughput of 1 million messages per second.
  • Used Kafka producer 0.8.3 API's to produce messages

Environment: Hadoop (Cloudera), HDFS, MapReduce, Pig, Hive, Sqoop, HBase, Oozie, Flume, Zookeeper, java, SQL, Scripting, Spark, Kafka.

Confidential, Plano, TX

Big Data/ Hadoop Developer

Responsibilities:

  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.
  • Responsible for developing map reduce program using text analytics and pattern matching algorithms
  • Involved in in porting data from various client servers like Remedy Altiris Cherwell OTRS etc into HDFS file system
  • Assist the development team to install single node Hadoop 224 in local machine
  • Coding REST Web service and client to fetch tickets from client ticketing servers
  • Facilitating Sprint planning Retrospection and closer meeting for each spring and help capture various metrics like team status
  • Participated in architectural and design decisions with respective teams
  • Developed in-memory data grid solution across conventional and cloud environments using Oracle Coherence.
  • Work with customers to develop and support solutions that use our in-memory data grid product.
  • Used Pig as ETL tool to do transformations, event joins, filters and some pre-aggregations before storing the data onto HDFS
  • Optimizing Map reduce code, pig scripts, user interface analysis, performance tuning and analysis.
  • Analysis with data visualization player Tableau.
  • Writing Pig scripts for data processing.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
  • Loaded the aggregated data onto DB2 for reporting on the dashboard.

Environment: Big Data/Hadoop, JDK1.6, Linux, Python, Java, Agile, RESTful Web Services, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, DB2, NoSQL, HBase and Tableau.

Confidential, NYC

Hadoop Developer

Responsibilities:

  • Developed Map-Reduce programs for data analysis and data cleaning.
  • Installing and configuring Hortonworks Data Platform 2.1 - 2.3.
  • Implemented Big Data solutions including data acquisition, storage, transformation and analysis.
  • Wrote Map-Reduce jobs to discover trends in data usage by users.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Provided quick response to ad hoc internal and external client requests for data.
  • Loaded and transformed large sets of structured and unstructured data using Hadoop.
  • Developed Pig scripts in the areas where extensive coding needs to be reduced.
  • Responsible for creating Hive tables, loading data and writing hive queries.
  • Involved in loading data from Linux file system to HDFS.
  • Created complex mappings in Talend 5.x.
  • Created Talend Mappings to populate the data into Staging, Dimension and Fact tables.
  • Excellent knowledge of NOSQL on Mongo and Cassandra DB
  • Handled importing data from various data sources, performed transformations using Hive and Map-Reduce, streamed using Flume and loaded data into HDFS.
  • Installed Ozzie workflow engine to run multiple MapReduce, Hive, Impala, Zookeeper and Pig jobs which run independently with time and data availability.
  • Worked with NoSQL database HBase to create tables and store data.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Written Hive queries for data analysis to meet the business requirement.

Environment: Hadoop, Pig, Hive, Oozie, NoSQL, Sqoop, Flume, Hdfs, Hbase, Map-Reduce, MySQL, Horton Works, Impala, Cassandra DB, Mongo, Zookeeper.

Confidential, Peoria, IL

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Used Multithreading, synchronization, caching and memory management
  • Used JAVA, J2EE application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle ( SDLC )
  • Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management , backup, and disaster recovery systems and procedures.
  • Responsible for creating Hive tables, loading data and writing hive queries.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
  • Load and transform large sets of structured, semi structured and unstructured data
  • Supported Map Reduce Programs those are running on the cluster
  • Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Wrote complex Hive queries and UDFs in Java and Python .
  • Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs
  • Utilized Java and MySQL from day to day to debug and fix issues with client processes
  • Managed and reviewed log files
  • Implemented partitioning, dynamic partitions and buckets in HIVE

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, CouchDB, Python, Java, Flume, HTML, XML, SQL, MySQL J2EE, Eclipse

Confidential

Java Project

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
  • Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax
  • Agile Scrum Methodology been followed for the development process. 
  • Developed proto-type test screens in HTML and JavaScript
  • Involved in developing JSP for client data presentation and, data validation on the client side with in the forms. 
  • Experience in writing PL/SQL stored procedures, Function, Triggers, Oracle reports and Complex SQL’s .
  • Worked with JavaScript to perform client side form validations. Gave an innovative for logging for all interdepends application.
  • Used Struts tag libraries as well as Struts tile framework. 
  • Used JDBC to access Database with Oracle thin driver of Type-3 for application optimization and efficiency. Created connection through JDBC and used JDBC statements to call stored procedures
  • Client side validation done using JavaScript
  • Used Data Access Object to make application more flexible to future and legacy databases. 
  • Actively involved in tuning SQL queries for better performance. 
  • Developed the application by using the Spring MVC framework
  • Collection framework used to transfer objects between the different layers of the application. 
  • Developed data mapping to create a communication bridge between various application interfaces using XML , and XSL
  • Proficient in developing applications having exposure to Java, JSP, UML, Oracle (SQL, PL/SQL), HTML, Junit, JavaScript, Servlets, Swing DB2, CSS .
  • Spring IOC being used to inject the parameter values for the Dynamic parameters. 
  • Developed JUnit testing framework for Unit level testing. 
  • Actively involved in code review and bug fixing for improving the performance. 
  • Documented application for its functionality and its enhanced features. 
  • Successfully delivered all product deliverables that resulted with zero defects.

Environment: Spring MVC, Oracle (SQL, PL/SQL), J2EE, Java, struts, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, Junit, Apache Tomcat, My SQL Server 2008

We'd love your feedback!