Big Data/ Talend Developer Resume Houston, TX - Hire IT People

SUMMARY:

Over 7+ years of IT experience in Analysis, Design, Development and in Scala, Spark,Hadoop and HDFS environment and experience in JAVA, J2EE.
Experienced in developing and Implementing MapReduce programs using Hadoop to work as per the requirement.
Excellent experience on Scala, ApacheSpark, SparkStreaming, Pattern Matching and Map - Reducing.
Developed ETL test scripts based on technical specifications/Data design documents and source to target mappings.
Experienced in installing, configuring, and administrating Hadoop cluster of major Hadoop distributions Hortonworks, Cloudera.
Experienced in working with different data sources like Flat files, Spreadsheet files, log files and Databases.
Experienced in working with flumeto load the log data from multiple sources directly into hdfs.
Excellent experience in Apache Hadoop ecosystem components like Hadoop Distributing File System (HDFS), Map Reduce, Sqoop, Apache Spark and Scala.
Extensive experience working inOracle, DB2, SQL Server and MySQL database and Java Core concepts like OOPS, Multithreading, Collections and IO.
Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map-Reduce and Pig jobs.
Experience with MapReduce, Pig, Programming Model, Installation and Configuration of Hadoop, HBase, Hive, Pig, Sqoop and Flume using Linux commands.
Experience in managing and reviewing Hadoop Log files using FLUME and Kafka and also developed the Pig UDF's and Hive UDF's to pre-process the data for analysis.
Experience with NOSQL databases like HBASE and Cassandra
Experience in scripting using UNIX Shell script.Proficiency in Linux (UNIX) and Windows OS.
Experienced in setting up data gathering tools such as Flume and Sqoop.
Extensive knowledge about Zookeeper process for various types of centralized configurations.
Knowledge of monitoring and managing Hadoop cluster using Hortonworks.
Experienced in working with Flume to load the log data from multiple sources directly into HDFS.
Worked with application teams to install operating system, Hadoop updates, patches and version upgrades as required.
Experienced in analyzing, designing and developing ETL strategies and processes, Writing ETL specifications.
Experiences on applications using Java, python and UNIXshell scripting.
Have good interpersonal skills, good communication, problem solving skills and a motivated team player.
Have the ability to be a value contribution to the company.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Flume, MongoDB, HBase, Oozie, Zookeeper, spark, storm& Kafka

Java & J2EE Technologies: Core Java

IDE’s: Eclipse, Net beans

Big data Analytics: Datameer 2.0.5

Frameworks: MVC, Struts, Hibernate, Spring

Programming languages: C, C++, Java, Python, Ant scripts, Linux shell scripts

Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP, FTP

ETL Tools: Informatica, Pentaho, SSRS, SSIS, BO, Crystal reports, Cognos.

Testing: Win Runner, Load Runner, QTP

WORK EXPERIENCE:

Confidential, Houston, TX

Big Data/ Talend Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop
Worked extensively with Flume for importing social media data
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager
Upgraded the Hadoop Cluster from CDH 3 to CDH 4, setting up High Availability Cluster and integrating HIVE with existing applications
Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
Developed Pigscripts in the areas where extensive coding needs to be reduced.
Extensively used for all and bulk collect to fetch large volumes of data from table.
Installed Oozie workflow engine to run multiple Hive and Pig jobs
Handled importing of data from various data sources using Sqoop, performed transformations using Hive, MapReduce, loaded data into HDFS
Configured Sqoop and developed scripts to extract data from MySQL into HDFS
Hands-on experience with productionalizing Hadoop applications viz. administration, configuration management, monitoring, debugging and performance tuning
Created HBase tables to store various data formats of PII data coming from different portfolios.Data processing using SPARK.
Parsed high-level design specification to simple ETL coding and mapping standards.
Cluster co-ordination services through Zookeeper .
Developed complex Talend jobs mappings to load the data from various sources using different components.
Design, develop and implement solutions using Talend Integration Suite.
Partitioning data streams using KAFKA . Designed and configured Kafka cluster to accommodate heavy throughput of 1 million messages per second.
Used Kafka producer 0.8.3 API's to produce messages

Environment: Hadoop (Cloudera), HDFS, MapReduce, Pig, Hive, Sqoop, HBase, Oozie, Flume, Zookeeper, java, SQL, Scripting, Spark, Kafka.

Confidential, Plano, TX

Big Data/ Hadoop Developer

Responsibilities:

Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Managed and reviewed Hadoop log files.
Tested raw data and executed performance scripts.
Shared responsibility for administration of Hadoop, Hive and Pig.
Responsible for developing map reduce program using text analytics and pattern matching algorithms
Involved in in porting data from various client servers like Remedy Altiris Cherwell OTRS etc into HDFS file system
Assist the development team to install single node Hadoop 224 in local machine
Coding REST Web service and client to fetch tickets from client ticketing servers
Facilitating Sprint planning Retrospection and closer meeting for each spring and help capture various metrics like team status
Participated in architectural and design decisions with respective teams
Developed in-memory data grid solution across conventional and cloud environments using Oracle Coherence.
Work with customers to develop and support solutions that use our in-memory data grid product.
Used Pig as ETL tool to do transformations, event joins, filters and some pre-aggregations before storing the data onto HDFS
Optimizing Map reduce code, pig scripts, user interface analysis, performance tuning and analysis.
Analysis with data visualization player Tableau.
Writing Pig scripts for data processing.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
Loaded the aggregated data onto DB2 for reporting on the dashboard.

Environment: Big Data/Hadoop, JDK1.6, Linux, Python, Java, Agile, RESTful Web Services, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, DB2, NoSQL, HBase and Tableau.

Confidential, NYC

Hadoop Developer

Responsibilities:

Developed Map-Reduce programs for data analysis and data cleaning.
Installing and configuring Hortonworks Data Platform 2.1 - 2.3.
Implemented Big Data solutions including data acquisition, storage, transformation and analysis.
Wrote Map-Reduce jobs to discover trends in data usage by users.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Provided quick response to ad hoc internal and external client requests for data.
Loaded and transformed large sets of structured and unstructured data using Hadoop.
Developed Pig scripts in the areas where extensive coding needs to be reduced.
Responsible for creating Hive tables, loading data and writing hive queries.
Involved in loading data from Linux file system to HDFS.
Created complex mappings in Talend 5.x.
Created Talend Mappings to populate the data into Staging, Dimension and Fact tables.
Excellent knowledge of NOSQL on Mongo and Cassandra DB.
Handled importing data from various data sources, performed transformations using Hive and Map-Reduce, streamed using Flume and loaded data into HDFS.
Installed Ozzie workflow engine to run multiple MapReduce, Hive, Impala, Zookeeper and Pig jobs which run independently with time and data availability.
Worked with NoSQL database HBase to create tables and store data.
Developed Simple to complex MapReduce Jobs using Hive and Pig.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Developing Scripts and Batch Job to schedule various Hadoop Program.
Written Hive queries for data analysis to meet the business requirement.

Environment: Hadoop, Pig, Hive, Oozie, NoSQL, Sqoop, Flume, Hdfs, Hbase, Map-Reduce, MySQL, Horton Works, Impala, Cassandra DB, Mongo, Zookeeper.

Confidential, Peoria, IL

Hadoop Developer

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing
Importing and exporting data into HDFS and Hive using Sqoop
Used Multithreading, synchronization, caching and memory management
Used JAVA, J2EE application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle ( SDLC )
Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management , backup, and disaster recovery systems and procedures.
Responsible for creating Hive tables, loading data and writing hive queries.
Extracted files from MongoDB through Sqoop and placed in HDFS and processed
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
Load and transform large sets of structured, semi structured and unstructured data
Supported Map Reduce Programs those are running on the cluster
Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Wrote complex Hive queries and UDFs in Java and Python .
Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs
Utilized Java and MySQL from day to day to debug and fix issues with client processes
Managed and reviewed log files
Implemented partitioning, dynamic partitions and buckets in HIVE

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, CouchDB, Python, Java, Flume, HTML, XML, SQL, MySQL J2EE, Eclipse

Confidential

Java Project

Responsibilities:

Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax .
Agile Scrum Methodology been followed for the development process.
Developed proto-type test screens in HTML and JavaScript .
Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
Experience in writing PL/SQL stored procedures, Function, Triggers, Oracle reports and Complex SQL’s .
Worked with JavaScript to perform client side form validations. Gave an innovative for logging for all interdepends application.
Used Struts tag libraries as well as Struts tile framework.
Used JDBC to access Database with Oracle thin driver of Type-3 for application optimization and efficiency. Created connection through JDBC and used JDBC statements to call stored procedures .
Client side validation done using JavaScript .
Used Data Access Object to make application more flexible to future and legacy databases.
Actively involved in tuning SQL queries for better performance.
Developed the application by using the Spring MVC framework .
Collection framework used to transfer objects between the different layers of the application.
Developed data mapping to create a communication bridge between various application interfaces using XML , and XSL .
Proficient in developing applications having exposure to Java, JSP, UML, Oracle (SQL, PL/SQL), HTML, Junit, JavaScript, Servlets, Swing DB2, CSS .
Spring IOC being used to inject the parameter values for the Dynamic parameters.
Developed JUnit testing framework for Unit level testing.
Actively involved in code review and bug fixing for improving the performance.
Documented application for its functionality and its enhanced features.
Successfully delivered all product deliverables that resulted with zero defects.

Environment: Spring MVC, Oracle (SQL, PL/SQL), J2EE, Java, struts, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, Junit, Apache Tomcat, My SQL Server 2008

We provide IT Staff Augmentation Services!

Big Data/ Talend Developer Resume

Houston, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship