Spark Developer Resume New York - Hire IT People

SUMMARY

Around 8years of professional IT experience which includes 4+ years ofexperience in Big data space with hands on expertise in development on Hadoop Platform and Java.
Expertise in executing best - in-class risk models and decision logic in Splunk.
Extensive experience withSplunkSearching and Reporting modules, Knowledge Objects, Administration, Add-On's, Dashboards, Clustering and Forwarder Management, Visualizations, alerts, reports.
Extensive knowledge aboutSplunk/Hunkarchitecture and its various components (indexer, forwarder, search head, deployment server, virual indexers,providers), Heavy and Universal forwarder, License model
Created and ManagedSplunkDatabase connect Identities, Database Connections, Database Inputs, Outputs, lookups, access controls
Proficiency in Java, Hadoop Map Reduce, Pig, Hive, Oozie, Sqoop, Flume, Zookeeper, Impala and NoSQL Database.
Good exposure on usage of NoSQL database column-oriented, HBase.
Extensive experience writing custom Map Reduce programs for data processing and UDFs for both Hive and Pig in Java.
Strong experience in analyzing large amounts of data sets writing Pig scripts and Hive queries.
Extensive experience in working with structured data using Hive QL, join operations, writing custom UDF’s and experienced in optimizing Hive Queries.
Extensive experiences in working with semi/unstructured data by implementing complex map reduce programs using design patterns.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database.
Experience in Apache Flume for collecting, aggregating and moving huge chunks of data from various sources such as webserver, telnet sourcesetc.
Adequate knowledge and working experience in Agile & Waterfall methodologies.
Great team player and quick learner with effective communication, motivation, and organizational skills combined with attention to details and business improvements.

TECHNICAL SKILLS

Hadoop/Big Data Technologies: Splunk, Hunk, Forwarder, DB connect, HDFS, Map Reduce, Sqoop, Flume, Pig, Hive, Oozie, Apache Spark, Python, impala, Zookeeper and Cloudera Manager, MapRclusters, Hbase, Amazon Web Services

Monitoring and Reporting: Tableau, Jaseprsoft

Build Tools: SQL server management studio, Eclipse

Programming & Scripting: Core JAVA, C, SQL, Shell Scripting

Databases: Microsoft SQL server, Teradata, MySQL

PROFESSIONAL EXPERIENCE

Confidential, New York

Spark Developer

Responsibilities:

Developed a PySpark code for saving data in to AVRO and Parquet format and building hive tables on top of them.
Developed an equivalent PySpark code for existing SAS code to extract summary insights on the hive tables.
Responsible for datatype, count and header validations for the ingested data.
Assisted team in code reviews bug fixes.
Responsible for writing RESTful services to invoke and run the Apache NiFi process.
Configured NiFiingestion tool for dynamic parameterization using python script and JSON files.

Environment: Hadoop, HDP, My Eclipse IDE, Python 2.7, PySpark, Hive, Sqoop, Shell Scripting, Linux.

Confidential, New York

Data Architect

Responsibilities:

Involved in modeling different key risk indicators in Splunkand building extensive Hive queries to understand customer behavior across the customer life cycle.
Converting existing hive queries to Spark SQL queries to reduce execution time.
Successfully implemented Proof of concept in Splunk on risk modeling which covers 3 different risk types such as Credit, Operational and Compliance.
Extensively used various risk reporting tools such as Tableau and Jasepersoft to understand risk types and levels at Confidential .
CreatedReports, Alerts and Dashboardsin Splunk which demonstrate various risk levels.
Installed and configured heavy, universal, and intermediate forwarders to bring customer data from production systems.
Created and ManagedSplunkDB connect Identities, Database Connections, Database Inputs, Outputs, lookups, access controls.
Designing and maintaining production-qualitySplunkdashboards.
Splunkconfiguration that involves different web application and batch, create Saved search and summary search, summary indexes.
Experience with search ahead clustering and Index clustering.
Extracted various fields using field extractor, field extractions (rex) and calculated fields to optimize the search performance and reduce the load on the search ahead.
Configured various summary indexes by created saved searches to collect the aggregated data to run create dashboards on top of summary index.
IntegratedSplunkwith Global Alert Repository to show alerts to executive leaders at Confidential .
Use techniques to optimize searches for better performance, Search time vs. Index time field extraction. And understanding of configuration files, precedence and working.
Lead the team in actively implementing smartSplunksolutions.
In depth experience with props.conf, transforms. conf, inputs.conf
Assisted various other power users in optimizing the searches.
Configured Hunk to read customer transaction data from Hadoop Ecosystems such as HDFS and Hive.

Environment: Splunk 6.4.1, Hunk 6.4., DB connect v2.0, HDP MapR 3.1, YARN, Hive 1.2.1, UNIX Shell Scripting, Teradata, MS SQL server 2014.

Confidential

Big Data Engineer

Responsibilities:

Design and develop data ingestion framework using Hadoop stacks and expertise in analyzing the logs and diagnosis the issues
Used Flume for log analysis
Used sequence and AVRO file formats and snappy compressions while storing data in HDFS
Developed UNIX scripts to download files from FTP to MELD HDFS and load the data into stage and base hive tables after partitioning and bucketing
Designed and developed Map Reduce jobs to process data coming in different file formats like XML, CSV, JSON
Designed the framework for historical/incremental load
Created Hive tables to store the processed results in a tabular format in Base Schema Developed pig scripts to perform ETL operations and write UDFs if needed
Importing data into HDFS and HIVE using Sqoop from Teradata and Oracle databases
Worked on migrating projects from MapR to Confidential Works

Environment: Centos 6.4, JDK 1.7, HDP 2.1, YARN, Sqoop 1.4.4, Pig0.12, Hive 0.12, Flume1.4.0,Ambari, UNIX Shell Scripting, WinSCP, Teradata, Oracle 11.6.

Confidential, Bentonville, AR.

Hadoop Developer

Responsibilities:

Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
Worked on automation of delta feeds from, Teradata using Sqoop, also from FTP Servers to Hive.
Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis
Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analysed the imported data using Hadoop Components
Established custom MapReduce programs in order to analyze data and used Pig Latin to clean unwanted data
Did various Performance tuning like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
Participated in requirement gathering from the Experts and Business Partners and converting the requirements into technical specifications
Implemented daily workflow for extraction, processing and analysis of data with Oozie.
Involved in loading data from LINUX file system to HDFS.

Environment: Hadoop, Pig, Hive, Sqoop, Flume, MapReduce, HDFS, LINUX, Oozie.

Confidential

SQL/JAVA Developer

Responsibilities:

Involved in database design.
Created tables, stored procedures in SQL for data manipulation and retrieval, Database Modification using SQL, Stored procedures, Views in Oracle 10g.
Created User Interface using JSP.
Involved in integration testing the Business Logic layer and Data Access layer.
Used technologies like JSP, JavaScript, HTML, XML for Presentation tier
Involved in JUnit testing of the application using JUnit framework.
ImplementedStored Procedures functions and views to retrieve the data.
Responsible to mentor/work with team members to make sure the standards and guidelines are followed and delivery of tasks in time.

Environment: JSP, Servlets, JDBC, JAVA, Eclipse, UNIX, SQL

We provide IT Staff Augmentation Services!

Spark Developer Resume

New, YorK

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship