Big Data Engineer Resume New York, NY - Hire IT People

SUMMARY

Big Data professional having over six years of experience in the field of Data Science and Data Analytics using Big Data technologies along with Enterprise Application Development.
Have gained a considerable amount of experience working alongside experienced professionals in the IT industry.
Hardworking, a team player, and a progressive thinker looking for a challenging environment to apply theoretical knowledge and working expertise to date.
5+ years of professional experience in the domains of finance, ecommerce, and technology, developing, implementing and configuring Java technologies for desktop, web, and cloud applications.
3+ years of experience with Apache Hadoop and Apache Spark.
Extensive knowledge of concepts in Big Data, Scalable Distributed and Parallel Computing, and Data Science.
Excellent knowledge of the Hadoop architecture and components such as the HDFS, MapReduce, YARN, and Hadoop Ecosystem.
Experience working with Spark APIs, and resilient distributed datasets (RDD) for batch processing data streams.
Hands on experience with Hadoop ecosystem components such as: Hive, Pig, HBase, Oozie, Zookeeper, Sqoop, Mahout, and Flume.
Experience in writing custom MapReduce computing jobs in Java.
Experience with data pipeline and data logging using Kafka, Flume and Storm.
Hands - on involvement in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) such as Oracle.
Extensive experience with installing, configuring, and managing multi-node Hadoop clusters on Linux, and cloud distributions such as Cloudera (CDH3/4/5), and Amazon Web Services.
Proficient in analyzing data using HiveQL, Pig Latin scripts and custom UDFs.
Working experience with NoSQL databases such as HBase and Cassandra.
Well-versed with applying and working with various open-source APIs, tools, and virtualization for Big Data related tasks.
Knowledge of data storage repositories such as data lakes and data warehouses as well as data cleansing and wrangling using Talend.
Familiar with Python data analytics tools such as NumPy and Pandas.
Extensive knowledge of Object Oriented Programming (OOP) and machine learning data pattern algorithms.
Well-versed with core Java programming, servlets, multi-threading, concurrency principles and growing knowledge of Scala, R, and emerging technologies such as Apache Flink, Docker, and Titan.
Work experience in Test Driven Development environment and knowledge of various software development methodologies such as Agile, Scrum, RUP, and Waterfall.
Successful ability to work independently as well as a team member on group projects with both on-shore and offshore teams.

TECHNICAL SKILLS

Big Data: Hadoop, Spark, MapReduce, YARN, Zookeeper, Hive, Pig, Solr, Sqoop, Flume, Oozie, Storm, Kafka, Mahout, Cloudera, Talend

Languages: Java, Scala, Python, C#, C++, HTML, Shell

RDBMS: Oracle 10g/11g/12c, MySQL

NoSQL: Cassandra, HBase

Cloud: Amazon Web Services, Oracle E-Business, Fusion Financials

Tools: Eclipse IDE, Netbeans, BlueJ, Visual Studio, MS Office 365, VMware, VirtualBox, Talend, Titan

Operating Systems: Windows XP/7/8/10, Mac OSX, Linux

PROFESSIONAL EXPERIENCE

Confidential - New York, NY

Big Data Engineer

Responsibilities:

Developed MapReduce and Spark jobs using Java for batch processing and validating income data from multiple file formats and sources.
Built data pipeline using MapReduce, Flume, Sqoop, Pig, and HDFS for financial analysis.
Implemented Spark SQL and Spark Streaming for faster processing of real-time trading data.
Pipelined and analyzed real time streaming data logs using Spark with Kafka.
Imported data from different databases into the HDFS using Sqoop and performed transformations using Hive.
Collected and aggregated large amounts of log data using Flume.
Wrote HiveQL queries and executed Pig scripts to study customer behavior.
Used Hive to analyze the partitioned and bucketed data and computed metrics for financial reporting.
Developed product profiles using Pig and product specific UDFs.
Built scalable distributed data solutions and wrote CQL queries in Cassandra.
Scheduled and executed workflows in Oozie to run Hive and Pig jobs.
Used Impala to read, write and query the Hadoop data in HDFS from Cassandra.
Configured Kafka to read and write messages from external programs and handle real time data.
Involved in writing Storm topology to accept data from Kafka and process the data.
Applied Solr to index the search data and performed real time updates.
Participated in data cleansing using Talend.
Monitored Hadoop cluster using Cloudera Manager in CDH 5.
Participated in displaying financial statistical analysis in distributed graphs using Titan.

Environment: Hadoop, MapReduce, Spark, Pig, Hive, Sqoop, Oozie, HBase, Kafka, Storm, Flume, Solr, Impala, Oracle 11g, Cloudera Manager, CDH 5, Cassandra, Linux, Java SE 8, Scala, Titan.

Confidential, New York, NY

Big Data Engineer

Responsibilities:

Wrote MapReduce jobs to filter and parse inventory data which was stored in the HDFS.
Configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster for data pipelining.
Imported and exported data into the HDFS from the Oracle database using Sqoop.
Integrated MapReduce with Cassandra to import bulk amount of logged data.
Converted ETL operations to the Hadoop system using Hive transformations and functions.
Conductedstreaming jobs with basic Python to process terabytes of formatted data for machine learning purposes.
Used Flume to collect, aggregate and store the web log data and loaded it into the HDFS.
Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
Developed custom and Pig UDFs for product specific needs.
Implemented and configured workflows using Oozie to automate jobs.
Performed Hadoop cluster management and configuration of multiple nodes on AWS.
Involved in creating buckets to store the data in AWS and stored the data repository for future needs and reusability.
Involved in the cluster coordination services through Zookeeper.
Participated in the managing and reviewing of the Hadoop log files.

Environment: Hadoop, MapReduce, Hive, Flume, Sqoop, Zookeeper, Pig, Oozie, Python, Java SE 8, Oracle 11g, HBase, AWS, Linux

Confidential, Jacksonville, FL

Big Data Engineer

Responsibilities:

Analyzed and prepared functional specifications for the business and system requirements.
Developed custom MapReduce use-cases using Java to log data of customer behavior and loaded it into the HDFS.
Fixed bugs and improved Java source code to support clusters.
Applied Sqoop to execute processes between the Oracle database to the HDFS.
Loaded and transformed large sets of structured, semi structured and unstructured data.
Migrated the dataset into Hive for ETL purposes and optimized Pig UDFs.
Wrote column-mapping scripts to generate ETL Queries in Hive.
Developed Hive Schema to help the business user extract data files.
Handled importing of data from various data sources, performed transformations using Pig and Hive.
Used Impala to query data stored in the HDFS.
Participated in Mahout implementation for machine learning analysis.
Performed data analysis on large datasets and present results to risk, finance, accounting and pricing, sales, marketing, and compliance teams.
Imported data into excel and created pivot tables and statistical models.

Environment: Hadoop, MapReduce, AWS, Hive, CDH, Sqoop, Pig, Oracle 11g, Java SE 7, Python, Zookeeper, Impala, Linux

Confidential, Des Plaines, IL

Java Developer

Responsibilities:

Involved in the analysis, design and development of the application based on J2EE using, Spring and Hibernate.
Involved in developing the user interface using Struts.
Scalable web services were built under the RESTful model.
Developed the user interface screens using JavaScript and HTML and also conducted client side validations.
Developed unit-testing classes using JUnit.
Implemented Spring MVC to handle the user requests and used various controllers to delegate flow.
Used JDBC to connect to database and wrote SQL queries and stored procedures to fetch and insert/update to database tables.
Worked with Servlets to handle and process electronic prescriptions, history, and analysis.
Conducted data analysis with basic Python and wrangled data for data repositories.
Applied machine learning principles for studying market behavior for trading platform.
Worked with JavaScript and CSS to improve application performance.
Used Log4J logging framework for logging.

Environment: Java SE 7, JDBC, Spring, Hibernate, Struts, Servlets, HTML, JavaScript, Apache Tomcat, JQuery, JUnit, XML, SQL

Confidential, Newark, DE

Java Developer

Responsibilities:

Participated in agile/scrum meetings to define client requirements and development reports with cross-functional and offshore teams.
Designed UML use-case, activity, and class diagrams for technical documentation and requirements.
Developed and tested the web application using core Java.
Involved in the front-end design using JavaScript, CSS, HTML and Servlets and used Hibernate to connect to the
Involved in using Spring as the middle-tier framework.
Utilized Java multi-threading programming, synchronization and built API for concurrent models and processes.
Implemented Hibernate in the data access object layer to access and update information in the Oracle Database.
Wrote Stored Procedures, Queries and Functions in SQL.
Used Log4J logging framework for logging messages.
Performed testing using JUnit.

Environment: Java EE 6/7, SQL, JavaScript, Servlets, JDBC, HTML, CSS, Apache Struts, Hibernate, Spring, XML, Eclipse, Oracle 10g, SQL

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

New York, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship