ASSOCIATE BIG DATA ANALYST Resume SHREWSBURY, MA - Hire IT People

SUMMARY:

A Big Data Architect with 3 years of experience, which includes experience in the Big Data ecosystem related technologies with knowledge in Big Data infrastructure, distributed file systems - HDFS, parallel processing - Map Reduce Framework and complete Hadoop ecosystem - Hive, Pig, Scoop, HBase, Flume and Oozie. Well versed with Java, Python and R languages, working with the Elastic Stack and front-end technologies like HTML, CSS and Java Script. With a zeal to learn, ability to work hard, commitment to task and willingness to to be a team member as well as be able to work in an independent work environment I am consistent at following directions and adding value to the project at hand.

TECHNICAL SKILLS:

Big Data Technologies: Hadoop, HDFS, Map Reduce, Hive, Pig, Impala, Sqoop, Flume, Oozie, Kafka, Spark

NOSQL Databases: HBase, Cassandra

Programming Languages: Java, C, C++, R, Python

Web Technologies: HTML, J2EE, CSS, JavaScript

Databases: MySQL, SQL, Oracle, SQL Server, HBase

Operating System: Linux, Windows 7, Windows 8, XP, windows vista

Work Environments: AWS, Docker, Eclipse, Visual Studio .NET, JUnit, Log4j, Putty, WP

PROFESSIONAL EXPERIENCE:

ASSOCIATE BIG DATA ANALYST

Confidential, SHREWSBURY, MA

Responsibilities:

Successfully installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, and Sqoop.
Built Spark Scripts, and responsible for performance tuning of spark applications. Used memory computing capabilities of Spark and performed advanced procedures like text analytics and processing.
Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration.
Implemented Name Node backup using NFS for High availability.
Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS at high level optimization.
Responsible for developing data pipeline using HDInsight, Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
Configuring, running and using high performance cloud platforms (AWS, Docker, etc,)
Performed highly time oriented jobs on Oozie workflow engine to run multiple Hive and Pig Jobs.
Rigorous use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Exported the analyzed data to the relational databases using Sqoop for analyzation, visualization and generating reports.
Implementing Hive Partitioning and Bucketing for organizing and cleaning data at a large scale.
Migrating data from different databases (SQL and noSQL) to HDFS using different pipelines.
Designing rowkeys with high usability in HBase preventing Hotspotting on High performance clusters.
Performing high level HQL queries on External tables created using Hive for data wrangling.
Extensive use of Shell scripting for data pulling, data cleaning for processed data to migrate.
Adeptly using Fully Distributed and Pseudo-distributed modes for building POCs and full built projects.

Environment: Hadoop, Map Reduce, Hive, HDFS, PIG, Sqoop, Oozie, Spark- Scala, Kafka, Flume, HBase, Oracle, SQL, NoSQL and Unix/Linux

ASSOCIATE DATA ANALYST

Confidential, SHREWSBURY, MA

Responsibilities:

Worked on Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables
Capturing data from existing databases that provide SQL interfaces using Sqoop.
Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems/mainframe and vice-versa. Loading data into HDFS.
Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables

Environment: Hadoop, MapReduce, HDFS, Hive, Spark- Scala, Kafka, Java (jdk1.6), Hadoop distribution of Hortonworks, Cloudera, MapR, DataStax, IBM DataStage 8.1(Designer, Director, Administrator), PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting

We provide IT Staff Augmentation Services!

Associate Big Data Analyst Resume

Shrewsbury, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship