We provide IT Staff Augmentation Services!

Project Manager/data Analyst Resume

3.00/5 (Submit Your Rating)

Chicago, IL

PROFILE:

  • Big Data Professional with 4+years of competency In Several Software Development Methodologies Like Programming, Analysis, Design, Deployment, Maintenance Of Software Applications In Various Domains Like Energy and Retail.
  • Healthy Problem solving and Responsive to needs of clients, coworkers and management. Poised, resourceful and adaptable to any office environment.
  • Attentive to details with sharp awareness of Organizational priority and meet deadlines schedules eager to assume increasing levels of responsibility.
  • Around 2+ yrs. experience on Hadoop Development With Varying Level Of Expertise Around Different Big Data Hadoop Projects.
  • Ample Experience in working with HDFS corresponding tools - Mapreduce, Hive, Impala, Spark, Pig, Avro, Linux scripting for data processing and analysis, Sqoop and Flume for data migration & for data ingestion, Oozie and Cron for scheduling .
  • 2+ Years of Extensive Experience in MSSQL Server Database Design, Development and Maintenance of Database Applications. Implementation of Database Systems using MSSQL for OLTP Systems And Microsoft BI for OLAP Systems and Data Mining Functionality for Business Intelligence Applications.

TECHNICAL SKILLS:

Big Data Hadoop:  HDFS, Mapreduce, Hive , Pig , Sqoop , Flume , Oozie , Impala , Spark , Avro.

Amazon Web Services:  IAM , S3 , EC2.

RDBMS:  MSSQL Server 2008 R2 , Oracle 10g ,MySQL NoSQL HBASE

Visualization and Data Management Tools:  Podium , Tableau , SSIS , SSRS , SSAS.

Programming Languages:  Java , SQL , LINUX/Shell , HQL , Pig Latin.

IDE’s:  Eclipse 3.2 , Putty , SSMS , Visual Studio 9.0 , Amazon Workspaces , Vmware , Virtual Box.

Distributions:  Microsoft , Cloudera , Hortonworks , Oracle , Amazon.

PROFESSIONAL EXPERIENCE:

Project Manager/Data Analyst

Chicago , IL

Confidential

Roles & Responsibilities:

  • Actively participated in data operations team in transitioning Urology marketing science processes to the Hadoop environment
  • Fetching Data from Various Data Vendor - FTP sites to Amazon S3 through sftp protocol.
  • Accessing s3 from a public subnet as well as private subnet.
  • Comprehensive Experience in working with Podium - Data Lake management system to connect to various servers like ftp, S3 and load data into HDFS.
  • Hands on Experience in working with various podium data issues, handling various types of data sets and their supporting properties.
  • Worked on podium properties, issue handling, handling new data feeds etc.
  • Good experience in managing podium through Command line Java API’s.
  • Good Experience in working with optimizing Hive queries using Apache spark configuration properties.
  • Experience in developing linux scripts in fetching the data from S3 to HDFS.
  • Optimized Hive scripts and brought down the execution time from 20 hours to 10 hours using spark properties on hive.
  • Programmed ETL functions between Oracle and Amazon Redshift
  • Experience in developing linux scripts, integrating podium java API’s, connecting to S3 and loading data to hdfs.
  • Developed hive and pig scripts specifically for User-friendly SA’s and analytics purposes.
  • Experience in Optimizing impala Queries.
  • Optimizing various hive scripts using different hive settings and optimizing joins.
  • Experience in optimizing hive and pig scripts. Building udfs, Various operations in hive and pig like unions, udfs, groups, filters etc.
  • Automated various hive, pig and Linux scripts.
  • Using cron and oozie schedulers to automate the data load process.
  • Good Experience in automating Hive, linux and Pig scripts through oozie and cron.
  • Experience in handling new /ad hoc data feed’s building the schemas with vendor data dictionaries and building new tables.
  • Experienced in handling refresh, full and delta loads.

 Hadoop Developer

Houston, TX

Confidential

Roles & Responsibilities:

  • Actively participated with the development team to meet the specific customer requirements and proposed effective Hadoop solutions.
  • Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify issues and behavioral patterns
  • Development of Pig scripts for handling the raw data for analysis.
  • Worked closely with application developers and end-users for analyzing requirements
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java, map reduce, Hive, Sqoop and a few system specific jobs
  • Monitored cluster health status on daily basis, tuning system performance related configuration parameters, backing up configuration xml files.
  • Monitored all MapReduce Read Jobs running on the cluster using Cloudera Manager and ensured that they were able to read the data to HDFS without any issues.
  • Involved in moving all log files generated from various sources to HDFS for further processing.
  • Involved in collecting metrics for Hadoop clusters using Ganglia.
  • Supported Data Analysts in running Mapreduce Programs.
  • Developed Hive queries to process the data and generate the data cubes for visualizing.
  • Responsible for deploying patches and remediating vulnerabilities.
  • Experience in setting up Test, QA, and Prod environment.
  • Worked extensively in writing Sqoop scripts for importing and exporting of data from RDBMS.
  • Responsibility of extracting the data from various sources into Hadoop HDFS for processing.
  • Worked on streaming the real time log data into HDFS from web servers using flume.
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Implemented Hive queries using buckets for time efficiency.
  • Designed and implemented Pig udfs for evaluation, filtering, loading and storing of data.
  • Hands on experience in installing Hbase and creating hbase tables.

 Hadoop Developer

Houston, TX

Confidential

Roles & Responsibilities:

  • Processed data into HDFS by developing solutions, analyzed the data using Map Reduce programs produce summary results from Hadoop to downstream systems.
  • Developed Hive jobs using HIVEQL.
  • Writing Hive queries and implementing different design patterns
  • Developed Sqoop scripts to extract the data from MYSQL and load into HDFS
  • Applied Hive queries to perform data analysis on HBase using Storage Handler in order to meet the business requirements.
  • Developed UDF, UDAF, UDTF functions and implemented it in HIVE Queries.
  • Writing Hive Queries to Aggregate Data that needs to be pushed to the Hbase.
  • Developing Scripts and Batch Job to schedule a bundle (group of coordinators) which consists of various Hadoop Programs using Oozie
  • Implemented dynamic partitions, bucketing, sequence files.
  • Experienced in using Avro data serialization system to handle Avro data files in Hive programs.
  • Implemented optimized joins to gather data from different data sources using hive joins.
  • Experienced in optimizing hive queries, joins to handle different data sets.
  • Configured oozie schedulers to handle different Hadoop actions on timely basis.
  • Configured flume source, sink and memory channel to handle streaming data from server logs.
  • Designed and implemented a stream filtering system on top of Apache Kafka to reduce stream size.
  • Involved in ETL, Data Integration and Migration by using SSIS.
  • Used different file formats like Text files, Sequence Files, Avro using Hive SerDe.
  • Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts

 SQL Server Developer

Confidential

Roles and Responsibilities:

  • Designed, implemented and administrated SQL Server 2000 databases.
  • Implemented Constraints (Primary Key, Foreign Key, Default, Unique, Check and Not Null/Null) and Indexes on Tables.
  • Created and Modified Stored Procedures, Triggers, Views according to the business.
  • Developed various reports to validate the data between source and the target systems.
  • Design of ETL processes to transfer customer related data from MS Access, excel to SQL Server.
  • Used Database Engine Tuning advisor and SQL Profiler for Monitoring memory, processor, Disk I/O and SQL Queries.
  • Well experienced in Data Extraction, Transforming and Loading (ETL) using various tools such as SQL Server Integration Services (SSIS) Import, Export Data, Bulk Insert.

We'd love your feedback!