We provide IT Staff Augmentation Services!

Senior Data Analyst Resume

0/5 (Submit Your Rating)

Indianapolis, IN

SUMMARY

  • 8+ years of IT experience that includes Data Analysis and Hadoop Ecosystem
  • Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts.
  • Experience in working with MapReduce programs using Hadoop for working with Big Data.
  • Experience in analyzing data using Hive QL, Pig Latin and custom MapReduce programs in Java.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including HIVE, Sqoop, Oozie, Spark and zookeeper.
  • Performed Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in writing shell scripts to dump the shared data from MySQL servers to HDFS.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Experience in providing support to data analysts in running Hive queries.
  • Good Knowledge on RDBMS concepts and writing complex queries to load and transform data.
  • Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) in different application domains involving different technologies varying from Object oriented technology to Internet programming on Windows, Linux and UNIX platforms and RUP methodologies.
  • Experience in developing client / server-side web applications, Web API, Windows Services in Microsoft Visual Studio
  • Experience in generation BI reports using Power BI
  • Sound Understanding and good Experience with Object Oriented Programming Concepts
  • Adept in gathering and analyzing requirements, documentation and proven ability solve problems and work on multiple projects simultaneously
  • Excellent leadership, interpersonal, problem solving and time management skills.
  • Excellent communication skills both written (documentation) and verbal (presentation).
  • Very responsible and good team player. Can work independently with minimal supervision.

TECHNICAL SKILLS

Programing Languages: C, C++, Java, Python, Scala, UNIX Shell Scripting, SQL, HQL.

Scripting: HTML, JavaScript, CSS

Hadoop Ecosystem: MapReduce, HBASE, Hive, PIG, SQOP, Zookeeper, OOZIE, Flume, HUE, Kafka, SPARK, SQL

Hadoop Distributions: Cloudera

Database: MySQL, NoSQL, Oracle DBData Visualization Power BI, Tableau

Tools: /Applications: Attunity, CA7, GIT, Udeploy, MS-Excel, MS-office, SharePoint.

Methodologies: Agile, SDLC

PROFESSIONAL EXPERIENCE

Confidential, Indianapolis, IN

Senior Data Analyst

Responsibilities:

  • Using Git for version controlling of the code and Udeploy to migrate the code from lower environments to production.
  • Data analysis: Socrata offers various data analysis tools, including filtering, sorting, and aggregating data. Understanding these tools can help you perform advanced analysis and extract valuable insights from the data.
  • Performance monitoring of the production jobs/Hue workflows
  • Responsible for Implementation and support of the Hadoop environment
  • Establishing connectivity from Hadoop platform to data sources like Oracle Database, Mainframe system and others (Excel, flat files) for business purposes
  • Validating the applications by running the sample program after applying the spark upgrade.
  • Snowflake is built on top of SQL, so having a good understanding of SQL is essential for working with Snowflake. You need to know how to write SQL queries to extract, transform, and analyze data stored in Snowflake.
  • Snowflake provides various features for data modeling, such as schema design, data types, and relationships between tables. Knowing how to design effective data models can help you organize data in Snowflake for efficient analysis.
  • Data exploration: With Socrata, you can explore datasets using various visualization and analysis tools. Familiarity with these tools can help you identify patterns and insights in the data.
  • Data management: Socrata allows you to manage data by creating and updating datasets, setting up data governance policies, and managing user permissions. Understanding these features can help you keep your data organized and secure.
  • Updating the SLA for every release and monitoring the data flow as per SLA standards
  • Moving and replacing the data to different locations for end-users on Ad-hoc cluster with the help of the platform team
  • Involved in running hive and impala queries to monitor the data flow.
  • Preparing CA7 scripts for jobs and in corporation of new codes in the job scheduling
  • Manage and monitor the HDFS File system
  • Sqoop Import/Export for converting the drivers to secure the connections
  • Interacting with business users/clients to understand the requirements and being involved in the functional study of the application
  • Understanding existing changes, upgrades, migrations, tickets closing audit/exception & performing the impact analysis for the new/change requirements
  • Defining responsibilities, reviewing the code developed by the offshore team, and tracking the review comments for closure.
  • Coordinating with offshore teams & updating the status of work to clients

Environment: Hadoop Eco System, Jira, Linux, Putty, SecureFX, SQL, Python, Java, Udeploy, Scheduling, Attunity, MS-Excel, SharePoint.

Confidential, Boston, MA

Data Modeler Analyst

Responsibilities:

  • Designed and developed Integration APIs using various Data Structure concepts, Java Collection Framework along with exception handling mechanism to return response within 500ms. Usage of Java Thread concept to handle concurrent request.
  • Installed and configured Apache Spark, Hadoop Distributed File System (HDFS), and Apache Hive. Developed Sparkscripts using Java to read/write JSON files. Imported and exported data into HDFS and Hive using Sqoop.
  • Experience in running Apache Hadoop streaming jobs to process large amounts of XML format data and exporting analyzed data to relational databases using Sqoop.
  • Tyler provides various tools for data extraction, such as query builder and report writer. Knowing how to use these tools can help you extract relevant data from the system for analysis.
  • Tyler data can be complex and often requires cleaning before analysis. Knowing how to clean and preprocess data can help you ensure the accuracy of your analysis.
  • Hands-on statistical coding using R and advanced Excel
  • EOE-related data often contains sensitive information, so it's essential to have a good understanding of ethical considerations related to data privacy, confidentiality, and security.
  • Conducted research in Social Media Analytics
  • Involved in collecting, processing, analyzing and reporting social media data of specific research topic
  • Worked on Tracking Community Development from Social Media using R
  • Explored Social Media Analysis on Community Development Practices based on the results from R
  • Performed data mining, data cleaning & explored data visualization, techniques on a variety of data stored in spreadsheets and text files using R and plotting the same using with R packages

Environment: R-Studio, RPubs, Apache Spark, HDFS, Hive, Sqoop, Java, Excel

Confidential, MI

Data Analyst

Responsibilities:

  • Extensively worked with the business and data analysts in requirements gathering and to translate business requirements into technical specifications.
  • Build Informatica mappings, workflows to process data into the different dimension and fact tables.
  • Involved in designing data model with star schema as per the business requirements.
  • Developed complex mappings using Informatica Power Centre Designer to transform and load the data from various source systems like Flat files, XML, Oracle to Oracle target database.
  • Used look up, router, joiner, filter, source qualifier, aggregator, sequence generator, sorter and update strategy transformations extensively to load the data from flat files, tables and excel sheets.
  • Used Session parameters, mapping variable/parameters and created parameter files for imparting flexible runs of workflows based on changing variable values.
  • Used Oracle performance tuning techniques to optimize SQL queries used in Informatica.
  • Tune existing Informatica mappings and SQL queries for better performance.
  • Developed transformation logic, Identifying and tracking the slowly changing dimensions, heterogeneous sources and determining the hierarchies in dimensions.
  • Used Mapplets and Worklets for reusability and to improve performance.
  • Used SQL and PL/SQL for data manipulation and worked with UNIX Shell Scripts.
  • Used debugger wizard to remove bottlenecks at source level, transformation level, and target level for the optimum usage of sources, transformations and target loads.
  • Used VB Scripts for updating and manipulating flat files being used by Informatica.
  • Data analysis skills are essential for a data analyst, regardless of the industry. In the context of EOE, data analysis skills can help you identify patterns and trends in diversity and inclusion data, such as employee demographics, hiring practices, and retention rates.
  • Diversity and inclusion knowledge: Having a good understanding of diversity and inclusion concepts and best practices can help you analyze and interpret EOE-related data. This knowledge can also help you make data-driven recommendations to improve diversity and inclusion initiatives.
  • Data visualization skills are essential for a data analyst to communicate insights effectively. In the context of EOE, data visualization can help you present diversity and inclusion data in a clear and compelling way to stakeholders.

Environment: ETL Informatica Power Center, Oracle, Teradata, SQL Server, RedHat Linux, Perl, MS-Project, Visual Source Safe.

Confidential

Data Analyst

Responsibilities:

  • Conducted research in Social Media Analytics
  • Involved in collecting, processing, analyzing and reporting social media data of specific research topic
  • Worked on Tracking Community Development from Social Media using R
  • Explored Social Media Analysis on Community Development Practices based on the results from R
  • Performed data mining, data cleaning & explored data visualization, techniques on a variety of data stored in spreadsheets and text files using R and plotting the same using with R packages
  • Hands-on statistical coding using R and Advanced Excel
  • Worked collaboratively with different teams to smoothly slide the project to production.
  • Worked on Hadoop Ecosystem using different big data analytic tools including Hive, Pig.
  • Involved in loading data from LINUX file system to HDFS.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Implemented Partitioning, Bucketing in Hive.
  • Created HBase tables to store various data formats of incoming data from different portfolios.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Executed Hive queries on parquet tables stored in Hive to perform data analysis to meet the business requirements.
  • Experience in Daily production support to monitor and trouble shoots Hadoop/Hive jobs.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs
  • Experienced in running Hadoop Streaming jobs to process terabytes of json format data.
  • Worked with multiple Input Formats such as Text File, Key Value, and Sequence file input format.
  • Worked on different file formats (ORACFILE, TEXTFILE) and different Compression Codecs like GZIP, SNAPPY, LZO
  • Used Shell Scripting to automate Hadoop Jobs.
  • Used ZooKeeper for centralized service configuration and synchronization.
  • Used MySQL for metadata storage and retrieval.

Environment: HDFS, Pig, Hive, Sqoop, Shell Scripting, HBase, ZooKeeper, MySQL.

Confidential

Support Data Analyst

Responsibilities:

  • Worked with business and business analysts to translate business requirements into technical specifications.
  • Laid out specifications in writing data extracts to excel files so that business-users can analyze the data extracts in spreadsheets. These data extracts are also used for high-level Business users reporting purposes.
  • Developed around 40 mappings and documented system processes, procedures and set ups.
  • Helped in creating and scheduling batches and sessions using the server manager.
  • Tested the data extracts to check if they conform to the business rules.
  • Maintained specific functional area documentation in a library defined by the technical lead.
  • Created source to target data mapping documents.
  • Identified, developed and documented requirements and functional specifications for change requests.
  • Performed extensive SQL, Shell and PL/SQL scripting for regular Maintenance and Production Support to load the target database in regular intervals.
  • Developed robust mappings with built in reconciliation and error handling.
  • Participated in first level problem resolution tasks.
  • Recommended methods to improve ETL performance.
  • Assisted technical client liaison with Knowledge transfer, system transitions and mentoring.
  • Provided documented operation and user procedures, when required, and performed supporting training for users.
  • Extensively involved during the testing phase before the application underwent the production phase.

Environment: Informatica Power Center 7.1.2/6.2, Oracle 9i, DB2, Sybase, COBOL, SQL, Erwin, Windows NT, UNIX (Sun Solaris).

We'd love your feedback!