We provide IT Staff Augmentation Services!

Big Data Analyst Resume

4.00/5 (Submit Your Rating)

Madison, WI

SUMMARY

  • Currently working as a Big Data Analyst with DSS Advanced Business Intelligence and Infrastructure Analytics - Data Management Team in TDS.
  • Working with CDH5.3 cluster and its services, instances
  • Working with Apache Spark for batch and interactive processing. Involving to develop and running the spark applications, use Spark with other Hadoop components.
  • Working to Extract Transform and Loading data (ETL operations) using Morphlines.
  • Working with the decision support team to deployment planning for cloudera search, JVM memory management and using the custom JAR files based on the business logic.
  • Have experience with the Solr Server tuning.
  • Good knowledge and experience in Flume and Solr, Impala, HBase, Mahout and MapReduce.
  • Working in linux environment and efficiently using the linux scripts and system administration process.
  • Involving for Impala deployment, performance tuning and trouble shooting.
  • Writing HiveQL, performance tuning and monitoring. Using the event filters and writhing the JSON objects to define and match with the audit events.
  • Good experience and knowledge with Impala, Sqoop, Spark, Crunch, Pig, Avro, Parquet, HUE, Oozie and Flume.
  • Good experience with hive meta store database and logs and configure the proxy user groups to access and override the control and security management.
  • Good experience in data modeling and data analysis with very strong backend sql.
  • Experience with Hortonworks, Cloudera, cloud and Amazon web services AWS and cloud9.
  • Good understanding with experience and exposure in big data concepts and hive, pig queries.
  • Advanced skills in Excel and Access.
  • Experience in hive queries, pig scripts, python with Hadoop eco system and R scrips, programming and tableau visualization and analytics.
  • Experience in Data modeling process, conceptual, logical and physical models, ERD and data warehouse design-Dimensional-Star, snowflake schema, Data Integrity and OLAP and facts, indexing, data dictionary
  • Very good professional experience in all phases of Software Development Life Cycle (SDLC) including Design, Implementation and testing during the development of software applications along with Hadoop, Big data, Oracle Database(11g, 12c), Sql Server 2012, MySQL, Derby, MariaDB etc. Oracle database Warehouse (DBW), Informatica Power Center, IBM Data Manager for BI, Data Analysis, Data Extraction, Data Transformation and Data loading, Data Mart & IBM Cognos Reporting tool to create cubes and reports.
  • Knowledge of DB2, COBOL.
  • Expertise in ETL operation and tools.

TECHNICAL SKILLS

  • Hadoop Tools & Concepts: Cloudera, CDH5 Hadoop Cluster, Yarn, Spark, Hive, Impala, HDFS, Hbase, Flume, Hcatalog, Pig & HDFS, Map - Reduce, YARN, Hadoop Eco System, zoo keeper, sqoop, Mahout (Machine learning library), Hive, Pig, R connectors
  • Programming & Scripting Languages: Java, Python, php, JavaScript, PL/SQL, SQL, HTML, XHTML, XML,UNIX shell scripting. Knowledge of functional programming, C, C++, COBOL and Visual BasicsJava, J2ee specific skills
  • Core Java, Java SE, J2EE Common Services APIs, Web Services, Java JSP, Servlets, Struts driven web sites, JDBC connections, Hibernate, Applet, SOAP, RESTful, Restless JUNIT, Eclipse, My Eclipse, IBM WebSphere.: Database
  • Oracle 11g, Oracle 12c, MS Access, MySQL, SQL Server 2012,Derby and MariaDBNoSQL- Mongo DB, Couch Base,Cassendra PostgreSQL DB, Oracle Siebel CRM.: Data Ware house
  • Kimball-DW/BI Life cycle methodology: Data Modeling Tools
  • Erwin, IBM Infosphere, DIA, Microsoft Visio: Web Server
  • Apache Tomcat 7.0, DerbyETL Tools-BI, BO: SAP Webi, Crystal Report, IBM Data Manager & Informatica power center
  • Designing and modeling Tools: OOAD- DIA, MS-Visio, Star UML, IBM framework Manager
  • Tools: & Utilities: Eclipse IDE, Net Scape,JDK1.6,SQL*Plus, SQL & PL/SQL Developer, SQL * Loader, CVS, SVN, JIRA, JAMA, IBM RAD, WebSphere, Golden 6, Erwin
  • Domain: Manufacturing, Health care, Medical Insurance, Telecom, Automotive, finance
  • Internet Technologies: Oracle Web Tool kit, web services
  • Analytics, Visualization tools: IBM Cognos, R Studio, R-Programming, Tableau8.0, Pentaho, Advanced Excel VBA, Macro, Microsoft Power BI
  • Methodologies & Frameworks: Agile, Scrum methodology, LD-Lean Software Development, XP-Extreme Programming, RAD-Rapid application development and Water fall methodology
  • Operating system: Windows Vista/ XP/2000/7/8, UNIX, Ubuntu

PROFESSIONAL EXPERIENCE

Confidential, Madison, WI

Big Data Analyst

Responsibilities:

  • Working with telecom billing and financial data and discussions with stack holders to decide the design and migrations.
  • Writing the hiveQL and manage hive metastoreserver to control different advanced activities.
  • Involving to manage logs and monitoring health check in hive metastore server
  • Working with statistical analysis patterns and create the dashboards for quick references and share to the internal customers on daily, weekly or monthly basis.
  • Worked with streaming and Data ware housing projects
  • Worked in Json scripts, mongo dB and Unix environment to non-sql data clean-up grouping and create the analysis reports
  • Writing python scripts and java coding for business applications and MapReduce programs.
  • Working with hive warehouse directory and hive tables and services.
  • Have working knowdege with policy file based sentry.
  • Working with cloudera manager and Administrator, data managements and operations.
  • Using Apache Spark for streaming applications and write the API using scala, python and java
  • Using Machine learning algorithms API implementing through MLub and GraphX api’s for graph-parallel computation.
  • Using and implementing Kerberos identity verifications in cluster security management.
  • Good understanding about Kerberos server and Kerberos principles.
  • Working with Thrift JDBC and ODBC servers and exposure with cluster manager and other features.
  • Expertise in Spark authentication encryption and manage, monitoring spark applications.

Environment: Hadoop eco system, CDH5.3,HDFS,Hive QL, HBase, hortonworks & cloudera, Mongodb, BI Launch pad, SAP Business Object Webi, Crystal Report, MySql, Oracle Database, SQL Server 2013, python, Sql Developer, CMC, SAP Business Intelligent client, Golden 6, Java, Tableau10.3

Confidential, Madison, WI

Data Analyst

Responsibilities:

  • Working on building the cubes and reports
  • Create the Business metric KPI (Key Performance Indicator) to evaluate factors for different modules.
  • Create the catalog in data manager, fact build, dimension build, reference dimension, customize create and deploy ETL jobs from the transaction data
  • Working on database migration, upgrade and maintenance.
  • Cleansing, mapping and transforming data, create the job stream, add and delete the components to the job stream on data manager based on the requirement
  • Use Service Now to incident management.
  • Job schedule through the windows job scheduler.
  • Migrate data from Dev environment to PROD
  • ETL process on the data ware housing and OLAP data.
  • Create the look-up tables for the data processing
  • Experience with creating different reports and cubes using Cognos.
  • Pivot tables and create various analysis report using MS-Excel
  • Using statistical R-packages and R-programming for Factor, quantitative Analysis and k-means clustering.
  • Using python script API for R- programming analysis
  • Tableau reports and dashboards created and distributed in pdf format to the administration team.
  • Worked on Database migration BI integration and cloud conversion.
  • Big data Architectural analysis and Hadoop eco system migration with vm.

Confidential, Madison, WI

Web Analyst

Responsibilities:

  • Worked as a Analyst in back end operations with Object Oriented Programming.
  • Job scheduling, pl/sql module development using advanced pl/sql concepts collections, dynamic sql
  • Write code to interface with external java applications.
  • Performance tuning and sql query tuning. Write and tune the pl/sql code effectively to maximize performance
  • Involved in System Analysis and Design methodology as well as Object Oriented Design and development using OOAD methodology to capture and model business requirements.
  • Responsible for Module development, web services(RESTful, Restless), Frameworks(MVC), Java design patterns(Factory, singleton, etc.), JVM & Memory management and tuning
  • New page Design, object mapping and migrate to html5 tag changes
  • Incident Management through JIRA and JAMA
  • Development and Unit test using IBM WebSphere
  • Actively work as the Database Administrator and handle the database architecture using Erwin and kimball

Environment: Java EE, JSP, Servlets, JSF, Spring DI/IOC, Hibernate, XML, HTML, JS, CSS, DB2, Web services, Rational Software Architect, Web sphere Application Server, UNIX, Junit, Log4J, SVN, Linux / Windows, oracle 11g, JIRA,JAMA and Wiki, Kimball, Erwin

We'd love your feedback!