Senior Data Analyst Resume
2.00/5 (Submit Your Rating)
SUMMARY:
- Have 5 years of work experience in software industry, mainly as a Big data and Spark Developer in Data Warehousing project of Confidential (System Engineer) and Confidential (senior Data Analyst)
- Exposure to emerging technologies such as, Big Data ( HDFS, Hadoop, Hive, Impala, Yarn, Spark,Sqoop, Oozie, Python, Core Java), Oracle 11 g PL/SQL, Data Warehousing,Kafka and Flume
- Informatica PowerCentre 9.6, Basics of Machine learning with python, Qlikview, Tableau, Basic knowledge of React Js and Javascript
PROFESSIONAL SKILLS AND COMPETENCIES:
- Have professional work experience in Data warehousing Project
- Have hands - on experience in HDFS, Hive, Spark Core, Spark SQL, Python, Impala, Big Data/ Hadoop, Sqoop, Oozie, Pig
- Extensive hands-on on developing PySpark framework with the help of Python libraries(pandas, Datapackage, scikit-learn, matplotlib etc) and spark data structures(RDD, Dataframe and Dataset)
- Have used git with Bitbucket and Artifactory for Repository Management, Jira for SAFe(Scaled Agile)
- Experience of scheduling jobs in Oozie, CA7, Control M
- Have technical tools Anaconda Navigator(Jupyter Notebook) and Spyder for Python and spark Framework, Informatica PowerCentre 9.6 for ETL mapping, Teradata studio and PLSQL Developer for SQL query generation, Hue web Interface for Hive and Impala queries and Hadoop file management
- Working knowledge of Reporting tool like BO, OBIEE, Tableau and Qlikview
- Basic knowledge of Machine learning and Deep learning with Python(Scikit -learn, Seaborn, matplotlib, pandas, goodtables, Datapackage) and React Js and JAvascrpt .
KEY AREAS OF WORK:
- Creating Hive, Spark SQL and Impala complex Queries
- PySpark Application Design
- Connection of spark with oracle through JDBC driver
- Data Ingestion using Sqoop
- PL SQL procedures creation
- ETL Design & Development
PROFESSIONAL EXPERIENCE:
Confidential
Senior Data Analyst
Environment: Anaconda Navigator (Jupyter), Spark 2, PySpark, Hive Shell, Unix, Hue, WinScp, Python
Responsibilities:
- To manage the ISR and BCM raw feed files provided by their corresponding user in the designated server location, we have to extract and load that feed files into the hive partitioned tables (partition is based on requirements) in SDL(Secure Data Lake) feed framework .
- After that we applied business logic and transformations and created intermediate views on that data and then load the derived data into hive tables again in IRRM project space. For reporting purpose, Qlikview processes that derived data and generate reports.
Confidential
Environment: Anaconda Navigator (Jupyter), Spark 2, PySpark, Hive Shell, Unix, Hue, WinScp, Python
Big Data and Spark Developer
Responsibilities:
- Create mapping, session, command task, Workflows, CA7 jobs, Unix script for email generation
Confidential
Environment: Python, Hive, Impala, Spark, Sqoop, Teradata, Hadoop
Hadoop Developer
Responsibilities:
- Involved in creating of Frame work from scratch, which replace existing system ETL.
Confidential
Developer
Responsibilities:
- Development & design: Extract, Transform and Load (ETL) processes using Informatica and Oracle Warehouse Builder. Optimized PL/SQL code; writing efficient SQL queries and improving data quality by performing data analysis.
- Development as well as Performance tuning of the ETL codes for populating multiple Layers as Stage, Integration (IDS), Data mart and Aggregate Data mart using Informatica, OWB and PL/SQL.
- Supporting data load processes by Creating process flows and Control - M scheduling.
- Analysis of the functional requirements and planning of execution to achieve the requirements during development phase.
- Development & design: Extract, Transform and Load (ETL) processes using Informatica
- Powercentre and Oracle Warehouse Builder. Optimized PL/SQL code; writing efficient SQL queries and improving data quality by performing data analysis.
- Preparing daily report in Microsoft excel which is shared with client.