Data Scientist/spark Engineer Resume
4.00/5 (Submit Your Rating)
Grand Rapids, MI
SUMMARY
- A data geek wif 5 years of progressive experience in Data Analytics, Visualization and Machine Learning. Excellent capability in collaboration, quick learning and adaptation.
- Experience in integratingdata, profiling, validating and datacleansing/ transformation anddatavisualization using R.
- In depth knowledge and hands on experience of Big Data.
- Experience in manipulating the large data sets wif R packages.
- Intensive hands - on Boot camp on Data Analytics course spanning from Statistics to Programming including data engineering, data visualization, machine learning and programming in R, SQL.
- Experience in Descriptive Analysis Problems like Frequent pattern Mining, Clustering, Outlier Detection.
- Good exposure on Python and libraries like Numpy, Pandas, Matplotlib, Sci-Kit learn, Scipy.
- Good Exposure in deep learning wif Tensor flow in python.
- Good knowledge in Tableau for interactive data visualizations.
- Good Understanding in NoSQL databases like MongoDB.
- Experience and Knowledge in developing software using Java.
- Good exposure in creating pivot tables and charts in Excel.
- Good Knowledge on ETL tools like Informatica.
TECHNICAL SKILLS
Technical Skills: Machine Learning and Data Mining, Statistics
Programming Skills: R, Python, SQL, Hive, Hadoop
Data Sources: HDFS, Oracle, MySQL, MongoDB, ExcelData Visualization: R, Python, Tableau
Data Exploration: R, Python
Repository: GitHub
Operating Systems: Windows, Linux
PROFESSIONAL EXPERIENCE
Confidential
Data Scientist/Spark Engineer
Responsibilities:
- Extracted the data from hive tables by writing efficient Hive queries
- Developed signals using Machine learning algorithms
- Performed Dimensionality reduction using near zero variance and correlation techniques
- Build a Linear Regression Model to predict the Sector Failure Rate of a die at a given confidence level
- Used to work wif R for data manipulation of large datasets
- Implemented different models like ANOVA, Logistic Regression, Random Forest and Gradient-Boost Trees to predict whether a given data will pass or fail the test
Confidential
Data scientist
Responsibilities:
- As part of an enhancement, added an additional parameter to the URL to redirect the user to the content based on the value of the added parameter
- Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries
- Used Pandas and Numpy libraries for data analysis and process them in HDFS
- Developed Simple to complex Map/reduce Jobs using Hive
- Experienced in using Machine Learning Algorithms APIs to derive useful insights
- Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing Hive queries to further analyze the data
- Worked on Spark architecture and implementation in Java/Scala
Environment: Java, Hive, MapReduce, RDBMS, Spark, R, Scala, Python
Confidential, Grand Rapids, MI
Business Intelligence Analyst
Responsibilities:
- Prepared Dashboards using calculations, parameters, calculated fields, groups, sets and hierarchies in Tableau.
- Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data using SQL queries.
- Performed data quality analysis wif web tool named “REDCap” using R, SPSS and Excel
- Transferred big data between IT department and research department, solved technical problems of the software’s and the database, using SQL queries and commands
- Used data analysis tools and techniques such as SharePoint, Microsoft Access, MS Word, MS Excel (V-look, join, grouping, match, index, pivot tables), MS PowerPoint, MSU Outlook, Google Apps
- Resolved and cleared IT tickets
- Assisted in designing and implementing IT procedures, project management procedures and activities
- Served as a project manager for own projects
- Created master password list using Lastpass
Confidential, Grand Rapids, MI
Research and Teaching Assistant
Responsibilities:
- Developed nursing department’s website and conducted research literature
- Prepared instructions for bioinformatics labs for undergraduate and graduate students
- Conducted online competitive research on community colleges and developed strategic plans for KCON