Sr. Data Analyst / Engineer Resume
Newark, NJ
SUMMARY:
- At present, working as a Big Data Analyst, Appling Advanced Business Intelligence and Analytics - Infrastructure
- 6+ years of experience in Data Analysis, Machine Learning, Data mining with large data sets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive modeling, Data Visualization, and Web Scraping.
- Adept in statistical programming languages like R and Python including Big Data technologies like Hadoop, Hive SPARK, PYSPARK, and AWS.
- Experience in Statistical modeling, Multivariate Analysis, model testing, problem analysis, model comparison, and validation.
- Skilled in performing data parsing, data manipulation and data preparation with methods including describe data contents, compute descriptive statistics of data, regex, split and combine, Remap, merge, subset, reindex, melt and reshape.
- Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.
- Highly skilled in using visualization tools like Tableau, ggplot2, dash, flask for creating dashboards.
- Hands on experience with big data tools like Hadoop, Spark, Hive, Pig, PySpark, Spark SQL, PySpark
- Hands on experience in implementing LDA, Naive Bayes and skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, neural networks, and Principle Component Analysis.
- Experience with object-oriented/object function scripting languages: Python, Java, C++
TECHNICAL SKILLS:
BigData Ecosystem: Cloudera / hortonworks Hadoop, Hive, Pig, HBase, Apache Spark, PySpark
Python Libraries: Numpy, Matplotlib, NLTK, Statsmodels, Scikit-learn/sklearn, SOAP, Scipy
Python Frameworks: Pandas, Flask, Django, Docker
Languages: Python, Java, J2EE, SQL.
Data Visualization: tableau, SAS, PowerBI
Databases: Oracle, My SQL, AWS Redshift
Database Dev: T-SQL, PL/SQL. UNIX Shell Scripting
NoSQL Databases: HBase, MongoDB, Cassandra
ETL Tools: AWS Glue, Informatica, AirFlow
Data Modeling: Star-Schema Modeling, Snowflake-Schema Modeling, FACT and dimension tables, Pivot Tables, Erwin
Reporting Tools: Business Objects6.5, XIR3, Cognos 8 Suite
Data Warehousing: Informatica 9.1/8.6/7.1.2 (Repository Manager, Designer, Workflow Manager, and Workflow Monitor), SSIS, Data Stage 8.x
PROFESSIONAL EXPERIENCE:
Confidential, Newark, NJ
Sr. Data Analyst / Engineer
Responsibilities:
- Created an aggregated daily report system for clients to analyze market trends, so that, clients make investment decisions.
- Built an internal visualization platform for the clients to view historic data, make comparisons between various issuers, analytics for different bonds and markets
- The model collects, merges daily data from market providers and applies different cleaning techniques to eliminate bad data points.
- The model merges the daily data with the historical data and applies various quantitative algorithms to check the best fit for the day.
Confidential, Roseland, NJ
AWS Architect
Responsibilities:
- Using EMR, Spark, Lambda, and Redshift worked on AWS architecture for implementing the next generation cloud services, which resulted a cost savings of about $50,000 per month for the company.
- Introduced and defined a new reporting Tool that uses various reports like assess data gaps, outliers, potential errors and time series. The tool provides recommendations/visualizations for the data governance and data excellence teams through aws quick sight.
Confidential, New York, NY
AWS Architect
Responsibilities:
- Defined and worked for the migration of data to AWS architecture for implementing next generation of cloud services.
- AWS technologies like EMR, spark, Lambda, and Redshift were used throughout the project.
- Project provided better investment performance for both clients and management.
- Millions of trades were collected over night; click thru and web logs were placed into analytics platform, so that, large scale ingest, EMR jobs can run to identify trading activity. Since the system was automated, on daily basis, automated analytical reports are available for data analysts and traders. (provided great advantages for traders)
- Improved data processing and storage throughput by using amazon s3 technology and Hadoop framework for distribution and computing across a cluster of up to twenty-five nodes by implementing indexes and parallel processing.
- Worked with machine learning teams to identify various datasets needed for processing.
Confidential, Boston, MA
Associate Cloud Engineer
Responsibilities:
- Responsible for data, ETL and reporting architecture of overnight batch loads.
- Worked mainly with analytics and finance user groups to design enhancements for the data warehouse to identify bottlenecks, slow running processes; work helped to improve speed of jobs.
- Migrated legacy data warehouse to newer informatica workflows and database objects. Using snowflake design, defined the ETL architecture for staging, reporting schema, data models.
- To meet specific business requirements for reducing cost and processing time, proof of concepts from scratch were explored for integration of existing systems with new systems
Confidential, New York, NY
Associate Cloud Engineer
Responsibilities:
- Worked on data warehouse, etl, and projects for the Data Governance and Master Data Management teams to implement an anti-money laundering system for firm wide.
- Worked on a project to implement a portfolio reconciliation data mart to manage high value and high trading potential customers for the firm so that the sales teams can work to reach out more profitable customers and assign more experienced bankers to them.
- Used amazon quick sight to provide real time visualizations on how much batch process was successful, if there are any outliers, missing data, potential erroneous data, or unbalanced data on incoming data distributions, the system quickly alerts the management.
Confidential, New York, NY
ReactJS/ Python Develop
Responsibilities:
- Following MVC architecture, implemented web applications in Flask and spring frameworks
- Used Python to place data into JSON files for testing Django Websites
- Updated and manipulated content files by using Python scripts
- Used Django configuration to manage URLs and application parameters
- Wrote Automated test cases using Selenium WebDriver using Python API
Confidential, Stamford CT
Business Analyst / Python Developer
Responsibilities:
- Part of a small team that is responsible for managing a back office of a 10-billion-dollar Confidential .
- As a team member, provided research data and investment advices for processing Bonds
- Consulted with business partners and made recommendations to improve the effectiveness of Big Data systems to increase the load throughput & Integrated new tools and developed technology prototypes.
- Troubleshoot ETL failures & processing pipelines
- Established and maintained SQL queries in Oracle, Sal Server, and Postgres
- Wrote ad-hoc queries for reporting purposes in support of data analytics and business teams.
