- Highly skilled IT professional with 9 years of work experience in various roles as a consultant, developer working for major corporate clients
- Experienced professional with technical exposure in diverse technologies like Python, MS Azure, AWS,,Hadoop ecosystem and Tableau.
- Experience in using SSIS(SQL server integration services) and Python for ETL,extracting data from various sources like csv,json etc and loading/writing it to different targets(databases for SQL, parquet format for dynamo db etc)
- Experience with AWS:Aurora,Redshift,S3,EC2,Kinesis,Athena,AWS DMS,EMR,Glue
- Experience with working on Azure: Data lake,Databricks,Delta lake,Data factory
- Implemented big data processing applications to collect and clean large volumes of open data using Hadoop ecosystems such as SPARK,HIVE,. Experience creating and driving large scale Data pipelines.
- Experience in using the Microsoft Suite (Management Studio, Business Intelligence Development Studio (SQL Server Integration Services (SSIS) PostgresSQL.
- Experience working with DevOps tools for CI/CD.
- Excellent communication and interpersonal skills with user and client interaction experience
- Ability to interact with peers and stakeholders to define and drive product and business impact
Languages: Python, SQL
Databases: SQL Server, MongoDB, Oracle, PostgresSQL
Development Tools: Anaconda, Jupyter, SSIS,SSRS
Tableau PySpark.Hadoop: Hive.
Python Libraries/Algorithms: Numpy, Matplotlib, Scipy, Pandas, Tensorflow, Sckit - learn, Pytorch, Seaborn, Keras.
NoSQL: Hbase, Cassandra,MongoDb,AWS DynamoDB
Azure: Data lake,Databricks, Data factory
AWS: Aurora, Redshift,S3,EC2,Kinesis,Lambda,Athena,EMR,Glue,Cloudformation
DevOps Tools: Jenkins,Jira,Git/Github,Docker,Chef,Puppet,Ansible,Apache Nifi, airflow
- Worked with AWS EMR(Elastic Map Reduce) to process vast amounts of data quickly.
- Ingested data into Data frames using Spark, transforming the data and loading it.
- Involved in exporting Spark SQL Data frames into hive tables.
- Involved in cleansing and transforming the data. Used spark SQL to perform sort, join and filter the data.
- Design ETL (extract/transform and load) processes to populate database tables
- Perform day - to-day ETL processes - job monitoring, issue identification, documentation, analysis and resolution
- Understanding and working knowledge on Machine Learning frameworks like Tensor flow,Keras
- Used Pandas, NumPy, Seaborn, SciPy, Matplotlib, Scikit-learn, utilized machine learning algorithms such as linear regression,KNN and K-means for data analysis in Data bricks.
- Interacted, retrieved and analyzed business requirements
- Developed complex stored procedures utilizing transactions, merge statements and dynamic SQL
- Developed and designed Star and Snowflake data models
- Developed ETL process using SSIS, wrote and optimized SQL queries to perform data extraction and merging from SQL server database. Extracted data from multiples sources into SQL such as excel,csv.etc