Data Engineer Resume
NC
SUMMARY
- Over 7+ years of experience as a Python Developer, pro cient coder in multiple languages and environments including Python and SQL.
- Worked on several standard python packages like Numpy, Pandas, PySide, PyTables, etc.
- Driven to architect Big Data solutions on multiple platforms using data analytics.
- Developed various Python scripts to generate reports, send FIX messages (FIX Simulator), SOAP requests, TCP/IP programming, and multiprocessing jobs.
- Good Experience in Working on Restful Apis.
- Good Experience in working on Amazon Web Services.
- Good Experience in working on Google Cloud Platform.
- Good experience in developing web applications and implementing Model View Control (MVC) architecture using server - side applications like Django and Flask.
- Experience in working on BDPaaS (Big Data Platform as a Service).
- Created and deployed CI/CD pipeline for applications using Jenkins for Dev, QA, Staging and Prod Environments.
- Good experience in implementing Data lake architecture on Aws and have worked extensively on snowflake Datawarehouse, and AWS data services like EMR cluster, RDS
- Data Extraction, aggregations, and consolidation of Adobe data within AWS Glue using PySpark.
- Designed and ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift.
- Expertise in working with different databases like Microsoft SQL Server, Oracle, MySQL, PostgreSQL, and Good knowledge in using NoSQL databases like MongoDB and Cassandra.
- Expertise in working on data migration tools like Alembic.
- Good Experience working on Pyspark
- Good Experience working on Hadoop.
- Analyzed the SQL scripts and designed them by using PySpark SQL for faster performance
- Proficient in developing complex SQL queries, Stored Procedures, Functions, and Packages.
- Hands-on Experience in Data mining and Data warehousing using ETL Tools.
- Excellent working knowledge in UNIX and Linux shell environments using command-line utilities.
- Application Stress tester created and stress-tested stand-alone, web-applications, and generated graph reports.
- Attentive to cybersecurity and data protection.
- Motivated, proactive, innovative problem solver with excellent analytical, organizational, interpersonal, communication, Developed and written skills. Excellent team player, quick learner, keens to learn and implement new IT technologies.
- Good understanding of Spark architecture and its components.
- Strong knowledge of Data Structures and Algorithms, Object-Oriented Analysis, machine learning, and software design patterns.
- Good Experience in Working on Geospatial Data.
- Developed API endpoints in Scala used functional programming, data aggregation with pagination.
- Good Experience in Snowflake data warehouse.
- Professional in designing, developing, and enhancing automated test scripts for Selenium using Python.
- Good Experience in working with Microsoft’s Team city and Octopus for continuous integration and deployment.
- Knowledge of deployment tools using Jenkins, pylint, and Coverity.
- Good experience working on Python-based frameworks for test automation.
- Expert at version control systems like Git, GitHub, and Bitbucket.
- Experience in documenting the build and entire deployment process
- Good Experience in Documenting on Confluence.
- Good Experience in Collaborating and working with offshore resources.
- Experienced in WAMP (Windows, Apache, MYSQL, and Python) and LAMP (Linux, Apache, MySQL, and Python) Architecture.
- Experience with Agile, Scrum, and Waterfall methodologies. Used ticketing systems like Jira, Bugzilla, and other proprietary tools.
- Experience troubleshooting software and automation issues with the assistance of testing engineers
- Familiarity with development best practices such as code reviews, unit testing, system integration testing (SIT), and user acceptance testing (UAT).
- Highly motivated, quality-minded developer, with proven ability to deliver applications against tight deadlines.
- Possess good interpersonal, analytical presentation Skills, ability to work in Self-managed and Team environments.
- Performed code reviews and implemented best Pythonic programming practices
- Experience in writing test scripts, test cases, test specifications, and test coverage.
- Good experience in handling errors/exceptions and debugging the issues in large scale applications.
TECHNICAL SKILLS
Programming Languages: Python, SQL
Operating Systems: Ubuntu, Windows 10/XP/2000/Vista/7, RedHat Linux, Windows server 2008, 2012
Python Libraries: Django, Flask, Beautiful Soup, SQLAlchemy, GeoAlchemy, HTML/CSS, Pandas, Numpy, PySide, PyTables
Testing Tools: Pytest, Selenium
Development tools: Sublime Text, Eclipse, PyCharm, Notepad++, Jenkins, Coverity, pylint, Jupyter Notebooks, mobaxterm
Databases: Microsoft SQL Server, PostgreSQL, Oracle, MySQL, MS Access, Mongo Db, and NoSQL database
CloudTechnologies:AmazonWebServices(S3,CloudWatch,APIGateway,StepFunctions,Boto3,RDS,CloudFormation,Terraform,Lambda,ECS,EC2,SES,SNS,Batch,EMR)GoogleCloudPlatform(CloudFunctions,Composer,Airflow,BigQuery,GoogleCloud Storage,AWS,Glue,CloudPubSub,StackDriver,ElasticSearch,AppEngine,Cloud Build,Compute Engine,IAM,SQL,VPC,KMS,Source Repositories, Data Flow, Jupyter Lab)
Big Data: Hive, Pig, PySpark, Spark
Version Controls: Git, GitHub, Bitbucket
Methodologies: Agile, SCRUM
PROFESSIONAL EXPERIENCE
Confidential, NC
Data Engineer
Responsibilities:
- Worked on Building Python Based Frameworks for internal use and deployed on the PyPI server.
- Responsible for debugging and troubleshooting the web applications
- Written python scripts using Spark-SQL for various data stages to analyze and maintain the data in well-formed.
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using Spark SQL.
- Good understanding and related experience with Hadoop stack-internals, Hive, Pig, and Map/Reduce.
- Involved in loading data from the UNIX file system to HDFS.
- Used Pytest, a Pythonunit test framework for all Pythonapplications.
- Hands-on Experience with Google Cloud Platform.
- Working knowledge in Google Cloud Functions for Server-less computing using Python
- Deployed Google Cloud functions by setting up various triggers like HTTP, Storage Bucket, and Pub-Sub for serverless integration.
- Enabled single sign-on for flask-based web Applications and deployed on Google App Engine.
- Created and deployed Flask based Web Applications on Google App Engine.
- Worked on various Google Cloud Platform services like(Cloud Storage, Cloud Functions, Cloud Composer, Big Query, Stack Driver, Pub-sub, Dataflow, App Engine, and Compute Engine)
- Hands-on experience in deploying and configuring Elastic Search
- Automated CI/CD process using Jenkins, build-pipeline-plug in, Maven, Git.
- Used Terraform scripts to Automate Instances for Manual Instances that were launched before.
- Extensively involved in infrastructure as code, execution plans, resource graph and change automation using Terraform.
- Responsible for implementing monitoring solutions in Ansible, Terraform, Docker, and Jenkins.
- Developed environments of different applications on AWS by provisioning on EC2 instances using Docker, Bash and Terraform.
- Worked on Airflow webserver and DAGS. Created Dag jobs on Google cloud Composer to schedule airflow jobs.
- Working experience in Spark components such as SQL, RDD, Data Frames and Datasets
- Implemented, designed, and coded python user-defined functions in PySpark
- Worked on reading and writing multiple data formats like JSON, ORC, Parquet on HDFS using PySpark
- Created pipelines using Kinesis firehose and lambda functions.
- Created and Scheduled jobs to load data from Hadoop to Google Big Query.
- Responsible for debugging the project monitored on JIRA.
- Followed Agile and Scrum Methodologies. Used Jira ticketing system
Environment:Python,HDFS,HadoopMR,Hive/HQL,HTML/CSS,Spark,PySpark,Pytest,GoogleCloudPlatform,AWS,Boto3,Lambda,Kinesis,S3,ElasticSearch,PubSub,Bigquery,CloudSQL,Dataflow,Dataproc,AIplatform,Snowflake,IAM,VPC,Airflow,Composer,Salesforce marketing Cloud, CI/CD, Terraform, Glue, UNIX shell script, Flask, Mobaxterm, Jupyter Notebooks, Looker, Red Hat Linux, Putty.
Confidential, Iowa
Python Developer
Responsibilities:
- Developed Restful APIs using Python.
- Analyzed complex user requirements, procedures, and problems to improve existing System design.
- Helping automate data analysis by automating pre-processing and incorporating helper functions into API codebase.
- Used SQL toolkits like SQLAlchemy and Geo Alchemy.
- Extensively worked on Geospatial data.
- Developed end to end application components involving the business layer, persistence layer, database, and web services layer.
- Documented Detailed design within a given project structure.
- Developed code for HTTP API calls like (GET, PUT, POST, DELETE)
- Worked on Ingestion Process of geospatial data using ETL techniques.
- Developed test automation framework scripts using Python Selenium WebDriver.
- Created python test cases using pytest and unit test frameworks.
- Wrote python automation testing using selenium web driver across.
- Experience in working and coordinating with offshore resources.
- Used Analytical Python Libraries like Pandas and Numpy to work on Data Manipulations.
- Successfully coordinated and developed the deployment process.
- Worked on .csv, xml, .xlsx, and json files.
- Worked Extensively on Python libraries like Pandas for reading .csv files
- Experience Working on ORM toolkits like SQLAlchemy and GeoAlchemy.
- Performed development and test of system enhancements which include SQL Jobs and SQL queries.
- Documented the entire build and deployment process including detailed step-by-step instructions.
- Worked on developing tables using alembic a data migration tool used for SQLAlchemy.
- Extensively worked on Microsoft Team city and Octopus for Continuous Integration and Deployment.
- Wrote python scripts to parse XML and CSV documents to load the data in the database.
- Extensively worked on AWS Cloud platform and its features including EC2, VPC, RDS, API Gateway, Cloud Watch, Cloud Formation, Step Functions, IAM, S3, SES, SNS, Batch, and Lambda.
- Created SNS notifications and assigned ARN to S3 for object loss notifications
- Hands-on experience with AWS (Amazon Web Services), kinesis, using Elastic MapReduce (EMR), creating buckets in S3, and storing data in them
- Working knowledge in AWS Lambda for Server-less computing using Python
- Working knowledge in API Gateway for creating API keys and endpoints.
- Developed cloud formation scripts to create and update stacks.
- Generated property list for every application dynamically using python.
- Worked on Defect Debugging.
- Creating a unit test/regression test framework for working/new code.
- Debugging and testing of the applications & fine-tuning performance. Provided maintenance support in a production environment.
- Followed Agile and Scrum Methodologies. Used Jira ticketing system.
Environment: Python, HTML/CSS, PostgreSQL 9.6, ETL, Team city, Octopus, Alembic, IBM, AWS, AWSGlue Oracle, PL/SQL, Unix Shell Scripting, Red Hat Linux, Selenium, WebLogic Application Server.
Confidential
Software Developer
Responsibilities:
- Worked with the Stakeholders, gathered requirements developed High-level design Detail design documents.
- Re-engineered various modules for implementing changes and creating an efficient system.
- Designed and developed components using Python. Implemented code in python to retrieve and manipulate data.
- Implemented database access using Django ORM.
- Used MySQL as the backend database and MySQL dB of python as a database connector to interact with MySQL server.
- Used Restful APIs to access data from different suppliers.
- Support the script's configuration, testing, execution, deployment, and run monitoring and metering.
- Used Python and Django creating graphics, XML processing of documents, data exchange, and business logic implementation between servers.
- Used Restful API's to gather network traffic data from Servers.
- Supported Apache Tomcat web server on Linux Platform.
- Developed and executed the User Acceptance Testing portion of the test plan.
- Debugging Software for Bugs
Environment: Python, MySQL, Shell Scripting, PL/SQL, UNIX, Linux, Agile, pylint, Jenkins.
