Sr. Data Engineer Resume SANFRANCISO, CA - Hire IT People

SUMMARY

Over all 5 years of experience in IT industry, played major role in implementing, developing and maintenance of various Web Based applications using Java and Big Data Ecosystem.
Around 3+ years of strong end to end experience in Hadoop, Spark, Cloud development using different Big Data tools.

PROFESSIONAL EXPERIENCE

Confidential, SANFRANCISO, CA

Sr. Data Engineer

Responsibilities:

Creating Confidential glue tables on existing csv data using AWS crawlers.
Creating ETL jobs for converting csv data to parquet format.
Glue job triggered using AWS Lambda.
Creating lambda functions to trigger ETL jobs.
Proficient in shell scripting, awk, sed, grep, Perl
Using cloud formation template to trigger the lambda functions.
Modified the ETL glue generated scripts as per the client requirement using Scala
Implemented CICD pipelines to build and deploy the projects in Hadoop environment.
Performed advanced SQL queries on views, which was used in the metric reporting tools.
Able to write Shell scripts (Bourne, Korn, C) to automate tedious repetitive tasks and additional hands on with Perl, Python and Ruby would be a plus
Used Lambda, Dynamo DB, and cloud watch from AWS.
Experienced in implementing the pipelines in Jenkins.
Experience with CDC tools to move data from on data sources.
Experience with AWS cloud services such as: EC2, EMR, RDS, Redshift
Experience with relational SQL and NoSQL databases.
Experience working with data, data warehousing & BI.

ENVIRONMENT: Hadoop, Hive, HDFS, AWS, Glue, EMR, spark Streaming, spark - SQL, Scala, Java, Lambda, Step functions, Jenkins.

Confidential, DENVER CO

Data Engineer.

Responsibilities:

Implemented workflows to process around 400 messages per second and push the messages to the Document DB as well as Event Hubs.
Built a custom message producer which can produce about 4000 messages per second for scalability testing.
Implemented call-back architecture and notification architecture for real time data.
Implemented spark streaming in Scala to process the JSON Messages and push them to the Kafka topic.
Developed SQL Reports using advanced SQL queries in OLTP system and webFOCUS
Written Auto scalable functions which will consume the data from Azure Service Bus or Azure Event Hub and send the data to Document DB.
Created real time streaming dashboards in Power BI using Stream Analytics to push dataset to Power BI.
Ability to write PERL, python, and shell scripts
Developed a custom message consumer to consume the data from the Kafka producer and push the messages to service bus and event hub (Azure Components).
Written spark Application to capture the change feed from the Document DB using java API and write updates to the new Document DB.
Created Custom Dashboards Using Application Insights and Application Insights Query Language to process metrics sent to AI and create dashboards on top of it in AZURE.
Implemented Zero Downtime deployment for the entire production pipelines in Azure.
Implemented CICD pipelines to build and deploy the projects in Hadoop environment.
Experienced in implementing the pipelines in Jenkins.
Used Custom Receiver, socket stream, File stream and Directory stream in spark streaming.
Used Lambda, Kinesis, Dynamo DB, cloud watch from AWS.
Used APP Insights, Document DB, Service Bus, Azure Data Lake Store, Azure Blob Store, Event HUB, Azure Functions.
Used Python to run the ansible playbook which will deploy the logic apps to azure.
Developed Hadoop streaming Jobs using python for integrating python API supported applications.

ENVIRONMENT: Hadoop, Hive, HDFS, Azure, AWS, spark Streaming, spark-SQL, Scala, python, Java, webserver’s, Maven Build, Jenkins, Ansible.

Confidential, SAN DIEGO CA

Big Data Engineer

Responsibilities:

Developed daily process to do Incremental import of Data from DB2 and Oracle into HDFS using Sqoop.
Extensively used Pig for Data transformation and Data cleansing.
Experienced in converting Hive scripts into spark using Scala and optimized spark jobs.
Experience in writing stored procedures to transform the data in Microsoft SQL server.
Scheduled multiple spark jobs in Oozie scheduler.
Developed Error logging script to load the Ingestion logs into HBase table.
Worked on CI/CD deployments using Jenkins.
Ingested XML Files captured from RabbitMQ and stored in HBase table.
Developed Complex Hive queries to analyse Data.

ENVIRONMENT: MAPR, Spark, Hive, Sqoop, Spark, HBase, Oozie, Pig, Scala, UNIX, Agile, Code hub, Splunk, Pyspark.

Confidential

Junior Data Engineer

Responsibilities:

Developed Hive queries as per required analytics for the report generation
Involved in developing the Pig scripts to process the data coming from different sources
Worked on data cleaning using pig scripts and storing in HDFS.
Worked on PIG user defined functions (UDF) using java language for external functions
Developed data requirements, performed database queries to identify test data, and to create data procedures with expected results.
Planned, coordinated and managed internal process documentation and presentations, describing the Process Improvement identified along with the Workflow, Diagrams associated with the Process Flow.
Scheduling jobs to automate the process for regular executing jobs worked on using OOZIE.
Worked on UDF in Hive using JAVA.
Expertise on PIG Joins to handle the data in Different data sets

ENVIRONMENT: Hadoop, Hive, Pig, Sqoop, HBase, MapReduce, Java, Python

We provide IT Staff Augmentation Services!

Sr. Data Engineer Resume

Sanfranciso, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship