We provide IT Staff Augmentation Services!

Data Engineer Resume

5.00/5 (Submit Your Rating)

SUMMARY

  • 5+ years of professional experience in BIG Data technologies, IT programming
  • Current Role: Data Engineer
  • Past Roles: Senior Hadoop Developer
  • Industry Sectors: Telecom, Travel
  • HIVE, HQL developing database
  • Spark ( adoop) programming, Spark Streaming, rdd, data frame, datasets and analytics
  • Hive joins, performance Tuning in joins and SerDe with different formats
  • Kafka, Sqoop tools for data ingestion, import data into Hadoop ecosystem
  • 1+ year of professional experience in AWS - > EC2, Lambda, Glue ETL, Athena

TECHNICAL SKILLS:

BIG Data: Spark, Pyspark, Hadoop, Map Reduce, Hive, Sqoop, SQL, Python, Glue ETL

Database: MySql, NoSQL, Hbase, Cassandra

Business Intelligence: Hive, Sqoop, Knime, Hue

Programming Languages: Python, Java(core), Linux Shell scripting

Testing: bluefish and geany editor, zookeeper, Jenkins

Operating Systems: Linux ubuntu/mint/centos

PROFESSIONAL EXPERIENCE:

Confidential

Data Engineer

Environment: NiFi, Spark, Cassandra, ML & AI, GRPC, Kubernetes, Docker, Python, Linux Shell Script

Responsibilities:

  • Create predictive industrialized use cases which are deployed as containerized solution and are scalable.
  • Worked on 2 predictive AI models for Bharti Airtel which were commercialized and deployed on Confidential . The models would predict 4 hours and 1 hour respectively in advance some degradation and take corrective action before the degradation.
  • Developed FE (feature Engineering) scripts in spark for machine learning models in pyspark. Also created model for target selection to dynamically locate neighbor cell to transfer data making sure of multiple conditions.
  • Feasibility study to commercialize POCs by shortlisting those running within Ericsson which could be developed as a full-scale architecture which would be plugged and played as a solution to multiple customers

Confidential

Senior Hadoop Developer

Environment: Hadoop, spark, Hive, Sqoop, Python, SQL-HQL, Linux Shell Script

Responsibilities:

  • Responsible for data modeling.
  • Built configured structure data loading into HDFS using Sqoop or locally
  • Developed Jenkins job and HQL to load the data into stage and base hive tables.
  • Wrote scala programs to filter and process XML files and joining large data sets of json files.
  • Make canonical model on Hbase.
  • Developed Jenkins job to make job automated

Environment: Hadoop, Kafka, Stream Topic, Spark Streaming, Hbase, Python, Scala, Shell Script

Responsibilities:

  • Responsible Kafka producer.
  • MapR Stream and topic with partition.
  • Spark Streaming to process data and apply HMF rules
  • Responsible for dynamic JSON Schema Evaluation.
  • Enable upsert operation.
  • Responsible for data modeling

Environment: Hadoop, Hive, Spark sql, Hbase, JSON, Python, Scala

Responsibilities:

  • Create architecture of hive master-staging hbase integrated tables.
  • Apply currency conversion rules.
  • Enable upsert operation.
  • Responsible for data modeling

Environment: EC2, Lambda, Glue ETL, Athena, xml, parquet

Responsibilities:

  • Create AWS Lambda and Glue ETL job.
  • Responsible for project architecture and result.
  • Responsible for enable/schedule AWS Lambda & Glue Job.
  • Responsible for data modeling

Confidential

Associate Hadoop Developer

Environment: Hadoop, Spark, Python, Linux Shell Script

Responsibilities:

  • Responsible for the security of client data
  • Provide a hassle free interface to transfer data from one database to another
  • Developing pyspark code to parse data
  • Responsible to map same data types
Confidential

Python Developer

Environment: Hadoop, Hive, Sqoop, Python, SQL-HQL, Linux Shell Script

Responsibilities:

  • Responsible for designing, coding, testing, deployment phases of Data modeling.
  • Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like HDFS, Hive.
  • Built configured structure data loading into HDFS using Sqoop
  • Developed UNIX script and HQL to load the data into stage and base hive tables after partitioning and bucketing.
  • Wrote MapReduce programs to filter and process XML files and joining large data sets.
Confidential

Environment: Linux Scripts, Python, MySQL, Hadoop, hive, sqoop, AWS (cloud), EC2

Responsibilities:

  • This is a Script-based Project.
  • The Record of 15 companies is managed through MySql while other 15 companies record is managed through Hive.
  • The Stock Market Analysis is broadly divided into two categories.

We'd love your feedback!