We provide IT Staff Augmentation Services!

Sr Software Developer Resume

3.00/5 (Submit Your Rating)

Mountain View, CA

SUMMARY

  • Around 15+ years of professional experience in Information Technology Industry expertise in administration, developing, managing and deploying various commercial business applications.
  • Lead team of engineers in building innovative data products specific to a business requirement
  • Collaborate with team of data scientists and ML engineers to architect and deploy solutions
  • Passionate about engineering data related products to generate business insights
  • Architecture data solutions with Cloud, On - premise or hybrid practices
  • Designed and developed templatized data processing framework centered around Apache Airflow
  • Custom built validation frameworks to provide insights into data latency and quality
  • Expertise working with AWS environment
  • Expertise with the tools in Hadoop Ecosystem includingPig, Hive, HDFS, MapReduce, Sqoop, Spark, pyspark, Kafka, Yarn, Oozie,andZookeeper.
  • Experience in developing algorithms in spark using spark context, spark-sql, RDDs and Dataframes.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Excellent knowledge on Hadoop Ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in designing and developing applications in Spark to compare the performance of Spark with Hive and SQL/Oracle.
  • Experienced in working with Cloud technologies like openstack and Amazon Web Services (AWS) using EMR and EC2 for computing and S3 as storage mechanism, RDS, Amazon Redshift.
  • Good experience in creating data visualization using tableau and custom python dashboards.
  • Good exposure with Agile software development process.
  • Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
  • Strong experience on Hadoop distributions likeApache Hadoop, Cloudera,EMR, andHortonWorks.
  • Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra andMongoDB.
  • Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, parquet and Avro.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice-versa.
  • Extensive Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications. Experience in CI/CD implementation using Jenkins for build/deploy the applications. And good knowledge on Devops tools like ansible, chef, Nagios.
  • Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC, SOAP and RESTful web services.
  • Strong Experience of Data Warehousing ETL concepts
  • Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
  • Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
  • Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
  • Worked in large and small teams for systems requirement, design & development.
  • Key participant in all phases of software development life cycle with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing of Software Applications in client server environment, Object Oriented Experience in using various IDEs Eclipse, IntelliJ, pycharm and repositories like SVN, Git, Bitbucket, Gitlab. Experience of using build tools Ant, Maven, sbt.
  • Preparation of Standard Code guidelines, analysis and testing documentations.
  • Technology and Web based frameworks and applications.
  • Mentored juniors in team to build data applications at scale

TECHNICAL SKILLS

BigData/Hadoop Technologies: HDFS, databricks, YARN, MapReduce, Hive, Pig, spark, pyspark, Sqoop, Flume, Kafka, Zookeeper and Oozie

Machine Learning: NO SQL Databases, HBase, Cassandra, MongoDB

Languages: Python, SQL, Java, pyspark, Scala, impala, Pig Latin, HiveQL, Java Script, Shell Scripting

Application Servers: Apache Tomcat, Nginx

Cloud Computing Tools: Amazon AWS EMR, S3, RDS, Azure

Operating Systems: UNIX, Windows, LINUX, Ubuntu

Databases: Microsoft SQL Server, MySQL, Oracle, DB2

Build Tools: Jenkins, Nexus, Maven, ANT

Business Intelligence Tools: Tableau, Splunk, QlikView, Grafana

Development Tools: Microsoft SQL Studio, Eclipse, NetBeans, IntelliJ, pycharm

Development Methodologies: Agile/Scrum, Waterfall Version Control Tools Git, Bitbucket, SVN

PROFESSIONAL EXPERIENCE

Sr Software Developer

Confidential, Mountain View, CA

Responsibilities:

  • Responsible for building scalable distributed data solutions using Bigdata spark.
  • Lead team of engineers in building frameworks to reduce latency in data availability and help business generate timely insights for retail operations and campaign management.
  • Implementation of architecture on AWS using Lambda, SQS, SNS and event driven components
  • Migrated on-premises data pipelines to AWS cloud
  • Reduced latency of data delivery from hours to seconds
  • Designed and built dashboards with Tableau
  • Developed framework for processing of complex unstructured datasets from R&D teams
  • Designed metric enablement framework and process

Environment: Hadoop YARN, Spark Core, Spark SQL, pyspark, Scala, Python, Hive, Hbase, Tableau, Airflow, Jenkins, MySQL.

Staff Data Engineer

Confidential, Mountain View, CA

Responsibilities:

  • Work with business to extend algorithmic foreteller features (in-house inventory forecasting tool)
  • Collaborate with data scientists to implement and deploy model manager DAG templates
  • Design and build frameworks to streamline development of data pipelines (ingest, transform and load).
  • Design and build configurable and API driven data quality and validation processes.
  • Design and build self-service data tools for business associates to access datasets and insights generated.
  • Responsible for building scalable distributed data solutions using Bigdata stack on AWS cloud
  • Experience in developing algorithms in spark using spark context, spark-sql, RDDs and Dataframes.
  • Maintain timely deliverables of foreteller forecasts to business
  • Create and maintain Airflow’s template engine to build configurable data pipelines
  • Create and maintain Airflow plugins (operators, hooks, macros, blueprints and admin views)
  • Understand the patterns of data ingestion with sources like elastic search, REST API, streaming events
  • Understand the patterns of data parsing and build configurable PySpark transformation modules
  • Create and maintain pluggable data quality framework to generate synchronous and asynchronous alerts
  • Create and maintain self-service application (custom portal) to publish datasets and metrics
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames.
  • Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as well as RDBMS and NoSQL data stores for data access and analysis.
  • Implementation of automated deployment process with Gitlab and Jenkins
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Involved in loading data from to HDFS, transform for various data manipulations and generates reports, re-loading back to postgres for UI/feeding to different systems.
  • Importing and exporting data into HDFS and Hive using spark jdbc
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive

Environment: Hadoop YARN, Spark Core, Spark SQL, pyspark, Scala, Python, Hive, Tableau, AWS, databricks, Oracle 12c, MySQL, Linux.

Sr Software Developer

Confidential, Bellevue, WA

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop spark.
  • Lead team of engineers in building frameworks to reduce latency in data availability and help business generate timely insights for retail operations and campaign management.
  • Ownership of platform and data pipeline changes that occur with Hadoop distribution upgrades.
  • Lead team of engineers to build and maintain custom ETL framework to generate templatized data pipelines.
  • Maintain and extend Data Movement Framework, an in-house built framework
  • Handle hadoop distribution upgrades and update template changes (MR, Hive, PySpark, HBase and Oozie)
  • Create and maintain framework to asynchronously perform audits, balance and control on datasets
  • Document high-level design specs for data pipeline to be created and deployed
  • Integrate un-carrier delivery model into agile development and devops practices
  • Involved in offloading data pipelines from informatica to Bigdata stack
  • Importing and exporting data into HDFS and Hive using spark jdbc
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive
  • Worked in creating HBase tables to load large sets of semi structured data coming from various sources.

Environment: Hadoop YARN, Spark Core, Spark SQL, pyspark, Scala, Python, Hive, Hbase, Tableau, control M, Jenkins, Hortonworks, db2, Oracle 12c, Linux.

Sr Hadoop/spark Developer

Confidential

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop on AWS infrastructure.
  • Design, prepare and publish Bigdata architecture solutions for clients
  • Design and build applications using Java, Scala, Python, SQL, Unix shell scripting, REST APIs for various use cases
  • Design and built real-time, near real-time and batch processing applications
  • Design and build bigdata proof of concepts to showcase working solutions
  • Lead team of engineers to provide on-call support for tier-1 applications
  • Developed tools to automate datasets recovery on ETL failures.
  • Developed generic notification modules and integrate with in-house systems
  • Implemented ELK (Elastic Search, Log stash, Kibana) stack to collect and analyze the logs produced by the spark cluster.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.

Environment: Hadoop YARN, Spark Core, pyspark, Spark Streaming, Spark SQL, Scala, Python, Kafka, Hive, Sqoop, Elastic Search, Impala, Hbase, Tableau, Oozie, Jenkins, AWS, EMR, RDS, S3, Amazon Redshift, oath, Oracle 12c, Linux.

We'd love your feedback!