Software Developer Resume Santa Monica, CA - Hire IT People

SUMMARY:

Solid programming experience with Python 2.7.0/3.5.0 , R 3.5.1, Java 7, HTML 5, CSS 3 and environments like Linux and UNIX.
Technical experience of using Hortonworks 2.6.5 distributions, Databricks 2.4.2, Cloudera 4 and Hadoop working environment including Hadoop 2.8.3, Hive 1.2.2, Sqoop 1.4.7, Flume 1.5.0.1, HBase 2.0.0, Apache Spark 2.2.1, Kafka 1.3.2
Good working knowledge on Eclipse IDE 4.7 for developing and debugging Java applications.
Comprehensive knowledge of Core Java Concepts and Collections Framework, Object Oriented Design and Exception Handling
Experience in testing applications using JUnit 4.12
Used JIRA for bug tracking.
Habituated with Agile, software development methodologies
Experience working with source and version control systems like BitBucket, Git 2.12, GitHub
Well acquainted with HDFS 2, YARN and MapReduce programming paradigms
Extensive practical knowledge of data imports and exports using Sqoop 1.4.7, Flume 1.5.0.1 from HDFS 2 to Relational Database Systems.
HIVE 1.2.2 table creation plus loading data to run basic and advanced queries as well as partitioning, bucketing hive - stored data
Experience with Map reduce programs to Spark RDD transformations for improved performance
Proficient in working with Spark Ecosystem using Spark SQL and Scala 2.11.12 queries on different formats like Text file, Avro, Parquet files
Knowledge of working with NoSQL databases like Cassandra 2.2, HBase 2.0.0
Worked with Amazon Web Services using EC2 for computations and S3 as a storage mechanism
Implemented MLlib algorithms for and testing different models using Spark Machine Learning APIs
Hands-on experience with Python libraries like Matplotlib 2.2.2, NumPy, SciPy, Pandas
Strong knowledge with design & analysis of ML/data science algorithms like Classification, Association rules, Clustering and Regression and models like Descriptive, Predictive and Prescriptive analytics, Machine Learning (ML), Deep Learning (DL), Natural Language Processing (NLP), Text Analytics, Data Mining, Unstructured Data Parsing and Sentiment Analysis.
Neural network libraries: TensorFlow r1.8.0, Keras 2.2.1 etc.
Hands-on working experience with Brain Computer Interfaces - Emotiv technology for Data Mining on Brainwaves

TECHNICAL SKILLS:

Programming Languages\ Web Programming & Scripting Languages: C, Java 7, Python 2.7/3.5.0, R 3.5.1HTML5, CSS3, JavaScript 6

Databases\ Operating Systems: Oracle 10g, MySQL 5/ 8, Cassandra 2.2, \ Windows 7, 8.1, 10; Linux; xv6 - Unix; HBase 2.0.0 Android

Frameworks\ Methodologies: Hadoop 2.8.3, Apache Spark 2.2.1 Agile, Waterfall

Software\ Analysis Tools: Microsoft Excel 2015, Microsoft Access 2015, MATLAB 2015, SciLab, Tableau Microsoft Word 2015, Visual Studio 2017, Eclipse Oxygen, NetBeans IDE 8.2, RStudio 1.1.456, Amazon S3, Anaconda 5.1.0, Rapid Miner 7.2, Knime 3.5, PyCharm 2.0, Emotiv, MySQL Workbench

SCM Tools\ Hadoop Ecosystem: Bit Bucket, Git 2.12, GitHub HDFS 2, MapReduce, Hive 1.2.2

ETL\ Bug Tracking Tools: Sqoop 1.4.7, Flume 1.5.0 JIRA

PROFESSIONAL EXPERIENCE:

Confidential, Santa Monica, CA

Software Developer

Responsibilities:

Deploying the features using traditional Maven commands
Jenkins automation proposal and implementation for deployment and Maven Release - Major, Minor and Nexus artifact upload.
Salt scripting for benefiting from parallel execution framework.
Big Query usage estimation and testing
Github repository migration and testing, HTTP to HTTPS enabling.
Research and analysis on the existing and new features
Deploying the released artifacts on the Linux machines manually and automated deployment
Creating test cases for Regression Testing and Application Testing and used existing test cases for Sanity Testing.
Creating bash jobs for Testing on the cluster and Testing Kerberos secrets and Vault.
Uploading the test jobs created for testing Spark, Map Reduce, Sqoop, Presto to HDFS and testing via UI
Developing features for the job scheduler using Scala play and Scala-Slick
Testing the features using MySQL Workbench to execute queries on various tables.
Rendering Json HTTP response to UI using Scala play-Routes.

Environment: Scala, IntelliJ, Python 2.7/3.5, HDFS 2, Hive 1.2.2, MySQL 5, MySQL Workbench, Linux/Unix - Bash Shell Scripting, Mac

Confidential, Phoenix, AZ

Big Data Developer

Responsibilities:

Maintaining, enhancing and upgrading the Cornerstone Data ingestion capabilities for data ingestion
Optimizing data retrieval using Hive for users
Efficiently store and retain data in Cornerstone.
Creating and managing nodes that utilize Java jars and python, shell scripts for scheduling jobs to customize data ingestion for providing undeterred, simplified user access using a dedicated organization specific software Event Engine.
Implement code changes in existing modules - Java, python, shell-scripts for enhancement.
Performing MySQL queries extensively on several tables for efficient retrieval of ingested data using MySQL Workbench.
Rewriting the existing scripts which are written, using Python and Shell Scripting for efficient code execution.
Perform testing on various modules as a part of environment upgradation.
Solve JIRA tickets for Customer Request Tracking by debugging the code for errors.
Agile is used as Project Management tool and Bit Bucket is used for source code tracking.
The data formats dealt with are XML, JSON, Parquet, and Text.
Hive is used for data visualization.

Environment: Python 2.7/3.5, Putty, Java 7, Eclipse - Oxygen, HDFS 2, Hive 1.2.2, MySQL 5/8, MySQL Workbench, Linux/Unix - Bash Shell Scripting

Confidential, Piscataway, NJ

Big Data Developer

Responsibilities:

Implemented solutions for ingesting data from various sources and processing the Data utilizing Big Data Technologies such as Hive, Sqoop, Hbase, Map reduce, etc.
Installed Hadoop, Map Reduce and HDFS, developed multiple MapReduce jobs in Hive for data cleaning and pre-processing.
Demonstrated better organization of the data using techniques like hive partitioning, bucketing.
Analyzed Hadoop cluster using tools like Hive, Sqoop, Spark, SparkSQL.
Handled data import and export to and from HDFS, Hive using Sqoop.
Implemented data transfers between local/external file system and RDBMS to HDFS.
Hive queries were used to analyze the data.
Spark context & Spark-SQL were used for optimizing the analysis of data.
Spark RDD was used to store and perform in-memory computations on the data.
Amazon S3 was used to store the data.
Anaconda was used as the IDE for developing the model and WEKA, and RapidMiner were used for generating stats and analysis.

Environment: Hadoop 2.8.3, Hive 1.2.2, Apache Spark 2.2.1, Amazon S3, Anaconda 5.1.0, WEKA, and RapidMiner 7.2

Confidential, Hosuton, TX

Big Data Developer

Responsibilities:

Abandoning the cart refers to the scenario of leaving the cart without purchasing.
The data file used was in json format.
Predictive Analysis was performed on Dell Shopping Cart Clickstream Data using Keras library running on top of Tensorflow.
PySpark and Spark was used for Data cleaning and Data preprocessing.
Seaborn library was used for data visualization and analysis.
Decision Rule Classifier and Treatment Learner TAR3 was used to obtain paths having a high likelihood of leading to a specific outcome-abandoning/purchasing user.
Agile Methodology was used for project management and Github was used for source code tracking.

Environment: Pandas, Tensorflow r1.8.0, Keras libraries, PySpark 2.2.1, Spark 2.2.1.

Confidential, Houston, TX

Big Data Engineer

Responsibilities:

Developed a book recommendation system to generate personalized recommendations.
Stored the Amazon free datasets on Amazon S3 for efficient processing and fast access.
Data used was in XML and CSV format.
Two datasets containing customer demographics and ratings were combined to generate recommendations for users to buy a book predicted to suit their taste.
Collaborative filtering and ALM Matrix algorithms were used to generate recommendations.
Databricks was used for processing the data and visualizations.
Analyzed the dataset using Spark MLlib, python-pandas, seaborn.
Python - Pandas library was used for data cleaning and pre-processing.
Trained the neural network model to classify for unknown input values until there was less difference between the estimated and actual values and ensuring it wasn’t over fitted.
Used k-fold validation for dividing the and test data to get the optimum division without data skew.
ROC curve, Confusion matrix was used for evaluation.
Genetic Algorithm Analysis was used for finding the best possible solution from available options
Processed the datasets using Ga, Neuralnet and NNet packages in Rstudio.
Trained neural network models until a high precision value was obtained.
Utilized RStudio packages for generating covariance, correlation matrices.
Visualization was done using ggplot.
Principal Component Analysis was used for identifying the major contributor towards the distribution.
Experimented Ensemble learning and Bagging and Boosting techniques.

Environment: Python 3.5.0 - Pandas, Seaborn libraries, Apache Spark 2.2.1 - MLlib, RStudio 1.1.456 - Neuralnet, NNet, Ga packages, R programming

We provide IT Staff Augmentation Services!

Software Developer Resume

Santa Monica, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship