Sr. Software Engineer Resume
VA
SUMMARY:
- 7+ years of intensive experience in development environments within cross - platform systems
- Extensive experience implementing and maintaining Open-Source Distributed System (Hadoop Echo System), Search Engine solutions
- Diverse experience includes developing reporting and procedural standards
- Diagnosing and resolving complex technical problems in cross-platform systems
- Engineering and management as well as performance monitoring and capacity planning on distributed systems
- Extensive knowledge of Software Development Life Cycle (SDLC)
TECHNICAL SKILLS:
Languages: Bash/Shell Scripting, SQL, Java, JSON, CSS, XML/XSLT, XPath, XQuery
Platform: UNIX, Linux, Windows, VMware, VirtualBox, Hyper-V
Database & Data warehousing Tools: MS SQL, MySQL, Hbase, Postgres, Redshift
Distributed System: Apache Hadoop Ecosystem, Hadoop HDFS, YARN, HBase, ZooKeeper, Pig, Hive, Scoop, Flume, Spark, Falcon, Ambari, Apache Solr/Lucene (SolrCloud), SplunkIT infrastructure frameworks & monitoring: Cloudera Manager, Cloud AWS infrastructure (EMR, EC2, S3), Lucidworks Fusion, Hortonworks HDP & HDF, Cloudera Distribution for Hadoop (CDH4, CDH3), Nagios, Kibana/Banana, Ambari
Web & Application Servers: Apache web Server, Jetty, Tomcat, Jboss
Networking: TCP/IP, DNS, LAN/WAN, NAT, LDAP/AD
Analysis/Design Methodologies: Scrum Agile, UML, REST
Others: MS Visio, Subversion, Selenium, Puppet, Kerberos, Knox, Ranger
EXPERIENCE:
Confidential, VA
Sr. Software Engineer
Responsibilities:
- Designed, Provisioned, maintained and upgraded Search Application Lucidworks Fusion (SolrCloud) in AWS GovCloud, Redhat Linux hosts
- Setup and configured monitoring tools, New Relic and Zabbix
- Provided data management and indexing solutions for governmental agencies
- Leveraged daily Scrum/Agile meetings to communicate and clarify issues and solutions
- Took lead on migration of the application from GovCloud to a new AWS environment
- Designed, installed and configured HDP-Hadoop cluster in AWS on Linux hosts
- Leveraged Splunk for log aggregation and stacks troubleshooting
Confidential, Washington, DC
Sr. Big Data Infrastructure Engineer
Responsibilities:
- Designed, provisioned, configured and upgraded HDP 2.6.3 clusters include: Hadoop, YARN, HDFS, Zookeeper, Solr, Oozie, Flume, Hive, HBase, Kafka, Spark
- Setup and configured Ambari 2.6 for provisioning and monitoring
- Setup external solrcloud 7.1 cluster on Ambari and configured backup and recovery
- Setup firewallD in Redhat 7
- Setup Nagios for monitoring 27 Hadoop clusters with over 217 hosts, virtual machine and bare metal Redhat Linux 6 and 7 servers
- Setup and configured Puppet for automation
- Knowledge transfer with team members
Confidential, NYC
Big Data & Solr Engineer
Responsibilities:
- Designed and installation of HDP 2.2 cluster includes 20 nodes on RedHat Linux environment
- Setup and configure HDP cluster high availability
- Set up Ambari 2.1 for provisioning, managing and monitoring Hadoop cluster
- Secured Hadoop clusters with Kerberos, LDAP, Ranger and Knox
- Setup and auto-configured Hadoop cluster in Puppet environment
- Installation, configuration, and administration of HDP Stack includes services HBase, Solr, Hive, YARN, Oozie, Flume, Kafka, Spark and HDFS
- Setup and configured Nagios and Ambari to leverage system health check, metrics collection and alert framework
- Setup back-up system and Snapshot data replication for HDFS using Falcon
- Setup and configured SolrCloud Cluster with ensemble ZooKeeper cluster and Kibana/Banana data visualization
Confidential, MD
Software Search Engineer
Responsibilities:
- Designed and initiated a prototype for “ Confidential -Drug-Labels Indexing Solutions”
- Performed requirement gathering and data analysis
- Set up and configured Solr search engines in “Standalone” and “SolrCloud” modes
- Setup ETL tools, parsers, and transformers (DIH, Tika, XPath) to extract, transfer, and load data to Solr nodes
- Defined customized Solr Schema for indexing different data formats (XML, HTML, PDF, JPEG, metadata …)
- Investigated possibility of integration software, frameworks, and libraries to enhance performance of the system including: “OpenNLP” (Machine learning based toolkit), “Logstash” (loading log files to Solr), “Kibana/Banana”(Data visualization)
- Set up and configured data visualization framework “Kibana/Banana” to visualize time-series and non-time-series data, indexed in Solr nodes
- Reconfigured and customized search engine and performed functional testing (both indexing and searching) for quality assurance
Confidential, VA
Solr Search Engineer
Responsibilities:
- Set up and configured SolrCloud search solution running on Jboss includes 3 clusters, 9 shards and 9 replicas on Linux servers to process TBs of dataset and provided near real time search
- Configured quorum ZooKeeper running as a central configuration for Solr clusters
- Installed and configured Jboss clusters
- Performed testing and debugging of different search components for high availability and performance enhancement
- Provided comprehensive documentation
Confidential, Washington, DC
Healthcare Business Analyst & Application Engineer
Responsibilities:
- Setup & installed Solr/Lucene search solution under Tomcat-Jboss environment on Azure Cloud
- Provided replication configuration for Solr/Lucene indexes
- Defined data source connectivity from Solr to MS SQL & S3 on AWS
- Setup & installed SSL connectivity between Solr nodes
- Installed and configured SolrCloud cluster on Unix servers
- Performed requirements analysis
- Daily administration and management of multiple Linux servers
- Installed and configured Hadoop nodes/Cluster on Amazon Web Services
- Automated start-up script for pulling data from MySQL and ingesting into Hadoop Distributed File System (HDFS)
- Wrote shell scripts for Log-Rolling day to day processes
- Executed importing and exporting data into HDFS and Hive using Sqoop
- Executed manipulating and managing data with Hive and Pig
- Managed the day-to-day operations of the cluster for backup and support
- Performed performance monitoring and capacity planning
- Reliability testing to reveal potential problems arising from extended runs
- Solr performance testing to determine the response time of a given request
- Executed expectations and general performance standards as set forth by the company
- Oversaw and directed the quality assurance review of monthly activity reports including validation of results