SRE and DevOPS
Experience- 2-8 Yrs
Package: 10-20 LPA
Position - SRE and DevOPS
FarEye is a logistics SaaS platform for predictive visibility. It enables brands to orchestrate, track, and optimize their logistics operations. The machine-learning based platform is empowering global enterprises to shrink delivery time by up to 27%, increase courier productivity by up to 15%, eliminate risks by up to 57%, and achieve operational excellence.
We are looking for experienced and motivated Site Reliability Engineers who will shape up and drive the Performance and Scaling initiatives for FarEye platform. You will be champion and owner for most of the components running our Products and Solutions. You will work very closely with other SRE team members, Product engineers, Business owners, and other SREs to deliver enterprise grade SaaS Platform by providing a 100% DB uptime with 99.9% of speed SLA.
You love to automate and have a passion for debugging and challenging the status quo by experimenting with innovative OSS to solve various complex SRE challenges. You have an innate desire to design and architect infra solutions using Infrastructure-As-Code. You have a passion to learn path breaking technologies and solve complex business problems. Security, Reliability, Scalability and Speed are the keywords drive your actions. Troubleshooting complex distributed systems problems with like-minded peers keep you motivated and engaged into building the next Logistics SaaS platform.
More than anything else, you love to delight customers with software that works flawlessly, repeatedly and functions even at peak loads.
As a Cloud Infrastructure Engineer cum SRE, you operate seamlessly between development and operations. You’ll engage in and improve the lifecycle platform and allied services - from design to deployment, operation and refinement. You’ll maintain services by measuring and monitoring availability, latency and overall system health. You’ll play an important role in scaling systems sustainably through automation and evolving them by pushing for changes to improve reliability and velocity. You will also participate in a 24x7 on-call support roster
To be successful in this role, you must be a motivated self-starter and self-learner, possess strong problem-solving skills; and be someone who embraces challenges
- Work with other Cloud Infrastructure Engineer and developers to ensure maximum performance, reliability and automation of our deployments and infrastructure
- Work with, consult and influence developers on new features and software architecture to ensure scalability
- Develop software, both as components of our solution and outside of the solution, for deployment automation, packaging and monitoring visibility.
- Identify tasks and areas where automation can be applied to achieve time efficiencies and risk reduction.
- Debug and troubleshoot service bottlenecks throughout the whole software stack.
- Measure and monitor availability, latency, and overall system health.
- You will have direct influence on the decisions and outcomes related to solution implementation.
- Minimum 3 years of relevant industry experience. 4-8 years would be preferred
- In-depth experience in Linux administration, TCP/IP networking, virtualization using xen, VMWare or Linux KVM
- Hands on experience in distributed systems based on Java, NodeJS and/or Ruby
- Hands on experience in administering and basic level troubleshooting RDBMS MySQL and Postgres.
- Deep understanding and working experience on Load balancers like Nginx / Haproxy / likewise and TLS based security layer on load balancers
- Demonstrated ability and experience in administration, tuning and troubleshooting of microservices based Distributed systems based on Kubernetes, Elasticsearch, Kafka, RabbitMQ and Redis
- Proven experience in Infrastructure automation, configuration and orchestration on public cloud platforms Terraform, Saltstack, docker, K8S and CI/CD using Jenkins/Argo/Flux
- Exposure to administration and operations to monitoring tools like Zabbix, Prometheus, Grafana and EFK, AppD or newrelic
- Certification on AWS or Azure Cloud would be an added advantage
- Understanding of Error budgets, Observability, Maturity and Reliability index
- Strong practitioner of Infrastructure as Code
- Do proactive health checks and drive RCA and final fix
· A minimum of 3 years of experience is required. 4 to 7 years of experience is preferred.
- A Bachelor of Science Degree in Computer Science or equivalent experience is required.
Apply For this Job