Title: Site Reliability Engineering
Location: Austin TX (Day1 Onsite)
Duration: Contract
The conversational Engineering team seeks a highly motivated individual with a background in Software In this position, you'll help craft a seamless, high-quality experience for our Join our team, and you'll help Apple ensure our customers get the products they want as effortlessly as
The Conversational Reliability Operations Engineer is a natural leader and facilitator; a strategic thinker who can "connect the dots" at multiple levels; is driven, organized, and detail-oriented; communicates with ease at all levels; is adept at facilitating actions and resolving conflicts; manages through relationships and influence; and displays grace under
Key Qualifications:
- Experience with DevOps, service reliability engineering(SRE), issue triaging, and troubleshooting experience
- Experience with Automation skills using Ansible, Jenkins, and puppet
- Hands-on experience with/CD tools and building pipelines using Jenkins or any other tool
- Container tools & Orchestrator systems like Docker and
- Messaging Queues systems like Kafka
- Good hands-on Splunk build dashboards using complex queries
- Excellent debugging skills: ability to quickly recognize patterns in failures
- Experience with scripting languages such as Python, Perl, shell scripts,
- Strong verbal and written communication skills and knowledge to coordinate with multiple technical and functional users
- Self-motivated with excellent time management skills
- High attention to detail, and you are good at finding edge
- Additional skills a plus, not required:
- You know iOS and are familiar with the features of the Messages
- You are familiar with essential Machine Learning and NLP concepts
- You've worked with Chatbots, Conversational AI, or IVR systems
Responsibilities:
- Drive critical incidents with cross-functional teams across
- Use operational tools to monitor performance against the ecosystem, identifying trending issues and escalating any problems to the appropriate team(s). Follow the progress and work with the proper group (s) to identify the root cause and resolve each
- Look for improvement and automation opportunities to preserve and enhance customer
- Where you identify customer experience as sub-optimal, work with cross-functional teams to improve quality and Influence for improvements based on a solid understanding of expected and desired experience
- In an impactful production incident, ensure the issue is triaged and troubleshot by the correct teams and mitigated as soon as possible to restore customer Ensure the root cause is
- Able to perform code fixes for problems and support activities such as incident trend analysis under minimum supervision
- Ensuring resilience through proactive testing and preventive maintenance
- Setting up monitoring and alert mechanisms to address issues before they become problems
- Identify learnings or opportunities for improvement and influence for change as part of your quality and reliability
- Provide constructive feedback for testability and suitable solutions, relying on data to justify technical
- Create and maintain Operations-related documentation and processes for the role in a central
- Effectively document and communicate standards to platform
Experience:
- BE/B Tech in Computer Science, 5+ years of experience in the software industry
- Proven work experience in software development and operations
- Strong knowledge of infrastructure deployment methodologies, tools, and processes
- Solid understanding of operations and infrastructure, and scripting
- Experience with performance and security testing is a plus
- Up-to-date on the latest industry trends; able to articulate trends and potential clearly and confidently
Twinkle
Skills :