B

Site Reliability Engineer II

Bank of America
Full-time
On-site
Charlotte, North Carolina, United States
$108,000 - $161,800 USD yearly

Job Description:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.

Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.

Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations.

At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

Job Description:
This job is responsible for partnering with engineering and technology teams to implement measures as prescribed by lead/senior SRE engineers. Key responsibilities include ensuring appropriate instrumentation, tooling, ticketing, alerting and on call routines are in place for key services, identifying root causes of issues through production triage efforts, and suggesting code enhancements to technology teams to automate services and improve reliability and efficiency. Job expectations include using software development skills to improve efficiency and to address gaps in reliability.

Position Summary:

This role will partner directly with Application Development and Production Support teams to implement the measures prescribed through the collaboration of the Senior SRE and their partners. This will include ensuring the appropriate instrumentation, tooling, ticketing, alerting and on-call routines are in place for key services. The SRE II will be engaged in production triage efforts and work with Problem Management in the identification of root cause for issues as required, using the knowledge gained in those efforts to partner closely with the Senior SRE to address any gaps in the reliability measurements and dashboards. The SRE II will also focus heavily on software development activities, with a focus toward delivering automated solutions to eliminate ‘toil’ and suggest code enhancements to the application development teams. Ideally looking for an individual with a keen interest in technology, forensics, troubleshooting and problem solving. Someone who -needs- to keep digging until an answer is found.

Responsibilities:

  • Develops and maintains reliability scripts, tools and libraries and leverages them for common instrumentation, automation, and operational needs, and when mentoring Site Reliability Engineer (SRE) resources on reliability practices and established tools/capabilities
  • Collaborates with Development and Infrastructure teams to understand technical solutions and implement monitoring capabilities outlined in the application and system monitoring designs put forward by the SRE Lead
  • Partners to implement code changes to make use of common reliability libraries and tools and helps Application Production Services and Application Development teammates understand how to use them
  • Identifies vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring, and defines solutions to reduce manual support effort and/or improve system reliability
  • Engages as a subject matter expert in major incident triage efforts and failure scenario modelling and diagnosis with Problem Manager root causes for major incident/problem management investigations
  • Participates regularly in an on-call rotation with Production Support teammates to learn more about reliability issues affecting their portfolio
  • Integrates the analysis of business models, logical specifications and/or user requirements to design and implement solutions.
  • These individuals can lead a tool evaluation effort, including the definition of the evaluation criteria, identification of tools to be evaluated, and the actual evaluation.

Required Qualifications:

  • Minimum of 5 - 10 yrs experience supporting multiple systems.
  • Real-world experience supporting a wide-variety of technologies including HTTP servers, databases, Windows/Unix/Linux, load-balancers, single-sign-on, network design, as well as application platforms in a large-scale user setting
  • Must be skilled in the design/management/oversight of multiple development/infrastructure efforts integrating multiple technologies to implement the process component of a requirements based solution.
  • Experience supporting .Net/IIS and Java/Websphere/Weblogic implementations
  • Practical, hands-on, usage of troubleshooting and diagnostic tools such as Splunk and Dynatrace
  • Working knowledge and experience Windows and Linux servers
  • Some scripting/development skills and working knowledge of SQL
  • Experience with log analysis - IIS, Event, and Performance logs
  • Capable of eliciting, understanding and instantiating a complete solution to business problem.
  • Can design an entire system configuration including hardware, software, and communications.
  • Able to work with and manage multiple suppliers.
  • Very clear communication skills both written and oral
  • Good technical knowledge of Microsoft Server & Desktop products
     

Desired Qualifications:

  • Experience supporting Online Brokerage or Ecommerce websites
  • Experience in large Financial environment is MAJOR plus
  • Experience interfacing with senior management
  • In-depth knowledge of Networking and infrastructure components such as DNS, load balancers, Firewall, Proxy
  • Debugging of .NET applications a plus (debugdiag/winDBG) and Java applications
  • Broad architectural knowledge of all components of applications and infrastructure systems including object and component based development environments; communications technologies including middleware and open gateways; and multiple backend database technologies.

Skills:

  • Analytical Thinking
  • Automation
  • Collaboration
  • Production Support
  • Result Orientation
  • Application Development
  • Architecture
  • Influence
  • Project Management
  • Solution Design
  • Adaptability
  • DevOps Practices
  • Risk Management
  • Solution Delivery Process
  • Stakeholder Management

Shift:

1st shift (United States of America)

Hours Per Week: 

40

Pay Transparency details

US - NJ - Pennington - 1200 American Blvd - Princeton Place At Hopewell Bldg. 2 (NJ2120)

Pay and benefits information

Pay range

$108,000.00 - $161,800.00 annualized salary, offers to be determined based on experience, education and skill set.

Discretionary incentive eligible

This role is eligible to participate in the annual discretionary plan. Employees are eligible for an annual discretionary award based on their overall individual performance results and behaviors, the performance and contributions of their line of business and/or group; and the overall success of the Company.

Benefits

This role is currently benefits eligible. We provide industry-leading benefits, access to paid time off, resources and support to our employees so they can make a genuine impact and contribute to the sustainable growth of our business and the communities we serve.