Project Management Central

Please login or join to subscribe to this thread

Topics: Construction, Risk Management, Scrum
Agile as troubleshooting? Scrum?
Network:0


I work at a nuclear power plant. Every 2 years we shut down to refuel the reactor with fresh uranium and to fix or perform preventive maintenance on systems and components that cannot be done while running. These outages are planned well in advance of the execution. A Refuel Outage (RFO) can be as short as 25 days at 24/7 or if a large scope of maintenance, modification or "discovery" work is to be performed, an outage can last 90 days or more.
My job is handling the high-risk unknowns "discovered" during these scheduled outages. The burn-rate for an outage of this sort, including the lost production of the reactor can top a million bucks a day.
The teams I form and manage are referred to as "Emergent Issues Teams" (EITs) and are lead by n EIT Leader selected from middle management for most issues. Very High risk, complexity or consequence issues warrant and get senior-level management oversight.
The whole outage is planned in 1 hour increments and logic-tied in the schedule for maximum efficiency and to maintain and keep the critical path stable.
That is where the problems can begin.
Managing the waterfall of projects and programs in the outage schedule is pretty straightforward. Start when you are scheduled, keep up with the estimated production rate and finish when you should in the schedule. There can be issues, but they can usually be easily resolved.
The Emergent Issues I deal with are commonly threats, sometimes MAJOR threats to the outage critical path. I view the EI Team as a "Fix It Now" (FIN) team and we work outside the outage schedule specifically to PROTECT the plant, the workers and the outage schedule. Most of the issues we deal with are technical in nature and related to non-functioning equipment discovered in the completion of other work or equipment accidentally damaged while performing other work.
The EIT teams are generally an EIT leader, A couple of engineers as analysis leads, an operations specialist (that can tell us what does what and when it can be worked on), maintenance (mechanics, electricians and instrumentation and control techs). Radiation Protection is usually represented, and sometimes Security, Training or Regulatory Compliance is involved.

Whew! That is more background than I thought I would need! So.... Here is my question:
Waterfall / plain vanilla project management is not ideally suited to technical troubleshooting and "fix on the fly" efforts.
I believe that aspects of Agile and Scrum management strategies are things I can use to great advantage. But what parts? Here is a bulleted list of items from a very simple EIT we recently completed.
1: The superwhamadyne zoomie transfer slinger (SZTS) has quit working and has sent automated alarms to the control room and the alarms disable other important equipment.
2: The Outage Manager declares that and EIT shall be formed to address the issue so the rest of the organization can continue with all available work to maintain the outage schedule.
3: The EIT Manager selects an EIT Leader and assists the Leader in assembling the team with Engineering, Maintenance, Ops and Purchasing. their first meeting is held within 2 hour of the discovery of the issue.
4: Engineering and Maintenance develop a troubleshooting plan to discover why the SZTS has quit and what specifically caused the alarm. (there are 14 different alarm points on the device but all show up at the control room as "SZTS General Trouble".
5: Ops and the Shift Manager have to approve the Troubleshooting Plan to make sure that it will not affect any other outage work. Once the plan is approved, the Maintenance team can go to work.
6: The maintenance team returns with data collected from the troubleshooting plan and hand over the data to Engineers. Engineering compiles an "FMA" (Failure Modes Analysis) team within the EIT to interpret the data. They may send the Maintenance team out again with more things to do based on the first iteration of troubleshooting. This may repeat several times.
7: Once the Failure mode is isolated, the team comes up with a fix. Parts are ordered or manufactured if required.
8: A Corrective Maintenance Work Order is planned and assigned to a specific Maintenance Work Group or the FIN Maintenance team for rapid execution.

As you can see, there are many unknowns and decisions made while determining the best path to success resolving the issue. The linear methods of management do not really work well although many Senior managers think a straight timeline should be used to track the progress of the progress of the EIT towards resolution.
I prefer to post as part of the EIT progress updates (usually every 3 hours) is to publish a PERT chart marked up to show what has been done, how long it took and if possible a projection of the completion time based on present knowledge.
Of course, that can easily vary by 100% over a few ours depending on what troubleshooting reveals.

And anyone make a recommendation on Agile / Scrum / KanBan or other models that can be combined / integrated or bastardized into a general framework that supports the very short timeframe, high consequence, high cost and variable scope of an "Emergent Issue Team"?
Sort By:
Network:491



Troubleshooting efforts deal primarily with unknowns, and as such are best governed by processes like the one you detailed. It seems your management wants to feel it has control over the EIT effort, which it why it insists on trying to manage it like a typical project. I don’t see how Scrum or even a simple Kanban board can help you significantly improve your current process. That said, an Ishikawa diagram (fishbone diagram) could help you identify the root causes of your problems faster by helping you categorize the many unknowns you deal with (assuming you’re not already using one).
Network:167



I don't really feel qualified to comment on how you should handle a nuclear reactor - generally, I have a hands-off policy on things that can burn you through walls.

My initial thinking would be to pursue more Lean centered practices, maybe not full-blown Agile. Lean elements could include Kaizen events to find areas that could be improved, continuously looking for ways to reduce wasted steps in the process. 5S to ensure the place is tidy and organized so you can find what you need when you need it.
A standardized process for how to do things should be created within that.
Network:88



How about trying more of the philosophy of building quality in rather than testing it in? A high-quality, mature process is characterized by not having the kinds of emergent issues you described. More risk analysis, more Ishikawa diagrams, etc. can help you to identify and proactively manage risks, and improve systems and processes so that they don't create emergent issues during routine maintenance. This is especially important for systems that pose a potential public health and security problem.
Network:1009



Karl -

Kanban with different classes of work items (e.g. expedite vs. normal priority) would seem to be a good starting point. I'd also focus on forming a whole self-managed, self-organizing team to minimize delays due to missing skill sets or decision making.

Kiron
Network:0


First I want to thank you all for your inputs. In order, Mr. Simms writes that my current management model may be the best. And he posited that pressure from management to "control" troubleshooting activities accounts for insistence on using straight - line management (waterfall) techniques. I am in agreement with the latter. Not all senior management are very sophisticated in their thinking when it comes to non-linear high risk efforts with multiple decisions. Some are the "It's simple math! 9 women, 1 month, ya got a baby!" types. But he is right in that kanban cards are not really useful in troubleshooting. Scrum, with some of the time-constrained tools used there may be of some use. One thing I personally use is the Agile Servant Leadership role, as a Nuclear plant is a large complex system. No one leader will have the technical knowledge for cross-cutting troubleshooting, but he might know how to use and teach the use of the tools to perform the troubleshooting.
Mr Render.... Don't be skeered! When ya got Nooklar perfessushnils like us'n runnin' the show, whut's ta go wrong? Seriously, I will research more into Lean concepts to mine for tools and methods and incorporate into my EITs. Right now I am working on number 62 emergent issue requiring a team to address over the past 80 days.
Mr Isom.... You are preaching to the choir. Almost without fail the emergent issues I have to address are a result of ineffective contingency planning. The managers and supervisors discount the risk instead of planning for the worst, and then are caught pants down without work planning, materials available or specialty workforce available.
And Mr Bondale.... I'm not sure at all about hi vs lo priority for EIT Troubleshooting. By definition we are high priority. But the self organizing / self managing is spot on. The problem is, we are very much "in the spotlight", and every blood-sniffing shark is attracted to the smell and delight in pot-shotting the plan and resulting decisions.

There are a couple of things I am looking for that the forum may know of and can point me to. One is a classroom lean/agile/scrum or omnibus course that IS NOT SOFTWARE!
I don't really want an online course, I want to ask questions of live people.
The other I might have to build. I want a computer graphic too that combines PERT charting for critical path and incorporates a decision tree. This would be great for troubleshooting. The nodes would be the ends of activities or decisions and have no duration. The arrows would have the duration and be represented with proportional length. As the troubleshooting progresses, the manager selects the arrows completed and takes credit for the nodes as milestones. Software in the background then calculates the critical path as each decision is made, and tracks changes over time to the critical path.
What do you think? Would it sell?
...
1 reply by Joshua Render
Jun 26, 2018 4:50 PM
Joshua Render
...
I am glad you "Nooklar perfessushnils" are on the job. :)

I can't claim to know of any sources for Agile not dedicated to software for education except for like 1 online course in Kanban. I wish I knew of more, I would take them myself.

Lean should be much easier. I wouldn't be surprised if local community colleges offer classes on it. They may be manufacturing focused, which might not actually be too far off of what you need. ASQ I think offers classes where they will come to you and train your team. http://asq.org/lean-manufacturing/training.html
Network:167



Jun 26, 2018 4:36 PM
Replying to Karl Conley
...
First I want to thank you all for your inputs. In order, Mr. Simms writes that my current management model may be the best. And he posited that pressure from management to "control" troubleshooting activities accounts for insistence on using straight - line management (waterfall) techniques. I am in agreement with the latter. Not all senior management are very sophisticated in their thinking when it comes to non-linear high risk efforts with multiple decisions. Some are the "It's simple math! 9 women, 1 month, ya got a baby!" types. But he is right in that kanban cards are not really useful in troubleshooting. Scrum, with some of the time-constrained tools used there may be of some use. One thing I personally use is the Agile Servant Leadership role, as a Nuclear plant is a large complex system. No one leader will have the technical knowledge for cross-cutting troubleshooting, but he might know how to use and teach the use of the tools to perform the troubleshooting.
Mr Render.... Don't be skeered! When ya got Nooklar perfessushnils like us'n runnin' the show, whut's ta go wrong? Seriously, I will research more into Lean concepts to mine for tools and methods and incorporate into my EITs. Right now I am working on number 62 emergent issue requiring a team to address over the past 80 days.
Mr Isom.... You are preaching to the choir. Almost without fail the emergent issues I have to address are a result of ineffective contingency planning. The managers and supervisors discount the risk instead of planning for the worst, and then are caught pants down without work planning, materials available or specialty workforce available.
And Mr Bondale.... I'm not sure at all about hi vs lo priority for EIT Troubleshooting. By definition we are high priority. But the self organizing / self managing is spot on. The problem is, we are very much "in the spotlight", and every blood-sniffing shark is attracted to the smell and delight in pot-shotting the plan and resulting decisions.

There are a couple of things I am looking for that the forum may know of and can point me to. One is a classroom lean/agile/scrum or omnibus course that IS NOT SOFTWARE!
I don't really want an online course, I want to ask questions of live people.
The other I might have to build. I want a computer graphic too that combines PERT charting for critical path and incorporates a decision tree. This would be great for troubleshooting. The nodes would be the ends of activities or decisions and have no duration. The arrows would have the duration and be represented with proportional length. As the troubleshooting progresses, the manager selects the arrows completed and takes credit for the nodes as milestones. Software in the background then calculates the critical path as each decision is made, and tracks changes over time to the critical path.
What do you think? Would it sell?
I am glad you "Nooklar perfessushnils" are on the job. :)

I can't claim to know of any sources for Agile not dedicated to software for education except for like 1 online course in Kanban. I wish I knew of more, I would take them myself.

Lean should be much easier. I wouldn't be surprised if local community colleges offer classes on it. They may be manufacturing focused, which might not actually be too far off of what you need. ASQ I think offers classes where they will come to you and train your team. http://asq.org/lean-manufacturing/training.html
Network:1634



I worked in this type of environments before Agile exists and we use Barry Bohem´s Spiral life cycle. We used kanban because kanban is an ancient technique that was used and is used in manufacturing and inventory control from long time ago. Remember that Agile is not a life cycle. Returning to waterfall most of the people confuse waterfall with sequential. In waterfall you have feedback loops. So, is about to find the life cycle that best fit for your environment.

Please login or join to reply

Content ID:
ADVERTISEMENTS

"Comedy is tragedy - plus time."

- Carol Burnett

ADVERTISEMENT

Sponsors