Update: See this followup post for a more elegant style that still provides the benefits of nesting vs chaining.
So I've run across a modeling problem where the "proper functional way" is unsatisfactory. I experimented for several days on alternatives. In the end the
Pyramid of Doom prevailed. Since F# is whitespace-formatted, I suppose they are more like "Steps of Doom". This is a command handling pipeline for an API.
let run store request (readJsonFunc:ReadJsonFunc) at =
let readJson = fun _ -> readJsonFunc.Invoke() |> Async.AwaitTask
let logErr = tee (ErrorEncountered >> log) >> Responder.toErrorResponse
let logResponse = flip tuple DateTimeOffset.Now >> ResponseCreated >> log
async {
log <| RequestStarted (request, at)
match getHandler request.Path with
| Error err ->
return logErr err
| Ok (handlerPath, handle) ->
log <| HandlerFound handlerPath
match getUser request.User with
| Error err ->
return logErr err
| Ok user ->
let tenantId = getClaimOrEmpty Constants.TenantClaimKey user
let userId = getClaimOrEmpty Constants.UserIdKey user
log <| UserFound (tenantId, userId)
match authorize handlerPath user with
| Error err ->
return logErr err
| Ok claim ->
log <| OperationAuthorized claim
let! jsonResult = getJson readJson ()
match jsonResult with
| Error err ->
return logErr err
| Ok json ->
log <| JsonLoaded json
match deserializeMeta json with
| Error err ->
return logErr err
| Ok meta ->
log <| RequestMetaDeserialized meta
match checkTenancy user meta.TargetId with
| Error err ->
return logErr err
| Ok x ->
log <| TenancyValidated x
// TODO page result from event store
let! loadResult = loadEvents store meta.TargetId
match loadResult with
| Error err ->
return logErr err
| Ok slice ->
log <| EventsLoaded (slice.FromEventNumber, slice.NextEventNumber, slice.LastEventNumber)
match checkConcurrency meta.Version slice.LastEventNumber with
| Error err ->
return logErr err
| Ok version ->
log <| RequestVersionMatched version
match deserializeSlice slice with
| Error err ->
return logErr err
| Ok inEvents ->
log <| EventsDeserialized inEvents
let! outEventsResult = runHandler (meta:RequestMeta) (request:Request) user json handle inEvents
match outEventsResult with
| Error err ->
return logErr err
| Ok outEvents ->
log <| HandlerFinished outEvents
match outEvents with
| [] ->
log <| NoEventsToSave
return Responder.noEventResponse ()
| _ ->
let eventMeta = createEventMeta tenantId userId
match serializeEvents meta.TargetId meta.Version meta.CommandId request.CorrelationId eventMeta outEvents with
| Error err ->
return logErr err
| Ok eventDatas -> // bad grammar for clarity!
log <| EventsSerialized
let! eventSaveResult = save store meta.TargetId meta.Version eventDatas
match eventSaveResult with
| Error err ->
return logErr err
| Ok write ->
log <| EventsSaved (write.LogPosition.PreparePosition, write.LogPosition.CommitPosition, write.NextExpectedVersion)
return Responder.toEventResponse ()
}
|> Async.tee logResponse
|> Async.StartAsTask
Ok, so the Steps of Doom are invisible here because F# does not require me to indent nested match statements. Only the return expression requires indentation. Maybe it's more of the Cliffs of Insanity.
Now before you reach for your pitch fork and torch, let me explain how I got there and why it may really be the best choice here. (Not in general though.)
Let's talk about how to change this code. I can insert/remove/edit a step at the appropriate place in the chain (above another match expression), then fix or add affected variable references in the same function. That's it. Now let me describe or show some alternatives I've looked at.
The standard functional way to represent "callbacks" is with monads. Inside the above, you can see that I'm already using Result (aka Either). But that one alone is not sufficient, since I need to keep values from many previous successes. I also need to do some logging based on those values. And some of the calls are Async as well. So I would need some combination of Async, Result, and State. Even if I was interested in building such a franken-monad, it still doesn't solve the problem of representing this pipeline as one state object. Consider what the pipeline state object might look like:
type PipelineState =
{ Request: Request
ReadJson: unit -> Async
// everything else optional!
HandlerPath: string option
...
...
...
...
...
}
Working with this state object actually makes the pipeline harder to reason about. The user of this object can't be sure which properties are available at which step without looking at the code that updates it. That's quite brittle. Updating the pipeline requires updating this type as well as code using it.
You could eliminate the question of whether properties were available at a given moment by nesting states and/or creating types per pipeline step. But then you have a Pyramid of Doom based on Option (aka Maybe). Updating the code around this is also quite a chore with all the types involved.
Instead of keeping a pipeline state object, you could use an ever-growing tuple as the state. This would make it easier to tell what was available at what step. However, this has a very large downside when you go to change the pipeline. Anytime you modify a pipeline step and its corresponding value, you have to modify it in all subsequent steps. This gets quite tedious.
I tried something similar to the tuple method, but with a list of events instead. I was basically trying to apply Event Sourcing to make the events both the log and the source of truth for the pipeline. I quickly realized updating state wasn't going to work out, so I used pattern matching on lists to get previous values as needed. However, it suffered from the same problem as the growing tuple method, plus it allowed for unrepresentable states (unexpected sequences in the list). This is pulled from an older version with slightly different pipeline steps.
let step events =
match events with
| EventsSaved _ :: _ ->
createResponse Responder.toEventResponse ()
| NoEventsToSave :: _ ->
createResponse Responder.noEventResponse ()
| ErrorEncountered error :: _ ->
createResponse Responder.toErrorResponse error
| [ RequestStarted (request, _, _) ] ->
Async.lift getHandler request.Path
| [ HandlerFound _; RequestStarted (request, _, _) ] ->
Async.lift getUser request.User
| [ ClaimsLoaded user; HandlerFound (handlerPath, _); _ ] ->
Async.lift2 authorize handlerPath user
| [ RequestAuthorized _; _; _; RequestStarted (_, readJson, _) ] ->
getJson readJson ()
| [ JsonLoaded json; _; _; _; _ ] ->
Async.lift deserializeMeta json
| [ RequestMetaDeserialized meta; _; _; _; _; _ ] ->
loadEvents meta.TargetId
| [ EventsLoaded es; RequestMetaDeserialized meta; _; _; _; _; _ ] ->
Async.lift2 checkConcurrency meta.Version <| List.length es
| [ ConcurrencyOkay _; EventsLoaded es; RequestMetaDeserialized meta; JsonLoaded json; _; ClaimsLoaded user; HandlerFound (_, handle); RequestStarted (request, _, _) ] ->
runHandler meta request user json handle es
| [ HandlerFinished es; _; _; _; _; ClaimsLoaded user; _; RequestStarted (request, _, _) ] ->
save request user es
| _ ->
Async.lift unexpectedSequence events
As you can tell, the above has a number of readability issues in addition to the challenges listed.
So probably the most compelling alternative would be to create one large method which takes in all the values and uses Retn + Apply to build up the function until its ready to run. A sketch of it might looks like this:
retn (firstStepGroupFn request readJson at)
|> apply (getHandler request.Path |> Result.tee logHandler)
|> apply (getUser request.User |> Result.tee logUser)
|> apply (authorize handlerPath user |> Result.tee logAuthorize)
|> bind (
retn (secondStepGroupFn ...)
>> apply ...
)
We get into a little bit of Type Tetris here, because some results are Async but others are not. So all of the individual steps will have to be lifted to AsyncResult if they weren't already.
This is what it would look like explicitly, but we could make it look much nicer by creating adapter functions. We could create adapter functions that did the lifting and tee off logs. We could then create adapter functions for the groups of steps (firstStepGroupFn, secondStepGroupFn, etc). The step group functions are required because some steps require values computed from the combination of earlier ones.
Changing this kind of structure is a challenge. For instance, we use the request data most of the way down the pipeline. So it would have to be passed into each successive function. If we later changed how we used that data, we would have to touch a number of functions. We may even have to change the way groups of steps are structured. The compiler will aid us there, but it's still tedious.
Reflecting on my Cliffs of Insanity solution... I think it's quite ugly, but it's also the simplest to understand and to change of the alternatives shown. It also makes sense that this solution could be the right answer because the edges of the system are where side effects happen. And indeed most of the code above is explicitly modeling all the ways that IO and interop can fail. For the core of an app where functions should be pure, this kind of solution would be a bad fit. But maybe, just maybe, it's good at the edges.
Of course, there may be a functional pattern that fits here that I just haven't considered. I will continue to learn and be on the lookout for better alternatives.