Services and endpoints
In addition to relying heavily on a Schema
construct, which enables abstracting over serialisation, Smithy4s uses abstractions to codify the notion of interface, to allow for interoperability with various communication protocols. The idea is to reason generically about things of this shape :
trait Interface[Context[_]]{
def operation1(a: A, b: B): Context[Output1]
def operation2(c: C, d: D, e: E): Context[Output2]
}
This generalisation enables the easy interpretation of implementations of such interfaces into services (HTTP, RPC, etc), or conversely, the derivation of stub instances of these interfaces to talk to remote services.
The creation of an abstraction that allows for this generalisation is a problem similar to the one that lead to the Schema
construct: one needs to deconstruct the notion of "interface" into fundamental building blocks.
The duality of final and initial algebras
Before we dive into the core of the solution, one notion that is drastically helpful is the duality between finally-encoded algebras and initially-encoded algebras.
Finally-encoded algebras are object-oriented encodings of a set of operations, just like above: operations are represented as methods in an interface. Interpretation of expressions written in terms of these methods does not involve any runtime transformation from one context to another: the method call is merely executed. In other words, when they are executed, expressions coming from finally-encoded algebras are already in their "final form".
Conversely, initially-encoded algebras represent expressions as data, implying that interpretation involves a transformation of this data into lower level method calls. However, data has the quality of being a first class construct in programming languages, meaning you can pass it around and use it as parameter to functions. This allows for the unification of code-paths, as the differences between some aspects of a bit of logic can be absorbed by the data and handled later on.
Finally-encoded KVStore algebra :
trait KVStore[Context[_]]{
def put(key: String, value: String): Context[Unit]
def get(key: String) : Context[Option[String]]
def delete(key: String) : Context[Unit]
}
Initially-encoded KVStore algebra :
sealed trait KVStoreOp[Output]
object KVStoreOp {
case class Put(key: String, value: String) extends KVStoreOp[Unit]
case class Get(key: String) extends KVStoreOp[Option[String]]
case class Delete(key: String) extends KVStoreOp[Unit]
}
These two encodings contain a similar amount of information. It is nearly-trivial to go from a KVstore[Context]
instance to a KVStoreOp ~> Context
polymorphic function (natural-transformation), and vice versa:
trait ~>[F[_], G[_]]{
def apply[A](fa: F[A]): G[A]
}
def asNaturalTransformation[Context[_]](impl: KVStore[Context]) = new (KVStoreOp ~> Context){
def apply[A](fa: KVStoreOp[A]): Context[A] = fa match {
case KVStoreOp.Put(key, value) => impl.put(key, value)
case KVStoreOp.Get(key) => impl.get(key)
case KVStoreOp.Delete(key) => impl.delete(key)
}
}
def fromNaturalTransformation[Context[_]](run: KVStoreOp ~> Context) = new KVStore[Context]{
def put(key: String, value: String) = run(KVStoreOp.Put(key, value))
def get(key: String) = run(KVStoreOp.Get(key))
def delete(key: String) = run(KVStoreOp.Delete(key))
}
This duality is heavily used by Smithy4s: finally-encoded interfaces are generally more natural to Scala developers, and are better supported in editors (autocompletion, etc). But from an implementation's perspective, the initial, data-based encoding is really interesting, because operations are reified as data-types that can be associated with instances of generic type-classes: it is possible to abstract over data, it is not possible to abstract over method calls.
A detour around kinds
The methods generated by Smithy4s are conceptually similar to the methods expressed in the example above, except that the output types are significantly more verbose.
trait Interface[Context[_, _, _, _, _,]]{
def operation1(a: A, b: B): Context[Input, Error, Output, StreamedInput, StreamedOutput]
}
Let's address this awkwardness right away, by explaining the rationale behind this seemingly humongous signature :
Input
It's the input type of an operation. Typically, a case class that holds fields matching the method parameters. We keep track of it in the return type for several reasons:
- In the internal logic of Smithy4s, It prevents having to prematurely shoe-horn kinds into other kinds by means of injection/projection, which helps both implementor and compiler alike
- It will come in handy for the implementation of some pagination-aware interpreters, as pagination typically works by performing a modification of the previous input in order to get the next batch (page) of results. This implies that the input (and therefore its type) must be tracked across several requests resulting from a single method call.
Error
The execution of an operation can result in errors. The Smithy language allows for tying a list of errors to operations. When generating the associated code, Smithy4s synthesize a union. This allows the coproduct of errors associated to an operation to be represented as a bona fide Scala type, which we can abstract over via some type-class instance. This is also very useful for the writing of bi-functor interpreters, for users that are interested in this kind of UX.
Output
No surprise there: this is the data resulting from the run of the operation.
StreamedInput, StreamedOutput
Smithy supports the concept of Streaming. It is communicated as a trait that annotates a single field of the input shape or/and output shape of an operation. Scala does not have a "standard" way of expressing streaming semantics. Moreover, streaming constructs in Scala are heavily context dependant. It is therefore impossible for us to incorporate the concept of "streaming" to our Schema
construct as it is meant to be context-free and third-party-free.
To get some intuition for why that is: say we want to express streaming using fs2. If we naively generate a case class that has one of its fields annotated with @streaming
, it means that the the field is of type fs2.Stream[F, A]
, which means that we either need to make a decision on what the F
is, which is not okay for obvious reasons, or we need to propagate the F[_]
type parameter upward to the case class. Now our Schema
value, which accompanies the case-class, also have to carry the F
... this propagates throughout the whole codebase. We deemed that not acceptable.
Rather than polluting all layers of abstraction, we decided to just have the concept of operation be impacted and hold the streamed type in a separate type parameter. This allows for interpreters from various ecosystem to emerge. It also has the quality of allowing users to access the unary component of outputs (ie, data that is communicated in the headers of HTTP responses) without necessarily allocating resources to consume the streamed component of the output.
NB: at the time of writing this, Smithy4s does not have any streaming-aware interpreter implemented. But streaming is such a fundamental notion in remote interactions, and we had to devise a plan to ensure that third parties could decide to implement interpreters without waiting.
Transformation
Because of the complex kinds we're dealing with, we codify a polymorphic function (natural-transformation), called smithy4s.kinds.PolyFunction5
that allows us to work at this level :
trait PolyFunction5[F[_, _, _, _, _], G[_, _, _, _, _]] {
def apply[I, E, O, SI, SO](fa: F[I, E, O, SI, SO]): G[I, E, O, SI, SO]
}
This is a mouthful, but conceptually, it's exactly the same as our good old polymorphic function typically aliased to ~>
.
Codifying the duality between initial and final algebras
What we want users to manipulate is the final-encoded version of a service: a good-old object-oriented interface that has decent editor support. But we need the initial-encoded version to implement interpreters in a generic fashion.
So we codify the duality to allow for switching from one to the other via an abstraction called Smithy4s.Service
, which is the entry point to all interpreters.
trait Service[Final[_[_, _, _, _, _]]] {
type Operation[_, _, _, _, _]
def toPolyFunction[F[_, _, _, _, _,]](alg: Final[F]): PolyFunction5[Operation, F]
def fromPolyFunction[F[_, _, _, _, _]](polyFunction: PolyFunction5[Operation, F]): Final[F]
// ...
}
Implementations of such interfaces are typically code-generated. This implies that any smithy Service
shape gets translated as a finally-encoded interface, but also as an initially-encoded GADT
The high-level philosophy of Smithy4s
The goal of Smithy4s is to allow users to derive client stubs and routers in various protocols, by running the generated code (or instances of generated interfaces) in some one-liner functions. To that end, Smithy4s surfaces a number of abstractions (such as smithy4s.schema.Schema
) that allow for the implementation of (very) polymorphic interpreters. These interpreters operate on the generated code, which reflects what the user defines in their smithy Specs.
The abstractions used by interpreters contain all the elements that allow for turning a high-level method call (from an interface generated by Smithy4s) into a low level request of some sort, and then transform a low level response into the output of the method call.
Logical flow: client-side
Conceptually, to derive a high-level client that uses some sort of Request => Response
protocol, the implementation has to follow a sequence of steps:
- Assuming this method call:
kvstore.get("key")
- turning the method call into a piece of data:
KVStoreOp.Get("key")
using the initially-encoded dual of theKVStore
interface - Retrieving the Smithy4s
Schemas
(input and output) associated to theGet
operation - Compiling the schema associated to the input of the
Get
operation into some encoding function:GetInput => Request
- Running the request through a low-level
Request => Response
function (like an HTTP client) - Running
Get
into some function that gives us itsGetInput
representation - Compiling the schema associated to the output (
GetOutput
~=Option[String]
) of theGet
operation into some decoding functionResponse => Output
So we get kvstore.get => KVStoreOp.Get => GetInput => Request => Response => GetOutput
, which gives us the full data flow, client side.
Logical flow: server-side
The server side is different in that we want to derive the Request => Response
function from an instance of our interface (KVStore
). The goal is to mechanically translate a request into a method call, and a method's output into a response. The sequence:
- From a given
Request
, find the corresponding operationOp
(for instance, by means of HTTP path). Let's assume it's theget
operation, - Retrieve the Smithy4s
Schemas
(input and output) associated to the operation (KVStoreOp.Get
) - Compile a
Request => GetInput
decoding function, and run theRequest
through it - From
GetInput
, recreate theKVStoreOp.Get
instance - From
KVStoreOp.Get
, use the final-encoded dual ofKVStoreOp
to call theKVStore#get
method (implemented by the user). This gets us anGetOutput
- Compile a
GetOutput => Response
encoding function from the schemas, and run the output through it
So we get Request => KVStoreOp.GetInput => KVStoreOp.Get => kvstore.get => GetOutput => Response
, which gives us the full data flow, service side.
Both the service-side and client-side logical flows guide the design of the abstractions that are exposed by Smithy4s.
A note about efficiency
The flows described above are merely conceptual, and do not account for the optimisations involved to ensure that schemas are not recompiled into codecs on a per-request basis (which would greatly impact performance). Interpreters provided by Smithy4s (HTTP and co) are written to ensure that all compilation is performed ahead of receiving requests, by means of preliminary computations and caching.
The Endpoint abstraction
The smithy4s.Endpoint
abstraction ties a specific operation to the various schemas that are tied to it.
trait Endpoint[Op[_, _, _, _, _], I, E, O, SI, SO] {
def schema: OperationSchema[I, E, O, SI, SO]
def wrap(input: I): Op[I, E, O, SI, SO]
}
where smithy4s.schema.OperationSchema
is a product of all schemas involved in an specific operation.
final case class OperationSchema[I, E, O, SI, SO](
id: ShapeId,
hints: Hints,
input: Schema[I],
error: Option[ErrorSchema[E]],
output: Schema[O],
streamedInput: Option[StreamingSchema[SI]],
streamedOutput: Option[StreamingSchema[SO]]
) {
Endpoints are not type-classes. Instead, an Endpoint
instance is provided by the companion object of each member of the GADT
forming the initial-encoding of the service interface. So, going back to our KVStore
, the corresponding sealed-trait would look like this :
sealed trait KVStoreOp[Input, Error, Output, StreamedInput, StreamedOutput]
and the put
operation would look like :
case class Put(input: PutRequest) extends KVStoreOp[PutRequest, PutError, PutResult, Nothing, Nothing]
object Put extends Endpoint[KVStoreOp, PutRequest, PutError, PutResult, Nothing, Nothing] {
val input = PutRequest.input
val output = PutRequest.schema
val streamedInput = SteamingSchema.nothing
val streamedOutput = StreamingSchema.nothing
val errorable: Option[Errorable[PutResult]] = this
// ...
val schema: OperationSchema[PutRequest, PutError, PutResult, Nothing, Nothing] =
Schema.operation(ShapeId("namespace", "Put"))
.withInput(PutRequest.schema)
.withError(PutError.errorSchema)
.withOutput(PutResult.schema)
def wrap(input: PutRequest) = Put(input)
}
A note on errors
As stated previously, Smithy4s generates a coproduct type for each operation, where the members of the coproduct point to the various errors listed in the smithy operation shape. Additionally, each structure annotated with @error
in smithy is rendered as a case-class that extends Throwable
, because Throwables
are the de-facto standard of doing error handling on the JVM. Even libraries that use Either
to perform error handling often represent the left-hand-side of the Either as some throwable type, to facilitate the absorption of errors into the error-channels of monadic constructs (IO.raiseError
, etc)
As a result, it is important for Smithy4s to expose functions that generically enable the filtering of throwables against the Error
type parameter of an operation, so that interpreters can intercept errors and apply the correct encoding (dictated via Schema
) before communicating them back to the caller over the wire. Conversely, it is important to expose a function that allows to go from the generic Error
type parameter to Throwable
, so that errors received via low-level communication channels can be turned into Throwable
at the client call site, in order to populate the relevant error channel when exposing mono-functor semantics.
Therefore, when a smithy operation has errors
defined, the corresponding smithy4s.schema.OperationSchema
references a smithy4s.schema.ErrorSchema
, which looks like this :
case class ErrorSchema[E] private[smithy4s] (
schema: Schema[E],
liftError: Throwable => Option[E],
unliftError: E => Throwable
)
Services and endpoints
In order to implement any server-side interpreters, it is required to have a list of endpoints. That list is used to implement some matching logic based on the shapeId
and/or the hints
associated to the endpoints, in order to deterministically decide where to route a low level Request
to a specific Endpoint
instance.
For instance, smithy provides a @http
trait out of the box that can annotate operations :
service KVStore {
operations: [Get, Put]
}
@http(method: "GET", uri: "/resource/${key}, code: 200)
operation Get {
input: GetInput
output: GetOutput
}
structure GetInput {
@httpLabel
key: String
}
structure GetOutput {
value: String
}
@http(method: "PUT", uri: "/resource/${key}, code: 200)
operation Put {
}
structure PutInput {
@httpLabel
key: String,
@httpPayload
value: String
}
Each @http
occurrence get translated to a scala value in the Hints
associated to the corresponding endpoint.
- On server-side, having a list of all the endpoints associated to a service allows for creating a routing logic that dispatches an HTTP Request to the correct endpoint.
- On client-side, a method call to a service stub gets translated to an instance of the corresponding
GADT
member. From there, we have to retrieve the schemas associated to the member in question. Additionally, we need to extract the input value out of the member, to run it through an encoder derived from the the associatedSchema
.
Therefore, the Service
abstraction needs to be enriched with the following methods :
trait Service[Final[_[_, _, _, _, _]]] {
type Initial[_, _, _, _, _]
// ...
// useful for server-side
def endpoints: IndexedSeq[Endpoint[Initial, _, _, _, _, _]]
// useful for client-side
// provides the index of the endpoint associated to the operation
def ordinal[I, E, O, SI, SO](op: Initial[I, E, O, SI, SO]): Int
// extracts the input value out of a reified operation
def input[I, E, O, SI, SO](op: Operation[I, E, O, SI, SO]): I
}
Conclusion and complete interfaces
Here are links to the complete interfaces discussed in this chapter.