Generative Testing
In this post, we’re going to walk through building a generator for valid GraphQL schemas specified in GraphQL Shema Language (sans Arguments). We will take a bottom-up approach starting with primitive generators, composing these into higher-level generators, and those into yet higher-level generators, and so on (à la Stratified Design1), building our way up to the final goal: a single simple generator that provides a unique valid GraphQL schema each time it is sampled. Along the way, we will pause to make sure that each primitive or composition of primitives produces correct values.
First, let’s require test.check.
cljs.user=> (require '[clojure.test.check.generators :as gen :include-macros true])
true
Now we can begin defining our primitive generators. For the typenames for our object types let’s use the capital letters: A-Z. gen/elements
takes a collection of values and produces a generator that randomly chooses an element from the collection:
cljs.user=> (def a-to-z-gen (gen/elements (map char (range 65 (+ 65 26)))))
#'cljs.user/a-to-z-gen
To convince ourselves that this is working, let’s try sampling a few items using the function gen/sample
. gen/sample
takes two arguments: a generator and a number of samples to generate. (There is also a single-arity version which generates 10 samples by default).
cljs.user=> (gen/sample a-to-z-gen 5)
("K" "M" "W" "G" "S")
Now that we can generate individual typenames for our objects, let’s build a generator that will produce a set of typenames given the first generator. Each of these sets will eventually specify the names of the types comprising an individual schema.
cljs.user=> (def entity-set-gen (gen/set a-to-z-gen {:min-elements 8 :max-elements 25}))
#'cljs.user/entity-set-gen
gen/set
conveniently accepts min-elements
and max-elements
which let us put bounds on the sizes of the generated sets.
Sampling a few of these gives:
cljs.user=> (sample-one entity-set-gen)
#{"K" "Q" "L" "G" "J" "M" "S" "Y" "H" "C" "B" "V" "U" "A" "W"}
cljs.user=> (sample-one entity-set-gen)
#{"T" "Q" "J" "M" "S" "Y" "E" "F" "A" "I" "D"}
cljs.user=> (sample-one entity-set-gen)
#{"Q" "J" "S" "Y" "C" "B" "V" "U" "O" "N" "I"}
(sample-one
is a helper simplifying the sampling of one element.)
cljs.user=> (defn sample-one [generator] (first (gen/sample generator 1)))
#'cljs.user/sample-one
With a generator for complete sets of typenames for our schemas in hand, let us proceed to composing it with a generator of unique pairs of relations between the types/entities. Two types will be considered related if at least one of the types has a field of the type of the other type in the relation (although the relation may be bidirectional/mutual and in the case where the relation is merely unidirectional, multiple fields may participate in the relation).2 We will consider these restrictions more closely as we proceed but, for now, it is enough to consider these two fairly obvious (basic) requirements: no relation may be repeated and relations are unordered; i.e., [A B]
is the same as [B A]
.
First, let’s define relation-gen
:
cljs.user=> (defn relation-gen [entity-set]
(gen/list-distinct (gen/elements entity-set) {:num-elements 2}))
#'cljs.user/relation-gen
And sample it a few times:
cljs.user=> (sample-one (relation-gen #{"A" "B" "C" "D"}))
("C" "D")
cljs.user=> (sample-one (relation-gen #{"A" "B" "C" "D"}))
("C" "A")
cljs.user=> (sample-one (relation-gen #{"A" "B" "C" "D"}))
("A" "D")
And then combine it with entity-set-gen
:
cljs.user=>
(def entities-and-relations-gen
(gen/let [entity-set entity-set-gen
relations (gen/list-distinct-by set (relation-gen entity-set)
{:min-elements (int (* (count entity-set) (/ 2 3)))
:max-elements (int (* (count entity-set) 3))})]
{:entities entity-set :relations relations}))
#'cljs.user/entities-and-relations-gen
The above form utilizes gen/let
which has equivalent semantics to built-in let
except that generators are required on the right-hand side of the binding vector and the symbols on the left are bound to values generated by those generators. Also of note here is the use of list-distinct-by
which is just like list-distinct
except allowing a transform to be specified for use in the equality checks (which in our case is set
).
Let’s take a sample from this latest generator:
cljs.user=> (pprint (sample-one entities-and-relations-gen))
{:entities #{"T" "L" "Y" "F" "P" "O" "N" "W" "D"},
:relations
(("W" "T")
("Y" "N")
("P" "O")
("L" "T")
("F" "N")
("T" "N")
("O" "F")
("Y" "L")
("D" "T"))}
Each side of a relation may have a multiplicity of zero, one or many so let’s define a data structure representing these:
cljs.user=> (def multiplicities #{:zero :one :many})
#'cljs.user/multiplicities
And a generator for them:
cljs.user=> (def multiplicity-gen (gen/elements multiplicities))
#'cljs.user/multiplicity-gen
Also we will require a non-zero multiplicity generator (i.e., either :one
or :many
):
cljs.user=> (require '[clojure.set])
nil
cljs.user=> (def non-zero-mult-gen (gen/elements (clojure.set/difference multiplicities #{:zero})))
#'cljs.user/non-zero-mult-gen
These work as expected:
cljs.user=> (assert (every? #(multiplicities %) (gen/sample multiplicity-gen 10000)))
nil
cljs.user=> (assert (every? #(#{:one :many} %) (gen/sample non-zero-mult-gen 10000)))
nil
Now we will define a Cardinality
as two Multiplicities
(one for each side of a relation):
cljs.user=> (defrecord Multiplicity [entity field multiplicity required?])
cljs.user/Multiplicity
cljs.user=> (defrecord Cardinality [left right])
cljs.user/Cardinality
Here is a wrapper around goog.string/format
which we will use for string building:
cljs.user=> (require '[goog.string.format])
true
cljs.user=>
(defn format
"Formats a string using goog.string.format."
[fmt & args]
(apply goog.string/format fmt args))
#'cljs.user/format
And a function that produces generators of Cardinalities
from relations:
cljs.user=> (require '[clojure.string :refer [lower-case]])
nil
cljs.user=>
(defn relation->cardinalities-gen [[left right]]
(gen/let [lmult multiplicity-gen
repeats (gen/choose 1 5)]
(let [create-card
#(Cardinality.
(Multiplicity. left (lower-case right)
lmult (sample-one gen/boolean))
(Multiplicity. right %1 (sample-one non-zero-mult-gen)
(sample-one gen/boolean)))]
(condp = lmult
:zero (map
create-card
(map #(format "%s%s" (lower-case left) %1) (range repeats)))
[(create-card (lower-case left))]))))
#'cljs.user/relation->cardinalities-gen
Let’s break that down a bit. First off, we generate a multiplicity for the left side of the relation and a number of repeats (which will only be used if the left side has a :zero
participation). Then we define a simple helper function for constructing a cardinality: create-card
. Notice that the field names are merely the destination type names, lower-cased, and optionally with a unique integer index postpended (for multiple-link relations only). And the right side’s multiplicity comes from the non-zero-mult-gen
(which simplifies downstream logic by reducing the number of cases we need to consider). Then, in both branches of the condp
(i.e., when left multiplicity is zero or non-zero), a sequence of Cardinality
instances is generated.
Next up we have a fairly straightforward generator to combine our entities, relations & cardinalities into a single structure:
cljs.user=>
(def entities-and-cardinalities-gen
(gen/let [{:keys [entities relations]} entities-and-relations-gen
cardinalities (apply gen/tuple (map relation->cardinalities-gen relations))]
{:entities entities :relations relations :cardinalities (flatten cardinalities)}))
#'cljs.user/entities-and-cardinalities-gen
And here’s an abbreviated sample of its output:
{:entities
#{"T" "Q" "L" "G" "J" "M" "S" "Z" "H" "R" "C" "F" "B" "V" "O" "X" "N"
"A" "I" "W" "D"},
:relations
(("J" "B") ("N" "I") ("Z" "G") ("H" "R") ("Z" "V") ("G" "V") ...)
:cardinalities
({:left
{:entity "J", :field "b", :multiplicity :zero, :required? true},
:right
{:entity "B", :field "j0", :multiplicity :many, :required? true}}
{:left
{:entity "N", :field "i", :multiplicity :zero, :required? false},
:right
{:entity "I", :field "n0", :multiplicity :one, :required? false}}
{:left
{:entity "Z", :field "g", :multiplicity :zero, :required? false},
:right
{:entity "G", :field "z0", :multiplicity :one, :required? true}}
{:left
{:entity "J", :field "l", :multiplicity :one, :required? true},
:right
{:entity "L", :field "j", :multiplicity :many, :required? true}}
{:left
{:entity "H", :field "a", :multiplicity :many, :required? true},
:right
{:entity "A", :field "h", :multiplicity :one, :required? false}}
...)}
Now let’s define a record to represent the description of a field in a GraphQL entity specification:
cljs.user=> (defrecord FieldDescriptor [field type multiplicity required?])
cljs.user/FieldDescriptor
And a generator for a plain (i.e., scalar or non-object/non-reference) field descriptor:
cljs.user=> (require '[camel-snake-kebab.core :refer [->camelCase ->PascalCase]])
true
cljs.user=> (def scalar-types #{"Boolean" "String" "Float" "Timestamp" "Int"})
#'cljs.user/scalar-types
cljs.user=>
(defn field-gen [prefix]
(let [nonempty-char-alpha-gen
(gen/not-empty (gen/fmap clojure.string/join
(gen/vector gen/char-alpha)))]
(gen/fmap (partial apply ->FieldDescriptor)
(gen/tuple (gen/fmap #(format "%s%s" prefix (->PascalCase %1))
nonempty-char-alpha-gen)
(gen/elements scalar-types)
non-zero-mult-gen
gen/boolean))))
#'cljs.user/field-gen
And take a few samples3:
cljs.user=> (sample-one (field-gen "test"))
#cljs.user.FieldDescriptor{:field "testQk", :type "Boolean", :multiplicity :many, :required? true}
cljs.user=> (sample-one (field-gen "test"))
#cljs.user.FieldDescriptor{:field "testH", :type "Timestamp", :multiplicity :one, :required? true}
cljs.user=> (sample-one (field-gen "test"))
#cljs.user.FieldDescriptor{:field "testE", :type "Int", :multiplicity :one, :required? false}
Next we will define a generator for object fields:
cljs.user=>
(defn cardinality->obj-field-descriptor [entity cardinality]
(let [grouped (group-by #(-> (%1 1) :entity) cardinality)
this (first (vals (into {} (grouped entity))))
other-entity (:entity ((((first (dissoc grouped entity)) 1) 0) 1))]
(if (not= :zero (:multiplicity this))
(FieldDescriptor. (:field this) other-entity (:multiplicity this) (:required? this)))))
#'cljs.user/cardinality->obj-field-descriptor
And then combine them together to describe complete entities:
cljs.user=>
(defn entity->cardinalities [entity cardinalities]
(filter #(or (= entity (-> %1 :left :entity)) (= entity (-> %1 :right :entity))) cardinalities))
#'cljs.user/entity->cardinalities
cljs.user=>
(defn entity->descriptors-gen [entity cardinalities]
(let [filtered (entity->cardinalities entity cardinalities) ; Note: O(n^2), can be optimized
obj-fields (remove nil? (map (partial cardinality->obj-field-descriptor entity) filtered))
default-field (FieldDescriptor. "id" "ID" :one true)]
(gen/let [num-fields (gen/choose 2 8)
plain-fields (apply gen/tuple (repeat num-fields (field-gen (->camelCase entity))))]
{:entity entity :fields (concat [default-field] plain-fields obj-fields)})))
#'cljs.user/entity->descriptors-gen
cljs.user=>
(def entity-descriptors-gen
(gen/let [{:keys [entities relations cardinalities]} entities-and-cardinalities-gen
entity-descriptors (apply gen/tuple (map #(entity->descriptors-gen %1 cardinalities) entities))]
entity-descriptors))
#'cljs.user/entity-descriptors-gen
And sampling one complete entity gives:
cljs.user=> (pprint (-> (sample-one entity-descriptors-gen) first))
{:entity "K",
:fields
({:field "id", :type "ID", :multiplicity :one, :required? true}
{:field "kB", :type "Int", :multiplicity :many, :required? false}
{:field "kO", :type "String", :multiplicity :one, :required? true}
{:field "u", :type "U", :multiplicity :many, :required? true}
{:field "r", :type "R", :multiplicity :one, :required? true})}
Now all that is left is to emit the GraphQL Schema Language given an entity description record:
cljs.user=>
(defn emit-field [fd]
(if (#{:one :many} (:multiplicity fd))
(format " %s: %s%s%s%s"
(:field fd)
(if (= (:multiplicity fd) :many) "[" "")
(:type fd)
(if (= (:multiplicity fd) :many) "]" "")
(if (:required? fd) "!" "")) nil))
#'cljs.user/emit-field
cljs.user=>
(defn emit-type [{:keys [entity fields]}]
(format "type %s {\n%s\n}"
entity
(clojure.string/join "\n" (remove nil? (map emit-field fields)))))
#'cljs.user/emit-type
And join a sequence of them together into the full schema:
cljs.user=>
(def schema-str-gen
(gen/let [entity-descriptors entity-descriptors-gen]
(clojure.string/join "\n\n" (map emit-type entity-descriptors))))
#'cljs.user/schema-str-gen
And the final product:
cljs.user=> (print (sample-one schema-str-gen))
type H {
id: ID!
hX: [Boolean]!
hY: Int!
hL: Boolean
hO: [String]!
x: X
r: R
m: [M]
y: [Y]!
}
type Y {
id: ID!
yUk: [Boolean]!
yI: [Timestamp]
yA: Int!
q: Q
h: H
}
...
These restrictions are placed on us by the library this was originally designed to test, speako, which does not support non-standard extensions to the, as of yet informally specified, GraphQL Schema Language. Yet even with these restrictions speako is quite usuable/flexible in practice and there are straightforward workarounds for obtaining multiple bidirectional relations between two entities if one absolutely requires them. Arguably the benefit of keeping the GQL Schema Language syntax simpler (and more portable to other GQL backends) is well worth the tradeoff of slightly more restrictions on the nature of relations.↩
field-gen
takes a prefix so that we can ensure that each set of fields generated for our entities is unique (which is a requirement of speako).↩