Hello,
I am creating a small ORM on top of metaobjects and metafields (GitHub - maestrooo/metaobject-repository: A simple ORM for manipulating Shopify metaobjects and metafields) to make it easier to interact with them. It works really well and I’m happy with the overall architecture, however I found a problem with the Shopify query cost calculation.
Right now, this library allows to populate one or multiple resources (even recursively). The library generates an optimize request. For instance this code:
const events = await eventRepository.findAll({ populate: ['author.image', 'products'] })
Will generate a GraphQL query that get the events with the products, the author and the author’s image. Internally, the library generate a query that looks like this:
{
metaobjects(first: 50, type: "foo") {
nodes {
fields {
name
jsonValue
}
_author: field(key: "author") {
reference {
...on Metaobject {
fields {
name
jsonValue
}
_image: field(key: "image") {
reference {
...on MediaImage {
// properties
}
}
}
}
}
}
_products: field(key: "products") {
references(first: 10) {
nodes {
id
...on Product {
// other properties
}
}
}
}
}
}
}
This works really well and map very well to a recursive approach. There is however one issue: the calculated cost is way more expensive than doing a naive approach that will actually fetch way more data potentially.
This optimized query only fetches what we need: if the event object has other references, they are not fetches.
However, I’ve found that generating this query produces a cost that is way cheaper, while it potentially retrieves much more data:
{
metaobjects(first: 50, type: "foo") {
nodes {
fields {
name
jsonValue
reference {
...on Metaobject {
fields {
name
jsonValue
reference {
...on MediaImage {
// properties
}
}
}
}
}
references(first: 10) {
...on Product {
// fields
}
}
}
}
}
}
The problem is that if the metafied of type foo has several fields that reference metaobjects or list of products, way more data will be retrieved (while we only want to populate the author). This is actually a problem when I serialize the data to our internal structure, as I might get references to objects the user did not ask in the inital query.
I think that internally, for the query cost, if the query already includes a “fields”, then individual field should be “merged” with the fields on, so that the resulting cost is the same.