Rebuilding object graph from JsonL output

Hi there,

I am trying to write a general algorithm for parsing the JsonL output from a Graphql bulk query and converting it to an object graph (or rather, a tree) that is equivalent to the output given by the same query in non-bulk mode. I have some issues due to the way the __parentId field is used when objects are nested within another.

For instance, take the following query:

query MyDraftOrderQuery {
  draftOrders(first: 5) {
    edges {
      node {
        id
        customer {
          id
          lastOrder {
            id
            paymentTerms {
              id
              paymentSchedules(first: 10) {
                edges {
                  node {
                    id
                    __typename
                  }
                }
              }
            }
          }
        }
        paymentTerms {
          id
          paymentSchedules(first: 10) {
            edges {
              node {
                id
                __typename
              }
            }
          }
        }
      }
    }
  }
}

In bulk mode, I get the following output (converted to a Json array for readability):

[
  {
    "id": "gid://shopify/DraftOrder/1521654268227",
    "customer": {
      "id": "gid://shopify/Customer/22810943586627",
      "lastOrder": {
        "id": "gid://shopify/Order/10024036860227",
        "paymentTerms": { "id": "gid://shopify/PaymentTerms/54645981507" }
      }
    },
    "paymentTerms": { "id": "gid://shopify/PaymentTerms/54645948739" }
  },
  {
    "id": "gid://shopify/PaymentSchedule/53899329859",
    "__typename": "PaymentSchedule",
    "__parentId": "gid://shopify/DraftOrder/1521654268227"
  },
  {
    "id": "gid://shopify/PaymentSchedule/53899297091",
    "__typename": "PaymentSchedule",
    "__parentId": "gid://shopify/DraftOrder/1521654268227"
  }
]

In non-bulk mode, I get this output:

{
  "data": {
    "draftOrders": {
      "edges": [     
        {
          "node": {
            "id": "gid://shopify/DraftOrder/1521654268227",
            "customer": {
              "id": "gid://shopify/Customer/22810943586627",
              "lastOrder": {
                "id": "gid://shopify/Order/10024036860227",
                "paymentTerms": {
                  "id": "gid://shopify/PaymentTerms/54645981507",
                  "paymentSchedules": {
                    "edges": [
                      {
                        "node": {
                          "id": "gid://shopify/PaymentSchedule/53899329859",
                          "__typename": "PaymentSchedule"
                        }
                      }
                    ]
                  }
                }
              }
            },
            "paymentTerms": {
              "id": "gid://shopify/PaymentTerms/54645948739",
              "paymentSchedules": {
                "edges": [
                  {
                    "node": {
                      "id": "gid://shopify/PaymentSchedule/53899297091",
                      "__typename": "PaymentSchedule"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }

As can be seen in the non-bulk output, the two payment schedule objects are children of the top-level draft order object and the last order of the customer, respectively (or rather, the payment terms of those orders). However, in the JsonL output, the __parentId field is set to the top-level draft order for both payment schedule objects. This makes it impossible to re-attach them to their correct parent as far as I can tell.

Is this behavior intentional? If yes, is there a way to generate a correct object graph from the JsonL output in this case?

1 Like

Hey @Jonas_Hogh - thanks for reporting this - it does seem strange, I’m going to do a bit more digging into this to confirm if this is intentional, but I might have a workaround in the meantime. Can you try this query and let me know if it resolves the issue?

query DraftOrderQuery {
  draftOrders(first: 5, reverse:true) {
    edges {
      node {
        id
        customer {
          id
          lastOrder {
            id
            paymentTerms {
              id
              paymentSchedules(first: 10) {
                edges {
                  node {
                    id
                    __typename
                    paymentTerms {
                      id
                    }
                  }
                }
              }
            }
          }
        }
        paymentTerms {
          id
          paymentSchedules(first: 10) {
            edges {
              node {
                id
                __typename
                paymentTerms {
                  id
                }
              }
            }
          }
        }
      }
    }
  }
}

There is a bit of redundancy there, but I think if we query paymentTerms again under PaymentSchedules, this should let you tie the original parent payment terms. This query should also be useable in Bulk Operations (I did some testing on my end here to confirm)

Hopefully I’m understanding this correctly - I’ll loop back with you here once I have a more definitive answer on the expected behaviour though. Let me know if I can clarify anything on my end here/if this doesn’t work.

Hey @Jonas_Hogh :waving_hand: - just confirming after getting in touch with the team that the initial behaviour you reported here is considered expected on our end. Can you let me know if you encounter any issues with the workaround though?

If so, I’m happy to take a closer look for sure, just let me know.

I did consider that as a workaround, though it seems quite unwieldy for several reasons:

  1. It seems causes a non-negligible increase in query cost (in my admittedly very limited testing)
  2. It puts a lot of complexity on developers when writing queries
  3. It seems really hard to automatically detect from the GraphQL metadata which property to use for the reference back to the real parent in order to write a general algorithm for rebuilding the object graph

Also I noticed a similar situation where this workaround does not apply as far as I can see: On the Order type, there are two collections “lineItems” and “nonFulfillableLineItems”, which both have the same type and the same parent. There doesn’t seem to be any way to detect which objects of type “LineItem” which have the Order as their parent belong to which collection (besides knowing the exact business logic of how Shopify determines that a line item is fulfillable, and then inspect those properties, e.g. find the line items that are non-physical products.)

Am I missing something obvious here? Is there a way to configure the bulk query that I overlooked, or is it supposed to be this difficult to use?

@Alan_G could you suggest a way to distinguish the line items below an order which belong to the lineItems connection and which belong to the nonFulfillableLineItems?