We are fetching historical order data using the Bulk Admin Graph QL API. We then use the ‘order updated’ webhook to get any changes that may happen to an order. Such as refunds etc. The issue we are having is reconciling the financial transactions, so that they match merchant activity, so that sales, shipping, discounts, returns/refunds are all fetched and recorded correctly on our system, so that the calculations match up. Can anybody help?
Hey @Gurpreet_Sidhu - good call on using the bulk query there. I’m not sure how you currently have things set up, but a bulk query like this might be what you’re looking for (just as an example):
mutation {
bulkOperationRunQuery(
query: """
{
orders {
edges {
node {
id
name
createdAt
# Financial totals
currentTotalPriceSet {
shopMoney {
amount
currencyCode
}
}
currentSubtotalPriceSet {
shopMoney {
amount
currencyCode
}
}
currentTotalTaxSet {
shopMoney {
amount
currencyCode
}
}
currentTotalDiscountsSet {
shopMoney {
amount
currencyCode
}
}
currentShippingPriceSet {
shopMoney {
amount
currencyCode
}
}
totalRefundedSet {
shopMoney {
amount
currencyCode
}
}
# Line items for detailed breakdown
lineItems {
edges {
node {
quantity
originalUnitPriceSet {
shopMoney {
amount
currencyCode
}
}
discountedUnitPriceSet {
shopMoney {
amount
currencyCode
}
}
totalDiscountSet {
shopMoney {
amount
currencyCode
}
}
}
}
}
transactions {
id
kind
status
amountSet {
shopMoney {
amount
currencyCode
}
}
parentTransaction {
id
}
createdAt
}
refunds {
id
totalRefundedSet {
shopMoney {
amount
currencyCode
}
}
createdAt
note
}
}
}
}
}
"""
) {
bulkOperation {
id
status
}
userErrors {
field
message
}
}
}
This should give you all of the correct historical data per order, including discounts, shipping rates, returns/refuds, etc. You would need to do some post-processing afterwards to aggregate the data, but this might work for you. Let me know if I’m missing anything in your post here though, happy to help!
Hi @Alan_G thanks for the reply, here is a copy of the query we are currently using:
$query = <<<GQL
mutation {
bulkOperationRunQuery(
query: """
{
orders(query: "updated_at:>='${firstDayOfPrevMonth}' AND updated_at:<='${currentTime}'") {
edges {
node {
id
name
createdAt
displayFinancialStatus
processedAt
currentSubtotalPriceSet {
presentmentMoney {
amount
}
}
currentTotalPriceSet {
presentmentMoney {
amount
currencyCode
}
}
totalShippingPriceSet {
presentmentMoney {
amount
}
}
customer {
defaultAddress {
countryCode
}
}
billingAddress {
countryCode
}
shippingAddress {
countryCode
}
discountApplications(first: 10) {
edges {
node {
__typename
... on DiscountCodeApplication {
code
value {
__typename
... on MoneyV2 {
amount
currencyCode
}
}
}
}
}
}
lineItems(first: 10) {
edges {
node {
name
sku
quantity
}
}
}
refunds {
id
createdAt
updatedAt
totalRefundedSet {
presentmentMoney {
amount
}
}
}
}
}
}
}
"""
) {
bulkOperation {
id
status
}
userErrors {
field
message
}
}
}
GQL;
Our initial attempts actually looked almost identical to your suggestion, however we ran into a bug in the API that wouldn’t allow queries on nodes within nodes, so it refused to execute if we asked for line items, transactions, or refunds.
It’s more in the post-processing stage that we’re seeing so many different ways of expressing complex transactions that it’s proving difficult to be certain that we’ve got the right data at the right points; combine that with needing real-time data to avoid the problem of being unable to acquire refund data for a given date range, and we have a few transactions that we can’t really parse correctly. Do you have any advice on that?
Thanks for clarifying @Gurpreet_Sidhu, definitely understand where you’re coming from. We do limit nested query nodes to two levels for bulk operations (more info on that here). I did a little bit of testing, and I think this might be an optimized query to use:
mutation {
bulkOperationRunQuery(
query: """
{
orders(query: "updated_at:>='2023-01-01T00:00:00Z' AND updated_at:<='2024-12-31T23:59:59Z'") {
edges {
node {
id
name
createdAt
displayFinancialStatus
processedAt
currentTotalPriceSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
currentSubtotalPriceSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
currentTotalTaxSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
currentTotalDiscountsSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
currentShippingPriceSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
totalRefundedSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
customer {
defaultAddress {
countryCode
address1
address2
city
province
zip
}
}
billingAddress {
countryCode
address1
address2
city
province
zip
}
shippingAddress {
countryCode
address1
address2
city
province
zip
}
discountApplications(first: 10) {
edges {
node {
__typename
... on DiscountCodeApplication {
code
value {
__typename
... on MoneyV2 {
amount
currencyCode
}
}
}
}
}
}
lineItems(first: 50) {
edges {
node {
name
sku
quantity
originalUnitPriceSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
discountedUnitPriceSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
totalDiscountSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
}
}
}
transactions {
id
kind
status
amountSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
parentTransaction {
id
}
createdAt
}
refunds {
id
totalRefundedSet {
shopMoney {
amount
currencyCode
}
presentmentMoney {
amount
currencyCode
}
}
createdAt
updatedAt
note
}
}
}
}
}
"""
) {
bulkOperation {
id
status
}
userErrors {
field
message
}
}
}
This should make it so that you’re not hitting the nested fields limitation (which might have been what you were seeing with that error message), but still give you all of the customer, order, transaction, financial, etc info you need.
In terms of post processing, there are definitely a lot of ways you can work with the data like you mentioned. I’m not sure how you’re building your integration, but since the data is output as JSONL, the easiest implementation might be something like using JSONL-DB if you’re using Node to store the data and then build an interpretation layer on top of that.
For real-time data, you could set up a webhook subscription for your integration to a topic like orders/create or orders/updated so that whenever an order is created or updated you receive up-to-date info that you could then pass back to your database with the bulk-query info (maybe setting the bulk queries to run at certain times).
On my end here, I’m not able to help with direct code-level support (unless it’s to do with our Shopify libraries, etc), but I hope this gives you a decent place to start from. Let me know if I can clarify anything on my end here!